Copying a Website with wget

Today I was asked to download an old website for archiving purposes.

I decided to use wget.

The command is:

wget --recursive --no-clobber --page-requisites --convert-links --html-extension --no-parent

The function of the arguments is as follows:

–recursive -> Kinda obvious… follow links on the website to download more than just the index-page

–no-clobber -> do not download files that are already there.

–page-requisites -> download everything needed for displaying the page

–convert-links -> convert the links from the original to the now local copy (if you dont do that, clicking on a link will get you to the original site on the server…)

–html-extension -> converts other extensions to html, or in other words: makes remote scripts (visiter-counter for example) work on your local copy

–no-parent -> is used to tell wget not to follow links outside of the given domain (for example facebook-buttons etc.) only download subpaths of the given domain.

Thats it, basically… Easy… 😉

Leave a Reply

Your email address will not be published.