Today I was asked to download an old website for archiving purposes.
I decided to use wget.
The command is:
wget --recursive --no-clobber --page-requisites --convert-links --html-extension --no-parent http://www.domain.xyz
The function of the arguments is as follows:
–recursive -> Kinda obvious… follow links on the website to download more than just the index-page
–no-clobber -> do not download files that are already there.
–page-requisites -> download everything needed for displaying the page
–convert-links -> convert the links from the original to the now local copy (if you dont do that, clicking on a link will get you to the original site on the server…)
–html-extension -> converts other extensions to html, or in other words: makes remote scripts (visiter-counter for example) work on your local copy
–no-parent -> is used to tell wget not to follow links outside of the given domain (for example facebook-buttons etc.) only download subpaths of the given domain.
Thats it, basically… Easy… 😉