I had the need to backup an entire site and its content when I couldn’t get access to the FTP details quickly. Luckily the site was just static content so I was able to just use one of the many tools available in a regular linux shell.

Here is the code I typed into my terminal:

wget -m --tries=5 "http://www.foo.com"

The “-m” from the Man pages states that it is mirroring where it will follow links around the pages. The “–tries=5″ will stopĀ  wget from running into an infinite loop.

I’m not sure how well this will work with dynamic sites it may just capture the HTML of the outputted server side script but at least its better than nothing.

Further options such as:

--referer=www.google.com

For setting the referrer and:

--user-agent=Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.1) Gecko/20090717 Fedora/3.5.1-3.fc11 Firefox/3.5.1

For setting the user agent if a particular website is proving tricky to download.