How to download a website from the archive.org Wayback Machine?

I want to get all the files for a given website at archive.org. Reasons might include

 

  • the original author did not archived his own website and it is now offline, I want to make a public cache from it
  • I am the original author of some website and lost some content. I want to recover it

How do I do that ?

Taking into consideration that the archive.org wayback machine is very special: webpage links are not pointing to the archive itself, but to a web page that might no longer be there. JavaScript is used client-side to update the links, but a trick like a recursive wget won’t work.

 

ANSWER:

I’ve came accross the same issue and I’ve coded a gem. To install: gem install wayback_machine_downloader. Run wayback_machine_downloader with the base url of the website you want to retrieve as a parameter: wayback_machine_downloader http://example.comMore information: github.com/hartator/wayback_machine_downloader – Hartator Aug 10 ’15 at 6:32
A step by step help for windows users (win8.1 64bit for me) new to Ruby, here is what I did to make it works :
1) I installed rubyinstaller.org/downloads then run the “rubyinstaller-2.2.3-x64.exe”
3) unzip the zip in my computer
4) search in windows start menu for “Start command prompt with Ruby” (to be continued) – Erb Oct 2 ’15 at 7:40
5) follow the instructions of github.com/hartator/wayback_machine_downloader (e;.g: copy paste this “gem install wayback_machine_downloader” into the prompt. Hit enter and it will install the program…then follow “Usage” guidelines).
6) once your website captured you will find the files into C:\Users\YOURusername\websites
Another service https://ru.archivarix.com/

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Back to Top