- #1
- 15,938
- 5,790
Last December, my college's Web server crashed because of a hard-disk failure. As I was fiddling with my most recent backup of my own Web pages, I clumsily managed to delete that, too. I had to resort to a much older backup that was missing several pages that I had created since then. I put the missing pages on my "to do" list to re-create, but never got around to doing it.
Just today in a thread elsewhere on PF, someone mentioned the Wayback Machine, which I had forgotten about. I used it to search for my Web site, and voilà, there were the missing pages! :!)
There was just one problem. The pages are part of a large photo gallery, and all the images are in a directory that I've forbidden to Web crawlers via a robots.txt file. (Some people were slurping hundreds of pictures at once, and bogging the server down.) So the Wayback Machine has the Web page text, including the picture captions, but not the images.
At least I can find the images again in my collection, based on the captions and URLs, but it will take some time to track them down and fix them up in Photoshop again. To make things easier in the future, I've added an entry to robots.txt that allows the Wayback Machine's crawler to fetch my images, while still forbidding other crawlers from doing so.
Just today in a thread elsewhere on PF, someone mentioned the Wayback Machine, which I had forgotten about. I used it to search for my Web site, and voilà, there were the missing pages! :!)
There was just one problem. The pages are part of a large photo gallery, and all the images are in a directory that I've forbidden to Web crawlers via a robots.txt file. (Some people were slurping hundreds of pictures at once, and bogging the server down.) So the Wayback Machine has the Web page text, including the picture captions, but not the images.
At least I can find the images again in my collection, based on the captions and URLs, but it will take some time to track them down and fix them up in Photoshop again. To make things easier in the future, I've added an entry to robots.txt that allows the Wayback Machine's crawler to fetch my images, while still forbidding other crawlers from doing so.