I don't think that works. It's not remotely browsable or searchable. It would be...

joshmn · on July 16, 2015

Hmm. Now I'm thinking that I might end up using your idea (scraping the dark web) and using something like httrack[0] to do exactly that: structure.

[0] https://en.wikipedia.org/wiki/HTTrack

gwern · on July 16, 2015

I once tried using HTTrack, but I found it was doing too much magic under the hood and was hard to work with. As dumb as wget is (that blacklist bug is over 12 years old now!), it at least is understandable.

joshmn · on July 16, 2015

Thanks for saving me the headache :)