Web archival
Table of Contents
- 1. Org web tools
- 2. thread with recommendations where much of this info was sourced from
- 3. paid, software-as-a-service, but probably the best solution for organizations
- 4. Cancel's OP asking for help
- 5. ArchiveBox
- 6. download images that are loaded in parts
- 7. Wayback Machine downloader
- 8. turn youtube videos into web pages
- 9. https://leetusman.com/archiving-artist-spaces/reading.html
- 10. Web Archiving Community · ArchiveBox/ArchiveBox Wiki · GitHub
- 11. Webrecorder: Web Archiving for All
1. Org web tools
good for turning links on Org notes into Org markup
4. Cancel's OP asking for help
There's wget, obviously, plus Archive Team's fork of it. There is httrack, which is what I normally use.
5. ArchiveBox
self hostable, open source, but development is essentially stopped and it's stuck on an EOL version of Django
11. Webrecorder: Web Archiving for All
Works by recording user activity.
11.1. Browsertrix
Crawler built on top of Webrecorder, self-hostable but requires K8s.