I'd say this is currently our 2nd most-requested feature :) It's definitely in the roadmap: #120 but not planned anytime soon because it's not ArchiveBox's primary use-case and it's extremely difficult to do well. For now I recommend using another software to do the crawling to produce a list of URLs of all the pages, and then pipe that list into archivebox to do the actual archiving. Eventually, the idea plan is roughly to expose similar flags on ArchiveBox as are available on wget: --mirror --level=5 --span-hosts --recursive --no-parent https://www.gnu.org/software/wget/manual/wget.html#Recursive-Retrieval-Options-1 These flags together should cover all the use cases: archiving an entire domain with all pages archiving an entire domain but only below the current directory level archiving recursively from a single page across all domains to a given depth I anticipate it will take a while to get to this point though (12+ months likely), as we first have to build or integrate a crawler of some sort, and web crawling is an extremely complex process with lots of subtle nuance around configuration and environment. The process will also naturally be additive the moment snapshot support is added: #179. Unfortunately, doing mirroring / full-site crawling properly is extremely non-trivial, as it involves building or integrating with an existing crawler/spider. Even just the logic to parse URLs out of a page is deceivingly complex, and there are tons of intricacies around mirroring that don't need to be considered when doing the kind of single-page archiving that ArchiveBox was designed for. Currently this is blocked by setting up our proxy archiver which has support for deduping response data in the WARC files, then we'll also need to pick a crawler, or integrate with an existing one from here. For people landing on this issue and looking for an immediate solution, I recommend using this command (which is exactly what's used by ArchiveBox right now, but with a few recursive options added): wget --server-response \ --no-verbose \ --adjust-extension \ --convert-links \ --force-directories \ --backup-converted \ --compression=auto \ -e robots=off \ --restrict-file-names=unix \ --timeout=60 \ --warc-file=warc \ --page-requisites \ --no-check-certificate \ --no-hsts \ --span-hosts \ --no-parent \ --recursive \ --level=2 \ --warc-file=$(date +%s) \ --user-agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.75 Safari/537.36" \ https://example.com Set --level=[n] to the depth of links you want to follow during archiving, or add --mirror and remove --span-hosts and --no-parent if you want to archive an entire domain.