Hi! Is there a way to download all 42TB of Web Almanac’s data?
Hi! All of the raw data is publicly available on the HTTP Archive BigQuery project and the queries we used this year are available on GitHub. Does that help you get what you’re looking for?
Worth noting that you’d have to pay quite a bit to run queries on the whole set or export it.
Having worked directly with the data before I can understand why you’d want to do it but I also know the effort that was involved in providing it for download and the potential costs that can be incurred if the robots decide they want it.
Charlie