I use HTTP Archive data a lot in public presentations and I’ve started to “name names” more often. For example, this chart of CMS performance:
Web transparency shines a light on the good and the bad equally. By measuring the state of the web we can understand how good or bad things are and track progress from that baseline. It’s also a way of keeping the web honest, like when we identified sites that use cryptocurrency miners.
So I thought it’d be good to codify what’s ok and what’s not ok when it comes to web transparency datasets like the HTTP Archive. Is this something people would find useful and important to have? If so, what “commandments” would you like to see on that list?
I like this idea a lot!
A core commandment to me: reproducibility
i.e. any conclusions presented must have the queries that led to them publicly available.
The methods of web transparency should have the same transparency as well after all
+1 on reproducibility.
Probably goes without saying, but “assume good intent” is something that’s important too .
There shouldn’t be any reason for someone to use HTTP Archive data to publicly call out a certain site/tool as “bad”. This data helps us move the web forward, and we can only do that if we work together.
I agree that data should be unbiased, poor scores are an opportunity to fix problems. Public dissemination of these data should provide incentive to compete. These charts are a great service.