You imply that I think this would be a good way of measuring the impact of HTTP/2 but I don’t really think it is!
Each site will be unique and just running Wordpress won’t counter that. Maybe if you look at sites that have a similar number of, and size of, resources you might be able to leverage the scale of Wordpress usage to get a decent approximation but that’s far from a given.
Also, while removing CDN backed installations - under the assumption that they provide HTTP/2 by default and also have other optimisations - leads to other challenges. What if a site using HTTP/2 is based in Australia, but a similar HTTP/1.1 site is based in the US much closer to where the HTTPArchive runs from? So you should bring location into play too. What if one is on crappy hardware and the other is not? So maybe should bring TTFB in too?
I fear there is too many variables and the only fair way to measure this is to do a HTTP/1.1-only crawl and compare the results, but fear that is a huge resource challenge!
And then that’s before you get into who (if anyone!) is benefiting from HTTP/2? What if desktop is ever so slightly slower (1.5s becomes 1.6s) but mobile is dramatically faster with HTTP/2 (8s becomes 4s)? Is HTTP/2 “worth it” then? Or do you need to measure your percentage of visitors on each and take that into account?
This is where things like CrUX could help but again it only measures using Chrome (which we know does support HTTP/2 and will use it if it can). Ideally we need a Chrome experiment to disable HTTP/2 for a percentage of users and then leverage that to get some good measurements. But that’s quite an ask beyond the HTTP Archive.
Also what stat are we using to define effect? What if FCP is slightly slower, but TTI is massively faster?
And then we also come to browsers - which we know have a large variation in HTTP/2 implementations. Of course Chrome has a massive percentage of users but Safari obviously has a significant usage on iOS.
Then there’s the servers - which also have massive variation (in particular in terms of prioritisation!). Should you limit to the same Server header? What about versions as h2 is still relatively new and bugs are still being found and fixed and optimisations implemented.
All in all, while I would love to get an answer to this question, I’m not convinced there is an easy one to be had just by querying the HTTP Archive. This, incidentally, is why I simply avoided this question for the HTTP/2 chapter of the Web Almanac:
“ The impact of HTTP/2 is much more difficult to measure, especially using the HTTP Archive methodology. Ideally, sites should be crawled with both HTTP/1.1 and HTTP/2 and the difference measured, but that is not possible with the statistics we are investigating here. Additionally, measuring whether the average HTTP/2 site is faster than the average HTTP/1.1 site introduces too many other variables that require a more exhaustive study than we can cover here.”
I think to truly answer this would require a concerted effort from the browsers, or a research effort (from academia maybe?).
But maybe I’m being overly pessimistic here? Maybe the law of averages will hold true here - especially if the difference is quite noticeable? As I say I’d certainly love to know the answer here, so definitely interested in hearing other people’s opinions in this thread. And then might tackle the HTTP Archive queries myself if people can weigh in on what they think would be good things to query for.