In order to check the efficacy for either HTTP Pipelining or adoption of HTTP/2.0, I wanted to know the average number of HTTP requests made over a single TCP connection. I fired up the following query
SELECT round(avg(reqTotal/_connections)) average,
round(NTH(50, quantiles(reqTotal/_connections))) median,
round(NTH(75, quantiles(reqTotal/_connections))) p75,
round(NTH(90, quantiles(reqTotal/_connections))) p90,
round(NTH(99, quantiles(reqTotal/_connections))) p99
FROM [httparchive:runs.latest_pages];
Interestingly the data is as follows
Avg of 4 requests over a given connection which is the same as 75th percentile. In fact fully half of the pages make less than 3 requests per open connection
To see the same distribution across pages we can use
SELECT INTEGER(ROUND(reqs_per_conn/10)*10) as req_bucket, SUM(pages) as pages FROM (
SELECT reqTotal/_connections as reqs_per_conn, COUNT(*) AS pages
FROM [httparchive:runs.latest_pages]
GROUP BY reqs_per_conn
)
GROUP BY req_bucket
ORDER BY req_bucket;
Some of the top sites that exhibit this behavior are Craigslist(http://httparchive.org/viewsite.php?pageid=18240620) and www.wordpress.com(http://httparchive.org/viewsite.php?pageid=18240585) as evidenced by the following query:
SELECT rank,url,reqTotal,_connections
FROM [httparchive:runs.latest_pages]
where reqTotal == _connections and rank < 1000;
Rakuten.co.jp site serves 754 requests using only 61 connections (http://httparchive.org/viewsite.php?pageid=18240705)