What's the distribution of requests per page?

SELECT reqtotal as Requests, COUNT(*) AS Pages
FROM [httparchive:runs.latest_pages] 
GROUP BY Requests 
ORDER BY Requests

When I used to do this against the MySQL version of HTTP Archive data, I’d use ROUND(reqTotal, -1) to group in tens.

Looks like this, but what is this?

image

2 Likes

@fhoffa total number of requests per page: images, CSS, javascript, etc. That’s a looong tail too… 400+ requests for some pages, sheeesh!

@andydavies to get the “group in tens”, I think this should work:

SELECT INTEGER(ROUND(requests/10)*10) as req_bucket, SUM(pages) as pages FROM (
  SELECT reqtotal as requests, COUNT(*) AS pages
  FROM [httparchive:runs.latest_pages] 
  GROUP BY requests
) 
GROUP BY req_bucket
ORDER BY req_bucket

Based on above query:
image

Looking at the tail there… Oh my…

SELECT url, reqtotal FROM [httparchive:runs.latest_pages]
ORDER by reqtotal DESC
LIMIT 10

image

Coingig appears to have been fixed, but loading #2 yields:
image

They have a map widget which continue to load more and more icons… My DevTools froze after 2000+ requests. I’d say that’s an antipattern!


Percentile request stats for desktop and mobile:
image

SELECT * FROM
  (SELECT 'desktop' type,
    round(avg(reqTotal)) average,
    NTH(50, quantiles(reqTotal)) p50,
    NTH(75, quantiles(reqTotal)) p75,
    NTH(90, quantiles(reqTotal)) p90,
    NTH(99, quantiles(reqTotal)) p99
  FROM [httparchive:runs.latest_pages]),
  (SELECT 'mobile' type,
    round(avg(reqTotal)) average,
    NTH(50, quantiles(reqTotal)) p50,
    NTH(75, quantiles(reqTotal)) p75,
    NTH(90, quantiles(reqTotal)) p90,
    NTH(99, quantiles(reqTotal)) p99
  FROM [httparchive:runs.latest_pages_mobile]);

All of which adds up to a total of (in KB):
image

SELECT * FROM
  (SELECT 'desktop' type,
    round(avg(round(bytesTotal/1024))) average,
    NTH(50, quantiles(round(bytesTotal/1024))) p50,
    NTH(75, quantiles(round(bytesTotal/1024))) p75,
    NTH(90, quantiles(round(bytesTotal/1024))) p90,
    NTH(99, quantiles(round(bytesTotal/1024))) p99
  FROM [httparchive:runs.latest_pages]),
  (SELECT 'mobile' type,
    round(avg(round(bytesTotal/1024))) average,
    NTH(50, quantiles(round(bytesTotal/1024))) p50,
    NTH(75, quantiles(round(bytesTotal/1024))) p75,
    NTH(90, quantiles(round(bytesTotal/1024))) p90,
    NTH(99, quantiles(round(bytesTotal/1024))) p99
  FROM [httparchive:runs.latest_pages_mobile]);
2 Likes

Something that the “grouping by 10” hides: The big jump at 2, and the weird smaller big jumps at 14 and 21. Why those numbers?

Before 34 everything is below 2100, except: 2:3387, 14:2211, and 21:2235.

Will dig into the pages with just two requests, first hypothesis would be an error page of some sort and a favicon but suspect the reality is murkier.

I first did this query when the Resource Timing spec was in the works and I wanted to check the sanity of the 150 entry buffer limit.

This is what the distribution of pages with only two requests and that are less than 10,000 bytes looks like (Of the 3387 pages with just two requests, 3299 are less than 10,000 bytes).

There’s API end-points serving JSON, Joomla installs stating they need PHP upgraded, blank pages and all sort of other things in there.

Recalculating the average total transfer size for the HTTP Archive excluding all pages below 1,000 bytes increases that average up to 1,479kB (vs 1,466kB)

Interesting. It would also skew the counts for number of scripts, etc. I wonder if we should be dropping the tails (on both ends) when analyzing the data…

P.S. We should dump all the averages, and focus on quantiles. Averages of long-tail distributions… can be highly misleading.

Yeh, I don’t use averages when I write up results for my clients - tend to focus on distributions and quantiles.

I certainly think we should drop the tails but my stats isn’t strong enough to know where we should place the limits.

Updating the percentiles distribution for April 2019:

SELECT * FROM 
  (SELECT 'desktop' type, 
    round(avg(reqTotal)) average, 
    NTH(50, quantiles(reqTotal)) p50, 
    NTH(75, quantiles(reqTotal)) p75, 
    NTH(90, quantiles(reqTotal)) p90, 
    NTH(99, quantiles(reqTotal)) p99 
  FROM [httparchive:runs.latest_pages]), 
  (SELECT 'mobile' type, 
    round(avg(reqTotal)) average, 
    NTH(50, quantiles(reqTotal)) p50, 
    NTH(75, quantiles(reqTotal)) p75, 
    NTH(90, quantiles(reqTotal)) p90, 
    NTH(99, quantiles(reqTotal)) p99 
  FROM [httparchive:runs.latest_pages_mobile]);
Row type average p50 p75 p90 p99
1 mobile 98.0 77 126 190 377
2 desktop 107.0 84 136 206 429