What is the distribution of Domains per Page (Desktop vs Mobile)

I recently wanted to know if 52 domains on a given page was too much and how was the general distribution of the domains per page on httparchive looked like

SELECT * FROM
(SELECT 'desktop' type,
  round(avg(numDomains)) average,
  NTH(50, quantiles(numDomains)) p50,
  NTH(75, quantiles(numDomains)) p75,
  NTH(90, quantiles(numDomains)) p90,
  NTH(99, quantiles(numDomains)) p99
FROM [httparchive:runs.latest_pages]),
(SELECT 'mobile' type,
  round(avg(numDomains)) average,
  NTH(50, quantiles(numDomains)) p50,
  NTH(75, quantiles(numDomains)) p75,
  NTH(90, quantiles(numDomains)) p90,
  NTH(99, quantiles(numDomains)) p99
FROM [httparchive:runs.latest_pages_mobile])

resulted in the following

As can be seen 52 was way out in the 90+ percentile range :slight_smile: The query is trivial but since it has not been discussed before, I am posting this.

1 Like

great stuff :thumbsup:

Ilya, What do you think we can do to get that p75 mobile down to 10 in the next 2 years? Does it just come down to evangelization, or what?

@igrigorik, does this have any implications for how many dns-prefetch links might be considered appropriate or too many?

I know dns-prefetch shouldn’t be abused of course. :smile: Just wondering if these numbers shape the way you feel about that at all.

@aranjedeath before we answer that, it would be interesting to run a second run of analysis on these domains to isolate cases of third party deps (e.g. widgets, ads, etc), vs. domain sharding. In HTTP 2.0 world, sharding is an anti-pattern, so that’s definitely part education, part (hopefully) automation to undo the damage. That said, I doubt the number of third party deps will go down – my guess it, it’ll continue going up (in fact, there’s another interesting graph that I’d like to see!)

@cqueern I don’t think the sheer number is as relevant as when/which resources are in the critical path. If the bulk of these domains are loaded after the fact and outside of critical path – great!

@igrigorik good point!

I’m guessing this data would also be useful for folks who work in webspam detection because it’d be easier to identify sites and pages that have suspiciously high numbers of outbound links. I doubt it’s a novel idea to many teams that work for search engines but perhaps hosting companies hoping to detect compromises in their customers’ sites would find it helpful.

Nice work @pganti!