Segmenting HTTP Archive results by rank with CrUX

Since we’ve moved away from an Alexa-based dataset, the Chrome UX Report (CrUX) has been lacking comparable ranking information to help segment the results by popularity. However, CrUX now supports coarse rank magnitude segmentation! I think this will unlock a ton of really useful analyses for HTTP Archive users.

Rather than fine-grained ranking like top 1, 2, 3…, rank magnitudes indicate whether an origin is in the top 1k, 10k, 100k, etc. This is especially useful for analyses in which you want to segment the web into head/torso/tail in terms of popularity.

What technologies are used by the top 1k websites?

Here’s an example:

SELECT
  category,
  app,
  COUNT(0) / 1000 AS pct
FROM (
  SELECT DISTINCT
    CONCAT(origin, '/') AS url
  FROM
    `chrome-ux-report.all.202103`
  WHERE
    experimental.popularity.rank <= 1000)
JOIN (
  SELECT DISTINCT
    url,
    category,
    app
  FROM
    `httparchive.technologies.2021_03_01_*`
  WHERE
    category IN (SELECT DISTINCT category FROM `httparchive.technologies.2021_01_01_mobile`))
USING
  (url)
GROUP BY
  category,
  app
ORDER BY
  pct DESC

(The WHERE category IN... line is to work around a technologies bug affecting recent datasets)

This query measures the most popular technologies used by the top 1k websites’ home pages. Here are the 10 most popular technologies in the “head” segment of the web:

category app pct
Analytics Google Analytics 56.6%
Tag managers Google Tag Manager 42.7%
JavaScript libraries jQuery 42.6%
Web servers Nginx 24.5%
Reverse proxies Nginx 24.5%
Widgets Facebook 24.3%
Advertising Google Publisher Tag 23.1%
CDN Cloudflare 22.9%
Ecommerce Cart Functionality 19.4%
Font scripts Google Font API 16.9%

The most popular technology that we measure is Google Analytics, which is found on 56.6% (566 websites) of the head of the web.

Surprisingly, jQuery is still extremely popular in the head of the web at 42.6%, although this is much lower than the 83% adoption measured globally in the 2020 Web Almanac. Speaking of which, I’m excited to see what insights we can find in the 2021 Web Almanac with the help of this new ranking information!

JS libraries and frameworks

Filtering specifically for JS libraries and frameworks, we can see that jQuery, React, and Vue.js are the most popular:

app pct
jQuery 42.6%
React 9.2%
jQuery UI 5.7%
Vue.js 4.6%
Slick 3.4%
Polyfill 2.9%
Modernizr 2.8%
AMP 2.7%
RequireJS 2.6%
jQuery Migrate 2.3%

CMS

In the CMS space, WordPress only makes up 3.4% (34 websites):

app pct
WordPress 3.4%
Adobe Experience Manager 1.1%
Drupal 0.5%
Wix 0.1%
Microsoft SharePoint 0.1%
Craft CMS 0.1%

If we wanted to list those 5 Drupal websites, we can modify our initial query to return the URLs:

SELECT DISTINCT
  url
FROM (
  SELECT DISTINCT
    CONCAT(origin, '/') AS url
  FROM
    `chrome-ux-report.all.202103`
  WHERE
    experimental.popularity.rank <= 1000)
JOIN (
  SELECT DISTINCT
    url,
    category,
    app
  FROM
    `httparchive.technologies.2021_03_01_*`)
USING
  (url)
WHERE
  app = 'Drupal'

Split by country

It’s also worth mentioning that the rank segments are available at the country-level as well. So we could run a similar query for the top 1000 websites in any given country by replacing chrome-ux-report.all with chrome-ux-report.country_<COUNTRY_CODE>. For example this query analyzes the technologies used by the 1000 most popular websites in Japan:

SELECT
  category,
  app,
  COUNT(0) / 1000 AS pct
FROM (
  SELECT DISTINCT
    CONCAT(origin, '/') AS url
  FROM
    `chrome-ux-report.country_jp.202103`
  WHERE
    experimental.popularity.rank <= 1000)
JOIN (
  SELECT DISTINCT
    url,
    category,
    app
  FROM
    `httparchive.technologies.2021_03_01_*`
  WHERE
    category IN (SELECT DISTINCT category FROM `httparchive.technologies.2021_01_01_mobile`))
USING
  (url)
GROUP BY
  category,
  app
ORDER BY
  pct DESC
category app pct
Analytics Google Analytics 75.5%
JavaScript libraries jQuery 71.5%
Tag managers Google Tag Manager 64.9%
Widgets Facebook 37.9%
Web servers Apache 36.3%
Advertising Google Publisher Tag 27.1%
Web servers Nginx 26.2%
Reverse proxies Nginx 26.2%
JavaScript libraries Slick 15.4%
Programming languages PHP 15.2%

It’s interesting to see how the composition changes now. Technologies like Google Analytics and jQuery are much more prevalent on the 1000 most popular websites in Japan compared to worldwide.

Let me know if you’ve found any other interesting insights using the new rank magnitude field!

2 Likes

I’ve written a little data explorer app to help visualize the global 1k results: https://codepen.io/rviscomi/full/GRrYMbJ

image

2 Likes