Tracking WebP adoption and impact, especially in WordPress

Hi!

I recently helped land WebP support in WordPress (Changeset 50810 – WordPress Trac) and would like to start tracking adoption of WebP images.

Mainly I want to track:

  • How many WordPress (>= version 5.8.0) sites use WebP images (vs. pre 5.8.0 which would have required using a plugin). WordPress 5.8.0 will ship around July 7th, so impact will start then.

Also nice to know:

  • Percentage of weight/total weight for images for sites using WebP vs. not using WebP.

  • How do CWV scores (more maybe just LCP?) compare for WP sites that do or don’t use WebP images?

Appreciate any help writing a query to track these!

1 Like

Too early to get 5.8+ data but to get started here’s a look at WebP adoption by WordPress version as of the April dataset (mobile only):

SELECT
  version,
  COUNTIF(has_webp) AS pages_with_webp,
  COUNT(0) AS pages,
  COUNTIF(has_webp) / COUNT(0) AS pct_webp
FROM (
  SELECT DISTINCT
    url,
    info AS version
  FROM
    `httparchive.technologies.2021_04_01_mobile`
  WHERE
    app = 'WordPress')
JOIN (
  SELECT
    url,
    has_webp
  FROM (  
    SELECT
      pageid,
      COUNTIF(ext = 'webp') > 0 AS has_webp
    FROM
      `httparchive.summary_requests.2021_04_01_mobile`
    GROUP BY
      pageid)
    JOIN (
      SELECT
        pageid,
        url
      FROM
        `httparchive.summary_pages.2021_04_01_mobile`)
    USING
      (pageid))
USING
  (url)
GROUP BY
  version
ORDER BY
  pages DESC
version pages_with_webp pages pct_webp
47,480 938,168 5.06%
5.7 8,212 343,103 2.39%
5.6.2 3,946 186,681 2.11%
5.5.3 1,878 111,512 1.68%
5.7.1 2,316 99,861 2.32%
5.4.4 1,052 71,304 1.48%
4.9.16 569 58,499 0.97%
5.3.6 856 46,232 1.85%
5.6.3 795 40,345 1.97%
5.2.9 601 39,449 1.52%
5.5.4 443 30,668 1.44%
5.6 606 24,580 2.47%
5.4.5 370 23,259 1.59%
5.1.8 165 20,615 0.80%
4.9.17 172 19,366 0.89%
5.4.2 271 15,531 1.74%
5.3.7 237 15,435 1.54%
5.2.10 208 13,431 1.55%
4.8.15 93 13,349 0.70%
4.7.19 83 13,299 0.62%

There are a lot of pages without a discernible version and 5.1% of those have a WebP image. The most popular detectible version of WordPress is 5.7 and 2.4% of those 343K pages have a WebP image.

If we remove the version grouping and look at any WordPress page, WebP adoption is at 3.1%.


Here’s a slight modification to look at the median image bytes per page segmented by those with and without any WebP:

SELECT
  has_webp,
  APPROX_QUANTILES(bytesImg, 1000)[OFFSET(500)] / 1024 / 1024 AS median_img_mbytes
FROM (
  SELECT DISTINCT
    url,
    info AS version
  FROM
    `httparchive.technologies.2021_04_01_mobile`
  WHERE
    app = 'WordPress')
JOIN (
  SELECT
    url,
    has_webp,
    bytesImg
  FROM (  
    SELECT
      pageid,
      COUNTIF(ext = 'webp') > 0 AS has_webp
    FROM
      `httparchive.summary_requests.2021_04_01_mobile`
    GROUP BY
      pageid)
    JOIN (
      SELECT
        pageid,
        url,
        bytesImg
      FROM
        `httparchive.summary_pages.2021_04_01_mobile`)
    USING
      (pageid))
USING
  (url)
GROUP BY
  has_webp
has_webp median_img_mbytes
true 0.64
false 1.10

So the median WordPress page with any WebP contains 0.64 MB of total images while the median page without any WebP contains 1.1 MB of total images. Nice to see that difference!

And as a median percent of all bytes:

SELECT
  has_webp,
  APPROX_QUANTILES(pct_img_bytes, 1000)[OFFSET(500)] AS median_pct_img_bytes
FROM (
  SELECT DISTINCT
    url,
    info AS version
  FROM
    `httparchive.technologies.2021_04_01_mobile`
  WHERE
    app = 'WordPress')
JOIN (
  SELECT
    url,
    has_webp,
    pct_img_bytes
  FROM (  
    SELECT
      pageid,
      COUNTIF(ext = 'webp') > 0 AS has_webp
    FROM
      `httparchive.summary_requests.2021_04_01_mobile`
    GROUP BY
      pageid)
    JOIN (
      SELECT
        pageid,
        url,
        SAFE_DIVIDE(bytesImg, bytesTotal) AS pct_img_bytes
      FROM
        `httparchive.summary_pages.2021_04_01_mobile`)
    USING
      (pageid))
USING
  (url)
GROUP BY
  has_webp
has_webp median_pct_img_bytes
true 39.7%
false 55.1%

Similarly, WordPress sites with WebP tend to have fewer image bytes in relation to all of a page’s bytes.


Here’s a look at the median lab-based LCP for pages that use WebP vs don’t:

Beware: this query is slow and expensive

SELECT
  has_webp,
  APPROX_QUANTILES(lcp, 1000)[OFFSET(500)] AS median_lcp
FROM (
  SELECT DISTINCT
    url,
    info AS version
  FROM
    `httparchive.technologies.2021_04_01_mobile`
  WHERE
    app = 'WordPress')
JOIN (
  SELECT
    url,
    has_webp
  FROM (  
    SELECT
      pageid,
      COUNTIF(ext = 'webp') > 0 AS has_webp
    FROM
      `httparchive.summary_requests.2021_04_01_mobile`
    GROUP BY
      pageid)
    JOIN (
      SELECT
        pageid,
        url
      FROM
        `httparchive.summary_pages.2021_04_01_mobile`)
    USING
      (pageid))
USING
  (url)
JOIN (
  SELECT
    url,
    CAST(JSON_EXTRACT_SCALAR(payload, "$['_chromeUserTiming.LargestContentfulPaint']") AS INT64) / 1000 AS lcp
  FROM
    `httparchive.pages.2021_04_01_mobile`)
USING
  (url)
GROUP BY
  has_webp
has_webp median_lcp
true 7.944
false 9.046

So the median WordPress page with WebP has a lab-based LCP of 7.9s while the median WordPress page without any WebP has a lab-based LCP of 9.0s. Emphasis on lab-based LCP because these are mobile tests that are throttled. The exact values don’t matter much but it’s still valid for us to compare which one is bigger.

I know @patmeenan has been working on integrating page-level CrUX API data into the results but I don’t think that’s available in this crawl yet. But soon we should also be able to look at the real-user LCP experiences and segment by WebP adoption.


I’d love to add something to HTTP Archive: State of Images to track WebP adoption. We can segment by the WordPress lens, although that doesn’t give us version-level granularity.

Thanks Rick, these are great!

The first chart of existing user per version is interesting because currently users need to install a plugin to get WebP support, so its impressive to see how many sites already use it! This should show a bump up in 5.8 when WebP becomes usable without a plugin.

I’d love to add something to HTTP Archive: State of Images

That sounds great, I would be happy to contribute here if I can.

One improvement might be to truncate the WordPress minor version:

SELECT
  version,
  COUNTIF(has_webp) AS pages_with_webp,
  COUNT(0) AS pages,
  COUNTIF(has_webp) / COUNT(0) AS pct_webp
FROM (
  SELECT DISTINCT
    url,
    CONCAT( SPLIT( info, '.' )[SAFE_OFFSET(0)], '.', SPLIT( info, '.' )[SAFE_OFFSET(1)] )  AS version
  FROM
    `httparchive.technologies.2021_04_01_mobile`
  WHERE
    app = 'WordPress')
JOIN (
  SELECT
    url,
    has_webp
  FROM (  
    SELECT
      pageid,
      COUNTIF(ext = 'webp') > 0 AS has_webp
    FROM
      `httparchive.summary_requests.2021_04_01_mobile`
    GROUP BY
      pageid)
    JOIN (
      SELECT
        pageid,
        url
      FROM
        `httparchive.summary_pages.2021_04_01_mobile`)
    USING
      (pageid))
USING
  (url)
GROUP BY
  version
ORDER BY
  pages_with_webp DESC

Good idea. A bit cleaner with regular expressions:

REGEXP_EXTRACT(info, r'(\d+\.\d+)') AS version
version pages_with_webp pages pct_webp
48,660 993,385 4.9%
5.7 10,528 442,970 2.4%
5.6 5,725 263,756 2.2%
5.5 2,576 157,366 1.6%
5.4 1,893 119,875 1.6%
5.3 1,426 79,046 1.8%
4.9 1,217 115,147 1.1%
5.2 1,158 69,122 1.7%
5.1 316 35,164 0.9%
5.0 227 21,417 1.1%
4.8 197 26,406 0.7%
4.7 191 26,429 0.7%
4.6 101 10,022 1.0%
4.5 62 8,920 0.7%
4.4 46 9,060 0.5%
1.0 0 101 0.0%

Seems like this approach results in a few more null versions, maybe those are major-only version numbers that can’t be parsed?

A bit cleaner with regular expressions

Nice, I suspected there would be a cleaner approach!