What percent of WordPress websites contain structured data?

Related to Structured Data adoption, I’d like to zoom in on WordPress specifically, because it’s the most popular CMS and site owners are not necessarily hand-coding their own websites, so this is a good measure of the platform capabilities with native or plugin support.

Borrowing from one of the custom metrics created by @Tiggerito for the SEO chapter of the 2020 Web Almanac, we can write a query that encompasses Microformats2, Microdata, JSON-LD, and RDFa formats.

CREATE TEMP FUNCTION hasStructuredData(payload STRING) RETURNS BOOL LANGUAGE js AS '''
try {
  var $ = JSON.parse(payload);
  var wpt_bodies = JSON.parse($._wpt_bodies);
  return Object.values(wpt_bodies.structured_data.rendered.items_by_format).reduce((sum, i) => sum + i, 0) > 0;
} catch (e) {
  return false;
}
''';

SELECT
  COUNT(DISTINCT IF(hasStructuredData(payload), url, NULL)) AS freq,
  COUNT(DISTINCT url) AS total,
  COUNT(DISTINCT IF(hasStructuredData(payload), url, NULL)) / COUNT(DISTINCT url) AS pct
FROM
  `httparchive.technologies.2020_08_01_desktop`
JOIN
  `httparchive.pages.2020_08_01_desktop`
USING
  (url)
WHERE
  app = 'WordPress'
freq total pct
1,284,919 1,754,704 73.23%

The results show that 73% of WordPress pages (on desktop) contain some kind of structured data!

Yoast is a popular WordPress plugin that offers SEO functionality like structured data markup.

Let’s modify the previous query to segment by WordPress pages with and without the Yoast plugin.

CREATE TEMP FUNCTION hasStructuredData(payload STRING) RETURNS BOOL LANGUAGE js AS '''
try {
  var $ = JSON.parse(payload);
  var wpt_bodies = JSON.parse($._wpt_bodies);
  return Object.values(wpt_bodies.structured_data.rendered.items_by_format).reduce((sum, i) => sum + i, 0) > 0;
} catch (e) {
  return false;
}
''';

SELECT
  'Yoast SEO' IN UNNEST(apps) AS has_yoast,
  COUNT(DISTINCT IF(hasStructuredData(payload), url, NULL)) AS freq,
  COUNT(DISTINCT url) AS total,
  COUNT(DISTINCT IF(hasStructuredData(payload), url, NULL)) / COUNT(DISTINCT url) AS pct
FROM
  (SELECT url, ARRAY_AGG(app) AS apps FROM `httparchive.technologies.2020_08_01_desktop` GROUP BY url)
JOIN
  `httparchive.pages.2020_08_01_desktop`
USING
  (url)
WHERE
  'WordPress' IN UNNEST(apps)
GROUP BY
  has_yoast
has_yoast freq total pct
TRUE 645,305 650,202 99.25%
FALSE 639,614 1,104,502 57.91%

Surprisingly, almost all (99%) WordPress pages that do have Yoast installed use at least one of the structured data formats. Also, these results show that Yoast is found on 37% of all WordPress pages. Only 58% of pages that do not use Yoast use structured data, so Yoast appears to be a big driver in adoption.

1 Like

Cool, that means my custom metric worked :partying_face:

This is just for home pages, so is mainly picking up things like WebSite markup to indicate how the internal search works. My custom metric does include a string array of the types discovered (jsonld_and_microdata_types).

Interesting about the low adoption of home page structured data for sites not using Yoast. I know there are other popular SD plugins out there.

1 Like