We could query for patterns like
<script type="application/ld+json"> , eg:
body LIKE '%<script type="application/ld+json">%'
The result is 88,467 pages containing the JSON-LD signature. There are obviously several other forms of structured data, but this gives us a rough idea.
Changing the table in the query to
2017_05_15_desktop (one year ago), the result is 25,471. So adoption of structured data is definitely growing, at least for JSON-LD!
With Lighthouse support in HTTP Archive, we will soon be able to more easily query for structured data usage and validity. There is a new audit being developed: https://github.com/GoogleChrome/lighthouse/issues/4359
Hi Rick !
And for microdata format we could query something like :
body LIKE 'itemtype="http%://schema.org%'
Ah good idea. Modified query:
SUM(IF(body LIKE '%<script type="application/ld+json">%', 1, 0)) AS jsonld,
SUM(IF(REGEXP_CONTAINS(body, 'itemtype=[\'"]?https?://schema.org'), 1, 0)) AS microdata
And the 2017 equivalent:
So we could say that both formats are growing rapidly and microdata is a more popular format.
Nice here we are looking per origin or document ?
The query is a bit lazy and searches through non-HTML resources as well, but the likelihood of the patterns matching is low so we could simplify by saying “per document”. But keep in mind that there could be multiple documents loaded by a page, for example if the page has no SD but it embeds an iframe that does, it’d count as 1 detection. And if 1000 pages embed the same iframe (Facebook like button, etc), it’d be counted 1000 times.