Agreed.
I found a bug in Rick’s original query above. We need to JSON.parse
the string or it’s just an array of letters and returns true for any non-null values.
I.e.
{"json_ld":false,"microdata":false...}
Is stringified to the following string array:
[{,",j,s,o,n,_,l,d,",:,f,a,l,s,e,,,",m,i,c,r,o,d,a,t,a,",:,f,a,l,s,e,...}]
which always returns True for some(i => i)
except when the string is null
(i.e. when the custom metric failed completely).
Using JSON.parse
to convert the passed in string back to an Object to allow Object.values
to work properly works:
CREATE TEMP FUNCTION HAS_STRUCTURED_DATA(data STRING) RETURNS BOOLEAN LANGUAGE js AS '''
try {
return Object.values(JSON.parse(data)).some(i => i);
} catch {
return false;
}
''';
WITH sd AS (
SELECT
client,
root_page,
JSON_QUERY(custom_metrics, '$.structured-data.structured_data.rendered.present') AS data,
HAS_STRUCTURED_DATA(JSON_QUERY(custom_metrics, '$.structured-data.structured_data.rendered.present')) AS has_structured_data
FROM
`httparchive.all.pages`
WHERE
date = '2023-10-01'
)
SELECT
client,
COUNT(DISTINCT IF(has_structured_data, root_page, NULL)) / COUNT(DISTINCT root_page) AS pct
FROM
sd
GROUP BY
client
It gives the following results, which seems much more reasonable and in keeping with the Web Almanac’s split out results:
client | pct |
---|---|
desktop | 71.8% |
mobile | 72.6% |
Still trying to recall what that really common was that I was thinking about and whether we fixed it or not… <meta... property="...">
might have been it (in which case it’s still there, and it’s correct that it is) but thought it was something even more common than that (most regularly used <meta>
elements use name
and content
attributes rather than property
).