Was having a discussion with Scott Jehl on Twitter: https://twitter.com/tunetheweb/status/1264162826123427843?s=20
He has, what I thought was a well-known method of loading low-priority CSS asynchronously (since the async
attribute is not natively supported on link elements) basically like this:
<link rel="stylesheet" media="print" href="mystyles.css" onload="this.media='all'">
I was curious how often this was used and knocked up this query: https://gist.github.com/bazzadp/0b894e7e645e077431c67dc744431224
#standardSQL
# Desktop pages that include link[rel=stylesheet][media=print][onload="*this.media*"]
# Warning this uses 250GB of BigQuery usage so can be expensive to run multiple times!
CREATE TEMP FUNCTION hasPrintCSSWithOnload(payload STRING)
RETURNS BOOLEAN LANGUAGE js AS '''
try {
var $ = JSON.parse(payload);
var almanac = JSON.parse($._almanac);
return !!almanac['link-nodes'].find(
e => e.rel.toLowerCase() == 'stylesheet' && e.media.toLowerCase() == 'print' && e.onload.toLowerCase().includes('this.media')
);
} catch (e) {
return false;
}
''';
SELECT
url
FROM
`httparchive.pages.2020_05_01_desktop`
WHERE
hasPrintCSSWithOnload(payload) = true
This showed it wasn’t actually used that much (only 14 sites on desktop!).
However there are a few variants of this such as:
<link rel="preload" href="mystyles.css" onload="this.rel='stylesheet'">
So what I really want is to return counts of distinct rel
, media
and onload
attributes for any page with a <link>
element with a href
attribute containing .css
to analyse this data.
So I want a table of results like this:
rel | media | onload | count | percentage |
---|---|---|---|---|
spreadsheet | this.media=‘all’ | 5 | 0.05% | |
preload | this.rel=‘spreadhseet’ | 10 | 0.10% | |
spreadsheet | 5349874 | 100% |
I tried amending this query as looks close to what I want https://github.com/HTTPArchive/almanac.httparchive.org/blob/master/sql/2019/19_Resource_Hints/19_03.sql but am not getting very far so thought I’d ask for help rather than wasting all my BigQuery budget
Any ideas?