Font transfer size & font requests

I was curious what SQL query powers the “trends” data for Font transfer size and Font requests over time? I’d love to be able to play with the query in Google BigQuery.

I believe the Trends page is querying against a stats table that is updated over time - so not directly linked to BigQuery. That may be changing soon, and @rviscomi can confirm. Regardless, you can see the parts of the SELECT clause being used to build this in the source. In particular:

"ROUND(reqFont, 1) as reqFont"
"ROUND(bytesFont/1024) as bytesFont"

A few months ago I was able to create a wildcard query to look at page weight over time - but that query doesn’t seem to be working anymore. @fhoffa - any idea what may have changed?

That said, we can still look at this by UNIONing the stats together in the _pages tables via BigQuery. For example the content in the Fonts trends graph can be recreated with the following standard SQL query. (direct link to it is here too)

SELECT date, fontsKB, numFonts
FROM (
SELECT '2016-12-15' date, ROUND(AVG(bytesFont/1024),1) fontsKB, ROUND(AVG(reqFont),1) numFonts FROM `httparchive.runs.2016_12_15_pages` GROUP BY date UNION ALL
SELECT '2017-01-01' date, ROUND(AVG(bytesFont/1024),1) fontsKB, ROUND(AVG(reqFont),1) numFonts FROM `httparchive.runs.2017_01_01_pages` GROUP BY date UNION ALL
SELECT '2017-01-15' date, ROUND(AVG(bytesFont/1024),1) fontsKB, ROUND(AVG(reqFont),1) numFonts FROM `httparchive.runs.2017_01_15_pages` GROUP BY date UNION ALL
SELECT '2017-02-01' date, ROUND(AVG(bytesFont/1024),1) fontsKB, ROUND(AVG(reqFont),1) numFonts FROM `httparchive.runs.2017_02_01_pages` GROUP BY date UNION ALL
SELECT '2017-02-15' date, ROUND(AVG(bytesFont/1024),1) fontsKB, ROUND(AVG(reqFont),1) numFonts FROM `httparchive.runs.2017_02_15_pages` GROUP BY date UNION ALL
SELECT '2017-03-01' date, ROUND(AVG(bytesFont/1024),1) fontsKB, ROUND(AVG(reqFont),1) numFonts FROM `httparchive.runs.2017_03_01_pages` GROUP BY date UNION ALL
SELECT '2017-03-15' date, ROUND(AVG(bytesFont/1024),1) fontsKB, ROUND(AVG(reqFont),1) numFonts FROM `httparchive.runs.2017_03_15_pages` GROUP BY date UNION ALL
SELECT '2017-04-01' date, ROUND(AVG(bytesFont/1024),1) fontsKB, ROUND(AVG(reqFont),1) numFonts FROM `httparchive.runs.2017_04_01_pages` GROUP BY date UNION ALL
SELECT '2017-04-15' date, ROUND(AVG(bytesFont/1024),1) fontsKB, ROUND(AVG(reqFont),1) numFonts FROM `httparchive.runs.2017_04_15_pages` GROUP BY date UNION ALL
SELECT '2017-05-01' date, ROUND(AVG(bytesFont/1024),1) fontsKB, ROUND(AVG(reqFont),1) numFonts FROM `httparchive.runs.2017_05_01_pages` GROUP BY date UNION ALL
SELECT '2017-05-15' date, ROUND(AVG(bytesFont/1024),1) fontsKB, ROUND(AVG(reqFont),1) numFonts FROM `httparchive.runs.2017_05_15_pages` GROUP BY date UNION ALL
SELECT '2017-06-01' date, ROUND(AVG(bytesFont/1024),1) fontsKB, ROUND(AVG(reqFont),1) numFonts FROM `httparchive.runs.2017_06_01_pages` GROUP BY date UNION ALL
SELECT '2017-06-15' date, ROUND(AVG(bytesFont/1024),1) fontsKB, ROUND(AVG(reqFont),1) numFonts FROM `httparchive.runs.2017_06_15_pages` GROUP BY date UNION ALL
SELECT '2017-07-01' date, ROUND(AVG(bytesFont/1024),1) fontsKB, ROUND(AVG(reqFont),1) numFonts FROM `httparchive.runs.2017_07_01_pages` GROUP BY date UNION ALL
SELECT '2017-07-15' date, ROUND(AVG(bytesFont/1024),1) fontsKB, ROUND(AVG(reqFont),1) numFonts FROM `httparchive.runs.2017_07_15_pages` GROUP BY date UNION ALL
SELECT '2017-08-01' date, ROUND(AVG(bytesFont/1024),1) fontsKB, ROUND(AVG(reqFont),1) numFonts FROM `httparchive.runs.2017_08_01_pages` GROUP BY date UNION ALL
SELECT '2017-08-15' date, ROUND(AVG(bytesFont/1024),1) fontsKB, ROUND(AVG(reqFont),1) numFonts FROM `httparchive.runs.2017_08_15_pages` GROUP BY date UNION ALL
SELECT '2017-09-01' date, ROUND(AVG(bytesFont/1024),1) fontsKB, ROUND(AVG(reqFont),1) numFonts FROM `httparchive.runs.2017_09_01_pages` GROUP BY date UNION ALL
SELECT '2017-09-15' date, ROUND(AVG(bytesFont/1024),1) fontsKB, ROUND(AVG(reqFont),1) numFonts FROM `httparchive.runs.2017_09_15_pages` GROUP BY date UNION ALL
SELECT '2017-10-01' date, ROUND(AVG(bytesFont/1024),1) fontsKB, ROUND(AVG(reqFont),1) numFonts FROM `httparchive.runs.2017_10_01_pages` GROUP BY date UNION ALL
SELECT '2017-10-15' date, ROUND(AVG(bytesFont/1024),1) fontsKB, ROUND(AVG(reqFont),1) numFonts FROM `httparchive.runs.2017_10_15_pages` GROUP BY date UNION ALL
SELECT '2017-11-01' date, ROUND(AVG(bytesFont/1024),1) fontsKB, ROUND(AVG(reqFont),1) numFonts FROM `httparchive.runs.2017_11_01_pages` GROUP BY date
)
ORDER BY date ASC

You may find it interesting to query this data from specific HTTP Archive runs as well. For example, this simple query summarizes the number of fonts per page from the latest HTTP Archive run and can be used to create a histogram. Averages often mislead, as you can see below where that average of 3.1 fonts per page is hiding quite a bit.

-- Standard SQL
SELECT reqFont, count(*) 
FROM `httparchive.runs.2017_11_01_pages`
GROUP BY reqFont
ORDER BY reqFont

image

That’s correct. The beta site is directly linked to BigQuery, so you could reproduce the queries. As an added bonus, the charts include a little menu button that pops open a link to “Show Query”. So for example, the query for HTTP Archive: Page Weight can be found at https://raw.githubusercontent.com/HTTPArchive/beta.httparchive.org/master/sql/timeseries/bytesFont.sql

Wildcard queries that span tables with different schemas behave unpredictably. I’ve started grouping tables with similar schemas into distinct datasets to address this. Not all of these datasets are publicly available (not sure why or if it’s necessary) but httparchive.lighthouse is one public dataset for example. I also need to manually copy tables into their respective datasets for now until I update the dataflow pipeline, so it’s still very much a “beta” feature. Querying tables in those datasets should not have any issues with wildcards.

1 Like