I was running an audit of a Japanese website today and was surprised to see that it loads over 1 MB of fonts. Here’s the content breakdown provided by WebPageTest:
MIME Type | Bytes | Uncompressed |
---|---|---|
image | 5,437,450 | 5,437,450 |
font | 1,025,064 | 1,025,064 |
js | 287,016 | 313,603 |
css | 143,103 | 143,103 |
html | 15,828 | 15,048 |
other | 13,882 | 13,882 |
Yeah, images are the biggest issue (5.5 MB!) but this post is specifically about the fonts. Here’s the info we have about each font file:
Resource | Bytes Downloaded |
---|---|
…/fonts/NotoSansCJKjp-DemiLight.woff2 | 483.8 KB |
…/fonts/NotoSansCJKjp-Medium.woff2 | 487.2 KB |
`https://fonts.gstatic.com/s/roboto/v18/...qEu92Fr1Mu4mxK.woff2 | 15.0 KB |
`https://fonts.gstatic.com/s/roboto/v18/...2Fr1MmWUlfBBc4.woff2 | 15.1 KB |
So there were two Roboto files loaded from the Google Fonts CDN for a total of only 30 KB. The big issue is the Noto Sans CJK files.
Noto Sans CJK and Noto Serif CJK comprehensively cover Simplified Chinese, Traditional Chinese, Japanese, and Korean in a unified font family. This includes the full coverage of CJK Ideographs with variation support for 4 regions, Kangxi radicals, Japanese Kana, Korean Hangul, and other CJK symbols and letters in the Basic Multilingual Plane of Unicode. It also provides limited coverage of CJK Ideographs in Plane 2 of Unicode as necessary to support standards from China and Japan.
Also:
…be aware that the web latency for large fonts, such as for Noto Sans CJK, can be large.
In a State of the Web episode on web fonts with my guest Dave Crossland, Dave talked about the challenges of loading CJK fonts and how they can be 100x larger than a European font. Here’s the relevant clip and transcript:
The biggest challenge has been for Chinese, Japanese, and Korean fonts. A typical font for Indian languages can maybe be two or three times larger than a European font. But for East Asia, it can be a hundred times bigger.
That was a big wind-up but that leads me to my web transparency question: if CJK fonts are so huge, how does the median number of font KB compare across countries? We should expect to see Chinese, Japanese, and Korean websites have more font bytes, right?
I adapted the following query from the CrUX Cookbook. The country-specific CrUX tables contain origins for popular websites visited by Chrome users. Because the HTTP Archive crawls the home pages of all CrUX origins, we can JOIN
these datasets together in BigQuery and answer our web transparency question.
#standardSQL
WITH
countries AS (
SELECT *, 'ad' AS country_code, 'Andorra' AS country FROM `chrome-ux-report.country_ad.201903` UNION ALL
SELECT *, 'ae' AS country_code, 'United Arab Emirates' AS country FROM `chrome-ux-report.country_ae.201903` UNION ALL
SELECT *, 'af' AS country_code, 'Afghanistan' AS country FROM `chrome-ux-report.country_af.201903` UNION ALL
SELECT *, 'ag' AS country_code, 'Antigua and Barbuda' AS country FROM `chrome-ux-report.country_ag.201903` UNION ALL
SELECT *, 'ai' AS country_code, 'Anguilla' AS country FROM `chrome-ux-report.country_ai.201903` UNION ALL
SELECT *, 'al' AS country_code, 'Albania' AS country FROM `chrome-ux-report.country_al.201903` UNION ALL
SELECT *, 'am' AS country_code, 'Armenia' AS country FROM `chrome-ux-report.country_am.201903` UNION ALL
SELECT *, 'ao' AS country_code, 'Angola' AS country FROM `chrome-ux-report.country_ao.201903` UNION ALL
SELECT *, 'ar' AS country_code, 'Argentina' AS country FROM `chrome-ux-report.country_ar.201903` UNION ALL
SELECT *, 'as' AS country_code, 'American Samoa' AS country FROM `chrome-ux-report.country_as.201903` UNION ALL
SELECT *, 'at' AS country_code, 'Austria' AS country FROM `chrome-ux-report.country_at.201903` UNION ALL
SELECT *, 'au' AS country_code, 'Australia' AS country FROM `chrome-ux-report.country_au.201903` UNION ALL
# All ~200 countries...
# See the link to the query below for the unabridged version.
)
SELECT
_TABLE_SUFFIX AS client,
country,
COUNT(0) AS urls,
ROUND(APPROX_QUANTILES(bytesFont, 1001)[OFFSET(501)] / 1024, 2) AS median_font_bytes
FROM
countries
JOIN
`httparchive.summary_pages.2019_02_01_*`
ON
CONCAT(origin, '/') = url
GROUP BY
client,
country
client | country | median_font_bytes | urls |
---|---|---|---|
desktop | China | 0 | 23138 |
desktop | Japan | 14.54 | 560364 |
desktop | Korea | 70.35 | 143057 |
mobile | China | 1.17 | 13342 |
mobile | Japan | 10.54 | 653001 |
mobile | Korea | 51.29 | 131356 |
Asia Desktop:
Asia Mobile:
So according to the stats in the Page Weight report, the median font bytes for desktop/mobile is around 100 KB. China and Japan have relatively few font bytes, although China’s sample size is much smaller. Interestingly, Korea has more font bytes than other CJK websites, at 70 KB for desktop.
So why would CJK websites load fewer font bytes? Maybe it’s because the web font files are just so prohibitively huge that it’s not even worth it and they rely on system fonts. Korea is known to have some of the fastest internet in the world, so maybe the download cost is more tolerable. Does anyone have any other insights either from the data or real world experience?