Internet Explorer emits a handy X-Download-Initiator
header that describes the type of resource being fetched, what triggered it, plus some other metadata - see MSDN blog post. With a bit of regex gymnastics, we can figure out how and why the image requests are initiated…
SELECT initiator, reason, cnt,
ROUND(ratio*100,2) AS percent,
ROUND(byteRatio*100,2) as bytePercent
FROM (
SELECT initiator, reason,
count(initiator) as cnt, RATIO_TO_REPORT(cnt) OVER() AS ratio,
sum(respBodySize) as bodySize, RATIO_TO_REPORT(bodySize) OVER() as byteRatio
FROM (
SELECT reqOtherHeaders, respBodySize, type, initiator,
REGEXP_EXTRACT(reason, r'(\w+)') as reason
FROM (
SELECT url, reqOtherHeaders, respBodySize,
REGEXP_EXTRACT(reqOtherHeaders, r'X-Download-Initiator\s=\s(\w+)') as type,
REGEXP_EXTRACT(reqOtherHeaders, r'"doc[^;]*;([^;]*);') as initiator,
REGEXP_EXTRACT(reqOtherHeaders, r'"doc[^;]*;[^;]*;([^;]*)') as reason,
FROM [httparchive:runs.2015_04_01_requests]
) WHERE type = 'image'
)
GROUP by initiator, reason
HAVING CNT > 100
)
Results for “desktop” April 1st, 2015 run:
- ~43% of image fetches are initiated by the speculative HTML scanner, which account for ~50% of transferred bytes.
- ~37% of other fetches are initiated by parsing the src attribute of the img tag - i.e. by the document parser.
- ~20% of remaining fetches are initiated via CSS (“background-image”).
In total this means ~80% of images are declared in the HTML markup and they amount to ~84% of transferred bytes. The remaining ~20% is declared via CSS. That said, also a few caveats for these numbers:
- This is for initial load and doesn’t account for images fetched later when, for example, a script injects an image based on user input, or some CSS rule is activated with a new background-image.
- This does not account for XHR-fetched images.