I am building a talk on Images, and I (of course) begin with the HTTPArchive pie chart breaking down the number of image requests on mobile:
I wondered to myself "How many of these requests are really images vs. tracking beacons? In this study, i assume that any image under 35 bytes is a tracking beacon. I know that will cut out some actual images, and may not actually remove all the tracking beacons. But it is a decent first approximation If there are suggestions on how to improve this, please comment below.
Let’s dive in:
SELECT SUM(gif) GifCnt, SUM(jpg) jpgCnt, SUM(png) PngCnt, SUM(svg) SvgCnt, SUM(webp) WebpCnt, SUM(respSize) imgbytes, SUM(IF(gif>0,respSize,0)) GifBytes, SUM(IF(jpg>0,respSize,0)) JpgBytes, SUM(IF(png>0,respSize,0)) PngBytes, SUM(IF(svg>0,respSize,0)) SvgBytes, SUM(IF(webp>0,respSize,0)) WebpBytes, FROM ( SELECT type, format, respSize, IF(format CONTAINS 'gif', 1,0) gif, IF(format CONTAINS 'jp', 1,0) jpg, IF(format CONTAINS 'png', 1,0) png, IF(format CONTAINS 'svg', 1,0) svg, IF(format CONTAINS 'webp', 1,0) webp, FROM httparchive.runs.2018_02_15_requests_mobile WHERE type CONTAINS 'image' AND respSize > 35)
if you run this without the last line ( AND respSize > 35) you get all of the images, and the byteCount of all the images… Running with this, you get the “big images.” Putting the data into a spreadsheet, you can calculate the tracker count for each format.
21% of GIFs are under 35 bytes and are possibly trackers.
We can now rebuild the HTTPArchive chart:
We see the percentage of GIF images drops by 4%, and PNGs and JPGs both gain 2% of the total number of requests. All in all, the HTTPArchive chart at the top is still very close to the numbers I calculated, but perhaps overestimates GIFs slightly.