Images: tracking beacon or actual image?


#1

I am building a talk on Images, and I (of course) begin with the HTTPArchive pie chart breaking down the number of image requests on mobile:
image

But, I remembered this thread with @rviscomi and @HenriHelvetica on Twitter:

I wondered to myself "How many of these requests are really images vs. tracking beacons? In this study, i assume that any image under 35 bytes is a tracking beacon. I know that will cut out some actual images, and may not actually remove all the tracking beacons. But it is a decent first approximation :slight_smile: If there are suggestions on how to improve this, please comment below.

Let’s dive in:

SELECT
  SUM(gif) GifCnt,
  SUM(jpg) jpgCnt,
  SUM(png) PngCnt,
  SUM(svg) SvgCnt,
  SUM(webp) WebpCnt,
  SUM(respSize) imgbytes,
  SUM(IF(gif>0,respSize,0)) GifBytes,
  SUM(IF(jpg>0,respSize,0)) JpgBytes,
  SUM(IF(png>0,respSize,0)) PngBytes,
  SUM(IF(svg>0,respSize,0)) SvgBytes,
  SUM(IF(webp>0,respSize,0)) WebpBytes,
FROM (
  SELECT
type,
format,
respSize,
IF(format CONTAINS 'gif', 1,0) gif,
IF(format CONTAINS 'jp', 1,0) jpg,
IF(format CONTAINS 'png', 1,0) png,
IF(format CONTAINS 'svg', 1,0) svg,
IF(format CONTAINS 'webp', 1,0) webp,
  FROM
httparchive.runs.2018_02_15_requests_mobile
  WHERE
type CONTAINS 'image'
AND respSize > 35)

if you run this without the last line ( AND respSize > 35) you get all of the images, and the byteCount of all the images… Running with this, you get the “big images.” Putting the data into a spreadsheet, you can calculate the tracker count for each format.

image
21% of GIFs are under 35 bytes and are possibly trackers.

We can now rebuild the HTTPArchive chart:

image

We see the percentage of GIF images drops by 4%, and PNGs and JPGs both gain 2% of the total number of requests. All in all, the HTTPArchive chart at the top is still very close to the numbers I calculated, but perhaps overestimates GIFs slightly.