Big changes during the gap from April → June

https://github.com/HTTPArchive/httparchive/issues/105#issuecomment-301913465 mentions “new agents,” but I can’t find info on what’s new/changed about them.

Lots of interesting drops — one guess I have is Brotli?

That doesn’t explain the drop in image weight on Mobile (but not Desktop) though. Perhaps that’s something to do with viewports and responsive images?

I’m working on analysis of change of brotli usage for the last year.
Currently the only anomaly I see is a sudden spike for 2017_02_01, but perhaps it is an echo of 2017_01_01/2017_01_15.
There is a rapid growth of brotli usage in 2017_06_01 / 2017_06_15, but it is somewhat expected (yet to be analysed deeper).

The “new agents” was a rewrite of the code that drives the testing and records the page results. Instead of running on Windows-only and intrusively intercepting the OS API calls it is now cross-platform and uses the supported dev tools interfaces, netlogs and traces. As part of the move we also switched to running on Linux VM’s instead of Windows which reduced the overhead and made it possible to get accurate performance stats out as well.

What kind of “drop” are you seeing with Brotli? There have been a few issues with the response headers reporting differently, specifically cases where the same header is repeated (ended up with a newline in the data) but we’d like to fix any cases that can be identified.

I still don’t understand the reported drop in HTML, JS and CSS bytes (while the top-level bytes remained consistent) which looks like it may have also hit images on mobile. Best guess right now is that the content type matching somewhere is missing some resources but I haven’t found where yet.

As to the big jump in brotli usage in June, my guess is that it is related to Google enabling Brotli for ads (the timing lines up pretty well).

I have found that not only google ads gave a huge boost to brotli, but cloudflare as well.
And February anomaly is also caused by cloudflare. This is easy to track via resp_server field.
For google ads (doubleclick) it is “cafe”; for cloudflare it is “cloudflare-nginx”.