From the standard best practices for reducing payload we know the following rule:
Don’t use gzip for image or other binary files.
However lets see if this is true in practice and if not what are the sites that do use gzip/deflate for their images using the following sample query
select count(requestid) as ct, domain(url) as dom from [httparchive:runs.latest_requests] where
(resp_content_type='image/png' or resp_content_type='image/jpeg' or resp_content_type='image/jpg')
and resp_content_encoding in ('gzip', 'deflate')
group by dom
having ct > 333
order by ct desc limit 20
shows up the following domains
We can drill down into specific URLs for the domain using the following query
select url from [httparchive:runs.latest_requests] where resp_content_type='image/jpeg' and resp_content_encoding in ('gzip', 'deflate') and domain(url) = 'facebook.com'
Since the top one happens to ‘external.ak.fbcdn.net’ which looks like Facebook but is served from Akamai. Funny thing is that we can verify that the “Content-Encoding:gzip” is actually added by the Akamai config not the original content publisher as seen here:
Content Publisher: http://www.webpagetest.org/result/140416_1P_64S/1/details/#request1
FB Shared Content:
http://www.webpagetest.org/result/140416_BB_675/1/details/#request1
Lets pick Slideshare for example which does use Akamai as their CDN but their origin is the one setting gzip headers on the jpeg (maybe a misconfig on their end)
Origin: http://www.webpagetest.org/result/140416_33_69J/1/details/#request1
CDN : http://www.webpagetest.org/result/140416_6T_69Q/1/details/#request1
Finally given that facebook.com existed on top20 domains doing this, I picked the profile pic url: http://www.webpagetest.org/result/140416_S2_SSX/1/details/#request1
I was a wondering if we pay any decompression costs (high on mobile relative to desktop), so ran the same query for mobile dataset it returns 0 records which is good news as no sites in httparchive do gzip compression on images
select count(requestid) as ct, domain(url) as dom from [httparchive:runs.latest_requests] where
(resp_content_type='image/png' or resp_content_type='image/jpeg' or resp_content_type='image/jpg')
and resp_content_encoding in ('gzip', 'deflate')
group by dom
having ct > 333
order by ct desc limit 20
Net net of all this is that I can conclude most of these just misconfigurations (as in the generating endpoint democratically compresses without regards to content type)
Is there any other harm I am not aware of when compressing images?