Once again, @igrigorik 's awesome post on html caching got me thinking about mobile caching. When I took his two queries, and modified for mobile, there were minor differences, but the general trend was similar. (The big differences are likely due to the smaller sample size for mobile).
This got me wondering, how do sites treat their mobile caching differently from their desktop caching? I ran a few tests, and found that the overall results mobile to desktop were similar, but that there were some values missing. Certainly, there would be outlying sites that treat mobile significantly differently than desktop. But how many? And how? So - I mashedup up Ilya’s query with GuyPo’s (from his m. study) to grab the cache control headers and max-age from the first html on desktop and mobile and compare them:
SELECT dData.url, dData.age, mData.age, dData.resp_cache_control, mData.resp_cache_control//COUNT(dData.age) as web_count, COUNT(mData.age) as Mobile_count
FROM
(SELECT pages.pageid as pid,url,urlhash,wptid,fHtml,fReq,fStatus,loc, age, resp_cache_control
FROM [httparchive:runs.latest_pages] as pages JOIN
(select pageid, MAX(firstHtml) as fHtml,MAX(firstReq) as fReq,MAX(status) fStatus, MAX(resp_location) as loc,
INTEGER(REGEXP_EXTRACT(resp_cache_control, r'max-age=(\d+)')) age,
resp_cache_control
from [httparchive:runs.latest_requests]
WHERE
firstHtml = true AND
status =200
group by resp_cache_control,age, pageid) as reqs ON
(reqs.pageid = pages.pageid)
)as dData
JOIN
(SELECT pages.pageid as pid,url,wptid,fHtml,fReq,fStatus,loc , age, resp_cache_control
FROM [httparchive:runs.latest_pages_mobile] as pages JOIN
(select pageid, MAX(firstHtml) as fHtml,MAX(firstReq) as fReq,MAX(status) fStatus, MAX(resp_location) as loc,
INTEGER(REGEXP_EXTRACT(resp_cache_control, r'max-age=(\d+)')) age,
resp_cache_control
from [httparchive:runs.latest_requests_mobile]
WHERE
firstHtml = true AND
status =200
group by resp_cache_control,age,pageid) as reqs ON
(reqs.pageid = pages.pageid )
) as mData
ON mData.url=dData.url
where mData.url=dData.url AND dData.age!=mData.age
Group By dData.url, dData.age, mData.age, dData.resp_cache_control, mData.resp_cache_control
//having web_count >20
//order by dData.age asc
Of 4672 sites that match in the 2 databases, 3645 (78%) have the same cache control response header. 996 (21%) have the same max-age (mData.age = dData.age). I then changed the where parameter to further breakdown the sites into various categories.
What I am interested in are the sites that are outside the norm. There are certainly legitimate reasons to cache longer (or shorter) on a mobile device compared to desktop. So, let’s look into the 1027 (22%) sites that are doing caching differently mobile vs. desktop:
Table 1: Breakdown of sites with different cache control headers for Mobile and Desktop.
Let’s look through these one by one (ignoring headers with the same max-age – because that sounds kind of boring):
322 have different cache headers but no max age values. Of these:
Table 2 Breakdown of different cache control headers with no max-age values for mobile or desktop.
The first 2 lines in the Table 2 show sites that have cache control headers for only mobile (69), or only desktop (85) but not the other version (that’s 3.3% of all sites). A large number are different by only a few characters, and glancing at the results – they are generally missing commas between parameters. Then there are 1.5% of sites that have cache control headers that are longer for either mobile or desktop due to more parameters being added for one or the other.
Table 3: When Cache Control Max ages differ
In table 3, the top line and bottom line show 2 extremes, where the cache directives differ by over 15 minutes one way or another. 1.1% of websites studied suffer from this. Another 1.6% of sites have cache headers that are over 2 minutes (but less than 15 minutes) different.
Tables 4 and 5 Sites with Max-age values for only mobile or desktop, broken down by available max-age.
In Tables 4 and 5, we see a breakdown of mobile max ages when there is a mobile max-age, but no desktop max-age (and desktop max ages when there is no mobile max age). Most are under 5 minutes, but interestingly, there are 51 sites that have max-ages 5min-1 day different (1.1% of all sites). 25 sites have a max age>1 day (while not specifying the other)! That’s 0.54% of all sites studied.
In conclusion, cache control headers and the max-age for caching can (and probably should) vary for mobile and desktop sites. We see 22% of sites with headers that vary from our sample of 4672 sites. However, there are no real patterns in the data as to identify ideal caching length, and ~10% of sites have cache control headers or max-age values that are extremely different between their mobile and desktop offerings. This goes to show that devlopers should 1. add cache headers and 2. periodically review the values on a fairly regular basis to ensure that all of the sites you deliver have cache headers that make sense for mobile and desktop.