Looks like 191 sites in the archive are using mod_pagespeed and 29 of them have keep-alives disabled
SELECT * FROM [httparchive:runs.latest_requests]
LOWER(respotherheaders) CONTAINS "x-mod-pagespeed" AND
LOWER(resp_connection) CONTAINS "close" AND
firstreq = true
@andydavies That doesn’t seem like a valid way to test keep-alive? You’re assuming that firstreq = disabled keepalive, but the browser could have just opened a new connection because others are busy or occupied. Shouldn’t you look for “Keep-Alive: close” instead?
I should have been clearer, the query returns the 191 sites that use mod_pagespeed, to filter down to those that don’t have keep-alive enabled needs the extra check for connection = close (don’t have the data in front of me ATM so column name might be wrong)
The firstreq = true is an attempt to just filter the request for the document but if the first response is a redirect then some sites might be missed.
@andydavies right. Updated the query in your original post to include “Connection: close” condition – also getting 29 hosts.
For what it’s worth, the actual number of hosts with mod_pagespeed and ngx_pagespeed is much higher (hundreds of thousands); I am curious what the difference is. But in any case, when I took Andy’s query and removed the Connection:close test I got 1656 rows.
http://trends.builtwith.com/Server/mod_pagespeed reports 235k sites but we believe the number is higher.
Be interesting to see if we could work out what the penetration of the other accelerators is.
I’ve got customers using mod_pagespeed, Akamai’s Aqua Ion and Radware’s Fastview plus some looking a PSS
I think Akamai uses X-Akamai-Transformed and Radware X-Strangeloop but I guess I should check with Guy and Tammy to be certain
Those (Akamai & Strangeloop) look like great BigQuery queries to do. What I’m curious is: how many sites total are represented in BigQuery? My extrapolation says that’s somewhere between 1/100 and 1/1000 of the whole internet. Does that sound plausible?
I can’t be very accurate in my extrapolation because builtwith’s data indicates mod_pagespeed percent usage is higher in the top 100k sites than the long tail.
The Akamai & Strangeloop queries both worked: 62 and 40 rows, respectively. However they sites using those technologies included large ones on the first page (www.target.com and www.walgreens.com) respectively.
HTTP Archive is currently tracking the top 300K Alexa sites for desktop and top 5K for mobile. Not sure how that extrapolates to the “entire internet”, but all numbers obtained here would be relative to that.
Thanks Ilya, that 1656/300k sites correlates plausibly with builtwith’s data, which says MPS is installed on 676 of the top 100k (0.7%)
Also of note: there are 952 sites in the archive with the x-page-speed header, which is used by ngx_pagespeed, PageSpeed Service, and IISpeed from we-amp.com
So together that’s (952+1656)/300k = 0.8% of the archive.