Also modified the boomerang.js
query a bit to match a bunch of URLs we found from open-source users like /boomerang-n.n.n.min.js
and /boomerang
:
SELECT vendor, sites, ROUND(ratio*10000)/100 as percent FROM (
SELECT
COUNT(DISTINCT page, 1000000) as sites,
RATIO_TO_REPORT(sites) OVER() AS ratio,
CASE
WHEN REGEXP_MATCH(url, "http[s]*://[www.|ssl.]*google-analytics.com/ga.js") THEN "Google Analytics"
WHEN REGEXP_MATCH(url, "http[s]*://[www.|ssl.]*google-analytics.com/analytics.js") THEN "Google Analytics"
WHEN REGEXP_MATCH(url, "http[s]*://[www.|ssl.]*googletagmanager.com/gtag/js") THEN "Google Analytics"
WHEN REGEXP_MATCH(url, "http[s]*://rum-static.pingdom.net/prum") THEN "Pingdom"
WHEN REGEXP_MATCH(url, "insight.torbit.com/.*.js") THEN "Torbit"
WHEN url CONTAINS "aksb.min.js" THEN "Akamai (Legacy RUM)"
WHEN url CONTAINS "go-mpulse.net/boomerang/"
OR url CONTAINS "lognormal.net/boomerang/" THEN "Akamai mPulse"
WHEN url CONTAINS "/gomez"
AND url CONTAINS "rum"
AND url CONTAINS ".js" THEN "Gomez"
WHEN REGEXP_MATCH(url, r".*boomerang[\d\-\w\.]*(\.js)?$") THEN "boomerang.js"
WHEN url CONTAINS "js-agent.newrelic.com" THEN "New Relic"
WHEN (url CONTAINS "spdcrv" OR url CONTAINS 'speedcurve.com/js/lux.js') THEN "SpeedCurve"
WHEN url CONTAINS "nccrum.core.js" THEN "NCC Web Perf"
WHEN url CONTAINS "cdn.appdynamics.com" THEN "AppDynamics"
WHEN url CONTAINS "static.cloudflareinsights.com/beacon.min.js" THEN "CloudFlare"
END as vendor
FROM [httparchive:requests.2020_01_01_desktop]
GROUP BY vendor
)
ORDER BY sites desc
Which bumps it up quite a bit:
Row | vendor | sites | percent | |
---|---|---|---|---|
1 | null | 4280567 | 56.88 | |
2 | Google Analytics | 2994638 | 39.79 | |
3 | New Relic | 142304 | 1.89 | |
4 | boomerang.js | 76162 | 1.01 | |
5 | Akamai mPulse | 14058 | 0.19 | |
6 | Pingdom | 5795 | 0.08 | |
7 | CloudFlare | 3585 | 0.05 | |
8 | AppDynamics | 3192 | 0.04 | |
9 | Akamai (Legacy RUM) | 3047 | 0.04 | |
10 | SpeedCurve | 1940 | 0.03 | |
11 | NCC Web Perf | 19 | 0.0 |