Who are the top RUM analytics providers?


#1
SELECT vendor, sites, ROUND(ratio*10000)/100 as percent FROM (
  SELECT     
    COUNT(DISTINCT pageid, 1000000) as sites, 
    RATIO_TO_REPORT(sites) OVER() AS ratio,
    CASE 
      WHEN REGEXP_MATCH(url, "http[s]*://[www.|ssl.]*google-analytics.com/ga.js") THEN "Google Analytics"
      WHEN REGEXP_MATCH(url, "http[s]*://rum-static.pingdom.net/prum") THEN "Pingdom"
      WHEN REGEXP_MATCH(url, "insight.torbit.com/.*.js") THEN "Torbit"
      WHEN url CONTAINS  "aksb.min.js" THEN "Akamai"
      WHEN url CONTAINS "mpulse.net/boomerang/" 
        OR url CONTAINS "lognormal.net/boomerang/" THEN "SOASTA"
      WHEN url CONTAINS "/gomez" 
        AND url CONTAINS "rum" 
        AND url CONTAINS ".js" THEN "Gomez"
      WHEN url CONTAINS "boomerang.js" THEN "boomerang.js"
      WHEN url CONTAINS "/eum/rum.js" THEN "New Relic"
    END as vendor
  FROM [httparchive:runs.latest_requests] 
  GROUP BY vendor
  HAVING vendor IS NOT NULL
)
ORDER BY sites desc

Note: Akamai dynamically injects their RUM script at the time when the HTML is being generated (with customer defined sampling rate). As a result, the absolute count will vary (and won’t report all sites).


#2

@souders all merged into a single query now! The percent column is the relative market share for all the sites that have RUM installed. Trying to figure out how to add an absolute % as well, but that’s turning out to be a bit more tricky… /cc @fhoffa


#3

Reran the query with 4/15/17 data:

Not much change with GA but Pingdom, SOASTA, and Akamai have grown substantially.

Would also be interesting to see the distribution of Alexa ranks for each vendor before and after…

Edit: Looking at the absolute numbers, GA now has fewer sites, but a similar % of vendorshare. Where did they go?


#4

Was there a change in New Relic detection? Going from 6440 sites to 2 seems suspect.


#5

Changing the NewRelic detection to:

WHEN url CONTAINS “js-agent.newrelic.com” THEN “New Relic”

appears to work -


#6

To add a few more providers into the mix (as AppDynamics can be self-hosted it may under-report their numbers)

SELECT vendor, sites, ROUND(ratio*10000)/100 as percent FROM (
  SELECT     
    COUNT(DISTINCT pageid, 1000000) as sites, 
    RATIO_TO_REPORT(sites) OVER() AS ratio,
    CASE 
      WHEN REGEXP_MATCH(url, "http[s]*://[www.|ssl.]*google-analytics.com/ga.js") THEN "Google Analytics"
      WHEN REGEXP_MATCH(url, "http[s]*://rum-static.pingdom.net/prum") THEN "Pingdom"
      WHEN REGEXP_MATCH(url, "insight.torbit.com/.*.js") THEN "Torbit"
      WHEN url CONTAINS  "aksb.min.js" THEN "Akamai"
      WHEN url CONTAINS "mpulse.net/boomerang/" 
        OR url CONTAINS "lognormal.net/boomerang/" THEN "SOASTA"
      WHEN url CONTAINS "/gomez" 
        AND url CONTAINS "rum" 
        AND url CONTAINS ".js" THEN "Gomez"
      WHEN url CONTAINS "boomerang.js" THEN "boomerang.js"
      WHEN url CONTAINS "js-agent.newrelic.com" THEN "New Relic"
      WHEN url CONTAINS "spdcrv" THEN "SpeedCurve"
      WHEN url CONTAINS "nccrum.core.js" THEN "NCC Web Perf"
      WHEN url CONTAINS "cdn.appdynamics.com" THEN "AppDynamics"
    END as vendor
  FROM [httparchive:runs.latest_requests] 
  GROUP BY vendor
  HAVING vendor IS NOT NULL
)
ORDER BY sites desc


#7

Looking for my own purposes and thought I’d share the results as of May.

I’m curious what others think about this:
For the sites I’ve inspected and companies I’ve worked with that explore RUM as a performance tool, many of them use Google Analytics AND something else - is this analysis a fair representation of what’s out there?

I’m going to experiment with the querying tool some more, but haven’t arrived at any different results with a simple reordering of the case statement.


#8

Another conclusion/question -
I reran the query without the “HAVING vendor IS NOT NULL” and ended up with about 82% of sites having no observable RUM presence. That’s not terribly surprising, but it’s interesting when thinking about market penetration of paid-for performance monitoring solutions.


#9

Rerunning the query against latest tables (with larger origin count)…

SELECT vendor, sites, ROUND(ratio*10000)/100 as percent FROM (
  SELECT     
    COUNT(DISTINCT page, 1000000) as sites, 
    RATIO_TO_REPORT(sites) OVER() AS ratio,
    CASE 
      WHEN REGEXP_MATCH(url, "http[s]*://[www.|ssl.]*google-analytics.com/ga.js") THEN "Google Analytics"
      WHEN REGEXP_MATCH(url, "http[s]*://rum-static.pingdom.net/prum") THEN "Pingdom"
      WHEN REGEXP_MATCH(url, "insight.torbit.com/.*.js") THEN "Torbit"
      WHEN url CONTAINS  "aksb.min.js" THEN "Akamai"
      WHEN url CONTAINS "mpulse.net/boomerang/" 
        OR url CONTAINS "lognormal.net/boomerang/" THEN "SOASTA"
      WHEN url CONTAINS "/gomez" 
        AND url CONTAINS "rum" 
        AND url CONTAINS ".js" THEN "Gomez"
      WHEN url CONTAINS "boomerang.js" THEN "boomerang.js"
      WHEN url CONTAINS "js-agent.newrelic.com" THEN "New Relic"
      WHEN url CONTAINS "spdcrv" THEN "SpeedCurve"
      WHEN url CONTAINS "nccrum.core.js" THEN "NCC Web Perf"
      WHEN url CONTAINS "cdn.appdynamics.com" THEN "AppDynamics"
    END as vendor
  FROM [httparchive:response_bodies.2018_07_01_desktop]  
  GROUP BY vendor
  HAVING vendor IS NOT NULL
)
ORDER BY sites desc

image


#10

I would suggest one change from @igrigorik’s query above to use the requests dataset rather than response_bodies, which is 3x the size. Since the query doesn’t care about the responses themselves (just the URL) requests would do.


#11

Hmm, just realized that we’re missing some GA tags in above query (analytics.js, gtag.js)…

SELECT vendor, sites, ROUND(ratio*10000)/100 as percent FROM (
  SELECT     
    COUNT(DISTINCT page, 1000000) as sites, 
    RATIO_TO_REPORT(sites) OVER() AS ratio,
    CASE 
      WHEN REGEXP_MATCH(url, "http[s]*://[www.|ssl.]*google-analytics.com/ga.js") THEN "Google Analytics"
      WHEN REGEXP_MATCH(url, "http[s]*://[www.|ssl.]*google-analytics.com/analytics.js") THEN "Google Analytics"
      WHEN REGEXP_MATCH(url, "http[s]*://[www.|ssl.]*googletagmanager.com/gtag/js") THEN "Google Analytics"
      WHEN REGEXP_MATCH(url, "http[s]*://rum-static.pingdom.net/prum") THEN "Pingdom"
      WHEN REGEXP_MATCH(url, "insight.torbit.com/.*.js") THEN "Torbit"
      WHEN url CONTAINS  "aksb.min.js" THEN "Akamai"
      WHEN url CONTAINS "mpulse.net/boomerang/" 
        OR url CONTAINS "lognormal.net/boomerang/" THEN "SOASTA"
      WHEN url CONTAINS "/gomez" 
        AND url CONTAINS "rum" 
        AND url CONTAINS ".js" THEN "Gomez"
      WHEN url CONTAINS "boomerang.js" THEN "boomerang.js"
      WHEN url CONTAINS "js-agent.newrelic.com" THEN "New Relic"
      WHEN url CONTAINS "spdcrv" THEN "SpeedCurve"
      WHEN url CONTAINS "nccrum.core.js" THEN "NCC Web Perf"
      WHEN url CONTAINS "cdn.appdynamics.com" THEN "AppDynamics"
    END as vendor
  FROM [httparchive:requests.2018_07_01_desktop]  
  GROUP BY vendor
)
ORDER BY sites desc

Using the above, a big bump for GA…

image


#12

I can’t see the inside view, but is this really accurately representing RUM by including all of the GA content? I know developers can create and send performance metrics through GA, but it seems like that’s more of a boundary case.

GA isn’t on the short list at all when I think about RUM (performance) in any context. I’d be happy to know otherwise.