Some awesome analysis here! Nice work @doug_sillars!
Expanding on @rviscomi’s query, we can use REGEXP_EXTRACT_ALL
to extract the library used in the SELECT clause. The query below shows which cryptocurrency libraries are most popular -
SELECT
library,
COUNT(DISTINCT page) library_count
FROM (
SELECT
page,
req.url,
REGEXP_EXTRACT_ALL(LOWER(req.url), r'(cnhv.co|coin-hive.com|coinhive.com|gus.host|load.jsecoin.com|miner.pr0gramm.com|minemytraffic.com|ppoi.org|projectpoi.com|azvjudwr.info|jroqvbvw.info|jyhfuqoh.info|kdowqlpt.info|xbasfbno.info|crypto-loot.com|coinerra.com|coin-have.com|minero.pw|minero-proxy-01.now.sh|minero-proxy-02.now.sh|minero-proxy-03.now.sh|api.inwemo.com|jsecoin.com)') library
FROM
`httparchive.har.2017_10_15_chrome_requests` AS req
JOIN
`httparchive.runs.2017_10_15_pages` AS pages
ON
req.page = pages.url
WHERE
REGEXP_CONTAINS(req.url, '(cnhv.co|coin-hive.com|coinhive.com|gus.host|load.jsecoin.com|miner.pr0gramm.com|minemytraffic.com|ppoi.org|projectpoi.com|azvjudwr.info|jroqvbvw.info|jyhfuqoh.info|kdowqlpt.info|xbasfbno.info|crypto-loot.com|coinerra.com|coin-have.com|minero.pw|minero-proxy-01.now.sh|minero-proxy-02.now.sh|minero-proxy-03.now.sh|api.inwemo.com|jsecoin.com)') ) libraries
CROSS JOIN
UNNEST(libraries.library) library
GROUP BY
library
ORDER BY
library_count DESC
It looks like coinhive seems to be the most prevalent, followed by jsecoin and crypto-loot -
Also, it occurred to me that some sites might be using more than 1 crytpocurrency mining library. Running the following query, I was able to see that there are 4 sites using 3 libraries and 270 sites using 2 libraries!
SELECT
library_count,
COUNT(*)
FROM (
SELECT
page,
COUNT(DISTINCT library) library_count
FROM (
SELECT
page,
req.url,
REGEXP_EXTRACT_ALL(LOWER(req.url), r'(cnhv.co|coin-hive.com|coinhive.com|gus.host|load.jsecoin.com|miner.pr0gramm.com|minemytraffic.com|ppoi.org|projectpoi.com|azvjudwr.info|jroqvbvw.info|jyhfuqoh.info|kdowqlpt.info|xbasfbno.info|crypto-loot.com|coinerra.com|coin-have.com|minero.pw|minero-proxy-01.now.sh|minero-proxy-02.now.sh|minero-proxy-03.now.sh|api.inwemo.com|jsecoin.com)') library
FROM
`httparchive.har.2017_10_15_chrome_requests` AS req
JOIN
`httparchive.runs.2017_10_15_pages` AS pages
ON
req.page = pages.url
WHERE
REGEXP_CONTAINS(req.url, '(cnhv.co|coin-hive.com|coinhive.com|gus.host|load.jsecoin.com|miner.pr0gramm.com|minemytraffic.com|ppoi.org|projectpoi.com|azvjudwr.info|jroqvbvw.info|jyhfuqoh.info|kdowqlpt.info|xbasfbno.info|crypto-loot.com|coinerra.com|coin-have.com|minero.pw|minero-proxy-01.now.sh|minero-proxy-02.now.sh|minero-proxy-03.now.sh|api.inwemo.com|jsecoin.com)') ) libraries
CROSS JOIN
UNNEST(libraries.library) library
GROUP BY
page
ORDER BY
library_count DESC )
GROUP BY
library_count
Many of the sites using 2 libraries were a combination of coinhive and coin-hive, but there were still a handful that used a combination of coin-hive and cryptoloot or coin-hive and jsecoin.