The Performance Impact of Cryptocurrency Mining on the Web


#1

Last week there was an article pointing out that hundreds of websites have been hacked to add bitcoin cryptocurrency mining scripts into them. This of course is interesting to the media because it uses the words ‘hacked’ and ‘bitcoin’ cryptocurrency in the same sentence. Tools like AT&T’s Video Optimizer will find these connections in a mobile application or website, and webpagetest.org would also work for mobile or desktop websites.

Now, when I hear things like “hundreds of sites” and infected… I immediately start thinking - “I wonder how many sites are mining bitcoin?” This (of course) leads to “How badly does bitcoin cryptocurrency mining affect the performance of the site?”

I know these are normal things to ask oneself. Right?

How Does One Mine bitcoin Cryptocurrency on a Website?

Essentially what happens is that a Javascript library is added to the page, and as long as the page is open, the Javascript (through the browser) will use the local CPU to perform bitcoin cryptocurrency calculations, and the person holding the bitcoin key can make money for any hashes that are uncovered. The consumer may not know that their device is crunching some pretty heavy numbers, but may notice faster battery drain, or perhaps the device getting warm to the touch.

The most popular JavaScript library for this is CoinHive. You add a script dependency:

<script src="https://coinhive.com/lib/coinhive.min.js"></script>

and a few lines of code - and you are on your way to wealth and riches. The documentation talks about how “you should not do this without telling your customers that you are doing it, because that is not ethical…” Clearly, this disclaimer is not working. :slight_smile:

So, How Many Sites ARE Mining Bitcoin Cryptocurrency?

A quick search in the 10-15-2017 mobile HTTPArchive shows 1,040 sites that are using some form or other of “coinhive.min.js” There are a few duplicates - it appears some sites are running 2 separate instances (one hosted at coin-hive.com and the other at coinhive.com.) Since mining bitcoin cryptocurrency is a CPU intensive thing - I would imagine that 2 instances would be competing for resources, and might not be the most efficient way to mine bitcoin.

SELECT
pages.rank rank,
pages.url url,

FROM
httparchive.runs.latest_pages_mobile pages join(
SELECT
url,
pageid,
FROM
httparchive.runs.latest_requests_mobile
WHERE
url CONTAINS “coinhive.min.js”) requests
ON
requests.pageid = pages.pageid
ORDER BY
pages.rank ASC

We can modify this query to get SpeedIndex and VisuallyComplete from “latest_pages_mobile.”

Can We Get CPU Usage?
My assumption is that bitcoin cryptocurrency mining is going to really impact CPU usage - and we can get CPU usage from the HAR data in HTTPArchive. While I am at it, I’ll grab the Time to Interactive too:

CREATE TEMPORARY FUNCTION
getHarEntry(payload STRING,
field STRING)
RETURNS INT64
LANGUAGE js AS “”“
try {
var $ = JSON.parse(payload);
return $[field];
} catch (e) {
return ‘’;
}
”"";
SELECT
url as url,
getHarEntry(payload,"_TTIMeasurementEnd") AS tti,
getHarEntry(payload,"_fullyLoadedCPUpct") AS cpu
FROM
httparchive.har.2017_10_15_android_pages

I saved this result in a table, and then can join it with the first query to learn the TTI and CPU % for each site that mines bitcoin cryptocurrency.

Timing Metrics

Let’s take a look at the load performance metrics. I have timing stats for all of the sites than mine bitcoin. I calculated quartiles(metric, 11) to get the percentiles at 10% intervals for all of the speed metrics. The data is similar at every decile, so for simplicity, I will only report the median values here.

In this chart, I am comparing the median site that mines bitcoin to the median value for all mobile sites in the HTTPArchive:
image

Sites with bitcoin cryptocurrency mining have a small 6% impact to SpeedIndex, but VisuallyComplete and TimeToInteractive are 18.5% and 21% slower. This is not terribly surprising, right? The site is throwing another process on the CPU that we know is going to use as much CPU as possible - and that will impact the page loading (especially on a mobile device)

So, how much more CPU is used when JavaScript bitcoin cryptocurrency mining is going on?
image
The Median site with bitcoin mining uses 14% more CPU than the median site in the HTTPArchive (45%-32%).

Coinhive API Parameters

What else can we learn about bitcoin cryptocurrency mining in the Browser? The Coinhive API lists a bunch of different parameters that can be configured in your mining operation. To see these, we need to get the bodies where the Coinhive JavaScript is called:

select
page, url, body
from httparchive.har.2017_10_15_android_requests_bodies
where body CONTAINS “coinhive.min.js”

This is a ~500GB query, so I made sure to save the results in a table to simplify future calculations.
There are only 645 results (from the 1040 pages found).

The first parameter to look for is the “isMobile()” which allows the bitcoin cryptocurrency miner to turn off mining when the customer is on a mobile device. There are only 16 matches (of 645) and a manual check shows that none are being used in reference to the CoinHive scripting. So basically - if a site has bitcoin mining running - it will mine on your phone as well.

We now know that every site that mines bitcoin will do it on your phone… So how hot is your phone going to get? The mining API has parameters to adjust how many threads to use for mining (the default is navigator.hardwareConcurrency - the number of CPUs available). The API also allows you to throttle the usage of the CPUs; the default throttle is 0 - meaning 100% of the CPU will be used to mine bitcoin.

Here is the query I used to measure the percentiles of thread count: https://bigquery.cloud.google.com/savedquery/554669893916:ff156ca38c414d8a9b69bd7d03d865a0

Let’s see what we can learn. There are only 157 sites that return a value for “thread” based on my Regex. This means that 500 (77%) will be using all of the available CPUs. The remaining sites have 61 with Javascript parameters, but 96 have a simple value. Of those 96, the median # of threads specified is 2, but the 60% percentile onward specify 4-8 threads:
image

How much CPU will these threads use? In the Coinhive API, you can configure the throttle on your mining operation. A throttling value closer to 0 means “mine fast and cook the phone” and a number close to 1 means “mine slowly, so no one will notice.” The default value is 0. In my query for throttling parameters, I find 292 sites that set this parameter. Of course, this implies that 353 (or 55%) of these sites have throttle set to 0.

Of the sites that set a parameter for throttling, there are 221 entries that specify a value for the throttle, and we find that the median site runs at 50% throttle.
image

Conclusion

There are a small number of websites (1040 on mobile, 1165 on desktop) that mine bitcoin cryptocurrency using the CoinHive JavaScript library on their home page. Of 645 that I was able to examine the Javascript code on, over 50% use the default settings of mining on “all available CPUs” with 0 throttling.

The median site using the CoinHive bitcoin mining library uses 14% more CPU compared to the HTTPArchive dataset, and has a Time to Interactive that is 21% slower.

It may be the case that adding bitcoin cryptocurrency mining to your site allows to to earn money without using ads. However, the data presented here shows that there is a performance impact to the addition of these mining libraries to your site.

NOTE: CoinHive mines Monero, not Bitcoin, and I used these as synonyms. I changed all instances to “cryptocurrency.”


#2

Nice digging. I get the really strong feeling that this is going to be getting worse very quickly. If you’re feeling adventurous there are a few other miners that I had to block to keep people from using WPT as a mining farm:

variants on coinhive:
coin-hive.com coinhive.com cnhv.co

Other miner scripts:
load.jsecoin.com miner.pr0gramm.com crypto-loot.com


#3

Wow really interesting. I’m imagining a transparency report powered by the HTTP Archive that publishes a list of URLs found to contain crypto mining libraries.


#4

Here’s a spreadsheet of all requests matching the domains @patmeenan mentioned including the Alexa rank of the parent pages: https://docs.google.com/spreadsheets/d/1ehxSVoK9sCDwwqQQ6NS07UpkBTLRoySSaQOw1zyWK6w/edit?usp=sharing

1071 unique pages. Up to ~80 requests flagged per page.

Not many surprises in there (mostly porn and stream/pirate/torrent sites), but I did find one .gov domain for Brazil: http://www.cidadao.sp.gov.br/


#5

Way back in the day I wrote about the idea of “collaborative Map-Reduce in the browser”, little did I know… cryptomining would become the leading use case. :expressionless:

@doug_sillars kudos, awesome analysis.


#6

I found an even longer list:


#7

Nice. Here’s an all-inclusive query that gives rank/page/request info:

SELECT
  rank,
  page,
  req.url
FROM
  `httparchive.har.2017_10_15_chrome_requests` AS req
JOIN
  `httparchive.runs.2017_10_15_pages` AS pages
ON
  req.page = pages.url
WHERE
  REGEXP_CONTAINS(req.url, '(cnhv.co|coin-hive.com|coinhive.com|gus.host|load.jsecoin.com|miner.pr0gramm.com|minemytraffic.com|ppoi.org|projectpoi.com|azvjudwr.info|jroqvbvw.info|jyhfuqoh.info|kdowqlpt.info|xbasfbno.info|crypto-loot.com|coinerra.com|coin-have.com|minero.pw|minero-proxy-01.now.sh|minero-proxy-02.now.sh|minero-proxy-03.now.sh|api.inwemo.com|jsecoin.com)')
ORDER BY
  rank ASC

(takes 6.7 GB to process FYI)


#8

Some awesome analysis here! Nice work @doug_sillars!

Expanding on @rviscomi’s query, we can use REGEXP_EXTRACT_ALL to extract the library used in the SELECT clause. The query below shows which cryptocurrency libraries are most popular -

SELECT
  library,
  COUNT(DISTINCT page) library_count
FROM (
  SELECT
    page,
    req.url,
    REGEXP_EXTRACT_ALL(LOWER(req.url), r'(cnhv.co|coin-hive.com|coinhive.com|gus.host|load.jsecoin.com|miner.pr0gramm.com|minemytraffic.com|ppoi.org|projectpoi.com|azvjudwr.info|jroqvbvw.info|jyhfuqoh.info|kdowqlpt.info|xbasfbno.info|crypto-loot.com|coinerra.com|coin-have.com|minero.pw|minero-proxy-01.now.sh|minero-proxy-02.now.sh|minero-proxy-03.now.sh|api.inwemo.com|jsecoin.com)') library
  FROM
    `httparchive.har.2017_10_15_chrome_requests` AS req
  JOIN
    `httparchive.runs.2017_10_15_pages` AS pages
  ON
    req.page = pages.url
  WHERE
    REGEXP_CONTAINS(req.url, '(cnhv.co|coin-hive.com|coinhive.com|gus.host|load.jsecoin.com|miner.pr0gramm.com|minemytraffic.com|ppoi.org|projectpoi.com|azvjudwr.info|jroqvbvw.info|jyhfuqoh.info|kdowqlpt.info|xbasfbno.info|crypto-loot.com|coinerra.com|coin-have.com|minero.pw|minero-proxy-01.now.sh|minero-proxy-02.now.sh|minero-proxy-03.now.sh|api.inwemo.com|jsecoin.com)') ) libraries
CROSS JOIN
  UNNEST(libraries.library) library
GROUP BY
  library
ORDER BY
  library_count DESC

It looks like coinhive seems to be the most prevalent, followed by jsecoin and crypto-loot -
image

Also, it occurred to me that some sites might be using more than 1 crytpocurrency mining library. Running the following query, I was able to see that there are 4 sites using 3 libraries and 270 sites using 2 libraries!

SELECT
  library_count,
  COUNT(*)
FROM (
  SELECT
    page,
    COUNT(DISTINCT library) library_count
  FROM (
    SELECT
      page,
      req.url,
      REGEXP_EXTRACT_ALL(LOWER(req.url), r'(cnhv.co|coin-hive.com|coinhive.com|gus.host|load.jsecoin.com|miner.pr0gramm.com|minemytraffic.com|ppoi.org|projectpoi.com|azvjudwr.info|jroqvbvw.info|jyhfuqoh.info|kdowqlpt.info|xbasfbno.info|crypto-loot.com|coinerra.com|coin-have.com|minero.pw|minero-proxy-01.now.sh|minero-proxy-02.now.sh|minero-proxy-03.now.sh|api.inwemo.com|jsecoin.com)') library
    FROM
      `httparchive.har.2017_10_15_chrome_requests` AS req
    JOIN
      `httparchive.runs.2017_10_15_pages` AS pages
    ON
      req.page = pages.url
    WHERE
      REGEXP_CONTAINS(req.url, '(cnhv.co|coin-hive.com|coinhive.com|gus.host|load.jsecoin.com|miner.pr0gramm.com|minemytraffic.com|ppoi.org|projectpoi.com|azvjudwr.info|jroqvbvw.info|jyhfuqoh.info|kdowqlpt.info|xbasfbno.info|crypto-loot.com|coinerra.com|coin-have.com|minero.pw|minero-proxy-01.now.sh|minero-proxy-02.now.sh|minero-proxy-03.now.sh|api.inwemo.com|jsecoin.com)') ) libraries
  CROSS JOIN
    UNNEST(libraries.library) library
  GROUP BY
    page
  ORDER BY
    library_count DESC )
GROUP BY
  library_count

Many of the sites using 2 libraries were a combination of coinhive and coin-hive, but there were still a handful that used a combination of coin-hive and cryptoloot or coin-hive and jsecoin.


#9

This thread has already influenced one of the affected websites!


#10

30 days later.

In the 10/15 run, there were 1,040 mobile sites with the CoinHive Javascript embedded. In the 11/15 HTTPArchive run, this has dropped to 759 - a drop of 27%!

Updating @paulcalvano’s query to 11/15 (and mobile instead of chrome), we see:
image

(there are probably lots of double requests for coinhive - hence the number being higher than 759)


#11

This
https://blog.trendmicro.com/trendlabs-security-intelligence/malvertising-campaign-abuses-googles-doubleclick-to-deliver-cryptocurrency-miners/ reminded me of this post


#16

Since we’ve integrated Wappalyzer into the HTTP Archive, we now have access to their Cryptominer detection category to make this analysis even easier.

Here’s an example query:

SELECT
  SUBSTR(_TABLE_SUFFIX, 0, 10) AS date,
  IF(ENDS_WITH(_TABLE_SUFFIX, 'desktop'), 'desktop', 'mobile') AS client,
  COUNT(0)
FROM
  `httparchive.technologies.*`
WHERE
  category = 'Cryptominer'
GROUP BY
  _TABLE_SUFFIX
ORDER BY
  date DESC,
  client

Explore the results