Should we make importScripts faster?

I’d recently become aware of the fact that importScripts are slow to load when given more than 1 URL, at least in their Chromium implementation (and the current spec, with a pending PR to fix that). When multiple URLs are given, the browser downloads them serially, resulting in this operation being latency bound, which is bad.

A discussion ensued, during which folks asked the question “should we even bother optimizing them? Or can we just tell folks to move to use module workers?”.

IMO, a large part of the equation is optimizing existing content. If we made importScripts faster, how much content would benefit from that?
So, as you can probably guess from the fact that I’m writing this here, I turned to HA for help. (with @rviscomi’s assistance, because my regex skills are… not great).

I wanted to search for response bodies that contain importScripts(...), and then count the instances that have more than a single URL in them (and would therefore benefit from the optimization).

Here’s the query I ended up with:

SELECT *
FROM(
  SELECT
    URL as URL,
    ARRAY_LENGTH(SPLIT(import, ',')) AS num_urls
  FROM (
    SELECT bodies.url AS URL, 
           REGEXP_EXTRACT_ALL(bodies.body, 
                              r'importScripts\(([^)]*)\)') as imports
    FROM httparchive.response_bodies.2020_04_01_mobile AS bodies
  WHERE bodies.body LIKE '%importScripts(%'
  ), UNNEST(imports) AS import
)
WHERE num_urls > 1

While the final query does plow through 12.4TB of data, the (many) iterations on that query were done using the httparchive.sample_data.response_bodies_desktop_10k table, which is significantly smaller.

The results show over 30K current worker scripts that would benefit from that optimization! That, in my book, means making it faster is worth our while.

1 Like

How many pages are affected? If we assume each matched url is the same number as the number of pages, it’s ~0.6% of pages. But probably a page can have multiple matching urls, so the number of pages would be less than that.

Changed the query to include the page. Seems like there are 8116 unique pages this will impact

Thanks! So assuming 5m pages, it’s ~0.16%. Still seems good to optimize :slight_smile: