Use of custom elements with attributes

By ask on twitter https://twitter.com/briankardell/status/1100081446528708611

I ran

#standardSQL
SELECT * FROM (
SELECT
  COUNT(*) AS num,
  REGEXP_EXTRACT(r.body, r"(?i)(<[a-z]+-[a-z0-9_-]*\s+(?:[^>/]|/[^>])+>)") AS match
FROM
  `httparchive.response_bodies.2019_02_01_desktop` AS r
WHERE
  page = url
GROUP BY
  match
ORDER BY
  num DESC
) WHERE match IS NOT NULL

See https://gist.github.com/zcorpan/56d1040e1afaa883b610c342c2e7a437

1 Like

I’m not skilled with this and it seems potentially pricey to try to learn on the fly but would it make sense to do

 #standardSQL
SELECT * FROM (
SELECT
  COUNT(*) AS frequency,
  REGEXP_EXTRACT(r.body, r"(?i)(<[a-z]+-[a-z0-9_-]*\s+(?:[^>/]|/[^>])+>)") AS match,
  COUNT(DISTINCT url) AS urls
FROM
  `httparchive.response_bodies.2019_02_01_desktop` AS r
WHERE
  page = url
GROUP BY
  match
ORDER BY
  frequency DESC
) WHERE match IS NOT NULL

So that this matches the same basic format/data of better one in Use of HTML elements ? Maybe not entirely as this needs further processing still to get just the tag name and join up the counts… It feels like these two ‘ideas’ tho would be good ones to run/publish every so often in a place that is easy to find/discuss.