I’m interested in analyzing which (moreso how many, but for this query just which) URLs from a given CDN belong to a certain country. I figured the easiest way to do this using the archive would be by utilizing the lang attribute, but the following query doesn’t appear to return the proper results. Any ideas on how to fix this, or just any better ways to go about detecting country?
#standardSQL
SELECT
t1.url, t2.cdn, t1.service1
FROM
(SELECT
A.url,
JSON_EXTRACT_SCALAR(report, "$.audits.valid-lang.score") AS score
FROM `httparchive.lighthouse.2017_06_01_mobile` as A) as t1
INNER JOIN
(SELECT
B.url, B.cdn,
FROM `httparchive.summary_pages.2017_06_01_mobile` as B
) as t2
ON t1.url = t2.url
WHERE
score IS NOT NULL
AND
cdn LIKE 'Akamai'
ORDER BY
service1 ASC
The query runs but it’s either empty or the third column is null.