I want to know whats the rough ballpark estimate for different third party domains when you insert a tag. I agree that some of these load in async so time spent is not a valid metric and obviously there’s this big debate on what constitutes first party vs third party. With all those caveats in, I write the following query which should give me the avg, median, p75 and p90/95 wall clock times attributed to the top third party domains:
SELECT DOMAIN(req.url) as domain, avg(req.time) as avg, NTH(50, quantiles(time)) p50,
NTH(75, quantiles(time)) p75,
NTH(90, quantiles(time)) p90,
NTH(95, quantiles(time)) p95,count(*) as num_requests
FROM [httparchive:runs.latest_requests] req JOIN (
SELECT DOMAIN(url) self, pageid
FROM [httparchive:runs.latest_pages]
WHERE rank <= 1000
) as pages ON pages.pageid = req.pageid
WHERE DOMAIN(req.url) != pages.self
GROUP BY domain
order by num_requests desc
limit 100;
which yields the following results
Is my method correct? Are there any other things I am missing?