Share your questions, comments, and ideas about the Performance chapter of the Web Almanac…
Hi Rick. If I want to use SQL to query Big Query in order to look at percentiles for a given measure, for example, TTI. Should I use the queries here https://github.com/HTTPArchive/almanac.httparchive.org/tree/master/sql/2019/07_Performance ? I ask because I’ve been looking at PSI and Lighthouse scoring documentation and BigQuery SQL in an attempt to better understand performance analysis across a range of Google speed tools. Something is bugging me. At https://web.dev/interactive/ the scoring for TTI, for example, is nicely explained. However, when I run this query https://github.com/HTTPArchive/almanac.httparchive.org/blob/master/sql/2019/07_Performance/07_16.sql I was expecting to get percentile results that pretty much match those mentioned on the TTI scoring page mentioned above. Bu they never do. Often, not even close. I appreciate they change somewhat over time. I’d just like to know if the queries actually used by the Lighthouse tool for measures like TTI, FCP, SI etc. are listed anywhere?
I don’t think the Lighthouse queries are necessarily kept up to date. The web.dev resource you cited links to https://httparchive.org/reports/loading-speed#ttci, which shows the coarse distribution of TTI measured by HTTP Archive. The most recent data is from May 2019 (which is an issue) but the p25 value of 5 seconds aligns with the LH threshold for “fast”. They seem to be inverting the percentiles so that the lower, faster percentiles get higher scores, for example if you’re in the 25th percentile then you get a score of 75. Maybe that’s causing the confusion.
Also you could see the results for all of the queries used by the Performance chapter in this sheet: https://docs.google.com/spreadsheets/d/1zWzFSQ_ygb-gGr1H1BsJCfB7Z89zSIf7GX0UayVEte4/edit#gid=1524152444
These results were taken in July 2019 and are still roughly aligned with the May results and the LH thresholds, although the median is higher than 7.3 seconds now.
Thanks for your prompt reply. I realised the percentiles were invereted. Thanks for the confirmation that I am using the correct queries at https://github.com/HTTPArchive/almanac.httparchive.org/tree/master/sql/2019/07_Performance . It’s a shame if they are not kept up to date. I posted the question becasue I wanted to ensure I wasnt using a wrong set of queries. The information at web.dev is a little confusing. For the majority of metrics there is no mention of sampling, but here https://web.dev/lighthouse-total-blocking-time/ Google says TBT percentiles are taken from the top 10,000 sites. So I wasn’t sure if there was a different set of queries that took data from a sample of the HTTP Archive data set. It would be helpful if the percentile/scoring grids in pages beneath https://web.dev/lighthouse-performance/ for each metric were updated dynamically, the queries were identical to those used by Lighthouse, and if the Google Sheet you shared was permanently availableand updated regularly with the most recent percentile ranges. It would just add some transparency to the current state of web performance data that Lighthouse uses. Many thanks for your help and advice.
One possible issue with keeping the thresholds in sync with the live data is that sites’ scores would change unexpectedly, even if all else stayed the same. What matters most is that users are experiencing good performance and to them a site doesn’t feel any faster or slower just because the distribution of other sites’ performance changed.
I thought that Lighthouse did in fact monitor the performance of sites in the HTTP Archive and that performance scores could fluctuate because the performance of specific metrics, such as TTI for example, for the top 75th percentile etc. would change over time. Is this not the case then? If not, how often are the percentile benchmarkings updated? Example benchmarking on this page https://web.dev/interactive/
LH did a one-time study of the distribution of scores using HTTP Archive, but they’re not necessarily monitoring the month-to-month results. The study was to make an informed decision about where to put the goal posts for fast/avg/slow performance. Even though the underlying distribution may change over time, the goal posts stay relatively consistent so developers have predictable experiences. You can imagine the confusion if a developer improved TTI but their score went down because the improvement didn’t keep pace with the rest of the web.
The LH team may periodically fine tune things like audit weighting, primary metrics, and metric thresholds as needed to prioritize the things developers should care most about. AFAIK this done about once a year if even that often.
Ahh…that information is gold. Thank you very much for the explanation. I recommend that this is included on Google pages that reference Lighthouse scoring. At present, this isn’t mentioned anywhere.