Is there an explanation of the columns

I am going through httparchive.summary_pages.2021_09_01_mobile in BigQuery preview and I don’t know what some of the columns mean. Is there documentation somewhere that describes each column?

Unfortunately not. The entire BigQuery dataset is in dire need of documentation and a project unto itself.

Are there particular columns that you’re wondering about? We can try to explain them here in lieu of the docs.

I have bean able to guess at the meaning of most of them (and found a few answers on this form) but I can figure out what maxDomainReqs, numGlibs, and usertiming are. I also would love to know the difference between onLoad, fullyLoaded, and visualComplete. A lot of the others I’m pretty sure I know what they mean but it would be nice if there was a list so I could sanity check myself.

  • onLoad - beginning of the window’s onload event. The load event fires at the end of the document loading process. At this point, all of the objects in the document are in the DOM, and all the images, scripts, links and sub-frames have finished loading. source
  • visualComplete - When the page was 100% visually complete for the first time based on the video capture and comparing page histograms (SpeedIndex completeness) source
  • fullyLoaded - The Fully Loaded time is measured as the time from the start of the initial navigation until there was 2 seconds of no network activity after Document Complete. This will usually include any activity that is triggered by javascript after the main page loads. source
  • maxDomainReqs: The HTTP Archive counts the number of requests made on each domain. The domain with the most requests is the “max domain” and the number of requests on that domain is the “maxDomainReqs”. The average maxDomainReqs value has risen from 47 to 50 over the past year. That’s not a huge increase, but the fact that the average number of requests on one domain is so high is startling. source

Not sure about the others.

2 Likes

Thanks that is very helpful

For more information about the timings you can visit webpagetest.org because this is software that does the testing and you can see the performance of individual sites. Difficult to beat waterfall diagrams to understand how slow some websites are.

It’s worth noting that maxDomainReqs is occasionally wrong but good enough. Since http/2 it doesn’t matter as much because requests get bundled on the same connection. numDomains is, of course, an indicator of potential data privacy issues.

Charlie

1 Like

Does anyone know what numGlibs , and usertiming are?
@rviscomi maybe you will be able to explain?

Here’s how numGlibs is defined in the pipeline:

if ( FALSE !== stripos($row['req_host'], "googleapis.com") ) {
	$numGlibs++;
}

So it’s a count of the number of requests having a hostname of googleapis.com.

usertiming comes from the custom metric defined here:


var perf = ( "undefined" !== typeof( window.performance) ? window.performance : null );
var numMM = 0;
if ( perf && "function" === typeof( perf.mark ) && "function" === typeof( perf.measure ) && "function" === typeof( perf.getEntriesByType) ) {
	numMM = perf.getEntriesByType("mark").length + perf.getEntriesByType("measure").length;
}

return numMM;

It’s the sum of the marks and measures set with the User Timing API.

Thanks that helps alot