Data issues for 06/01/16 - 08/15/16 runs (desktop and mobile)

Between Jun 1 and Aug 15 we had a bug in the agent which affected TLS pages that had keep-alive disabled, full details…

The WPT agent was using the ssl session pointer as an index for a session/connection but wasn’t dealing very well when the same pointer was getting re-used. It would only show up on TLS pages that had keep-alive disabled and you would end up seeing some requests showing as not being TLS and having the response data overlapped with the end of the body from the previous response."

This was fixed a few weeks back. Sept 1st run had this fix deployed and, as far as I can tell, the issue should be resolved.

The side effect of the above is that some responses were incorrectly logged by our agents and resulted in import failures of both CSV and HAR data. Unfortunately we can’t “fix” those responses. The best we can do is drop them for the time period, which is what we’ve done for the CSV imports.

  • [httparchive.runs] (CSV): bad records are dropped. This means the number of reported requests is less than the actual number of requests made by the browser. However, the missing data is on the order of ~a few thousand requests (out of ~51M).
  • [httparchive.runs] (HAR): the bad records are still there. If you’re extracting {"response": {"headers": [...]}} data, you may want to add additional checks for this time period to catch bad results.

Overall, I don’t believe the above should result in significant skew of any metrics (the affected number of rows is small compared to total number of requests), but you may nonetheless want to be careful with data from this (06/01/16 - 08/15/16) date range.

Last but not least, if you noticed delayed availability / missing tables / missing data over the last few months, this is why… apologies! It took a while to track this one down. The good news is, everything should now be back on track.

1 Like