HTTP Keep-Alive analysis

HTTP “Keep-Alive” header is an optional header which indicates server’s connection management policy:

~22% of responses contain “timeout” attribute. The query and the distribution is as follows:

  REGEXP_EXTRACT(resp_keep_alive,r'max=(\d+)') timeout,
  COUNT(*) cnt
FROM [httparchive:runs.latest_requests] 
GROUP by timeout;

That’s pretty aggressive! ~65% of responses specify a <5s timeout - yikes. By the time we’re at 30s, we’re at 95%… I guess this is not surprising given that httpd 2.2 uses a 5s default… nginx defaults to 75s.

~19% of responses contain “max” attribute, with “100” being the most prevalent (85%) default. The query and the distribution is as follows (note that I’m rounding the buckets):

SELECT maxr, count(*) as cnt FROM (
  SELECT CEIL(INTEGER(REGEXP_EXTRACT(resp_keep_alive,r'max=(\d+)'))/100)*100 maxr,
  FROM [httparchive:runs.latest_requests]

Wondering what max=0 means? Yeah, that doesn’t make much sense does it…

SELECT * from (
  SELECT pageid, resp_keep_alive, 
    HOST(url) as host, 
    INTEGER(REGEXP_EXTRACT(resp_keep_alive,r'max=(\d+)')) as maxr
  FROM [httparchive:runs.latest_requests]
) WHERE maxr = 0

Thankfully, it’s just one misconfigured server (all 300 urls):