I was just playing around sessionizing the data for the crawlid since its available to see if we can infer how the average crawl duration was increasing over a period of time. This is what initially it looks like
Started around 4 days per crawl in 2013 and currently takes around 9 days to finish the crawl based on the current data. I think the reason for the increase is the increased number of hosts to be tracked.
To vet that hypothesis given that the mobile page crawls are substantially less lets compare the crawl time for those and I see its roughly the same as desktop
So essentially the crawl duration (as defined by the max time for crawl id minus the min time for the same crawl id) seems independent of the number of hosts to monitor, is that right?