So, I took a look at last year's Iditarod. The first thing I did was to plot speed against bib number:
If there's a relationship it certainly did not pop out as clearly as it did in the Copper Basin. So, I ran the actual numbers. Using the R statistical package, I ran a Kendall's rank correlation tau, which has the advantage of not making assumptions about the underlying distribution (for example, that the underlying data have a "normal" distribution). In a nutshell, nope, there does not appear to be a relationship between bib number and speed in these particular data, or at least not one that cannot be explained as the result of random fluctuation. Specifically, the results of the run are:
Kendall's rank correlation tau data: y$Bib and y$speed z = -1.8238, p-value = 0.06818 alternative hypothesis: true tau is not equal to 0 sample estimates: tau -0.1557844
So, with a p value of 0.06818, it does not meet a .05 significance level criterion.
It doesn't appear likely that distance makes a difference. You'd reasonably expect that a shorter trail would see greater impacts, but there's not that much difference in distance between the start and first checkpoints in the two races (42 miles in Iditarod, 50 miles in Copper Basin). So, there's still a bunch of work to do. It may have something to do with trail differences, or weather, or ... ? I'll be taking a look at the 2014 numbers after the last team is into Yentna this weekend, but over the longer term I'd like to aggregate a bunch of years and see what falls out.
No comments:
Post a Comment