Friday, January 31, 2014

2012, from the start to Two Rivers

Here's a quick summary of times from the start to Two Rivers in 2012:

Times ranged from 7:15 (7 hours and 15 minutes) to 12:51, with speeds from 5.6mph to 9.93mph.  The fastest time was Hugh Neff's.  I would not expect times tomorrow to be quite as fast, given trail conditions.  The average (mean) time was 10:01:49, but the distribution was skewed and the median time was 10:50 (average speed 7.4 and a median speed of 6.65).

Here's a histogram of the speeds:

As you can see, most of the speeds were clustered around the lower end of the scale, but a few speedsters pulled the average up.

Here's a complete table of the average speeds from the start to Two Rivers:

Musher NameSpeed
Kyla Durham5.6
Jason Weitzel5.95
Misha Pedersen6.07
Maren Bradley6.12
Abbie West6.23
Paige Drobny6.3
Allen Moore6.38
Brian Wilmshurst6.42
Michael Telpin6.43
Mike Ellis6.49
Joar Leifseth Ulsom6.65
Sonny Lindner6.24
Brent Sass7.13
Yuka Honda7.32
Nikolay Ettyne7.59
Marcelle Fressineau8.45
Trent Herbst8.71
Jake Berkowitz8.8
David Dalton9.27
Gus Guenther9.33
Lance Mackey9.6
Kristy Berington9.8
Hugh Neff9.93

Returning mushers (in order of speed) are Hugh Neff, Dave Dalton, Brent Sass, Mike Ellis, Brian Wilmshurst, and Allen Moore.

2014 Yukon Quest

The 2014 Yukon Quest starts tomorrow, with a small field and a tough trail.  While the field is small it's pretty strong, and there's nobody who pops out as likely to get into a lot of trouble (note that "likely to get into a lot of trouble" is not the same as "likely to scratch").  At this point we know that the race will be starting on 2nd Avenue in Fairbanks rather than on the river, and will be ending at Takhini Hot Springs rather than Whitehorse, in both cases due to poor ice on the rivers.

Something that's new this year and should be a lot of fun is that the Quest has a new feature in the "starter" auction -- auction winners will ride in a tag sled behind their musher from the start chute to a drop-off point a few miles up the river, on Fort Wainwright.  All of the money that's been raised from the auction is going directly into the purse.

I think that from a spectator perspective it ought to be a very good race.  For one thing, the coverage should be excellent.  The outstanding Suitcase Media crew will be back, providing photographs and video from the trail, and both KUAC and the News-Miner will have reporters covering the race from the trail as well.  Tracking reliability should be much improved, with Trackleaders now putting two trackers on each sled and splicing the data streams together.

Note that I've got a running Twitter search for Quest-related tweets in a panel on the right side of this page.  Twitter can be an excellent place to pick up information not available elsewhere.

The Quest has a new website, which means that they'll also have a new data display.  I have no idea what to expect as it's not up yet, but I'm hoping that it's going to be data-crunching friendly.  In the meantime I'll be keeping (and publishing) my own spreadsheets, as usual, so watch for that information to be posted.

One piece of really good news for fans is that the Quest 300 is going to be tracked.  It should be a top-notch race this year.  As usual there are folks doing the race as a Quest or Iditarod qualifier, but we've also got some ringers in there.  With Aliy Zirkle and Sebastian Schnuelle in there, as well as a few less well-known but fast teams (keep an eye on Heidi Sutter, Shaynee Seipke, and while Ryne Olson is running a young team she could do quite well), it could be a quite exciting race.

So here's what I'm thinking right now ...  The first thing is that the trail is very hard and very fast (Scott Chesney has described our trails in Two Rivers as "frictionless") and frankly I'm expecting to see a lot of dropped dogs.  I also think that mushers who provide excellent dog care will have an advantage.  Teams that go out fast and hard may regret it later.

Anything can happen (and probably will!), but it looks like Allen Moore is going into the race in excellent shape and is a good bet to repeat.  Brent Sass and Ken Anderson could both have an excellent year, Mike Ellis is going to surprise a few people, and unless he overdoes the first half of the race I think a lot of people outside of Alaska or who don't follow the mid-distance races are likely to be wondering where that Matt Hall guy came from by the time the race is entering the Yukon.  Also keep an eye on Cody Strathe - Team Squid had an excellent Copper Basin, another tough race with difficult trail conditions.

Purebred fans should note that there are three Siberian Husky teams in the race this year, which is a relatively high percentage.  The teams are Mike Ellis, Tony Angelo, and Hank DeBruin.  Both Hank and Mike are race veterans, and this is Tony's first Quest.

This is also a race where learning what's available from the Trackleaders interface could help provide a lot of insight into what's going on on the trail that's not available from just looking at locations on a map.  Mushers' individual pages can give you insight into run/rest schedules and a better handle on speeds while moving, and as always the race flow plot allows you to get some understanding of the dynamics of the race.

Saturday, January 11, 2014

The relationship between speed and bib number

The Copper Basin 300 is underway, with nearly all teams having arrived at the first checkpoint.  It looks like the race organization is doing an excellent job, and while most people don't care about this sort of thing I'm grateful that their race data are in a form that requires nearly no massaging to be useful for analytical purposes.

Anyway, one of the things that's come up is that people on the CB300 Facebook page have expressed surprise that Nic Petit is as fast as he is, and there's been a general sense that it has something to do with trail conditions.  That trails deteriorate with traffic is well-known and just a given, but I was surprised that they were surprised that someone whose nickname is "Quick Nic," who was Iditarod rookie of the year in 2011 and who took 6th in last year's Iditarod, would be fast.  So, I decided to take a closer look at whether or not the numbers support their argument.

I use the R statistical package for analysis.  R is an open source, free, and widely-used statistical tool, and a follow-on to Bell Labs's S package (I know, right?).  I've been running it inside the R Studio development environment, which packages a bunch of tools together in an easy-to-use manner that really boosts productivity.

So, the first thing I did was plot speed against bib number:

Eyeballing it, it certainly looked like there was a negative correlation: that is to say, low bib numbers tended to have higher speeds than high bib numbers.  Did the actual numbers support it?  As it turns out they did, with a Pearson correlation coefficient (r) of -.43 with a probability P of .0037.  That's both fairly highly-correlated and unlikely to be the result of random fluctuation.

Then I decided to re-run it without Nic, since it's a coincidence that "Quick Nic" drew bib #3 and that could have impacted the results quite a bit.  Without Nic we ended up with a correlation coefficient of -.38 with a P of .0119, which is not quite as strong but is still present.

I have a pile of deadlines hanging over me and some undone work, but when I get a moment I'd like to look at some other races (for example, last weekend's Knik 200, where fast mushers were more evenly distributed through the bib number space) and also take a look at whether or not the correlation we're seeing here deteriorates over the course of the race.  I expect it to be close to 0, but my intuitions can be very, very wrong.  In the meantime, this is kind of interesting.

Saturday, January 4, 2014

More on Jake Berkowitz's "jaggies"

If you're following the Knik 200 GPS trackers, you may have noticed some erraticness in Jake Berkowitz's line in the race flow chart.  Here's an example:

(Jake is the purple line).  And if you've looked very closely at his track you may have noticed some oddness in the timestamps.  For example, readings 32 and 33 have timestamps 16 seconds apart but appear to claim that the distance covered during that 16 seconds was about .25 miles, which isn't possible (yes, Jake's got a very fast team, but not that fast).  After my last blog post I heard from Trackleaders with an explanation of what they're doing and why this looks odd.

They've been working on having one team carry two trackers, providing extra reliability (and hopefully accuracy) during events.  The Knik is the first time that they've experimented with using two different types of tracking devices, putting both a SPOT Gen 3 and a SPOT Trace on Jake's sled.  They're splicing the two data sources together into one data stream.  It appears to be the case that the Gen 3 and the Trace treat timestamps differently from one another and it's introduced some inconsistency into the times on the readings, which in turn has necessarily had some impacts on the race flow chart since the X axis is time.

Jake's location on the map is correct and the overall slope of the line is correct, and even with Jake's jaggies I think the chart remains the most useful tool Trackleaders provides for understanding how the race is unfolding.

Knik 200, an anomaly

The Knik 200 and Knik 100 got underway this morning with a full field of 41 mushers in the 200 and 10 in the 100.  They're the first race in Alaska to be GPS-tracked this winter so we've been poking around, trying to see if there's anything new.  Nothing pops out other than some refinements to the race flow chart.

As usual, I'm watching the race flow chart to get a handle on what's happening on the trail, and as usual there's a hiccup here or there that bears further investigation.  In this case we're seeing a lot of jaggies (if that's not a technical term yet it really needs to become one) in Jake Berkowitz's curve:

We've seen this a lot in the past and it seems to be that the plotting software being used by Trackleaders is throwing these in.  The more interesting question is why they're showing up in Jake's curve more than in other teams'.  But one more surprise popped out.  If the x-axis (horizontal) is the race time and the y-axis (vertical) is the trail distance, why does it appear that Jake went backwards at one point?  That is to say, why is he further down the trail (around 30.3 miles) at 3.41 hours than he is a short time later (about 29.9 miles), at 3.48 hours?

Here's why:

[A brief digression: on SPOT 1st and 2nd generation trackers, the location readings were taken at a fixed interval, 10 minutes.  The new SPOT 3 devices have a settable reading interval, and based on what we're seeing in the Knik Trackleaders have set the tracking interval to 5 minutes.]

This is taken from Jake's individual tracker page, and I've zoomed into the corner where they make a very sharp turn to the northeast after having been traveling to the WNW from the start (the red line is the "trail," the blue line is connects the tracker's individual location readings taken every time period; that is to say, the blue line is more-or-less where Jake's been).  If you've been following distance dogsled racing for any time you're probably very familiar with trails being moved from where they're shown on the map.  Often this is because the trail was drawn in based on a previous year, or drawn in by hand, etc., and in the meantime conditions on the ground required moving the actual trail for the safety of the teams, because a section of trail has become otherwise impassible, etc.  So, for whatever reason this trail has been moved a short distance.

Here's another issue: the timestamps on Jake's updates became very erratic.  You would expect them to be taken every 5 minutes and for the first part of the race, they are.  However, things start to change around mile 26, where there's an update after two minutes, another after three, and a spate of readings at fairly random shorter intervals, with the occasional longer one thrown into the mix.  Because the readings are being taken more often they show the actual trail more accurately than do tracks taken at longer intervals.  At any rate, what we're really interested in here is how they calculate trail miles, and how it is that Jake was calculated as traveling backwards for a short distance as he made his way towards Eagle Quest.

I believe they calculate trail miles by calculating the difference between the reading and a known reference point - if they measure you as being one mile past a point they've marked as 56 miles, they mark you as being at mile 57.  However, when there are unexpected deviations from the trail as marked there's a good chance that they'll be calculating your distance from markers incorrectly.  So, in this case Jake is at 30.30 miles at reading #62, 29.87 at reading #63, and 30.36 at reading #64.  In other words, at the point furthest away from the mapped trail he was closer to the point they were measuring from than he was when the actual trail and the mapped trail merged (which is interesting by itself and raises the question of what point they were measuring from).

The calculations here are harder than they look, particularly with an out-and-back trail (I'm curious to see how smoothly the turnaround goes), and Trackleaders does a highly creditable job handling odd data conditions that have left others spinning in place.  But still, odd things creep in from time to time, and this is one of them.