Thursday, March 7, 2013

Using the "analytics" to answer questions

I think that I've probably underestimated the value of the IonEarth "analytics."  With all due respect to Iditarod fans it's my sense that many of them are from warm places and don't have much (or any) experience with dogsled racing.  It shouldn't be surprising that they tend to be less knowledgable than, say, Yukon Quest fans or Two Rivers 200 (this weekend!) fans.  And so the "analytics" are showing them things they don't know and are finding enlightening.  I could be wrong but it's my sense that this year we've got fewer people freaking out every time a team stops moving, because fans are coming to understand that there are run/rest cycles, that some teams prefer to take long rests on the trail rather than at checkpoints, etc.  That's definitely valuable.  The "analytics" have an important educational role to play.

However, they have a less useful analytical role to play, at least in terms of utility in answering the sorts of questions I've been wondering about.  Even when you clear away some of the bad visual design and disregard some of the odd "analytical" decisions (seriously: who thought plotting altitude against time was a good idea?), there are still some problems.  For example, quite possibly the most common question someone might want answered is whether one team is gaining on another or falling back.  Right now Martin Buser is the furthest down the trail, and Aliy Zirkle is chasing him, having just come off her 24.  Is she gaining ground?

So, here's how I'm approaching the problem.  It involves a ridiculous amount of clicking and may not be the most efficient way, and I'd love to hear from people who've come up with better ways to answer this question.

The first thing I do is remove all mushers from the tracker, then add back the two I'm interested in.  This reduces the likelihood of annoying unwanted pop-ups should my mouse cross another musher.  Unfortunately the "add musher" feature in the tracker uses bib numbers, which is kind of insane when you think about the number of teams in this race, but it is what it is so to find the two I'm interested in I either sort by name, or I sort by trail mile and then use the list to find their bib numbers.  (You can sort the "Selected mushers" list by clicking on the column header of interest).

Once I've got the bib numbers, I remove all mushers and then add the ones I'm interested in.  Right now, that leaves me with this:

[This is actually the second screenshot I took of this.  The screen updated during the first attempt and they threw up a big "updating" alert box.  I hope that sometime between this Iditarod and next, IonEarth hires someone who understands a little bit about software development and user interfaces and this kind of amateurish nonsense goes away.  Or better still, that Iditarod hires a better tracking service.]

Okay, so here's the problem: given that the tools IonEarth provides don't give an easy way to assess how teams are moving in relation to one another, what do I do?  I figure I've got some basic choices:

  1. Use the instantaneous speed reading - how fast both teams are moving as of the most recent reading
  2. Average the instantaneous speed reading over some number of readings, say 6 (an hour) or 3 (a half-hour)
  3. Measure the distance apart at several different points and see how it's changing.
Note that I am not looking at either the average speed or the average speed while moving.  The reason I'm not is that they're averages from the beginning of the race.  They fail to capture anything about overall trajectory.  A team that was moving very fast on Sunday and is plodding today could have the exact same averages as a team that was plodding on Sunday and is very fast today.  IonEarth's averages may or may not be interesting in and of themselves, but they don't help answer this question at all.

I've decided that my best bet is to see how distance between the teams changes over time.  This should be more-or-less equivalent to averaging the instantaneous speed readings over the same period, but I've noticed some odd readings in the instantaneous speeds and suspect that they're not that reliable (too many 0s).

So, what I'm doing is calculating the difference between them now by subtracting Aliy's trail mile (427) from Martin's (446) and finding that they're roughly 19 miles apart.  I'll then move backwards 20 minutes by using the dropdown time menu to the left of the map

(the terrible user interface design choices just keep piling up, don't they?) to move the two teams backwards through time (if only I could find a way to do that for myself ... ).  20 minutes before the most recent reading, Martin was at trail mile 443 and Aliy was at trail mile 425, or they were 18 miles apart.  That is to say, Martin's gained about a mile in the last 20 minutes.  Going back an additional 20 minutes, Martin was at trail mile 441 and Aliy was at trail mile 423, also about 18 miles apart.  So, it looks like they're traveling at roughly the same speed, with Martin just a hair faster.  

With the Trackleaders race flow plot this is graphically displayed in a way that you can take it in an instant, but given that this is what we've got, we can figure out how to get questions answered anyway, with a few more steps and a lot more effort and with some loss of information.  The main thing is understanding that the race isn't just a matter of a team moving through space and time, it's a matter of a lot of teams moving through space and time and having their relationships constantly shifting as a result.  It would be nice to have tools that represented those relationships better.  But in the meantime, I think that if we can figure out what questions we want answered we can also figure out how to answer them with the tools at hand.

1 comment:

  1. Thanks so much for all this information. I have done simlaar work with the analytics and am glad to know that there really is not an easier way!! You are a great teacher!