Tuesday, October 14, 2014

A look at 2013 Quest runtimes

It's looking like this year's Yukon Quest has a pretty good field of entries, and with fall training well underway in interior Alaska we're all starting to speculate about how the race is going to go this year. It's only natural to look at past races, so I've started poking at the 2013 data, the last year the race was run in the Whitehorse-to-Fairbanks direction.  I'm also interested in having a baseline set of data to which this year's race can be compared, once it's underway.

So, I've taken my own spreadsheets from 2013 and used them as a basis for running some numbers.  In particular, I've created a spreadsheet containing runtime 2013 checkpoint data and extracted runtime summary data to get some basic descriptive statistics: fastest, slowest, mean, median, 1st quartile, and 3rd quartile for each race segment (between checkpoints).  For example, between Braeburn and Carmacks, I've got a table that looks like this:




You'll note that I've also calculated the ratio between the fastest and slowest runtimes; there may be something interesting there to look at later.  I've also plotted all runtimes as a histogram:



Again, this is largely to create a baseline dataset for comparison with this year.

However, I also ran some correlations, and while the results are obvious if you think about them for a few seconds, I haven't ever seen anybody say so explicitly:

The ranking of runtimes on longer race segments (more miles between checkpoints) tends to correlate more strongly with final standings, at least in this data set.  That is to say, people who had the faster runtimes between checkpoints which are far apart tended to finish better than people with slower runtimes.  Some of this is tautological (the longer runs are a greater percentage of the total race), some of it might (I haven't looked at this) be because on long runs everybody has to camp, even people who would otherwise prefer to rest at checkpoints, or because longer runs smooth out the variability you might see in shorter runs (law of large numbers, sort of).

Here's a plot of the the correlations between segment runtime and finishing position, against segment distance.



The correlations are on the "correlations" worksheet in the spreadsheet.  I'm using a standard Pearson product-moment correlation coefficient (r), which is not the best test for a complete dataset but is adequate for exploring these data.  I'm not posting the numbers here in the interest of not having reader eyes glaze over, but definitely feel free to visit the spreadsheet, poke through the data, copy the spreadsheets, and ask questions.

I'm planning on doing something similar with 2011 data, as well as years run in the opposite direction, to see how well what we're seeing in 2013 holds up.  Unfortunately getting the data into the spreadsheet and clean enough to use is pretty labor-intensive, so it might be some while before a follow-up to this post.  But, once the data are in a spreadsheet there's a lot we can do with them, so there's incentive to do it beyond answering just these questions.

Here's looking forward to a great winter of distance dogsled racing!

Monday, September 1, 2014

New (ish?) event tracking software

It looks like there's a new GPS-based event tracking application, RaceBeacon.  From what I can glean from their website it looks like they're consolidating GPS feeds from individual participants' personal smart phones.  This is probably reasonable for shorter (as in, very short) informal events, and may provide a solid basis for building out a more robust platform with more features in the future.  As much as I'm a fan of Trackleaders (and I'm a huge fan, for reasons I'll go into in the next paragraph), it's always great to see some competition in this space.

That said, just showing locations on a map is not that interesting, particularly for events with staggered starts (like sprint races).  A map alone can show you where teams are in relation to each other but they don't capture the dynamic nature of racing - the stuff that makes racing exciting.  Prominent among the reason that I like Trackleaders is that they really are both data guys and competitive cyclists, and they're interested in showing the story of a race as it unfolds.  Also, because they're data guys and computer scientists they've already dealt with some relatively difficult problems, for example calculating with some degree of precision the race mile at which a given team is currently located (harder than you'd think).  In this case, for sprint mushing races, the problem RaceBeacon is facing is how to demonstrate the relationship between two different teams' performances when the entire race is run in 20 minutes without stopping and the teams started 2 (or 4 or 18) minutes apart.  But, if you're not committed to using another tracking system and you're putting on a sprint event (and you can count on most or all of your teams carrying smart phones with data plans and having their batteries fully charged and being a platform supported by RaceBeacon), this could be very interesting to experiment with.

If you've used them for tracking your event, how did it go?  Who's planning on using them this fall or winter?

Saturday, May 31, 2014

Packable beer

Recently there's been some discussion of instant beer for backpacking or other backcountry travel.  After all, beer is almost entirely water and water's really heavy, so if the water can be eliminated and added later, the problem is basically solved (well, almost).

This sounded promising but turned out to be a hoax, but in the meantime Pat's Backcountry Beverages has developed the real thing - a beer concentrate and a convenient technology for rehydrating it and adding the fizzy back.  Note that they also have concentrates for various soft drinks including colas, lemon-lime drinks, ginger ale, and others (and you can actually probably use it to carbonate nearly anything).  In the interest of Science we decided to use empirical methods to test the manufacturer's claims.

It's a 3-part system, consisting of a very lightweight easy-to-pack plastic carbonator:




the "activator" (a mix of citric acid and sodium bicarbonate, which comes in convenient small packets):


and the drink concentrate:


(a fun fact about the concentrate - yup, that's 58% alcohol):


Also, your basic low bar:


Anyway, the lid of the concentrator has a lever that's used to pump air in and pressurize the container, and making the brew is a quite straightforward process of pressurizing the device, releasing the pressure, and repeating that cycle for about two minutes.  When you're done you've got something that looks like this:


which is a not-bad head on the beer.  Poured into pint glasses we have:



There are two beers available, a "Pale Rail" and a "Black Hops."  This is the Black Hops.  It smells very malty and a little sweet, but the sweetness doesn't come through in the flavor.  I'm pleased with the flavor (a little bitter) and find it very drinkable.  Chris is German and therefore has profound beer expertise, and she thought it would be better much cooler (we used plain tap water) but was otherwise quite good.

Here's the bad news: while the purchase cost was high but not unreasonable, the shipping costs to Alaska were nuts.  As in, about $30 for the carbonator, activator, and brew just for shipping.  It's only available from one online vendor and there doesn't seem to be anybody selling it locally.  (The good news: market opportunity!).  The carbonator itself is about $40 (but it's reusable and seems durable), the activator is about .50/packet, the beer is about $2.50 for a packet to make a pint, and the soda is a bit under $1.50 for a packet to make a pint.

My one reservation is that because it's a liquid concentrate it's likely to freeze at low temperatures, but otherwise I'm very pleased.  I'll be ordering the Pale Rail and some of their sodas, I think - I do think this is a pretty nifty gizmo, and very highly packable.

Hey, does anybody know if alcohol kills giardia?

Saturday, March 8, 2014

Iditarod and software development

As I posted on my Facebook page, Iditarod has removed my ability to post to their Facebook page.  The proximate cause appears to have been my suggestion that they hire a tech support professional.  I accept that I've been a little obnoxious.  I don't think, however, that I'm wrong.

I don't think Iditarod realizes yet that they're now in the commercial software business.  They wrote some software, they're selling it, and that's kind of that.  I don't think they had any idea at all what they were getting into, and I've been trying to figure out why they did it in the first place.  I think part of the issue is that Trackleaders' user interface really does look dated, even if they've got the best functionality in the event tracking business.  Iditarod wants their stuff to be "branded."  (They also want to charge a lot of money for it; Trackleaders is committed to tracking being free to fans).  I think a better outcome would have been to work with Trackleaders on figuring out how to "skin" the Trackleaders app to develop a distinctive look.

But, that's not what they did.  They decided to write their own software.  I can't imagine they did it to save money, since programmers are really pretty expensive.  Better ones make upwards of $90,000/year, plus benefits, really good ones make big piles of cash.  On the other hand there are web sites for jobbing out work to what are really very good programmers in places like Pakistan and Bangladesh, and those folks are quite inexpensive (at the cost of some reliability issues due to both infrastructure and political stability - a friend hired some developers in Pakistan and then Bhutto was assassinated a few days later, which, among other things, pushed deadlines back).

What Stan and the other fun folks in that office might not have realized is that you never finish software, never.  You release it, but there's always more work to do, bugs to fix, features to add, underlying technology changes to adapt to, and so on.  For example, when Google changed their maps API, it put one of my favorite Alaska GIS websites out of business, because there was no money to hire programmers to adapt their software to the new interface.  So, when the ITC decided to develop their own software, they decided to commit money to it, year after year after year after year.

Something else they apparently didn't realize is that software is not ready for release when the developer says "it works for me."  The developer knows how it works and naive programmers only test happy path application use.  Once the software is released into the wild, particularly if it's a web app, it's going to be run in a variety of platform and browser environments, users are going to try to do things you could not possibly have imagined, and so on.  Bringing in a test professional to bang on your application is going to turn up problems before the software is released, giving you an opportunity to fix bugs and head off support issues.  A typical test environment has a bunch of different operating systems running in clean VMs (virtual machines), with as many browsers (and versions of browsers) as they can possibly get their hands on.  I'm still boggled that Iditarod developers apparently didn't test their stuff on Internet Explorer, which may be losing market share but is still the second most-widely used browser after Chrome (see here for browser stats).  Test and QA professionals are in high demand and well-compensated for a very good reason - over the long run they save a project money, reputation, and headaches.

So here we are, with Iditarod having developed their own software and not having tested it before releasing it.  Now what?  Well, this is where having tech support people makes a huge, huge difference.  For starters, actually knowing something about the technology is kind of a time-saver when trying to solve a technical problem.  In library school, people who are planning on a career providing reference services are taught a skill called "negotiating the question," where they're taught techniques for finding out what a library patron's real question is when they come in with something vague or somewhat oblique to what they really want to find out.  Tech support people do the same thing.  They find out how to reproduce a problem and they know how to describe it to developers to find a solution if it's not something they can figure out themselves.  They recognize the difference between a user error and a real bug.

But that's not what Iditarod has.  They hired some social media people, whose job is to say "Watch this fantastic video!  Then buy things."  They may be able to navigate Facebook and Twitter extremely well, but those are different skills from being able to sort out technical problems.  And so it is that Iditarod's social media people were answering questions from someone who was unable to watch videos because she clearly wasn't logged in. They told her "Reboot your computer" (I'm only sorry I never had an opportunity to ask them how they thought that would help).  That led to a situation in which the social media people were flustered and frustrated and the user with the unsolved question was pissed off.  It's not good for anybody.

So, perfect storm of really bad decisions on ITC's part.  I don't expect Iditarod to fix this situation, because Iditarod doesn't fix these things and it's completely consistent with past performance (here's something I wrote about this almost exactly two years ago, when the handwriting was already on the wall with regard to IonEarth's long-term viability).  It's a tough situation for those putzes who blocked me from posting on Iditarod's wall, and it's a tough situation for fans.  I don't expect it to improve, at least not any time soon.

I really don't like the Iditarod organization, in case that's in any way unclear (!).  But when you get past all the stupid decisions, the commercialism, the minstrel show aspects of some of what goes on, that they really do not put dogs first, it's still a 1000-mile race with superb mushers and incredible athlete dogs.  Fortunately, as was foreseeable a few years ago, better and better photos, videos, and coverage are coming from free sources (in addition to the Anchorage Daily News and KNOM, KUAC started sending Emily Schwing out on the Iditarod trail two years ago, and this year Alaska Dispatch has really upped their coverage).  The only thing that Iditarod provides that isn't available elsewhere is the tracking, and at this point it's nearly worthless, anyway.  I am cheering for friends running the race and wishing them the best,  The ITC, well, whatever.

Monday, March 3, 2014

This one's for the mapping nerds

Map enthusiasts (and that would be nearly all of us of a nerdly disposition) should know about a really nice, free mapping service built on top of Google Maps.  Gmap4 includes not just road maps and satellite images, but also topo maps for the US and Canada, satellite maps, Open Street Maps, and a wide variety of tools, plus what's basically a RESTful API that allows you to integrate your own data without having to create an account or reveal personally identfiable information (PII).  Bravo to the author, and enjoy!

Gmap4 is online here.

Sunday, March 2, 2014

Good morning, yo

I'm in totally the wrong timezone for following Iditarod.  I went to bed before the start (sorry, all, I just can't bring myself to call it a "restart") and woke up at 4am GMT to a lot of complaints about the new tracker.  But I wasn't interested in the complaints as much as I was in looking to see what friends on the trail were up to, so I opened the tracker myself.

With Trackleaders, the first thing I'd do in the morning was open the race flow chart and get a quick picture of who was moving the fastest, who was resting, who was passing whom, and so on.  What I get with the Iditarod tracker is dots on a map.  This morning I can infer who went out fast by what order the bib numbers are in on the trail (because they left the start in order), but that's not going to last for very much longer.  But nevermind that, how are my friends doing?

To find that out, I had to open up the so-called "leaderboard" and chew up more screen space, then select someone from the list.  Becaause the default sort is by bib number and it's a huge list, to find Mike I needed to click to sort by musher name, then select his name.  A box pops up with basic information, and it COVERS UP THE DOTS ON THE MAP (yes, I'm shouting).


That is to say, the dots on the map aren't even visible.  When I drag the screen around to uncover the dots on the map, is Mike highlighted?  No, he's not, so I cannot even tell with a quick look where he is in relation to everybody else.  So, I go back to the "leaderboard," sort it by name, and get his trail mile.  Since trail miles are not displayed on the map, I need to find him in relation to the pack, which means that now I've got to sort the "leaderboard" by trail mile, which gives me some sense of how to find him (by finding other teams around him, which means looking for their bib numbers - I know, right?).

Mike's running last, so he was easy to find by dragging the map around some more.  Since he's last he's easy to find.  But here's an exercise for those following along at home: find Dan Kaduce.  I'm waiting ...

Find him?  How much clicking and dragging did you need to do to do that?  If you'll recall, with Trackleaders all you needed to do was hover over his name and his dot-on-a-map would bounce, making him very easy to find, indeed.  So basically, at this point in the race the tracker isn't carrying very much information.  I think some of this is deliberate (they really don't want to make it easy for you to spot teams in trouble) but much of it is just lack of understanding of design issues and the software development process.

A digression: I just tried looking at mouseover pop-ups to see if that's a little easier, once you've got trail mile.  It is, but the trail miles are wrong - they're showing a few people further down the trail at lower trail miles - see the screenshot showing Kristy Berington and, uh, somebody whose name is covered up by Kristy's pop-up:



Kristy is shown as being at race mile 36, yet she's ahead of someone who's shown as being at race mile 37.

Iditarod have made some very, very basic user interface mistakes, but because it's their software, they're the ones paying directly to fix every bug, improve every user interface error in the design, answer every user complaint, and so on.  I am pretty sure they had absolutely no idea what they were getting into when they made the decision not to use someone else's software.

And one last screenshot before going back to bed, because if nothing else I'm grateful to Iditarod for such a clear demonstration that the development of production software should not be left to hobbyists:



Saturday, March 1, 2014

Start order and speed impacts, 2013

I've just run some numbers on the 2013 Iditarod, from the start to the first checkpoint (Yentna), with an eye towards getting a handle on the relationship between bib number (i.e. start order) and speed.  The assumption is that because the trail gets torn up by everybody who passes over it, the teams with higher bib numbers will be traveling more slowly over the first section of trail.  I looked at this in the Copper Basin in this blog post, and found that there was, in fact, a negative correlation (later teams traveled slower), albeit a fairly loose one.

So, I took a look at last year's Iditarod.  The first thing I did was to plot speed against bib number:


If there's a relationship it certainly did not pop out as clearly as it did in the Copper Basin.  So, I ran the actual numbers.  Using the R statistical package, I ran a Kendall's rank correlation tau, which has the advantage of not making assumptions about the underlying distribution (for example, that the underlying data have a "normal" distribution).  In a nutshell, nope, there does not appear to be a relationship between bib number and speed in these particular data, or at least not one that cannot be explained as the result of random fluctuation.  Specifically, the results of the run are:

 Kendall's rank correlation tau

data:  y$Bib and y$speed 
z = -1.8238, p-value = 0.06818
alternative hypothesis: true tau is not equal to 0 
sample estimates:
       tau 
-0.1557844 

So, with a p value of 0.06818, it does not meet a .05 significance level criterion.

It doesn't appear likely that distance makes a difference.  You'd reasonably expect that a shorter trail would see greater impacts, but there's not that much difference in distance between the start and first checkpoints in the two races (42 miles in Iditarod, 50 miles in Copper Basin).  So, there's still a bunch of work to do.  It may have something to do with trail differences, or weather, or ... ?  I'll be taking a look at the 2014 numbers after the last team is into Yentna this weekend, but over the longer term I'd like to aggregate a bunch of years and see what falls out.