Mushing Tech: 2013

Wednesday, December 18, 2013

Oops, and catching up

I generally don't write much about my life or my dogs. We're a very small kennel and I just run the dogs recreationally. I love it but I think it's not of much interest to other people. Last Thursday, however, we had an exceptional (I HOPE) event. In the early afternoon I heard one of the dogs in the yard, Harry, making a racket so I went out to check. Everything seemed normal except that Lara didn't want to come out of her house.

I bent over to take a look at her and heard a noise coming from behind her. I thought "Oh, damn - she's got a squirrel or a bird in there!" I pulled her out of the house, took a look, and had one of those moments where something that's perfectly natural looks impossible. I wondered where she'd found a newborn puppy. The "What's that?" moment turned into "Holy crap!" moment and I dashed off to ask more experienced breeders (and that would be pretty much anybody) what I needed to do. Because she hadn't shown any signs at all of being pregnant I hadn't been feeding her differently or providing different care, so I thought it was likely that she'd have only one puppy, maybe as many as three. Well, I was wrong about that, too, and after the third puppy arrived I moved them all into the heated guest cabin to keep everybody warm and safe. I watched the fifth puppy being born, and ultimately we ended up with seven.

We know who the father is because we saw the tie happen and nobody else could have been close enough. The puppies should be very nice - both parents are purebred Siberian Huskies, with the mother being out of Nikeenuk's Sedna and Kraken's Kermit, and the father being Tumnatki's Dr Watson.

This increases our kennel size by over 50%, and I really had not planned on adding more dogs. So, it's a little bit of a shock, a little bit of a blessing, and a lot of joy.

In the meantime, the winter is finally getting underway. Several of the early mid-distance races here in Alaska have been cancelled because of trail conditions, but an impromptu "Alpine Creek Excursion" race along the Denali Highway, from Cantwell to the Alpine Creek Lodge, happened last weekend and was won by Two Rivers musher Judy Currier (our pick for the rookie of the year in the Yukon Quest, although Matt Hall is certainly a top contender for that, as well). This weekend brings the Solstice 100/50 race here in Two Rivers, put on by the Two Rivers Dog Mushers Association. It's a friendly, low-key local fun race we do to give people an early season trail tune-up experience, but since it's Two Rivers you can always expect a few ringers to show up. In keeping with this year's crazy race sign-ups (where races are filling up very shortly after opening), we've got a surprisingly large number of entries with some top kennels participating.

Now that the race season is starting up and we're seeing more activity I'll be posting more often. Feel free to drop me a line with any questions you might have about what you're seeing in the race data, about tools that might make races easier to understand, or more importantly easier to put on. In the meantime, I have some puppies to attend to.

Saturday, December 7, 2013

Don't do arithmetic. Seriously, don't.

One of my goals for this winter is to reduce the amount of arithmetic that race organizations and volunteers have to do. It would be fantastic to reduce it to none whatsoever, but that's not going to happen. Still, there are things that can be done.

One of the toughest pieces of arithmetic that distance races need to do is time-related. Base 24 arithmetic is not intuitive for most people and not easy for many, but that's what needs to be done, and often it involves carrying a day when someone arrives one day and leaves the next, or leaves one day and arrives the next (and the Quest has one race segment that takes more than a day, the roughly 200-mile run between Pelly and Dawson). But even for short races, especially ones with a lot of participants, just computing times, speeds, and rankings can be a nuisance. It's very, very easy to make mistakes.

So, over the past year or so I've gradually been putting together Google Drive spreadsheets that do an increasing amount of work on race data. This weekend the Langfjordløpet, in Norway, is using spreadsheets which do nearly all of the arithmetic for race organizers. The format is essentially similar to sprint races in North America, but over longer distances. The final results are completely generated -- volunteers only need to enter start and finish times, and everything else is computed. Definitely feel free to borrow as much as you'd like from it, and let me know if you'd like help setting up race spreadsheets.

Langfjordløpet spreadsheet

But very often races and race checkpoints are in remote places with little-to-no infrastructure - no electricity, no telecomm, etc. That means that race volunteers can't use online tools to help them out. For those cases I've written a very simple app to find the difference between two times, so at least volunteers won't have to do base 24 arithmetic. It runs on both iOS and Android devices, phones and tablets alike, and I'll be uploading it to the Google Play Store and the iTunes store as soon as I improve the layout, visual design not being my strong suit, exactly. I'll keep you posted. And in the meantime, drop me a line if there's some race-related problem you'd like to see automated.

[thanks to Mike Ellis for pointing out that it's not 200 miles between Scroggie and Dawson, as I originally wrote, but about 200 between Pelly and Dawson. It's been corrected.]

Monday, October 21, 2013

SPOT tracker API update

I've been noticing a fair amount of activity on my write-your-own-tracker post and I thought I ought to give a heads-up that there have been changes to the SPOT API, some rather major. For the most part these seem to have been implemented to accommodate the new SPOT 3 and the updated services. Without having dinked with it, the API described in my old post has apparently been deprecated and is no longer available (or may not be available in the future). There's a new URI to retrieve the data. Definitely rely on the information in the SPOT API web page, not my previous post, for what you'll need to be able to parse the SPOT messages.

From the description of the updated API:

Battery State will be sent along with most message types, not every message will have it, but most of them will. It has two states. Examples:

ModelID will also appear in the XML. Values:

<modelId>SPOTCONNECT</modelId>

<modelId>SPOTDC</modelId>

Three new message types as well.

<messageType>NEWMOVEMENT</messageType>

<messageType>UNLIMITED-TRACK</messageType>

<messageType>EXTREME-TRACK</messageType>

The new message types are undocumented, unfortunately, and may take some experimentation to sort out (they're clearly related to the SPOT 3). What they're not telling you and also appears to be undocumented is that the overall format of the feed has changed, being no longer a messageList. It is now a response and contains new elements, including a feedMessageResponse which itself contains a feed element with metadata describing the device and service, and the messages themselves. Note that the messageList element is now called messages and that it's deeper in the tree. There's also a new element tagged hidden in the message (no, I have no idea, either).

I should add that if I were coding this up today I'd use the JSON-formatted response, both for ease of parsing and for efficiency.

So basically, while the old post may still be useful for examples of how to parse XML in Javascript, how to do some basic things with the Google Maps API, etc., the URI needs to change to accommodate the updated API, and the XML (or JSON!) parsing will need to be tweaked.

Wednesday, October 16, 2013

A listing of which mushers are running which races

We were sitting around the other day, wondering which races various mushers were doing, and it occurred to me that it would be very easy to throw together a list of which mushers are doing which races. So, I did. It was basically a cut-and-paste job, using a Google Drive spreadsheet.

I think a lot of people are using spreadsheets only to store data as tables and might not be aware of what kind of data munging, both simple and complex, is available. In this case I basically just cut and paste race rosters, and then sorted by musher name, using this:

As you can see, under the "Data" menu item there's an option to sort the spreadsheet by a column. With your mouse, select the column you'd like to sort by (in our case, column A contains the musher names), and it will show up in the menu. Handy, dandy, and awfully easy.

The spreadsheet is available here. Let me know if you have questions, comments, or have spotted an error that needs fixing. I'll be adding races as their rosters come online.

Saturday, October 5, 2013

Race calendar

This year I'm planning on providing more tools and less ~~whining~~ analysis. Here's an easy one that may or may not be useful. Because I have to travel for work fairly often I'm trying to get a handle on my schedule this winter, so I've thrown together a calendar of distance mushing events in Alaska on Google Calendar, and since I'm not the only person interested in these things I've made it public. Here's how to add it to your own calendar on Google.

First, log into Google Calendar from your web browser. On the left-hand side of the page you'll see "My calendars" and "Other calendars".

If you click the triangle-in-a-box immediately to the right of "Other calendars" you'll get a drop-down menu, like this:

Choose "Add by URL". This will pop a small dialogue box. In the text field, paste in (and seriously, copy and paste this rather than typing by hand) "https://www.google.com/calendar/ical/fvidsm2c61mn3tr7106d9kk4ao%40group.calendar.google.com/public/basic.ics" (without the quotes) and click "Add calendar".

That's it! Piece of cake. The calendar is for Alaska distance mushing events. I'm sure I've missed some and gotten others wrong, so let me know what's missing or needs to be corrected.

Here's looking forward to another great season of dogsled racing!

Thursday, August 1, 2013

Some thoughts on what to ask scanner vendors

In my previous post I took a brief look at what some considerations might be for putting together a tracking system for dropped dogs in Iditarod. Let's pretend that Iditarod has decided to go ahead and implement something based on tracking dogs through their RFID chips.

One thing that's surprised me in Alaska, or at least in my experience at the university, is that there tends not to be very much rigor about defining project goals and requirements. Because this is Iditarod and money is very tight (we want to put as much into the race purse as we possibly can), we cannot afford to be sloppy about goals or requirements. We cannot waste money on a bad or useless purchase.

The first problem is how to read ID numbers from the microchips Iditarod requires to be implanted in the dogs. That means acquiring scanners, quite possibly several dozen of them (someone needs to figure out how and when dropped dogs would be scanned, but certainly at least one at each checkpoint where dogs can be dropped). We'll require that they have a computer interface, and that that interface is USB. Several scanners on the market have RS-232 interfaces, and they tend to be somewhat less expensive, but they're more "brittle." That is to say, they're very sensitive to misconfiguration, transmission errors, funky cables or connectors, etc. Find someone who's done a lot of RS-232 and you'll be finding someone who's got experience with weird, inexplicable failures that turned out to be pins pushed in a fraction of an inch, just barely enough to miss contact, and someone who knows more about parity bits, connector pinouts and pin assignments, control flow over slow networks, and so on than any human should need to. So, USB.

Let's break our questions down into functional questions and environmental questions.

Functional questions:

Does this device have a USB interface?
Is there an API for developers?
What about technical documentation - is there sufficient documentation that a developer would be able to sit down, read it, and start writing useful code? What about sample code?
What about USB identifiers and/or USB device classes? What do you use?
Is a device-specific device driver required? If so, which operating systems are supported?
Is there a "shim" or conduit or some other mechanism for pulling identifiers directly from the device into some piece of standard software (say, MS Excel)?
If so, how much flexibility is there around that? For example, would we be able to pull the numbers directly into cells on our own spreadsheets, or would we need to pull them from cells in a sheet in a format you define?
Are any metadata associated with the scanned microchip (for example, a timestamp)?
In what format are the data when read off the USB port? What about through a device driver (if you provide one)?
Will the scanner cache one or multiple scans and then allow those to be dumped to a computer, or does it need to be attached to a computer at the time it does the scan?
Do you have any particular licensing restrictions on what can be done with the scanner or scanner output? Would you object to a dog tracking system based on your device being released under a standard open source license?

Environmental questions:

Some Iditarod checkpoints are quite remote and have challenging operating environments. What's the temperature range within which your scanner is expected to perform?
Is it waterproof?
Iditarod has volunteers from a tremendous variety of backgrounds and experiences. Some are not technically sophisticated, some are extremely technically sophisticated. How difficult is the scanner to operate? How complicated is the user interface? How many buttons on the scanner would someone have to push to start a scan and get it into a computer?
What about power - can it be powered off the USB interface, does it require batteries, does it run off AC, ... ?
If it runs on battery power, what's the battery lifetime? Can it run on replaceable batteries, or only on your rechargeables? If it can run on replaceables, what about lithium ion?
Iditarod would need spares in case of device failure. What's the mean time between failures on this scanner?

Sunday, July 28, 2013

A few notes on Iditarod dropped dog tracking

I wasn't alone in being quite disturbed by what we learned about Iditarod dropped dog handling and tracking last year, and it seems as if there are opportunities to simplify and improve Iditarod's record-keeping. At any time (at every time!) it should be possible to know exactly where a given dog is, and to have a list of which dogs are at a given location. I should emphasize that I do not know how they're currently tracking dropped dogs and some of what I write may be redundant with what they're already doing.

While writing this I found that there's a combinatorial explosion of options and that it would probably be a good idea to summarize up front what I'm thinking. People who are interested can read on further for more detail.

Requirements and nice-to-haves

What to expect from an automated dropped dog tracking system probably includes

ease-of-use by checkpoint volunteers, and robust against user errors
data collected at checkpoints should be available to ITC staff and volunteers who may not be present at the checkpoint - this probably means loading the data into a centralized data store
but it should also work well with poor or intermittent network connectivity
it should be relatively inexpensive - Iditarod has a lot of checkpoints and the hardware will need to be available at each. This probably means cheap laptops or netbook computers
maintainable by volunteers - this means a mainstream development environment if a decision is made to develop a custom solution. Alas, Prolog and APL are out.
modular - by decoupling input mechanisms from the data store, we can provide multiple input options, or swap in newer technologies as they become available

What are the options?

Here are what I think are workable options:

Manual transcription of microchip scanner readings into a spreadsheet, using low-cost scanners. This is error-prone, labor-intensive, and almost certainly the worst option. However, it's the cheapest, and the least labor up front (the amount of development effort required for a project like this may be a consideration as we start to run out of summer, and time)
Get a higher-end scanner with a USB interface and software that allows it to scan directly into a spreadsheet (for example, Datamars's ISO MAX V). This is more expensive than the first option but reduces the effort required of checkpoint volunteers, and is far less error-prone. Because it's low-effort it's more likely to be done consistently. It also allows for the possibility of future development on top of what's already being done, in the form of feeding spreadsheet data into a centralized data store automatically. It also doesn't require much development overhead, although some spreadsheet programming may be required, and someone will have to test and document the process, for two reasons: 1) to make sure that it actually works the way it's believed to work, and 2) to be able to provide direction and support to volunteers
Get a higher-end scanner with a USB interface and develop a system more closely tailored to Iditarod requirements. This probably costs about the same as the previous option, but requires custom programming, continuing support and development, etc. It may be possible to turn this into a class assignment or project for CS students at UAF, although you can expect that the code quality would be inconsistent. However, if a dropped dog tracking system is to be able to answer both the question of what dogs are at a given checkpoint and where a given dog is, it's going to take some development, whether it's from-scratch or on top of a spreadsheet system. Note, as well, that documentation for the API (if there is one) or the USB command set (if there's no API) appears to be not easily come by. This approach could produce the best tools, but it's very high-overhead

Hardware

At this point we'll assume that we'll be identifying dogs by their microchips (dogs are required to be chipped by the rules), which pretty obviously implies a scanner at each checkpoint. Microchip scanners have displays showing the ID string on the chip, and it's certainly possible to transcribe this manually. But, we're trying to make it easy for checkpoint volunteers and we absolutely want to minimize errors, and the best way to do this is to have software pull the data off the scanners. In turn, this suggests that the scanners should have interfaces that allow us to do just that.
I've been taking a look at available scanners and it looks like the most common interface is RS-232 serial interfaces. And that, right there, is something. RS-232 hardware used to be standard on nearly all consumer-grade computers - it's what you used to plug your external modem (hey, remember those?) into. But when was the last time you saw one on a laptop or netbook? Also, RS-232 can be flaky and difficult to configure (line speed, parity, data formats, etc.).
USB is a much better choice - ubiquitous, works out of the box. However, are scanners available that have USB interfaces? As it turns out, yes they are, but I haven't yet been able to find one for less than US $654 (the Datamars ISO MAX V), which is awfully expensive. Datamars has another scanner that also has a USB interface and is probably less expensive (their Scanfinder XTEND MAX), but I haven't been able to find a price on it.
Microchips are just RFID devices, and microchip scanners are just RFID readers, but documentation of radio characteristics (frequencies, for example) and data formats (plus, some chips are encrypted) are not easily come by. I would think that a scanner manufacturer would be very, very interested in working with Iditarod on this. I am also quite sure that there are scanners out there that I haven't looked at and which may meet interface and cost requirements. More digging is needed. The primary issue is going to be documentation, or lack of it. If Iditarod wants to buy microchip scanners and use those as the basis for a dropped dog tracking system, they need to have some very specific questions to take to scanner vendors, and they need to have them in advance of talking to vendors. This would include a description of the problem that they're trying to solve and requirements and conditions for solving the problem.

I'll talk more about software in a later post. This one is quite long enough as it is.

Saturday, July 27, 2013

SPOT 3 is out!

SPOT has finally released its SPOT 3 device, an upgrade to the SPOT 2. It addresses some of the limitations of the SPOT Satellite GPS Messenger (aka the SPOT 2) while retaining its strengths. I think it's still a much better choice for bush travel tracking and emergency signaling than the GPS tracking units that pair with smartphones, like the original DeLorme Inreach (the new generation of which is the DeLorme Inreach Smart Phone) or the SPOT Connect.

Here's what the new SPOT device looks like:

It's got the same basic buttons as the SPOT 2, although the power button has been moved to a location less likely to be pushed accidentally. It's a hair smaller than the SPOT 2, and a tad lighter. I like that it now has slots on the frame to make it easier to strap down or otherwise attach to, say, a sled bag (an ongoing headache when using SPOT devices to track sled dog races), a backpack, etc., rather than relying on an external case to hold it.

It's a little more expensive than the SPOT 2, but at $169 still not even close to being as spendy as the DeLorme stand-alone InReach ($299).

New features include

Settable track rate options, varying from preset intervals of 2.5 minutes to 60 minutes. This gives better tracking resolution when you want it or better battery life when you're moving slowly or out there for a loooooong time
Motion-activated tracking. This is a huge win, since the tracker will stop sending out updates when you haven't moved for awhile. That is to say, it won't be doing expensive (in terms of power consumption) radio operations.
Some new, kind of confusing service plans. The same old $100/year plan is still available, but if you want motion-sensitive tracking (it stops updating if you stay in the same place for awhile) you need to subscribe to the new $150/year plan. I think this means that you cannot subscribe to the old $100/year plan with a SPOT 3 device, but only with SPOT 1 and SPOT 2 devices. Don't hold me to that. Apparently setting the tracking interval to something other than 10 minutes costs extra on top of the $150/year, but I'm not really sure about that. The plan info is here.

I still think that having a very basic device with a limited user interface and some simple functions is the right approach for wilderness travel or tracking in remote locations. There are always going to be tradeoffs between function and power consumption, and I'd prefer to err on the side of basic functions and lower power consumption, myself. For people who use GPS trackers to stay in touch with sponsors, etc., while in remote locations, something like the InReach or even the SPOT Connect makes sense. But for my own purposes, and for those of many people I know, having a reliable basic tracker and emergency signaling device that doesn't do much else gives a little extra peace of mind.

As a side note, SPOT continues to provide evidence that it's a bad idea to let hardware people write software. Their website, tracking pages, and device management interface remain hot messes.

I can't find my own SPOT 2 anywhere (I know, right?), and I think I'll be picking up one of these to replace it.

Saturday, July 13, 2013

DeLorme update

I haven't been paying much attention to DeLorme's offerings for two reasons:

I think having to pair the tracker with a smart phone is a risky strategy in resource-constrained environment (i.e. in the bush for a few days), and
the lack of an API meant that you were completely dependent on DeLorme for access to and manipulation of your data

But I recently looked at the DeLorme website and it turns out that they've fixed both problems. There's a new device, the inReach SE, which launched in April. The device looks like this:

It's pretty clearly a far more elaborate interface than what's provided by SPOT. It allows you to compose and send SMS messages (as well as allowing you define some preset messages), and it allows you to receive SMS messages. It also has an emergency notification button, like SPOT, which sends messages to the GEOS center. Unlike SPOT it allows you to send arbitrary messages to the GEOS center, as well, so you can answer questions and provide additional information to rescuers.

That said, it uses a rechargeable battery, which I think suggests that DeLorme is somewhat unserious about extended wilderness use. Disposables, while being wasteful and trashy, are easy to carry and easy to swap in and out as needed. So that's disappointing, particularly given that the additional functionality in these things is likely to encourage you to run down your batteries more quickly.

As for costs, the device runs $300, and subscriptions are a little spendy. There's a "Safety Plan," for folks who are using these strictly as an emergency signalizing gizmo (although I think that if that's all you want, you're better off with a SPOT), and it runs $9.99/month with a yearly subscription and renewal requirement. The other two plans are the "Recreation Plan" ($24.95/month for a yearly subscription) and "Expedition Plan" ($49.95/month for a yearly subscription). You can also get "seasonal" subscriptions to the latter two plans, with a minimum of 4 months, at $39.95 and $54.95 per month. These allow unlimited sending of predefined messages and either 40 or 120 messages you compose in the field, as well as unlimited tracking.

As for an API, they're now providing your tracking data through a KML feed. Contrast this with SPOT's RESTful interface, here. There are definitely some tradeoffs, although I think on balance the KML feed is a win. There are a lot of tools out there that already know how to parse and use KML data representations. At this point I think it's safe to say that we're all tired of XML, but if you're using someone else's tools that issue is largely moot.

SPOT has a very simple JSON format, but still, a lot of folks would like to be able to use their tracking data without having to parse and extract it themselves. I'd count this as a win for DeLorme.

In looking at the tradeoffs I have to say that I still lean towards SPOTs, ghastly software and all. I have no interested in tweeting or posting Facebook status updates from the field, the battery issue is pretty annoying, and the additional cost is difficult for me to justify. I expect that someone who's got sponsors and fans would weigh things quite differently, though. You have to understand your own priorities and the tradeoffs between the devices when making a choice.

Friday, May 3, 2013

Cody Strathe writing for Mushing

Cody Strathe of Dogpaddle Designs just announced on Facebook that he'll be writing a regular column for Mushing Magazine. This is excellent news. There are people out there building decent sleds and there are fewer people out there innovating in dogsled design, but Cody's one of the few doing really innovative work and has a business selling sleds. Top-quality, highly-regarded sleds being driven by people who are kind of hard on their gear and/or put huge miles on their sleds on tough terrain. And while some of the innovative work in sled building by distance mushers can be a little gimmicky (for example, heated driving bow) Cody's been trying new materials and new construction techniques and having excellent results. Check out this video, where he's taking a new sled for a test run:

I've been really interested in his ideas about building flexibility into a sled while retaining strength and structure. His column should be a very good one. He's also looking for suggestions for topics, so if you've got one, drop a note at the Dogpaddle Designs Facebook page.

Wednesday, May 1, 2013

Spotwalla

If you're a SPOT tracker user you probably rely on SPOT's shared page mapping services to share your trips with friends and family. And if you're doing that you probably know that the user interface is clunky, there's not particularly sophisticated control over what you can share and what you can't, and your data go away after a week. (I've never used SPOT's "SPOT Adventures" service so I can't comment on what it provides).

Piia and Julien, an adventurous young couple who spent the last year here in Two Rivers and recently left for a new life in the Yukon, clued me into a service I'd been unfamiliar with, called Spotwalla. Spotwalla provides a mapping interface for your GPS data, including but not limited to SPOT trackers. Others include the Delorme InReach, a variety of conventional GPS devices, and smartphones. (See the list here).

It looks pretty neat, although I think it probably requires a slightly higher level of technical sophistication than SPOT shared pages do. One thing that I particularly like is that it gives you reasonably granular control over sharing and privacy, allowing not just a variety of access control mechanisms but also the ability to disable sharing from certain locations. (Privacy is something that probably should be taking more seriously by people using trackers - they don't just show where you are, but they also show where you're not. Be thoughtful about what you share.)

They also provide their own RESTful API to the service, including "value-add" features specific to their service and not part of the SPOT service (for example, if you've uploaded pictures you can retrieve those programmatically through the API).

But the big deal, as far as I'm concerned, is that the locations aren't purged after a week, as they are with SPOT's service.

So, pretty interesting. I very rarely carry my SPOT in tracking mode but at this point, I'm inclined to use Spotwalla rather than SPOT's shared pages for when I do in the future.

Tuesday, March 19, 2013

I don't even know where to start

So, Dorado. I find it nearly impossible to write about something as personal, as intimate, as losing a dog. If it were me I'd be running around Unalakleet with an axe or a gun, or more likely just sitting out on the ice for a couple of weeks (or a couple of months) and avoiding people entirely. Much love and honor to Paige and Cody for walking a hard road.

Danny Seavey wrote about what happened to Dorado on his business's Facebook page, and that piece has received a lot of attention, reposts, and wide acclaim from Iditarod fans. It made me more than a little ill, frankly, and I'd like to talk about that and about responsibility to the sport.

The upshot of his post is that life is short, everything that lives must die at some point, and we need to decide where to draw the line between tragic and just sad. He talks about eating meat and he talks about the euthanasia of unwanted pets. He says a well-intended volunteer messed up and that we shouldn't take it out on that person. He's saying that accidents happen and that's just the way it is.

But here's the thing: this wasn't a freak accident. This was negligence on the part of the person or people staffing the dog yard at Unalakleet. Whether or not they understood how snow fences work, they weren't checking on the dogs regularly enough to identify a developing problem, and they weren't checking the dogs regularly enough to remediate what was going wrong.

I think we're living in a very good period in the development of dog mushing. Veterinary research has made great strides in identifying and developing mitigations against preventable dog deaths. Ethical standards in dog care, husbandry, breeding, and so on are improving a lot. Much improved nutrition has both improved dogs' performance and improved their quality of life. Some top kennels are working directly with veterinary researchers, and veterinary care awards have become some of the most prestigious titles awarded for many races.

But many dog mushing fans come from warm places, places without mushers or dog teams, places without roadless areas or true wilderness. They're drawn to the sport for the romance and adventure and often don't really know anything about running dogs beyond what they learn while following Iditarod and reading blog posts or Facebook statuses from mushers. They don't have enough experience to contextualize what they read. They want to support the mushers and dog teams, and when a relatively high-profile musher says something they tend to believe it. The kind of fans they become depends on what they read and experience. And so even leaving aside moral and ethical questions raised by Danny's unfortunate post, I think that it's important to the sport that mushers are clear that any dog death caused by negligence on the part of race volunteers, race staff, mushers, whomever, is completely unacceptable. We don't just shrug it off and compare it to the euthanasia of an unwanted, homeless dog or to eating chicken. What happened here is intolerable, and much shame on anybody who not only thinks otherwise but influences people who don't know better to think otherwise as well.

So, onward. I believe that the sport is going to continue to improve, that dog care will be valued more and more highly, that a solid understanding of the ethics of how we live with dogs will spread, and that fans will learn what these extraordinary dogs really mean to the people who raise them and care for them and travel vast distances with them as partners. But it will take outreach, and communicating to those fans who think that Danny Seavey's explanation is brilliant that no, this is not how I feel about my dogs and this is not how it works.

Thursday, March 14, 2013

The whizzdom of crowds

I've been enjoying the heck out of the Seavey's "Fantasy Iditarod" game. It occurred to me that with so many participants (469!) it might be interesting to look at how everybody bet, to see whether or not the game actually had any predictive value.

A few years back James Suroweicki wrote a book called "The Wisdom of Crowds," the basic premise of which is that "a large group's aggregated answers to questions involving quantity estimation, general world knowledge, and spatial reasoning has generally been found to be as good as, and often better than, the answer given by any of the individuals within the group." Over the past decade or so there's been tremendous growth in what are called "prediction markets," in which participants buy and sell prediction shares in things like political elections, Academy Awards, etc.

So, how well can a group of Iditarod fans predict the race outcome? Not that well, as it turns out, but not that badly, either. There seems to be some accuracy at the high ends (winners) and low ends (um, not winners), but the pricing and rules of the game have a distortive effect in the middle, I think.

Here's my premise: according to the rules of Fantasy Iditarod, each person has $27,000 to allocate to up to 7 mushers. Each musher had a "price," with very experienced, successful mushers being priced quite high and rookies or people who hadn't been particularly in the past priced quite low. The prices were set such that a player couldn't spend it all on top mushers - if they wanted 7 mushers they'd need to bet on some lesser-known or less-successful mushers. I thought it was possible that the raw counts of how many people had included a given musher might indicate something about how the race would turn out. So, I wrote a simple script to count the number of bets on each of the mushers (and did the same for the rookie bets - more on that later).

I was surprised that Martin Buser had gotten as many bets as he had, and he was the jackrabbit early in the race. Aliy Zirkle was the most popular choice and she came in a very close second. Joar Leifseth Ulsom was ranked 5th, which is not what you'd normally expect for a rookie, but was quite close to how he actually did (7th). Mitch Seavey was the 9th most popular choice but had far, far viewer bets than Aliy (74, to her 232).

I think to some extent the results were distorted by our inability to pick the seven teams we thought would place the highest, although I don't think they were distorted that much. Almost certainly people with no chance received more votes than they would have absent the $27,000 limit, and people with some chance received a bit fewer. I also think there was a lot of sentimental voting as a way of showing support for mushers people particularly like, regardless of what the expected outcome would be.

In the rookie race, Joar was the hands-down favorite. If you follow dog mushing at all he would have had to have been your choice. He received 92 votes for the Rookie Award, and the closest competitor was Travis Beals, at 45. Josh Cadzow received 42, which surprised me quite a bit - I would have thought he'd get the second-most votes, based on past performance. If anybody has any insight into why Travis got more votes from fans I'd be really interested to hear your take on it.

The tables are below, with names, votes, and actual placement. Some teams are still on the trail so the final standings aren't complete, but 36 teams are in and I think that's enough to get a handle on how well the Fantasy Iditarod bets line up with the actual results.

Fantasy Iditarod bet counts
Name	Bets	Placement
Aliy Zirkle	232	2
Dallas Seavey	206	4
Martin Buser	156	17
Lance Mackey	127	19
Joar Leifseth Ulsom	98	7
Jeff King	97	3
Jake Berkowitz	86	8
Ramey Smyth	80	20
Mitch Seavey	74	1
DeeDee Jonrowe	71	10
Gerry Willomitzer	71	withdrawn
Kristy Berington	70
Mike Ellis	68	30
Travis Beals	68
Brent Sass	65	22
Josh Cadzow	61	14
Matt Giblin	54
Newton Marshall	51	scratched
Peter Kaiser	50	13
John Baker	47	21
Cindy Abbott	46	scratch
Allen Moore	44	33
Aaron Peck	42
Anna Berington	42
Richie Diehl	40	36
Cim Smyth	40	15
Matt Failor	38	28
Paige Drobny	36	34
Mikhail Telpin	32
Christine Roalofs	31
James Volek	30
Aaron Burmeister	28	11
Scott Janssen	28	scratched
Jason Mackey	27	scratched
Paul Gebhardt	26	16
Charley Bejna	25	scratched
Nicolas Petit	25	6
Jessie Royer	23	18
Luan Ramos Marques	23
Mike Williams Sr	23
Wade Marrs	22	32
Jodi Bailey	21
Karin Hendrickson	21
Angie Taggart	20
Jan Steves	18	scratched
Michelle Phillips	18	24
Justin Savidis	17
Ken Anderson	17	12
Ray Redington Jr	17	5
Jim Lanier	15	35
Michael Williams Jr	14	23
Curt Perano	14	27
Kelley Griffin	14	26
Louie Ambrose	13
Bob Chlupach	11
Jessica Hendricks	10	25
Robert Bundtzen	9	scratched
Kelly Maixner	9	31
Sonny Lindner	7	9
Cindy Gallea	7
Rudy Demoski Sr	6	scratch
Linwood Fiedler	5	29
Michael Suprenant	3	scratched
Gerald Sousa	3
Ed Stielstra	3	scratched
David Sawatzky	1	scratched

Fantasy Iditarod rookie bet counts
Joar Leifseth Ulsom	92
Travis Beals	45
Josh Cadzow	42
Paige Drobny	32
Mike Ellis	30
Richie Diehl	23
Cindy Abbott	11
Mikhail Telpin	10
Charley Bejna	9
James Volek	9
Christine Roalofs	8
Luan Ramos Marques	7
Louie Ambrose	6

Saturday, March 9, 2013

Something interesting from the "analytics"

I know I'm getting pretty repetitive about the limitations of IonEarth's so-called "analytics," so I thought it might be novel to write about something I just noticed that may or may not be useful. I was looking at Aliy's and Martin's plots for the last 24 hours or so, and it appears to be the case while both Martin and Aliy are now traveling slower than their average speed while moving (although I don't think IonEarth's numbers here are particularly reliable), the difference between Martin's current speeds and his average speed is larger than the difference between Aliy's current speeds and her average speeds. That is to say, he's slowed down more than she has. Or at least I think that's the case - we're relying on visual guesstimates here but the y-axis scales seem to be the same.

So, at long last - IonEarth finally produces an insight not otherwise easily recognizable.

Thursday, March 7, 2013

Using the "analytics" to answer questions

I think that I've probably underestimated the value of the IonEarth "analytics." With all due respect to Iditarod fans it's my sense that many of them are from warm places and don't have much (or any) experience with dogsled racing. It shouldn't be surprising that they tend to be less knowledgable than, say, Yukon Quest fans or Two Rivers 200 (this weekend!) fans. And so the "analytics" are showing them things they don't know and are finding enlightening. I could be wrong but it's my sense that this year we've got fewer people freaking out every time a team stops moving, because fans are coming to understand that there are run/rest cycles, that some teams prefer to take long rests on the trail rather than at checkpoints, etc. That's definitely valuable. The "analytics" have an important educational role to play.

However, they have a less useful analytical role to play, at least in terms of utility in answering the sorts of questions I've been wondering about. Even when you clear away some of the bad visual design and disregard some of the odd "analytical" decisions (seriously: who thought plotting altitude against time was a good idea?), there are still some problems. For example, quite possibly the most common question someone might want answered is whether one team is gaining on another or falling back. Right now Martin Buser is the furthest down the trail, and Aliy Zirkle is chasing him, having just come off her 24. Is she gaining ground?

So, here's how I'm approaching the problem. It involves a ridiculous amount of clicking and may not be the most efficient way, and I'd love to hear from people who've come up with better ways to answer this question.

The first thing I do is remove all mushers from the tracker, then add back the two I'm interested in. This reduces the likelihood of annoying unwanted pop-ups should my mouse cross another musher. Unfortunately the "add musher" feature in the tracker uses bib numbers, which is kind of insane when you think about the number of teams in this race, but it is what it is so to find the two I'm interested in I either sort by name, or I sort by trail mile and then use the list to find their bib numbers. (You can sort the "Selected mushers" list by clicking on the column header of interest).

Once I've got the bib numbers, I remove all mushers and then add the ones I'm interested in. Right now, that leaves me with this:

[This is actually the second screenshot I took of this. The screen updated during the first attempt and they threw up a big "updating" alert box. I hope that sometime between this Iditarod and next, IonEarth hires someone who understands a little bit about software development and user interfaces and this kind of amateurish nonsense goes away. Or better still, that Iditarod hires a better tracking service.]

Okay, so here's the problem: given that the tools IonEarth provides don't give an easy way to assess how teams are moving in relation to one another, what do I do? I figure I've got some basic choices:

Use the instantaneous speed reading - how fast both teams are moving as of the most recent reading
Average the instantaneous speed reading over some number of readings, say 6 (an hour) or 3 (a half-hour)
Measure the distance apart at several different points and see how it's changing.

Note that I am not looking at either the average speed or the average speed while moving. The reason I'm not is that they're averages from the beginning of the race. They fail to capture anything about overall trajectory. A team that was moving very fast on Sunday and is plodding today could have the exact same averages as a team that was plodding on Sunday and is very fast today. IonEarth's averages may or may not be interesting in and of themselves, but they don't help answer this question at all.

I've decided that my best bet is to see how distance between the teams changes over time. This should be more-or-less equivalent to averaging the instantaneous speed readings over the same period, but I've noticed some odd readings in the instantaneous speeds and suspect that they're not that reliable (too many 0s).

So, what I'm doing is calculating the difference between them now by subtracting Aliy's trail mile (427) from Martin's (446) and finding that they're roughly 19 miles apart. I'll then move backwards 20 minutes by using the dropdown time menu to the left of the map

(the terrible user interface design choices just keep piling up, don't they?) to move the two teams backwards through time (if only I could find a way to do that for myself ... ). 20 minutes before the most recent reading, Martin was at trail mile 443 and Aliy was at trail mile 425, or they were 18 miles apart. That is to say, Martin's gained about a mile in the last 20 minutes. Going back an additional 20 minutes, Martin was at trail mile 441 and Aliy was at trail mile 423, also about 18 miles apart. So, it looks like they're traveling at roughly the same speed, with Martin just a hair faster.

With the Trackleaders race flow plot this is graphically displayed in a way that you can take it in an instant, but given that this is what we've got, we can figure out how to get questions answered anyway, with a few more steps and a lot more effort and with some loss of information. The main thing is understanding that the race isn't just a matter of a team moving through space and time, it's a matter of a lot of teams moving through space and time and having their relationships constantly shifting as a result. It would be nice to have tools that represented those relationships better. But in the meantime, I think that if we can figure out what questions we want answered we can also figure out how to answer them with the tools at hand.

Wednesday, March 6, 2013

Measurement artifacts in the Iditarod altitude plot

I just really don't like having altitude and temperature curves on the Iditarod speed plots, so when I start to look more closely at someone's "analytics" (I still cannot describe a speed/time plot as an "analysis" - sorry) the first thing I do is make them go away. But today someone asked me a question about something he'd seen on a plot and while looking at his screen grab I noticed that someone's altitude was changing while the sled wasn't moving (i.e. when the speed was 0.0). So, I decided it might be fun to take a look at someone who was parked a long time and see what happened. Here's Martin Buser's 24 in Rohn:

As you can see, the altitude line is basically level but wiggles a bit (and I wish they'd get rid of the drop shadow on that line - it's pretty but it makes it a little more difficult to read the curve - more chartjunk!). How much does it actually vary? Well, running my cursor back and forth and back and forth and back and forth and back and forth over the curve, the highest value appears to be 1489 feet on 3/5 at 11:10, and the lowest value appears to be 1309 feet on 3/5 at 1:40pm. In other words, by the plot the sled lost 180 feet of altitude in 2 1/2 hours while sitting still.

So, what's really going on? A couple of possibilities:

The tracker is using barometric pressure to measure altitude, and this is due to normal fluctuation or weather changes. Not likely
The GPS in the tracker is triangulating altitude, along with lat/long. This tends to be much less reliable than GPS lat/long readings and can fluctuate amore, and is the more likely explanation

Note that this represents an error range of about 12%.

[n.b. note that IonEarth will redraw the "analytics" while you're looking at it, taking it back out to the full scale even if you've zoomed in. Thanks tons, guys - your stuff is a pleasure to work with.]

Tuesday, March 5, 2013

The Mike Williamses are so confusing!

Chris got home tonight and said "I think they swapped the Mike Williamses." I had no idea what she was talking about but when she said "the tracker!" I thought "not too likely." Well, she's right - the Mike Williams Jr on the tracker is the Mike Williams Sr on the leaderboard, and vice versa. They swapped the bib numbers but, as it turns out, not the humans. Here's how I approached figuring it out.

Since they arrived in Nikolai about an hour apart I thought I'd use the tracker's history mechanism to see which the tracker thinks arrived first. The first thing I did was to remove all the mushers, then add the Mike Williamses back in (note to Iditarod: ordering the mushers by bib number on the "Choose your mushers" drop-down menu when there are 66 mushers is really dumb). Then I took a look at the leaderboard to see when a Mike Williams first arrived in Nikolai. We have a Mike Williams whose bib number is 46 arriving at 15:36, and a Mike Williams whose bib number is 35 arriving at 16:33.

That gave me a rough idea of what time to rewind the tracker to to see which Mike Williams it thought arrived at 15:36. Here's what the tracker showed at 15:30 this afternoon:

So it thinks bib 35, Mike Williams Sr, arrived at 3:36pm, and the leaderboard thinks it was bib 46, Mike Williams, Jr. We'd sort of expect it to be Mike Williams, Jr, given that he's a very speedy guy who's expected to place very high.

However, this:

It appears to be the case that they've associated the right name but the wrong bib number with that particular tracker. With the unfortunate reliance on bib number in the IonEarth user interface it would be nice if they could fix it, but given the ham-handed way they treat data I would not expect them to be able to move retrospective race data correctly when and if they fix this, so there's a reasonable expectation of buggered-up speed calculations, run/rest schedules, and whatnot.

Seriously, Iditarod: do you really want to stick with this tracking service?

Exploring the "analytics" a little further

I like to use race analytics to try to answer questions I have about how a given race is unfolding or to reveal something otherwise not obvious, and I thought it might be worth spending some time looking at what's possible with the new IonEarth "analytics" (I have a hard time calling plots of speed against time an "analytic," but we get what we get). So, here's Justin Savidis's "analytic" plot:

First, let's get rid of the chartjunk - the stuff that doesn't add information but clutters up the plot with a bunch of noise.

Aaaah - much better. Goodbye, altitude curve and insane, ridiculous temperature curve! To get rid of a given curve, click its name in the box to the right side right above the plot.

One of the things I do like about the IonEarth analytics is that they flag checkpoint locations on the x (horizontal) axis (and it should be noted that this is really the only place in their "analytics" where they provide distance information).

When the instantaneous speed curve (the green one) is horizontal and at 0 on the Y axis (chartjunk alert! IonEarth needs to move some of those labels to the right side of the plot and move temperature up to the altitude curve), the team isn't moving. What we can see is that Justin's longer rests have almost all been in checkpoints so far. We can also see that he stopped for about three hours on the trail earlier this morning. So that's kind of revealing.

Contrast this with Brent Sass's plot:

For the most part Brent is taking his rest on the trail. This should not be a surprise, for a few reasons:

Brent is racing and he's not going to let checkpoint placement location control his run/rest schedule
Brent's got mad skillz and is expert at wilderness travel. He knows how to camp and he's comfortable doing it

None of this is hugely surprising. But one thing we can learn from these "analytic" plots is teams' run/rest schedules, and it's enormously interesting to compare those. And, of course, it was on this plot that we could tell that Martin was just not giving his dogs a break as he hot-footed it to Rohn.

It's a lot harder to use IonEarth's "analytics" to do things like figuring out who's traveling together (often sort of interesting) and how teams are moving in relation to each other. I'll start to look at those questions in subsequent posts.

Monday, March 4, 2013

Iditarod, Monday morning

So, it's Monday morning and the talk is all about Martin Buser opening a big gap on the rest of the field. It's 8:40 here in Alaska and according to the tracker he's at trail mile 156 while the person running in the second position, Matthew Failor, is at trail mile 132. That's a difference of 24 miles, or roughly three hours of running time. Perhaps more interesting is that Lance Mackey is resting at trail mile 109, or 47 miles behind Martin.

Before the race started commentators and mushers alike stressed the importance of run/rest cycles, how they figure into the race, and what it means to keep your dogs perky. I don't think we should forget that now that we've got a jackrabbit, especially since we're less than 24 hours into the race and we don't know whether or not it's going to work for him

So, here are the IonEarth speed/time plot for Martin and Lance, with some chartjunk removed for clarity. Martin's plot is the upper one; Lance's the lower:

The thing that pops out here is that Lance is banking a lot of rest and Martin is banking what appears to be none. Lance took a three-hour break last night at 10pm and has now been parked for nearly 5 hours. This is less rest than the old-school even run/rest schedules that used to be common, but it's still a good amount of rest.

Also note that Martin is moving more slowly than other teams (when they're moving) so it's crossed my mind that it's *possible* that he's carrying one or two dogs at a time to give them some rest rather than breaking the whole team, but there's no way to tell. He's averaging 9mph while moving and Lance is averaging 9.7mph while moving, which is a respectable difference (although to be honest I don't really trust IonEarth's averages, so while I'm probably more confident than not in these numbers I'm not 100% confident).

Also worth mentioning and this should definitely figure into your thinking about positions: Martin has the lowest bib number, which means that he owes the most time (over two hours). To figure out the difference between what he owes and another team owes, subtract Martin's bib number from the other team's bib number and multiply by two. That's the number of minutes more that Martin owes.

Sunday, March 3, 2013

Analytics and smell tests

One of the things I've really tried to hammer on over the last year or so is that the statistics that we talk about and that race organizations offer up need to be connected to something going on in the real world, and that they need to make sense. Whether or not they're useful depends on whether or not they reflect something real and on whether or not they have any explanatory power. I feel pretty strongly that tracker analytics are worthless without meeting those two conditions.

And so it is that I cast an eye upon the IonEarth analytics and find myself scratching my head. This one is really easy, popped out immediately, required absolutely no arithmetic. And here it is, tracker info on Rudy Demoski:

Here we are, less than three hours into the race. Rudy is just a little past Scary Tree, at about race mile 25. So we take a look at the IonEarth run/rest calculations, and they claim that he's taken about 1:20 of rest, and been running 2:40. First, this fails the smell test just on the rest time alone. It's quite unlikely that someone's already taken 1:20 rest before getting to mile 25 of the Iditarod, although we'll circle back around to this later. If we add 1:20 and 2:40, we get 4 hours. So, has Rudy been running 4 hours at 5:50 pm on Sunday? Not very likely, since the first musher left at 2:00pm. Bib 39 definitely did not leave before the race started. If we drop the 1:20 rest, does the notion that he's been running 2:40 make sense? Maybe. By the clock he should have left at 3:14pm, and if he'd been running 2:40 at 5:50 Trackleaders thought he left at at 3:10.

So that all makes sense, but where did the 1:40 rest come from? Here's my guess: IonEarth was incredibly sloppy in their treatment of pre-event data in the Junior Iditarod, including readings from when the trackers were in someone's car or truck on the way to the start. So, early in the race you'd see average speeds of something like 45mph. So what I think is that they truncated the pre-event data up to the start from some of their calculations and displays, but not the run/rest calculations. The tracker was on Rudy's sled and turned on for 1:40 before he started the race.

So the other thing that doesn't pass the smell test, as it relates to rest time, is the average speeds. If someone is taking rest their average speed will fall relative to their average moving speed (and IonEarth, seriously, kill that "moving average" label, at least if you don't want people to know that you don't have people working on this who've been through an introductory statistics class - it doesn't mean what you're using it to mean, and I think you could be using actual moving averages to provide better insights into your data). Here, they're identical, as you'd intuitively expect early in the race. If he'd really spent 1/3 of his race time parked, the average speed would be substantially lower.

So, we're not off to a good start, IonEarth analytics and I. This is just really sloppy, and it's pretty clear that it did not go through any QA process. There are a couple of things they can do. One is to manually edit the raw data down. Preferable, I think, would be to include the team's actual start time in their input data and use the Holy Mackerel of really simple programming to completely exclude anything that shows up before the start time (or after the finish time). What they have now is internally inconsistent, confusing, and lacks the ability to explain what's happening on the trail.

Saturday, March 2, 2013

Terrific improvements in the Iditarod website

I'd like to give a nod and a huge "Thank you!" to the Iditarod Trail Committee for one of the things they've done for their website. It looks like the site, and the Iditarod media content, are being served out of Amazon Web Services.

What this means in practice is improved performance, potentially much improved performance. Amazon Web Services are what's known as a "cloud" service, and it includes the ability to add resources (computational, disk, bandwidth) when they're needed and remove them when they're not. It makes it possible to move services around, keeping them available even in the face of hardware failure. If they do a good job, and Amazon does an excellent job, you won't notice it at all because the data will flow smoothly and services will remain available. I've been an Insider subscriber since they first started charging for it (although the last few years I haven't gotten a video subscription, but this year I have), and it seems like year after year after year after year they knew exactly how many subscribers they had and still underprovisioned their services, resulting in poor video performance, site outages, etc. This should solve that problem.

I should mention that the first few days that they started to see a lot of traffic there were some performance issues that appeared to be related to their database servers rather than their web services, and it looks like they've sorted that out. So, credit where credit is due, and many thanks to the ITC for taking these problems seriously and taking the right steps to getting them fixed.