Wednesday, May 2, 2012

A closer look at the Spot tracker API - Write your own tracker

Well, not very much closer, since the data and their representation are pretty simple, as described here.  So, I'll take a look at how to pull down the data, parse them out, and put them on a map.  This is where race tracking software starts, and the tracker put together by the folks who tracked the Percy Dewolfe didn't go a lot further.  A "real" tracker is a far more complicated thing and needs to be smart about speed to be credible, has to deal with storing and retrieving data for all participants over the length of the race, subsetting the data on demand, plus all that fancy stuff that Trackleaders and others do.

Snagging the data

An "API," or "application programming interface," is how services are exposed (or provided) to third parties programmatically.  That is to say, if I want to write my own programs that make use of Spot's tracking data, or Google Maps (or both!), I use their APIs from my programs.

The truth is that Spot's API can barely be considered an API, since all it really allows you to do is to download XML-formatted records.  Here's how it works:
  1. When you register your Spot tracker, or edit your settings, you have the opportunity (if and only if you've paid for the tracking service) to set up a "shared page" to display the data.  That shared page is automatically assigned a "glId" by Spot.  I believe that "glId" is a "guest link" identifier, but I wouldn't swear to it.  Anyway, you can find the glId for your shared pages by going to your list of shared pages at https://login.findmespot.com/spot-main-web/share/list.html and choosing the shared page that has the data you'd like to use.
  2. You'll find the glId embedded in the URI of the shared page.  So, if the URI of the shared page is "http://share.findmespot.com/shared/faces/viewspots.jsp?glId=0BW5A0B1QQFTHG4GivzbYsQFyIFo5VHTh" the glId is 0BW5A0B1QQFTHG4GivzbYsQFyIFo5VHTh.
  3. The data can be accessed through the URI http://share.findmespot.com/messageService/guestlinkservlet?glId=0BW5A0B1QQFTHG4GivzbYsQFyIFo5VHTh&completeXml=true with the glId from your shared page substituted in for the glId value.
When you do an HTTP GET on that URI, what you'll get is a chunk of XML, as described in this earlier post.

Displaying the data

Once you get the data out of your Spot shared page, you need to parse the XML, then turn it into data you can display using the mapping API of your choosing.  These days the most popular option is Google maps, which has a rich API and which gives the programmer a lot of control over what goes on the page and how the user interacts with it.  I've written a very simple example of using Spot data to create a track on a Google map, with clickable track points and pop-up info windows that show the timestamp and the distance from the previous track point.

I wrote it in Javascript because I wanted to be able to provide code you could play with without having to install additional infrastructure (languages, libraries, etc.).  If you're writing a real tracker you probably don't want to do this, since it puts some computational and storage load on the user's browser (seriously, doing trigonometric calculations in a browser window doesn't seem like a great idea).  It's cleaner and more efficient to do the computational work (including XML parsing) on the server side, in the language of your choice.  Note that in the code below I'm grabbing the data from localhost.  I made a local copy, since Spot only keeps your data for a week.  If you're writing a real tracker you'll have to add local data storage and data management (de-duplication, for example) to your code.  I've also left out all error-checking, data validation, etc., to keep the sample code compact and clean.  If you're writing code for production use you must check to make sure that operations are successful, that the data you're getting are clean, etc. - if you're going to fail, fail gracefully.  As an example of what I'm talking about, take a look at extract_gps_data, and notice that I'm making a lot of assumptions about what data are present and that the XML document hasn't been corrupted in some way - that's terrible programming practice.  Checking for run-time errors and validating your data gives you control over what your users experience if something goes wrong.  A program that "works" doesn't really work if it blows up on unexpected inputs.

So here's the code.  Drop me a line if anything isn't clear, or if you notice a problem.


<!DOCTYPE html>
<html>
<head>

 <title>My Wee Tracker</title>
 <meta name="viewport" content="initial-scale=1.0, user-scalable=no" />
 <style type="text/css">
  html { height: 100% }
  body { height: 100%; margin: 0; padding: 0 }
  #map_canvas { height: 100% }
 </style>
 <script type="text/javascript" 
  src="http://maps.googleapis.com/maps/api/js?key=AIzaSyBFoJjPtS9vWXIENOa-egd0XFFnnQbfTIk&sensor=false&libraries=geometry">
 </script>

<script type="text/javascript">
//<![CDATA[

// load_xml_doc takes a uri and retrieves the document at that location,
// and returns it.  

function load_xml_doc(uri)  {
 if (window.XMLHttpRequest)  {
  var request = new XMLHttpRequest();
 } 
 request.open("GET", uri, false);
 request.send();
 return request.responseXML;
}


// "point" is an object we use to hold the data we'll be putting on the map
//  I hate that javascript has us declare classes as functions

function point(timestamp, latitude, longitude)  {
 this.timestamp = timestamp;
 this.latitude = latitude;
 this.longitude = longitude;
}

// here's where we pull the tracker data out of the XML document
// and convert it into something easier to deal with when scribbling
// on the map.  It creates and returns an array of tracker points (messages) 

function extract_gps_data(trackerdata)  {
 var points = new Array();
 
 tracker_points = trackerdata.childNodes[0].getElementsByTagName("message");
 for ( i = 0 ; i < tracker_points.length ; i++ ) {
  tracker_point_node = tracker_points[i];
  timestamp = tracker_point_node.getElementsByTagName("timestamp")[0].textContent;
  latitude = tracker_point_node.getElementsByTagName("latitude")[0].textContent;
  longitude = tracker_point_node.getElementsByTagName("longitude")[0].textContent;
  
  var point_holder = new point(timestamp, latitude, longitude); 
  points.push(point_holder);
 }
 return points;
}
 
function makeinfobox(pointnum, thispoint, theotherpoint)  {
 var latlnga, latlngb; 
 var distance;
 var infoboxtext;
 var timestamp;
 
 timestamp = new Date(thispoint.timestamp); // we convert it from ISO format to something more readable
 infoboxtext = String(timestamp);
 if (pointnum > 0)  {  // no point calculating distance on the point
  latlnga = new google.maps.LatLng(thispoint.latitude, thispoint.longitude);
  latlngb = new google.maps.LatLng(theotherpoint.latitude, theotherpoint.longitude);
  distance = google.maps.geometry.spherical.computeDistanceBetween(latlnga, latlngb) / 1610; // convert to miles
  infoboxtext = infoboxtext + "<br />" + distance.toFixed(2) + " miles";
 } 
 return infoboxtext; 
}

// here's our pseudo-"main"

function initialize()  {
 var i = 0;
 var trackline = new Array();
 var windowtext;

 // First we pull down the tracker data and load it into an array of point objects

 trackerdata = load_xml_doc("http://localhost/~melinda/trackerdata.xml");
 points = extract_gps_data(trackerdata);
 
 // Next, we set up the map
 
 var spot = new google.maps.LatLng(points[0].latitude, points[0].longitude);
 var my_options = {
  center: spot,
  zoom: 12,
  mapTypeId: google.maps.MapTypeId.ROADMAP
 };
 var map = new google.maps.Map(document.getElementById("map_canvas"), my_options);


 for ( i = 0 ; i < points.length ; i++ )  {
  var contentstring = "Point " + i; 
  var spot = new google.maps.LatLng(points[i].latitude, points[i].longitude);
  // here we create the text that is displayed when we click on a marker
  var windowtext = makeinfobox(i, points[i], points[i-1]);  // if you tell anybody I did this I'll deny it vehemently
  var marker = new google.maps.Marker( {
   position: spot, 
   map: map,
   title: points[i].timestamp,
   html: windowtext
  } );
  
  // instantiate the infowindow
  
  var infowindow = new google.maps.InfoWindow( {
  } );

  // when you click on a marker, pop up an info window
  google.maps.event.addListener(marker, 'click', function() {
   infowindow.setContent(this.html);
   infowindow.open(map, this);
  });

  // set up the array from which we'll draw a line connecting the readings
  trackline.push(spot);
 }  
 
 // here's where we actually draw the path 
 var trackpath = new google.maps.Polyline( {
  path: trackline,
  strokeColor: "#FF00FF",
  strokeWeight: 3
 } );
 trackpath.setMap(map);
}
//]]>

</script>
</head>

<body onload="initialize()">

<div id="map_canvas" style="width:100%; height:100%"></div>

</body>
</html>

Friday, April 13, 2012

Out-and-back is hard

The Kobuk 440 is underway!  It's basically the last race of the season, a very highly-regarded mid-distance race from Kotzebue to Kobuk and back.  It draws a nice mix of participants, from big-name professional mushers to smaller kennels from the coastal villages.  By all reports the hospitality is incredible.  Checkpoint in/out times are here, and kudos to the Kobuk organization for saving themselves some work and using Google docs to display the data.

Online tracking is being provided by Trackleaders, who are definitely growing into the go-to guys for dogsled race tracking.  They've developed a lot of experience and domain-specific knowledge over the past few years.

One disappointment in the tracking has been the disappearance of the "Race Flow" plot.  It showed up briefly after one refresh and disappeared after the next, and I'm hopeful that they've just taken it offline while working out some kinks and that it will be back next year.

In the meantime things seem to be going pretty smoothly, modulo Ed Iten's tracker seeming to have been left in Noorvik.  Unfortunately the race projections are dubious, as usual, but this time they're providing a pretty great illustration of some of the challenges around computing statistics for out-and-back races.  To wit, here's a screen shot from this morning (note that the "race clock" and all other times are given as days:hours:minutes since the start of the race, April 12th at 6pm).


I've circled Ed's projected times into Kobuk (the turnaround point) and Kotzebue (the finish).  You'll notice that he's projected to arrive into Kobuk at two days, nine hours and change into the race, and to finish one day, five hours, and 21 minutes into the race.   In other words, the projections show him finishing the race more than a day before arriving at the halfway point.  While I think most of us agree that the laws of physics are often inconvenient I hope that we can also agree that there's really not that much we can do about them and that the projections are being calculated incorrectly.  You'll also note that Ed is shown as winning despite his tracker being in last place by about 40 miles.  These both have the same root cause: I'm pretty sure that they're doing projections based on straight-line distance to given checkpoints, rather than doing them based on trail distance.  For any given blob of tracker data all you really know is when and where it was sampled - you don't know how fast the tracker was moving or the direction in which it was moving, so it's incumbent on the folks developing the software to compute that stuff for us based on the set of data.

So, these projections aren't useful, unfortunately.  I tend not to like projections much, anyway, since they don't accommodate the art and judgement part of the runtime equation (what are trail conditions?  how's the weather?  This guy rested a lot in a checkpoint on this leg but camped for 6 hours on another one - what's he going to do on the next leg?  etc.), but these are just wrong because they're being computed on an incorrect basis.

Maybe they'll get it straightened out for next year!  I hope so, although I imagine they've got another busy summer of tracking bike races.  I'm also looking forward to seeing what they do with the race flow chart.  It's got a few hiccups but I think it's an incredibly useful visualization tool, possibly  the single best tool for understanding what's happening in a race available from any online tracking system today.

In the meantime, enjoy the last big race of the season as we turn our plans towards summer and preparing for next winter's mushing.

Tuesday, April 3, 2012

A few quick notes


  1. The season is winding down but it's not over yet!  The Kobuk 440 starts on April 12 (mass start!  I hope they get some video).  Trackleaders will be providing race tracking
  2. The new Mushing magazine is out and includes an article I wrote on GPS race tracking
  3. Having been peripherally involved in organizing several local races, I'm kind of appalled by the amount of paperwork that has to be done the hard way (manually).  So, I'm starting to write a race management package - RGOs have events, events, have races, races have OMGWTF, etc.  One goal is to automate much of the form generation and accounting, and I hope to integrate it with race reporting but there are some tough constraints, like not being able to assume network connectivity, a few volunteers not being comfortable using computers, etc.  Any thoughts or wish-list items would be much appreciated
  4. Check it out - a blog on backpacking stoves!  It has excellent stuff on cold weather fuel considerations, etc.  I don't know about you but I almost always carry an Esbit solid fuel stove when out on the trail.  They weigh next to nothing and fold down nearly flat and I think they're great survival gear, but there's no way that I'd choose to cook with one given an option.  So, when I expect that I'll actually be cooking I have a couple of Jetboils and I'll carry one, although they aren't great in the cold.  

Thursday, March 22, 2012

Another tracking service!

The 2012 Percy DeWolfe started this morning.  It's a very nice, small race from Dawson City to Eagle and back on the Yukon River.  This year there are six entries in the main race, with additional entries in the shorter "Percy Junior."

I was a little surprised to see that they were providing trackers (although it's something we've talked about doing for the Two Rivers 200, another friendly, small mid-distance race).  When I saw that trackers would be available I assumed it would be through Trackleaders, since they're low-cost, use inexpensive hardware, and are familiar to the dog mushing community because they're used in so many races.

So, I was very surprised to see that the tracking service is being provided by Mammoth Mapping, a Dawson City-based GIS company.  They're projecting locations from Spot devices onto an embedded Google Earth map.  They aren't doing much beyond projecting locations, although if you click on each of the mushers you'll see their checkpoint times, which I gather are pulled out of manually-entered checkpoint timesheets.  There are no speed or distance computations, no historical track, no analytical tools.

I think that at this point, given what's been available through Trackleaders and IonEarth (the tracking service provider for the Iditarod), if this were a bigger race or one that attracts a broader fan base, people watching the trackers would be frustrated by the limited functionality provided by the Percy trackers.  But it seems to me that this is just about right for a small, friendly slightly out-of-the-way race.  When talking about whether or not to provide trackers for races here in Two Rivers one person said "I don't want [famous musher] showing up!", suggesting that he thought that trackers were both an indicator of a fancy, high-end race and likely to draw big-name mushers, causing us to lose the friendly local feeling.  What we're seeing with the Percy is that there's a happy medium, where we can watch the race unfold without bringing to it the competitive, less intimate feeling of one of the huge mid-distance races.

Sunday, March 18, 2012

Experimenting with Google Docs spreadsheet

The Two Rivers 200 was this weekend.  Although it's a small, local Alaskan race we always get a few ringers in there, and now that mushing has turned into a spectator sport interest in seeing results as they come in extends beyond just the participants and friends.  So, we decided to try using Google docs spreadsheet to share the data with fans online.

Throwing something basic together was trivial, but I'm terrible at user interface and esthetics so Chris cleaned it up to make it more legible and easier to deal with.  Here's what we ended up with, basically just in/out data.

Once we had that in place it took just another 1/2 hour or so to put together a sheet that contained run-time summaries for time between checkpoints (see the tabs on the left-hand side of the bottom of the spreadsheet).  Google spreadsheets does a nice job supporting multiple sheets and allowing referencing between sheets (and hooray for providing the ability to do arithmetic on time and date!).

Here's what we found:

  • This was quick and easy to put together, and having it embeddable in other web pages is a big win
  • Format and arithmetic support was richer than I expected
  • It's something that pretty much anybody who's ever worked on a spreadsheet can figure out how to use
  • We did lose a little bit of data in what appeared to be a race condition situation - two people working on the same cell at the same time.  When we noticed it we added it back
  • We really needed to be more systematic about how we gathered the data in the first place, since there were checkpoints with no phone or internet, etc.
  • There appears to be a data validation glitch, in that stuff I didn't want displayed if it didn't meet some criteria were displayed, anyway
  • I think we can code our way around that last one (for example, if a given value is less than 0 display 0).
  • I love being able to provide different views into the same data and that's definitely something we can do here.  Fans are often interested in things like run times, average traveling speeds, etc., and that's something we can do pretty easily with spreadsheets.  We can do it with web pages, too, but frankly it's a lot more work.  I cannot overstate how easy this was.
  • The big drawback was that you need internet connectivity to be able to update the spreadsheet, and that's not always available.  
So, basically I think it was a win, and certainly the data display was superior to much of what we've seen from a number of other small races.  The biggest challenge is definitely the internet connectivity question.

[edited to add: I don't think this is a great approach for major races, since I think the question of who owns the data and what happens to them later is not a trivial one]

Thursday, March 8, 2012

Speed calculations - where do they come from?

When you click on a racer on the IonEarth satellite tracker, it shows you their geographic coordinates, the time of the reading, the race mile, and their speed.  I addressed the question of how they calculate the race mile in this post, and I thought it might be interesting to see how they calculate speed.  One thing that interested me a lot is why their speed numbers are so smooth, compared to the ones being computed by the Spot-based trackers.  The latter can produce quite erratic numbers with a systematic bias on the low side.  Ultimately where I ended up is that the IonEarth GPS units are probably doing more work than I realized, and are another example of good hardware engineering and smart tradeoffs.

Consumer GPS units typically calculate and display speed for you, but they're taking frequent location readings, which drains the batteries pretty quickly.  The IonEarth battery lasts for two weeks in extreme cold, so it's seems unlikely that they're taking frequent readings.  My starting assumption was the obvious bet that they're using the distance between adjacent readings to calculate speed.  I chose Paul Gephardt at random and two readings as he was nearing Rainy Pass: one at 10:20am on Monday and the following at 10:30am on Monday.  Based on the coordinates from the GPS readings, he traveled 1.35 miles in that 10 minutes.  Multiply by 6 (since we've got 6 10-minute periods in an hour), and we get 8.1mph.  However, the tracker says he was traveling 8.4mph in that period.  Why the difference?

So, I backed up to find the difference between the 10:30 reading at the 10:10 reading.  That gave me 2.7 miles in 20 minutes, or 8.1 mph.  Still no match.  Back one more!  Paul traveled 4.1 miles in 1/2 hour, or 8.2 mph.  I'm still not getting the 8.4 mph that the tracker claimed.  So, I went back and calculated the distance he'd traveled since 9:30, or exactly an hour, and I got 7.8 miles - not even close.

This is starting to get annoying.

What I've been doing, here, is trying to calculate what are known as "moving averages."  Moving averages are a terrific way of smoothing erratic data and looking at longer-term trends.  It's a handy tool for taking a look at stock trends, polling data, all sorts of collections of numbers where you're interested in how they change over time.  I've noticed that the Iditarod speed numbers are more consistent than what we'd seen in the Quest and it occurred to me that they might be calculating moving averages on speed (although when they're stopped, they're stopped) as a way of smoothing the speed numbers.  If they are, I can't see it, since the speed numbers all seem to be on the high side of what I'm calculating (you wouldn't see a consistent bias in an average like that).

So, the next candidate is weighting.  Trackleaders knows that since they're calculating speed based on the distance between two points and a racer is never traveling on a perfectly straight line, they need to do something to correct for underestimating the distance traveled and therefore underestimating the speed.  So, they increase the distance by about 7%, or basically multiplying the distance times 1.07.  Is that what IonEarth is doing?  Back to the calculator!  Let's see if the speed calculations are consistently high by some percentage.

So, the tracker says he traveled at 8.4 mph from 10:20 to 10:30.  Based on the straight-line distance he traveled at 8.1 mph.  In other words, the tracker says he traveled about 4% faster than my calculations.  From 10:10 to 10:30 he traveled 1.4 miles.  The tracker says he was traveling 7.7 mph, and by my calculation he was traveling 8.4 mph.  Whoa!  That's off in the other direction, so clearly they're not weighting.

So, so far it's not really clear where the speed calculations are coming from, by just looking at the data being provided to us.  However, my best guess at this moment would be that the GPS is actually taking location readings more often and using those to calculate speed, but the only locations they're displaying on the map are 10 minutes apart.  One of the primary issues here is preserving the battery, and I think it would almost certainly use less power to calculate the speed on the GPS device than it would to uplink the additional data.  If this is what's going on it's another example of excellent hardware engineering on the part of IonEarth.

Tuesday, March 6, 2012

You know what would be pretty great?

If IonEarth had some predefined sets of mushers, so you could choose one menu item to see all the rookies (the rookie race is a major race-within-a-race), the women, the Canadians, etc.  There are a few subsets of the racers that group together naturally and that we're trying to sort through.  What groupings would you like to see?