Tuesday, February 19, 2013

Using my spreadsheets for race data - first post

Somebody-or-other said "The two qualities of a great programmer are laziness and hubris," but I'm too lazy to look it up (and am sufficiently hubristic to suggest that I'm a great programmer by claiming to be lazy).  When you create something it's a really good idea to make it easy to adapt for other uses.  So I put together these spreadsheets for my own use during the Yukon Quest and I think they're now mature enough that someone else may find them interesting.  I've been trying to figure out how to write up a description in a way that's got minimal babbling and maximal useful information and haven't been very successful, so I thought it might be worth stepping through each of the fields.

But first, design goals.  In the best of all possible worlds,

  1. you should only have to enter a piece of data once, and
  2. you should never, ever have to do arithmetic
And that's what I did with the spreadsheets.

I started with something in mind along the lines of a relational database implemented with a spreadsheet.  That is to say, I'd have different sheets/pages within the spreadsheet for each checkpoint, plus the start and finish.  On each sheet, all I'd enter would be the musher's name, the time in, and the time out, and the other data would all be calculated from those.

I chose to use bib numbers to tie data together.  For example, if I wanted to calculate runtime between two checkpoints for Brian Wilmshurst (bib number 1), I'd search for the departure time for bib #1 on the sheet for the previous checkpoint and subtract that from his arrival time at the current checkpoint.  The reason that I chose bib number instead of name that there can be a lot of variation in the spelling of some names, especially when volunteers are tired, and if there was an error it would be easier to find it if we used bib numbers.  During the Copper Basin this year one musher's name went back and forth between "Matt" and "Mark" across multiple checkpoints, which convinced me to use something more reliable - bib numbers!

A couple of words on errors: it's a matter of professionalism for code to recognize and, where possible, correct errors.  A programmer who isn't diligent about error (or exception) handling is a slob.  What I found with the Google Drive spreadsheet (and spreadsheets in general, I guess) is that it's pretty easy to catch certain kinds of errors so that you wouldn't end up with garbage in cells, but catching input errors is pretty hard. Because of this, over the long term I think it's probably a good idea to write a race management package that's a little more specialized and that can be more robust against errors that are otherwise difficult to process in spreadsheets.

I'm going to assume that you've got some basic experience with spreadsheets, that you know that columns run up-and-down and rows run across, and that columns are labeled with letters, rows are labeled with numbers, and a "cell" is specified by its row and column.

So, I put mushers in rows and data types in columns.  Here's the column header for the race start, along with the first few rows:

As you can see I've got a column for the "position in," which is actually the bib number and should be changed to reflect that, the musher name, and the date and time out.  That's it!

But you'll probably have noticed that I've got the date and time out in the same field (cell).  It's possible to put the date in one cell and the time in another cell, but combining them makes it easier to do arithmetic on them (particularly across day values - for example, someone leaves at 11pm Tuesday and arrives at 3am Wednesday).

Here's what the column headers and first few rows for each checkpoint sheet looks like:

For each team we've got a cell for the bib number, name, date and time in, runtime from previous checkpoint, speed from previous checkpoint, speed rank, date and time out, and rest time at that checkpoint.  The only cells in which I actually entered data were the name, date and time in, and date and time out.  All of the other fields (including bib number - more on that later) were calculated.

Next up, how the arithmetic works.

No comments:

Post a Comment