Gavin’s bug

(I call it Gavin’s bug because he found it, not because he created it).

Gavin’s bug concerns Step 3, and in particular the part where for each cell a list of stations within 1200 km is created. Creating this list of “nearby” stations means potentially consulting all the stations. All the stations could be quite a lot. So for each of the 80 boxes (see overview.txt), the code restricts the search list to an extended box. Here’s a typical box (in white) and the extended box (in green):

In principle it’s faster to first cull the entire station list by rejecting those stations outside of an enclosing rectangle based on latitude and longitude, because we can do that quickly without needing to do the trigonometry for the proper “within 1200 km” test.

The bug is that the rectangle chosen for the culling is not large enough. Here’s the code from revision 665 of step3.py:

# Extend box, by half a box east and west and by arc north
# and south.
extent = [box[0] - arcdeg,
box[1] + arcdeg,
box[2] - 0.5 * (box[3] - box[2]),
box[3] + 0.5 * (box[3] - box[2])]
if box[0] <= -90 or box[1] >= 90:
# polar
extent[2] = -180.0
extent[3] = +180.0

Note that the comment says that the box is extended by half a width east and west. The polar boxes (there are 4 for the North Pole, and they all meet at the pole) have to be special-cased because otherwise stations close to the pole would be missed.

The bug concerns the boxes that span latitudes 44 to 64 (there are 8 of these in each hemisphere, see overview.txt. The extended box isn’t quite wide enough:

But wait… that circle (above) is centred on the northwest corner of the box. When it comes to cells, it’s the centre of the cell that used to select the stations within 1200 km, and the northwest cell is set inside the box a little bit. Is it enough?

The white circle is 1200 km around the centre of the northwest cell. It just fits. Phew!

So even though there are point within 1200km of the box that are not inside the extended box, no stations are missed, because the cells inside the box are too large to be close enough to the edge.

Perhaps at one time computers were slow enough for this “optimisation” to make a difference, but it’s been irrelevant for probably at least 20 years. Whilst the code seems to be correct, it’s not clearly correct.

It would be simpler and clearer without it. So it’s gone.

It does change the results by a tiny tiny bit. The, essentially bogus, reasons for which I may writeup in another post.

5 Responses to “Gavin’s bug”

  1. Nick.Barnes Says:

    Could the inbox() function go as well?

  2. Gareth Rees Says:

    It seems kind of wrong to mark issue 95 as WontFix, though, since there was in fact a defect (the code wasn’t clear enough) and you did fix it.

  3. drj Says:

    @Gareth: I agree. Really what i wanted to say was “NotABug”. I’ve created Issue 102 for the code clarity bug.

  4. drj Says:

    @Nick: yes! now it’s gone too.

  5. Robert Way Says:

    Hey all at CCC

    See the following, you might find it interesting!

    http://www.skepticalscience.com/Arctic_Temperature_Change.html

Leave a Reply