We find bug in GISTEMP; GISS fixes it

Reto Ruedy of GISS has changed GISTEMP to fix a collection of minor bugs in STEP5’s SBBXotoBX.f, which David Jones and I found while re-implementing STEP5 in Python.  The fix did not have any effects on the final numeric outputs of GISTEMP.

This particular program combined land and ocean temperature data.  Each sub-box (an area of about 64,000 km^2)  is given an “ocean weight”, depending on the amount of ocean data and the distance of the nearest surface station.  Then the land and ocean series for each sub-box are given weights depending on the ocean weight and on the number of valid monthly temperatures.  Then the 200 series for each box (the land and ocean series for each of 100 sub-boxes) are combined in order of decreasing weight to form a single series for the box.

The error was in the way the land and ocean series were combined after sorting into order: sometimes the index of an entry in the sorted set was used to index into the unsorted set.

As it happens, with the parameters used for this program, in particular the Rintrp parameter set to zero, this error has no effect because the ocean weight is always either 1 or 0, so after sorting the second half of the set of data series always has zero weight.

In email to David and myself, Reto Ruedy expressed thanks to us and to the CCC-GISTEMP project.

7 Responses to “We find bug in GISTEMP; GISS fixes it”

  1. Tweets that mention Clear Climate Code Blog Archive We find bug in GISTEMP; GISS fixes it -- Topsy.com Says:

    […] This post was mentioned on Twitter by wmconnolley, Clear Climate Code. Clear Climate Code said: We find bug in GISTEMP; GISS fixes it: Reto Ruedy of GISS has changed GISTEMP to fix a collectio.. http://bit.ly/4RV58L #climatechange […]

  2. turkeylurkey Says:

    I imagine that it would be interesting to eventually plot the ‘long outreach’ situations where a rather distant record is used to create or ‘adjust’ another record.
    (i.e., Excel has a way to show the cells which are controlled by other cells).

    I think this would be a good way to highlight the stations whose influence extends far beyond their merits.
    Hopefully you could ‘instrument’ the code to generate a separate file where the ‘influencers’ of a given record or gridbox are captured during the making of the sausage.
    Probably the best way to do this is to just keep the Lat/Lon and Station ID for each ‘influencer’.
    This would potentially highlight two vulnerabilities; inappropriate ‘Rural’ UHI adjustments, and inappropriate spatial extrapolation.
    Cheifio has described these situations but the elucidation beyond counting them has not been feasible with the Gistran ‘code’.

    I imagine this would be a benefit of the ‘clean machine’ you have built.

    Of course, I think his suggestions of examination of the massive deletions of thermometers after 1990 (but not from the baseline), and the whole South America thing, are clearly higher priorities.
    But these are questions I’ve been hoping to ask him, while realizing it would be an heroic endeavor to implement on his running of the legacy code.
    TIA
    TurkeyLurkey
    If you can spit out the files, I’ll find a way to plot them on a georeferenced grid.

  3. TurkeyLurkey Says:

    After sleeping on it, I realized that the list of ‘influencers’ might be temporally dynamic…

    So, one easy way to assess this is to count the number of influencers of any given station.
    So, the desired output would be a matrix with one ‘station record’ per row,( CHeifio found 4 quartiles of 3368 such station records), and one column per year (month?) of operation. The individual cell entries could be the count of ‘influencers’ for that time period.

    The Cheifio had mentioned the dynamic nature of such influencers, as their records might enter or leave the fray, as their ’20-year’ qualifier status flips and flops. This might indeed be the impetus of some very odd adjustment behaviours, as apparently occur.

    From the status of the GimpTran code, they would seem to have very little ability to get insight into behaviours of this ilk.

    If indeed you see significant variability in the influencer count, this would tell you how to design the more detailed probe instrument, which would capture the erratic adjusters in action.

    As far as I can tell, CCC is alone in their ability to root out this kind of algorithmic anarchy; Cheifio has caught glimpses of it but his evidence is anecdotal and, to date, irreproducible.

    If there is not a significant level of temporal dynamism in the influencers of the stations, then it will be relatively straightforward to capture the {lat/lon/StationId} of each influencer.
    There I guess the desired format would be to have the Influenced station data, the number of influencers, and then their info.
    This would greatly facilitate the drawing of a geo-referenced map of ‘spider-webs’ where each web shows the vector(s) drawn from influencer to influenced.

    Again, I think that CCC is unique among the ‘Army of Davids’ in their techical ability to address these points. The others are all just looking at the raw inputs and adjusted outputs, without delving into the ‘blackbox’ that is between them.
    (OK< Cheifio could obviously do it, but it is much harder in the context of his chosen mission.)

    Like he said, 'It is a beautiful thing' that you are doing . Bless you all,.
    TIA
    TL

  4. TurkeyLurkey Says:

    OK< I found another person pointing in the direction of algorithmic anarchy;

    http://www.kilty.com/pdfs/revisions.pdf

    Anyway, a nice review of what the adjustments are purportedly striving to accomplish.

    I think that the CCC project gives the opportunity to construct validation test cases to demonstrate fulfillment or mangling of the assumed functionality of the adjustment procedures.

    As Cheifio ~sez, the big news is probably in the behavior of the (narrowly-validated-conceptual) adjustment procedures, when applied to a large (temporally and spatially inconsistent) dataset, for which the adjustment procedures give bizarre results.

    Your guidance in the creation of suitable test sets will be most valuable.
    TL

  5. TurkeyLurkey Says:

    dropping thermometers;

    Of the many fine posts of Cheifio, I think this one describes the issue best;
    http://chiefio.wordpress.com/2009/10/22/thermometer-langoliers-lunch-2005-vs-2008/#20yrs

    Again, CCC is the only horse in the race to unearth this issue.
    Regards,
    TLakaAC

  6. Nick.Barnes Says:

    TurkeyLurkey, you have written a great deal here and I don’t have time to respond to very much of it. Thank you for the complimentary things you write about the CCC project. You make a lot of references which I don’t immediately understand, for instance to “GimpTran”, to “20-year status”, to “the whole South America thing”, and so on.

    Your concerns as expressed are too high-level or vague for me to be able to address them in the limited time I have available. If you can be more specific and succinct, I might be able to respond constructively. For instance, if you can say “this part of step0.py seems strange to me, can you explain it?” or “how do the results change if you omit this section, or these stations?”, then one of us might well be able to help.

    As I have said elsewhere, I believe that the CCC-GISTEMP code can and will provide a baseline for other people to investigate the effects of varying the GISTEMP algorithms, and of course you are very welcome to use it in that way. If you find it hard to understand, you should report that as a bug.

    Some of your remarks, such as “GimpTran”, seems as if they might be intended to cast aspersions on GISS staff or their work, in which case they have no place here; please take care to avoid such an appearance here.

  7. Climate codes | Klimapolis Says:

    […] most fascinating aspects is the amount of constructive skepticism the CCC people show. When they found a bug in GISTEMP, they sent the information to NASA who fixed it. Ultimately, GISS may take over the new […]

Leave a Reply