Archive for the ‘announcement’ Category

ccc-gistemp release 0.6.0

[ed: 2010-10-29: This was an announcement for release 0.6.0, but that has a bug (see comments). Please use release 0.6.1 instead. I’ve edited the article to update the links]

I’m pleased to announce ccc-gistemp release 0.6.1 (making the previous buggy 0.6.0 obsolete). ccc-gistemp is our project to reimplement the NASA GISS GISTEMP algorithm in clearer code (in Python). This release is the first release made under the aegis of the Climate Code Foundation.

Many of the significant changes in this release have already been previewd in earlier blog posts:

Further details are available in the release notes.

We intend to carry on our work on ccc-gistemp, and we urge you to download our code, try it, and read it. We welcome contributions.

The work for release 0.6.0 was carried out by David Jones and Nick Barnes, and the work for release 0.6.1 was carried out by David Jones.

ccc-gistemp release 0.5.1

I am pleased to announce ccc-gistemp 0.5.1 (the astute reader will note that there is no announcement for release 0.5.0. It is available but does not work in Python 2.5.1 so I fixed that for release 0.5.1).

Compared to the previous release, the changes are not so grand. This release incorporates many incremental improvements to clarity. It also has a couple of bug fixes: to cope with the fact that the GISTEMP source tarfile that we used changed its layout (see this comment here for example); and to once again run on Python 2.4 (a thoroughly ancient version, please try and use Python 2.6).

I have spent a large amount of time trying to clarify Step 2 the peri-urban adjustment described in Hansen et al 1999. I encourage you to try out this release, read the code, and help us improve it.

David Jones, Nick Barnes, and Ronan Lamy have contributed to this release.

ccc-gistemp release 0.4.0

[Updated: ccc-gistemp release 0.4.1 is now available]

I am pleased to announce ccc-gistemp release 0.4.0. This release is much clearer than previous releases. Give it a go.

  • Almost all of our code has now been rewritten to remove the Fortran style which remained from the original conversion from GISTEMP. Previous releases had greatly improved steps 0-2; this release continues the improvement work there and also carries those improvements through steps 3-5. Almost all of the code now has sensible variable and function names, clearer data handling, and helpful comments. Many unused variables and functions have been removed. The current core algorithm has 3740 lines of code, of which more than half are either comments, documentation strings, or blank.
  • Rounding has been completely eliminated from the system. Previously, rounding and truncation code was used to exactly emulate GISTEMP. Rounding made the code less clear, and Dr Reto Ruedy of NASA GISS confirmed that rounding was not important to the algorithm, so it has been removed. All temperature data is now handled internally as floating point degrees Celsius (previously it was a mixture of integer tenths, floating point tenths, and floating point degrees) and all location information is handled as floating point degrees latitude and longitude (previously it was a mixture of floating point degrees and integer hundredths).
  • In a normal run of ccc-gistemp, no data passes through intermediate files. Much of GISTEMP is concerned with generating and consuming intermediate files, to separate phases and to avoid keeping the whole dataset in memory at once (an important consideration when GISTEMP was originally written). We have now completely replaced this with an in-memory pipeline, which is clearer, automatically pipelines all the processing where possible, and avoids all code concerned with serialization and deserialization.
    We now have separate code to generate data files between the distinct steps of the GISTEMP algorithm, and to allow running a step from a data file instead of in a pipeline. This allows the running of single steps, and is useful for testing purposes.
  • Parameters, such as the 1200 km radius used when gridding, and the number, 3, of rural stations required to adjust an urban station, which were scattered throughout the code, are now all to be found, with explanatory comments, in code/parameters.py
  • It’s now possible to omit Step 4 and produce a land-only index, which closely matches GISTEMP.
  • It’s also possible to omit Step 2, and run the algorithm without the urban heat-island adjustment.
  • GISTEMP recently switched to using nighttime brightness to determine urban/rural stations. We made the corresponding change, which is switchable.

Note that none of these changes altered any of our results by more than 0.01 degrees C, except for the change to urban station identification, for which the changes in our results (none greater than 0.03 degrees C) closely match the changes the GISTEMP results.

The work for this release has been done by David Jones, Paul Ollis, and Nick Barnes.

[Updated: this release has been swiftly followed by ccc-gistemp release 0.4.1, to fix a bug reported in comments here.]

GISTEMP Land Index

GISS publish a land-only temperature anomaly (referred to as their “traditional analysis”).

As I pointed out in an earlier article ccc-gistemp can now create a land index by omitting Step 4: python tool/run.py -s0-3,5.

Here’s how we compare with official GISTEMP:

The 1990s station dropout does not have a warming effect

Tamino gives his results for his GHCN based temperature reconstruction. It is well worth reading. He also gives a comparison between stations that are reporting after 1992, and those that “dropped out” before 1992. He concludes that there is no significant difference in the overall trend. In other words refuting the claim that the 1990s station dropout has a warming effect. His results are preliminary and for the Northern Hemisphere only.

Tamino’s analysis use only the land stations; in order to write this blog post I tweaked ccc-gistemp so that we can produce a land index (python tool/run.py -s 1-3,5 now skips step 4, avoids merging in the ocean data, and effectively produces a global average based only on land data).

It is very easy to subset the input to ccc-gistemp and run it with smaller input datasets. So in this case I can split the input data into stations reporting since 1992, and those that have no records since 1992, and run ccc-gistemp separately on each input. I created tool/v2split.py to split the input data. Specifically I ran step 0 (which merges USHCN, Antarctic, and Hohenpeissenberg data into the GHCN data) to create work/v2.mean_comb then split that file into those stations reporting in 1992 and after, and those not reporting after the cutoff. Then I ran steps 1,2,3, and 5 of ccc-gistemp to create a land index:

It is certainly not the case that the warming trend is stronger in the data from the post-cutoff stations. [edit 2010-03-22: In a subsequent post I add trend lines to this chart]

The differences between these results and Tamino’s are interesting. Both show good agreement for most of the 20th century. These data show more divergence than Tamino’s in the 1800’s. Is that because we’re using Southern Hemisphere data as well, or is it because of the difference in station combining? Further investigation is merited.

We hope to make “experiments” of this sort easier to perform using ccc-gistemp and encourage anyone interested to download the code and play with it.

Update: Nick B obliges with a graph of the differences:

ccc-gistemp release 0.3.0

I am pleased to annnounce ccc-gistemp release 0.3.0. This includes a number of bug fixes and features in our framework and tools, and a great deal of clarification work especially in steps 1 (station combination) and 2 (peri-urban adjustment). Really, it’s much better. Give it a go.

Much of GISTEMP was concerned with generating and consuming intermediate files, to separate phases and to avoid keeping the whole dataset in memory at once (an important consideration when GISTEMP was originally written). In 0.3.0 this has largely been replaced by an iterator-based approach, which is clearer, automatically pipelines all the processing where possible, and avoids all code concerned with serialization and deserialization.

We have retained intermediate files between the distinct steps of the GISTEMP algorithm, for compatibility with GISTEMP and for testing purposes. We have also retained some code to round or truncate some data at the points where Fortran truncates it for serialization. This will be removed in future.

Some of the original GISS code was already in Python, and survived almost unchanged in 0.2.0. Much of the rest of 0.2.0, especially the more complex arithmetical processing in step 2, was more-or-less transliterated from the Fortran. A lot of this code has been rewritten in 0.3.0, especially improving the clarity of the station-combining code (in step1.py) and the peri-urban adjustment (now in step2.py).

There has been a rearrangement of the code: the code/ directory now only contains code which we consider part of the GISTEMP algorithm. Everything else – input data fetching, run framework, testing, debugging utilities – is in the tool/ directory. This division will continue, to allow us to add useful tools while still reducing and clarifying the core code.

There is better code for comparing results, and a regression test against genuine GISTEMP results.

All-Python ccc-gistemp release

I am proud to announce release 0.2.0 of ccc-gistemp.  This is an all-Python reimplementation of GISTEMP, the NASA GISS surface temperature analysis.  Please feel free to download and play with it.  It will automatically fetch input data across the internet, and produce textual and graphical result files.

This release works on Windows, Linux, Mac OS X, FreeBSD, and probably anywhere else you can get Python to work.  The only dependency is on Python (2.5.2 or later, as we discovered today that the code to fetch input data trips over a bug in earlier Python libraries).

The results of running this release match GISTEMP results very closely indeed:

Comparison of ccc-gistemp with GISTEMP, on common input data

In fact, the annual global, northern hemisphere, and southern hemisphere anomaly results are identical, as are the southern hemisphere monthly anomalies.  The global monthly anomalies differ 7 times, out of more than 1000, each time by one digit in the least-significant place.

This ends phase 1 of the CCC-GISTEMP project.  However, although there is no remaining Fortran, ksh, or C source code, much of step1.py is still GISS code, and a lot of the large-scale structure of the code is still dictated by its 1980s Fortran heritage.  For instance, the data is broken up into pieces because it couldn’t all fit into memory at once [ed: 2010-01-19: this particular instance is Issue 25 and it’s now fixed].  This obscures the underlying algorithms being applied.  Phase 2 of CCC-GISTEMP will refactor the code to eliminate this obscurity.  We expect one side-effect to be an increase in speed.

Thanks to all who have contributed, including David Jones, Paul Ollis, Gareth Rees, John Keyes, and Richard Hendricks. Thanks also to Reto Ruedy at GISS, who has been helpful and responsive.