Archive for the ‘status’ Category

GISTEMP tab

I added a tab page about GISTEMP which has more detail on the status of ccc-gistemp. Of note from that page:

It is our opinion that the GISTEMP code performs substantially as documented in Hansen, J.E., and S. Lebedeff, 1987: Global trends of measured surface air temperature. J. Geophys. Res., 92, 13345-13372., the GISTEMP documentation, and other papers describing updates to the procedure.

How close are we to GISTEMP?

This close:
gistemp (red) vs ccc-gistemp (black).

The two graphs are almost on top of each other. I’ll add 0.02K to the black line to separate them a bit:
gistemp (red) vs ccc-gistemp (black), artificially separated.

We can now see the red series that the black series was hiding, and we can see that the differences between the 2 series are minute at most. 1 or 2 centikelvin here and there. Red is official GISTEMP, black is our ccc-gistemp code.

What exactly am I comparing? GISTEMP’s global temperature anomalies, one set from their website, one set from our ccc-gistemp code. I’m running the vischeck command:

code/vischeck.py -o 2 result/GLB.Ts+dSST.txt result/GLB.Ts.ho2.GHCN.CL.PA.txt

(the -o option is used to produce the offset graphs, bottom picture)

The first file is GLB.Ts+dSST.txt, that I download from NASA yesterday. The second file, GLB.Ts.ho2.GHCN.CL.PA.txt, is the result of me running ccc-gistemp yesterday.

But it’s not a very careful comparison. The inputs I am using are SBBX.HadR2 and v2.mean downloaded on 2009-12-04 and an hcn_doe_mean_data downloaded in June (!). Also, the version of the GISTEMP code we are coding against is quite old (about a year) and has been updated several times. For example, GISTEMP currently use USHCN version 2, ccc-gistemp does not (yet). The fact that we’re not keeping up with GISTEMP is Issue 7.

Furthermore the exact output may depend on the Fortran compilers being used, the architecture on which I’m running, and the Python versions we’re using.

The bottom line is that we’re already very close to the GISTEMP output, well with any meaningful error threshold. As we get closer we’ll need to be a lot more careful about keeping track of exactly what inputs and software tools are being used. We’ve requested from GISS a copy of the exact inputs and outputs for one of the runs, so that we have a fixed set for comparison purposes.

Detailed ccc-gistemp status

Thank you, David, for kicking off the blog.  Thank you, John Keyes, for setting it up and hosting it.  This entry is a brief description of the current status of the CCC-GISTEMP project.  Anyone interested should feel free to wander over to the project page to browse or download the code.

GISTEMP as published by NASA consists of six steps, numbered 0 to 5.  Each step includes some FORTRAN code and one or more driver shell scripts (written in the slightly-obscure ksh shell), and takes one or more input files from preceding steps or from external data sources, and sometimes some config files.  Most steps produce a number of executables (by compiling the FORTRAN), some intermediate files, and one or more output files (which are either consumed by subsequent steps or are the outputs of GISTEMP as a whole).  Some of the data files are in formatted text, some are in big-endian binary FORTRAN formats.  The ksh driver scripts would rename and delete some intermediate files as necessary as they went along.

The first thing we did in CCC-GISTEMP was to regularize this structure.  The ksh scripts were rewritten in /bin/sh and consolidated into a single run.sh file.  All the files were placed in consistent subdirectories (all the config files in config/, all the executables in bin/, all the intermediate files in work/, log files in log/, and so on).  Some files have been renamed, and no files are now deleted during a run, so intermediate files can be inspected.

Following that, we have done the following re-implementation work:

- David Jones wrote a preflight script, which fetches any source data as necessary.

- I rewrote STEP0 as step0.py.  This reads the input met station data from various sources and consolidates it into a single consistent plain-text data file.  Our version does not use the various intermediate data files of GISTEMP STEP0.

- I rewrote STEP1 as step1.py.  This takes the data file and performs some adjustments (for instance, step-changes where a weather station has been replaced), as specified by configuration files.  STEP1 was already partly in Python and partly in C.  I have rewritten the C but I have not tinkered much with the existing Python; just enough to consolidate it into a single file.  This stage produces a number of intermediate files in DB2 format; I haven’t changed that.

- Paul Ollis rewrote STEP2 as a number of Python files.  This applies peri-urban adjustment and calculates anomaly values.  Paul has maintained the structure of the FORTRAN code quite closely in the Python.

- David Jones rewrote STEP3 as step3.py.  This produces monthly weighted anomaly values for each geographical “boxes” and “sub-boxes” – a division of the Earth’s surface into 8000 parts of equal area – according to a weighted averaging system based on the distance of each station from the centre of the sub-box.

- STEP4 is still in FORTRAN.  This is an optional step which updates a boxed sea-surface temperature file based on recent sea-surface temperature measurements.  We haven’t done anything to this, and in fact our current ./run.sh file doesn’t run any of this code.

- We are in the process of rewriting STEP5 as step5.py.  This combines the land data from step 3 and the sea data from step 4 into a single data set (according to land/ocean weighting in boxes with both land and ocean), and outputs a set of formatted text files giving monthly and annual temperature anomalies for a number of zones and for the globe as a whole.

- We have a little script step5res.py, which takes the global anomaly file produced by step 5 and turns it into a chart using Google Charts.

There was a long hiatus this year, but David and I are both active in the CCC project again and we hope to complete our first-cut Python version of GISTEMP soon.

Welcome to CCC

Clear Climate Code is an open project created by Ravenbrook; we aim to write and maintain software for climate modelling and analysis, with an emphasis on clarity and correctness. Our goals are:

  1. To produce clear climate science software;
  2. To encourage the production of clear climate science software;
  3. To increase public confidence in climate science results;
  4. To promote Ravenbrook’s software consultancy services.

[Updated to add: of course, these are the goals of Ravenbrook's internal project, out of which this open project has grown.  We don't expect third parties to sign up to goal 4, and of course they may have other goals of their own. - Nick B]

We are not new, but our blog is. Nick Barnes had the idea for the project in 2007 and he and David Jones started work on it at Ravenbrook in 2008. We talked about at PyCon UK 2008. Since then we have been joined by some contributors (on our mailing list), including John Keyes who has provided hosting for this blog.

Currently we are working on ccc-gistemp which is a reimplementation of the GISTEMP algorithm in Python. We are nearing the end of “step 1″ of that project, at which point we will have a Python program that uses exactly the same inputs as GISTEMP, and produces the same intermediate files, and the same outputs (right now, we have such a program but bits of it still use some of the Fortran code from GISS).