Detailed ccc-gistemp status
Posted by Nick.Barnes | Filed under status
Thank you, David, for kicking off the blog. Thank you, John Keyes, for setting it up and hosting it. This entry is a brief description of the current status of the CCC-GISTEMP project. Anyone interested should feel free to wander over to the project page to browse or download the code.
GISTEMP as published by NASA consists of six steps, numbered 0 to 5. Each step includes some FORTRAN code and one or more driver shell scripts (written in the slightly-obscure ksh shell), and takes one or more input files from preceding steps or from external data sources, and sometimes some config files. Most steps produce a number of executables (by compiling the FORTRAN), some intermediate files, and one or more output files (which are either consumed by subsequent steps or are the outputs of GISTEMP as a whole). Some of the data files are in formatted text, some are in big-endian binary FORTRAN formats. The ksh driver scripts would rename and delete some intermediate files as necessary as they went along.
The first thing we did in CCC-GISTEMP was to regularize this structure. The ksh scripts were rewritten in /bin/sh and consolidated into a single run.sh file. All the files were placed in consistent subdirectories (all the config files in config/, all the executables in bin/, all the intermediate files in work/, log files in log/, and so on). Some files have been renamed, and no files are now deleted during a run, so intermediate files can be inspected.
Following that, we have done the following re-implementation work:
- David Jones wrote a preflight script, which fetches any source data as necessary.
- I rewrote STEP0 as step0.py. This reads the input met station data from various sources and consolidates it into a single consistent plain-text data file. Our version does not use the various intermediate data files of GISTEMP STEP0.
- I rewrote STEP1 as step1.py. This takes the data file and performs some adjustments (for instance, step-changes where a weather station has been replaced), as specified by configuration files. STEP1 was already partly in Python and partly in C. I have rewritten the C but I have not tinkered much with the existing Python; just enough to consolidate it into a single file. This stage produces a number of intermediate files in DB2 format; I haven’t changed that.
- Paul Ollis rewrote STEP2 as a number of Python files. This applies peri-urban adjustment and calculates anomaly values. Paul has maintained the structure of the FORTRAN code quite closely in the Python.
- David Jones rewrote STEP3 as step3.py. This produces monthly weighted anomaly values for each geographical “boxes” and “sub-boxes” – a division of the Earth’s surface into 8000 parts of equal area – according to a weighted averaging system based on the distance of each station from the centre of the sub-box.
- STEP4 is still in FORTRAN. This is an optional step which updates a boxed sea-surface temperature file based on recent sea-surface temperature measurements. We haven’t done anything to this, and in fact our current ./run.sh file doesn’t run any of this code.
- We are in the process of rewriting STEP5 as step5.py. This combines the land data from step 3 and the sea data from step 4 into a single data set (according to land/ocean weighting in boxes with both land and ocean), and outputs a set of formatted text files giving monthly and annual temperature anomalies for a number of zones and for the globe as a whole.
- We have a little script step5res.py, which takes the global anomaly file produced by step 5 and turns it into a chart using Google Charts.
There was a long hiatus this year, but David and I are both active in the CCC project again and we hope to complete our first-cut Python version of GISTEMP soon.
December 21st, 2009 at 11:37 pm
Since you have all the intermediary data as well as the overall adjustment, it would be interesting to see graphs of the adjustments broken down by stage, year, region, etc.
Thus one could graphically see how North American temps have been adjusted over the last 100 years and in turn drill down into which step is having the most influence on the adjustment.
I’m not sure that is part of your current goals, but the adjustment process is certainly in public debate at present and I’m not aware of sites that allow the casual user to delve into the details via a graphic presentation.