Code
Posted by drj
Our ccc-gistemp code is available from our googlecode project. For downloading, we recommend our featured release (generally this will be our most recent packaged release). If you just want to look around, then please use the source code repository browser. All our development is done using the SVN repository, and you can download our latest development code (or any other version) using SVN.
We would like our code to be clear, please look at it and try it, and let us know what bits you think are not clear.
March 11th, 2010 at 5:20 am
[comment moved to 0.4.0 release post — NB]
March 31st, 2010 at 6:40 pm
Hey Nick et al.
I’m strugglingly mightly to understand what exactly is going on with the various versions of data read into Gisstemp:
In particular this:
Looking in order through the various other source files used:
antarc_to_v2.sh is identical
antarc1.txt, antarc2.txt, antarc3.txt, antarc3.list all differ: new data, updated data, corrected station metadata
There is a small difference in antarc_comb.f, fixing a potential bug
which doesn’t actually occur in practice (one line of data would be
discarded if Antarctic data comes later than all v2.mean data).
dump_old.f is identical
ushcn.tbl is now called ushcn1.tbl (identical contents).
get_USHCN:
If hcn_doe_mean_data is missing, or 9641C_200907_F52.avg is present,
run get_USHCN_v2 instead (see below)
Otherwise, we’re still using USHCNv1 data:
USHCN2v2.f has been renamed USHCN_to_v2.f
move ushcn1.tbl to ushcn.tbl
move the IN_full file to IN
get_USHCN_v2 is all new. It runs unify_us_ids to translate new
station IDs in the USHCNv2 data to old station IDs. Then it runs
USHCNv2_to_v2.f. Finally it runs the reduce_strange.sh script to
remove USHCNv2 stations from the Ts.strange file:
#!/bin/ksh
fortran_compile=${FC}
echo “replacing USHCN station data in $1 by USHCN_V2 data (all adjustments, but ignoring fill-ins)”
cd input_files
echo “unifying station-ids”
./unify_us_ids 9641C_200907_F52.avg
cd ..
mkdir temp_files 2> /dev/null
sort -n input_files/ushcn2.tbl > ID_US_G
${fortran_compile} USHCNv2_to_v2.f -o USHCNv2_to_v2.exe
USHCNv2_to_v2.exe ; rm -f USHCNv2_to_v2.exe
sort -n USHCN.v2.mean_noFIL > USHCN.v2.mean_noFIL.sort
mv -f USHCN.v2.mean_noFIL.sort USHCN.v2.mean_noFIL
cd input_files
reduce_strange.sh
mv Ts.strange.RSU.list.IN ../temp_files/.
cp ushcn2.tbl ../temp_files/ushcn.tbl
unify_us_ids is all new, and changes 9641C_200907_F52.avg by replacing
the new station IDs with old station IDs (columns 1 and 3 in
ushcnV2_cmb.tbl):
#! /bin/ksh
while read a b c d
do sed “s/^$a/$c/g” $1.1
mv -f $1.1 $1
done $1.1
mv -f $1.1 $1
USHCNv2_to_v2.f is based on USHCN_to_v2.f, with some changes for the
new data format.
reduce_strange is all new, and removes some lines from
Ts.strange.RSU.list.IN_full to make Ts.strange.RSU.list.IN.
Specifically, ones which have country code 425 and which don’t have
any matching lines in USHCN.v2.mean_noFIL (the output from USHCNv2_to_v2):
#! /bin/ksh
while read a b
do if [[ $a != ‘425’* ]]
then echo “$a $b”
else c=$( grep $a ../USHCN.v2.mean_noFIL | head -1 ) 2> /dev/null
if [[ $c != ” ]]
then echo “$a $b”
fi
fi
done Ts.strange.RSU.list.IN
USHCN2v2.f has become USHCN_to_v2.f, and is identical apart from
correcting a trivial bug (which we have already fixed in step0.py).
dif.ushcn.ghcn.f identical
cmb2.ushcn.v2.f has changed to support the skip_US behaviour.
It now takes the $skip_US argument.
Internally, that becomes only_USHCN.
In a normal run only_USHCN will be false,
but if do_comb_step0.sh is given a non-zero second argument
then only_USHCN will be true.
When reading a line from GHCN, if icc is 425 and
the id is in [710000000, 900000000), cont_US is set to true,
otherwise false.
If we’re about to write a GHCN line, and if (cont_US and
only_USHCN), skip the line.
So if $skip_US is zero, the semantics are identical. If $skip_US
is 1, it omits any GHCN lines from this range of station
IDs (which, based on the variable name, is the continental US) –
but still copies the USHCN lines.
hohp_to_v2.f identical
************************************************
There are a bunch of folks ( see them at Lucia’s) who have done their own approaches and comparing to GISSTEMP is complicated by these data replacements, fixes, etc.
I’m not questioning them just trying to understand them all and hopefully get a way for people to do a comparison that is more apples to apples. Any help or a Post on this important step would be very cool. not to give work to people..
March 31st, 2010 at 8:07 pm
Hi Steven.
That looks like a copy of most of the file doc/step0-update-notes, which I wrote in early 2009-12, as we were working towards ccc-gistemp release 0.2.0 – our first all-Python release, which we made on 2010-01-11. There is a corresponding file doc/steps1-5-update-notes, which I wrote at the same time. These files can be fetched with SVN or browsed, with revision history, here.
I wrote these documents to help myself to understand, and to communicate to ccc-gistemp people (mainly David Jones at the time), changes that GISS had made to GISTEMP over the preceding year or so. I discarded these notes shortly after they were written, and haven’t revisited them since. So some of them may be terse, opaque, ambiguous, or downright wrong. But the summary at the top of the file, which I will have written last, is probably both accurate and sufficient.
At the time, in 2009-12, we were working towards our first all-Python version of ccc-gistemp, release 0.2.0 which was actually released on 2010-01-11. When we started ccc-gistemp, back in 2008, ccc-gistemp 0.1.0 was equivalent to GISTEMP in 2008-09 (the document says 2008-09-11, although I think the actual GISTEMP release may have been on 2008-09-10). We wanted ccc-gistemp 0.2.0 to match current GISTEMP sources. So I downloaded the GISTEMP sources released on 2009-12-03, ran diff over the two GISTEMP source trees, and spent a while pondering the results and writing these two documents (step0-update-notes and steps1-5-update-notes).
I emailed Reto Ruedy at GISS on 2009-12-17 to ask about the first version of these documents (I sent him a link to the SVN browser); his reply informed my next update (r102).
So this document isn’t really about versions of data (except inasmuch as the GISTEMP changes included a switch from USHCNv1 to USHCNv2, and some updates to SCAR data and metadata); it’s about versions of code.
April 2nd, 2010 at 7:42 am
Shucks,
I was hoping for some clarification as Im trying to understand exactly what files gisstemp uses, what get added, what gets modified, what gets, disgarded, the various errors in station ids.
Let’s say with Zeke’s program its easy to understand. He uses GHCN. v2mean.
Anyways, I’m not being critical of the file or the operation I’m just trying to understand it.
When I get some time, I’ll try to come up with clearer questions.
April 2nd, 2010 at 10:33 am
Oh, OK. Yes, this should be documented better. Release 0.5.0 will do much better at documentation in general. But those -update-notes documents aren’t gonig to be very helpful to anyone except an archaeologist.
April 2nd, 2010 at 5:31 pm
Understood nick,
It struck me as notes one would write to oneself when deep into coding where you live and breath the code like it was your native tongue. But it was helpful to me to just get my feet back under me WRT gisstemp code.
Part of this is I’m putting together a bunch of metadata for any one to use, so I’m painfully aware that I better get all the ids correct etc
so some of the id correction stuff was important, also learning R and trying to get the other data ( antartica ) into a GHCN format just as a R learning tool.
April 25th, 2010 at 7:35 am
hmm. did giss change names?
MISSING: input/ushcnV2_cmb.tbl
MISSING: input/mcdw.tbl
MISSING: input/ushcn2.tbl
MISSING: input/sumofday.tbl
MISSING: input/v2.inv
MISSING: input/oisstv2_mod4.clim.gz
PROBLEM: Tried fetching missing files but it didn’t work.
====> STEPS 0 to 5 ====
Traceback (most recent call last):
File “tool/run.py”, line 225, in
sys.exit(main())
File “tool/run.py”, line 206, in main
data = step_fn[step](data)
File “tool/run.py”, line 56, in run_step0
data = giss_io.step0_input()
File “/Users/mosher/Downloads/ccc-gistemp-0.4.1/tool/giss_io.py”, line 674, in step0_input
input.ushcn_stations = read_USHCN_stations(‘input/ushcn2.tbl’, ‘input/ushcnV2_cmb.tbl’)
File “/Users/mosher/Downloads/ccc-gistemp-0.4.1/tool/giss_io.py”, line 609, in read_USHCN_stations
for line in open(ushcn_v1_station_path):
IOError: [Errno 2] No such file or directory: ‘input/ushcn2.tbl’
April 25th, 2010 at 11:38 am
@Steven: Yes, it changed a couple of weeks ago; I fixed our trunk sources to work either way. See issue 65 and r436.
April 25th, 2010 at 3:43 pm
Cool, l have the 0.4.1. so I just got the files from one of the tar on your dowloands ( gisstemp test ) hit enter. went to bed and woke up to a successful run. about 2100 seconds on a MAC lap top. All the work files look to be in order. nice work. I’ll check out your comment above.
April 30th, 2010 at 10:31 pm
hmm.
We fetch various files such as station metadata from the GISTEMP source
archive. The top-level directory in this archive has changed name from
GISTEMP_sources to GISTEMP_sources_open. Our preflight code fails as a
result. We should be flexible about this: for instance, by using a regular
expression to match either old or new directory names.
Is that URL correct?
ftp://data.giss.nasa.gov/pub/gistemp/GISS_Obs_analysis/
Exists.
ftp://data.giss.nasa.gov/pub/gistemp/GISS_Obs_analysis/GISTEMP_sources/
Exists.
GISTEMP_sources_open doesnt exist
May 1st, 2010 at 11:15 am
@Steven: We fetch the GISTEMP sources from
http://data.giss.nasa.gov/gistemp/sources/GISTEMP_sources.tar.gz
Because that is the link given on the relevant web page:
http://data.giss.nasa.gov/gistemp/sources/
When we unpack this tarball, it used to make a directory called GISTEMP_sources. Now it makes a directory called GISTEMP_sources_open. Our preflight.py/fetch.py code is now independent of the name of this directory.
We’ve never used the GISS_Obs_analysis link.
July 19th, 2010 at 6:01 pm
gistemp 0.5.0 still has this strange behaviour …:
kampmann@ibk-node14:~/Projekte/GlobalWarming/ccc-giss 0.5.0/ccc-gistemp-0.5.0> python tool/run.py
MISSING: input/antarc1.list
MISSING: input/antarc1.txt
MISSING: input/antarc2.list
MISSING: input/antarc2.txt
MISSING: input/antarc3.list
MISSING: input/antarc3.txt
MISSING: input/t_hohenpeissenberg_200306.txt_as_received_July17_2003
MISSING: input/ushcn2.tbl
MISSING: input/ushcnV2_cmb.tbl
MISSING: input/mcdw.tbl
MISSING: input/ushcn2.tbl
MISSING: input/sumofday.tbl
MISSING: input/v2.inv
MISSING: input/oisstv2_mod4.clim.gz
MISSING: input/v2.mean
MISSING: input/ushcnv2
MISSING: input/SBBX.HadR2
Attempting to fetch missing files: antarc1.list antarc1.txt antarc2.list antarc2.txt antarc3.list antarc3.txt t_hohenpeissenberg_200306.txt_as_received_July17_2003 ushcn2.tbl ushcnV2_cmb.tbl mcdw.tbl ushcn2.tbl sumofday.tbl v2.inv oisstv2_mod4.clim.gz v2.mean ushcnv2 SBBX.HadR2
Extracting members from http://data.giss.nasa.gov/gistemp/sources/GISTEMP_sources.tar.gz …
Traceback (most recent call last):
File “tool/run.py”, line 225, in
sys.exit(main())
File “tool/run.py”, line 172, in main
preflight.checkit(sys.stderr)
File “/home/kampmann/Projekte/GlobalWarming/ccc-giss 0.5.0/ccc-gistemp-0.5.0/tool/preflight.py”, line 103, in checkit
fetch.main(argv=[‘fetch’] + missing)
File “/home/kampmann/Projekte/GlobalWarming/ccc-giss 0.5.0/ccc-gistemp-0.5.0/tool/fetch.py”, line 405, in main
fetch(args)
File “/home/kampmann/Projekte/GlobalWarming/ccc-giss 0.5.0/ccc-gistemp-0.5.0/tool/fetch.py”, line 219, in fetch
handler[hname](group, prefix, output)
File “/home/kampmann/Projekte/GlobalWarming/ccc-giss 0.5.0/ccc-gistemp-0.5.0/tool/fetch.py”, line 262, in fetch_tar
compression_type=tar_compression_type)
File “/home/kampmann/Projekte/GlobalWarming/ccc-giss 0.5.0/ccc-gistemp-0.5.0/tool/fetch.py”, line 348, in fetch_from_tar
tar = tarfile.open(”, mode=’r|%s’ % compression_type, fileobj=inp)
File “/usr/lib/python2.5/tarfile.py”, line 1046, in open
_Stream(name, filemode, comptype, fileobj, bufsize))
File “/usr/lib/python2.5/tarfile.py”, line 352, in __init__
self._init_read_gz()
File “/usr/lib/python2.5/tarfile.py”, line 439, in _init_read_gz
raise ReadError(“not a gzip file”)
tarfile.ReadError: not a gzip file
kampmann@ibk-node14:~/Projekte/GlobalWarming/ccc-giss 0.5.0/ccc-gistemp-0.5.0>
could you explain or help?
Thanks
Jörg
January 5th, 2011 at 8:42 pm
So they (google) will be shutting down the mailing list?
January 6th, 2011 at 9:04 am
@clearscience: ah… no?
April 26th, 2013 at 4:58 pm
[…] and easy. And if you can't read FORTRAN, the same code translated to Python can be found here: http://clearclimatecode.org/code/ Use either source. (Hint: you won't find what you're looking for. Because it's not there.) […]