Code

Our ccc-gistemp code is available from our googlecode project. For downloading, we recommend our featured release (generally this will be our most recent packaged release). If you just want to look around, then please use the source code repository browser. All our development is done using the SVN repository, and you can download our latest development code (or any other version) using SVN.

We would like our code to be clear, please look at it and try it, and let us know what bits you think are not clear.

15 Responses to “Code”

  1. Tim W Says:

    [comment moved to 0.4.0 release post — NB]

  2. steven mosher Says:

    Hey Nick et al.

    I’m strugglingly mightly to understand what exactly is going on with the various versions of data read into Gisstemp:

    In particular this:

    Looking in order through the various other source files used:

    antarc_to_v2.sh is identical

    antarc1.txt, antarc2.txt, antarc3.txt, antarc3.list all differ: new data, updated data, corrected station metadata

    There is a small difference in antarc_comb.f, fixing a potential bug
    which doesn’t actually occur in practice (one line of data would be
    discarded if Antarctic data comes later than all v2.mean data).

    dump_old.f is identical

    ushcn.tbl is now called ushcn1.tbl (identical contents).

    get_USHCN:
    If hcn_doe_mean_data is missing, or 9641C_200907_F52.avg is present,
    run get_USHCN_v2 instead (see below)

    Otherwise, we’re still using USHCNv1 data:
    USHCN2v2.f has been renamed USHCN_to_v2.f
    move ushcn1.tbl to ushcn.tbl
    move the IN_full file to IN

    get_USHCN_v2 is all new. It runs unify_us_ids to translate new
    station IDs in the USHCNv2 data to old station IDs. Then it runs
    USHCNv2_to_v2.f. Finally it runs the reduce_strange.sh script to
    remove USHCNv2 stations from the Ts.strange file:
    #!/bin/ksh

    fortran_compile=${FC}

    echo “replacing USHCN station data in $1 by USHCN_V2 data (all adjustments, but ignoring fill-ins)”
    cd input_files
    echo “unifying station-ids”
    ./unify_us_ids 9641C_200907_F52.avg
    cd ..

    mkdir temp_files 2> /dev/null
    sort -n input_files/ushcn2.tbl > ID_US_G
    ${fortran_compile} USHCNv2_to_v2.f -o USHCNv2_to_v2.exe
    USHCNv2_to_v2.exe ; rm -f USHCNv2_to_v2.exe
    sort -n USHCN.v2.mean_noFIL > USHCN.v2.mean_noFIL.sort
    mv -f USHCN.v2.mean_noFIL.sort USHCN.v2.mean_noFIL

    cd input_files
    reduce_strange.sh
    mv Ts.strange.RSU.list.IN ../temp_files/.
    cp ushcn2.tbl ../temp_files/ushcn.tbl

    unify_us_ids is all new, and changes 9641C_200907_F52.avg by replacing
    the new station IDs with old station IDs (columns 1 and 3 in
    ushcnV2_cmb.tbl):
    #! /bin/ksh
    while read a b c d
    do sed “s/^$a/$c/g” $1.1
    mv -f $1.1 $1
    done $1.1
    mv -f $1.1 $1

    USHCNv2_to_v2.f is based on USHCN_to_v2.f, with some changes for the
    new data format.

    reduce_strange is all new, and removes some lines from
    Ts.strange.RSU.list.IN_full to make Ts.strange.RSU.list.IN.
    Specifically, ones which have country code 425 and which don’t have
    any matching lines in USHCN.v2.mean_noFIL (the output from USHCNv2_to_v2):
    #! /bin/ksh

    while read a b
    do if [[ $a != ‘425’* ]]
    then echo “$a $b”
    else c=$( grep $a ../USHCN.v2.mean_noFIL | head -1 ) 2> /dev/null
    if [[ $c != ” ]]
    then echo “$a $b”
    fi
    fi
    done Ts.strange.RSU.list.IN

    USHCN2v2.f has become USHCN_to_v2.f, and is identical apart from
    correcting a trivial bug (which we have already fixed in step0.py).

    dif.ushcn.ghcn.f identical

    cmb2.ushcn.v2.f has changed to support the skip_US behaviour.
    It now takes the $skip_US argument.
    Internally, that becomes only_USHCN.
    In a normal run only_USHCN will be false,
    but if do_comb_step0.sh is given a non-zero second argument
    then only_USHCN will be true.

    When reading a line from GHCN, if icc is 425 and
    the id is in [710000000, 900000000), cont_US is set to true,
    otherwise false.

    If we’re about to write a GHCN line, and if (cont_US and
    only_USHCN), skip the line.

    So if $skip_US is zero, the semantics are identical. If $skip_US
    is 1, it omits any GHCN lines from this range of station
    IDs (which, based on the variable name, is the continental US) –
    but still copies the USHCN lines.

    hohp_to_v2.f identical

    ************************************************

    There are a bunch of folks ( see them at Lucia’s) who have done their own approaches and comparing to GISSTEMP is complicated by these data replacements, fixes, etc.
    I’m not questioning them just trying to understand them all and hopefully get a way for people to do a comparison that is more apples to apples. Any help or a Post on this important step would be very cool. not to give work to people..

  3. Nick.Barnes Says:

    Hi Steven.

    That looks like a copy of most of the file doc/step0-update-notes, which I wrote in early 2009-12, as we were working towards ccc-gistemp release 0.2.0 – our first all-Python release, which we made on 2010-01-11. There is a corresponding file doc/steps1-5-update-notes, which I wrote at the same time. These files can be fetched with SVN or browsed, with revision history, here.

    I wrote these documents to help myself to understand, and to communicate to ccc-gistemp people (mainly David Jones at the time), changes that GISS had made to GISTEMP over the preceding year or so. I discarded these notes shortly after they were written, and haven’t revisited them since. So some of them may be terse, opaque, ambiguous, or downright wrong. But the summary at the top of the file, which I will have written last, is probably both accurate and sufficient.

    At the time, in 2009-12, we were working towards our first all-Python version of ccc-gistemp, release 0.2.0 which was actually released on 2010-01-11. When we started ccc-gistemp, back in 2008, ccc-gistemp 0.1.0 was equivalent to GISTEMP in 2008-09 (the document says 2008-09-11, although I think the actual GISTEMP release may have been on 2008-09-10). We wanted ccc-gistemp 0.2.0 to match current GISTEMP sources. So I downloaded the GISTEMP sources released on 2009-12-03, ran diff over the two GISTEMP source trees, and spent a while pondering the results and writing these two documents (step0-update-notes and steps1-5-update-notes).

    I emailed Reto Ruedy at GISS on 2009-12-17 to ask about the first version of these documents (I sent him a link to the SVN browser); his reply informed my next update (r102).

    So this document isn’t really about versions of data (except inasmuch as the GISTEMP changes included a switch from USHCNv1 to USHCNv2, and some updates to SCAR data and metadata); it’s about versions of code.

  4. steven mosher Says:

    Shucks,

    I was hoping for some clarification as Im trying to understand exactly what files gisstemp uses, what get added, what gets modified, what gets, disgarded, the various errors in station ids.

    Let’s say with Zeke’s program its easy to understand. He uses GHCN. v2mean.

    Anyways, I’m not being critical of the file or the operation I’m just trying to understand it.

    When I get some time, I’ll try to come up with clearer questions.

  5. Nick.Barnes Says:

    Oh, OK. Yes, this should be documented better. Release 0.5.0 will do much better at documentation in general. But those -update-notes documents aren’t gonig to be very helpful to anyone except an archaeologist.

  6. steven mosher Says:

    Understood nick,

    It struck me as notes one would write to oneself when deep into coding where you live and breath the code like it was your native tongue. But it was helpful to me to just get my feet back under me WRT gisstemp code.

    Part of this is I’m putting together a bunch of metadata for any one to use, so I’m painfully aware that I better get all the ids correct etc
    so some of the id correction stuff was important, also learning R and trying to get the other data ( antartica ) into a GHCN format just as a R learning tool.

  7. steven mosher Says:

    hmm. did giss change names?

    MISSING: input/ushcnV2_cmb.tbl
    MISSING: input/mcdw.tbl
    MISSING: input/ushcn2.tbl
    MISSING: input/sumofday.tbl
    MISSING: input/v2.inv
    MISSING: input/oisstv2_mod4.clim.gz
    PROBLEM: Tried fetching missing files but it didn’t work.
    ====> STEPS 0 to 5 ====
    Traceback (most recent call last):
    File “tool/run.py”, line 225, in
    sys.exit(main())
    File “tool/run.py”, line 206, in main
    data = step_fn[step](data)
    File “tool/run.py”, line 56, in run_step0
    data = giss_io.step0_input()
    File “/Users/mosher/Downloads/ccc-gistemp-0.4.1/tool/giss_io.py”, line 674, in step0_input
    input.ushcn_stations = read_USHCN_stations(‘input/ushcn2.tbl’, ‘input/ushcnV2_cmb.tbl’)
    File “/Users/mosher/Downloads/ccc-gistemp-0.4.1/tool/giss_io.py”, line 609, in read_USHCN_stations
    for line in open(ushcn_v1_station_path):
    IOError: [Errno 2] No such file or directory: ‘input/ushcn2.tbl’

  8. Nick.Barnes Says:

    @Steven: Yes, it changed a couple of weeks ago; I fixed our trunk sources to work either way. See issue 65 and r436.

  9. steven mosher Says:

    Cool, l have the 0.4.1. so I just got the files from one of the tar on your dowloands ( gisstemp test ) hit enter. went to bed and woke up to a successful run. about 2100 seconds on a MAC lap top. All the work files look to be in order. nice work. I’ll check out your comment above.

  10. Steven mosher Says:

    hmm.

    We fetch various files such as station metadata from the GISTEMP source
    archive. The top-level directory in this archive has changed name from
    GISTEMP_sources to GISTEMP_sources_open. Our preflight code fails as a
    result. We should be flexible about this: for instance, by using a regular
    expression to match either old or new directory names.

    Is that URL correct?

    ftp://data.giss.nasa.gov/pub/gistemp/GISS_Obs_analysis/

    Exists.

    ftp://data.giss.nasa.gov/pub/gistemp/GISS_Obs_analysis/GISTEMP_sources/

    Exists.

    GISTEMP_sources_open doesnt exist

  11. Nick.Barnes Says:

    @Steven: We fetch the GISTEMP sources from

    http://data.giss.nasa.gov/gistemp/sources/GISTEMP_sources.tar.gz

    Because that is the link given on the relevant web page:

    http://data.giss.nasa.gov/gistemp/sources/

    When we unpack this tarball, it used to make a directory called GISTEMP_sources. Now it makes a directory called GISTEMP_sources_open. Our preflight.py/fetch.py code is now independent of the name of this directory.

    We’ve never used the GISS_Obs_analysis link.

  12. Jorge Kampmann Says:

    gistemp 0.5.0 still has this strange behaviour …:
    kampmann@ibk-node14:~/Projekte/GlobalWarming/ccc-giss 0.5.0/ccc-gistemp-0.5.0> python tool/run.py
    MISSING: input/antarc1.list
    MISSING: input/antarc1.txt
    MISSING: input/antarc2.list
    MISSING: input/antarc2.txt
    MISSING: input/antarc3.list
    MISSING: input/antarc3.txt
    MISSING: input/t_hohenpeissenberg_200306.txt_as_received_July17_2003
    MISSING: input/ushcn2.tbl
    MISSING: input/ushcnV2_cmb.tbl
    MISSING: input/mcdw.tbl
    MISSING: input/ushcn2.tbl
    MISSING: input/sumofday.tbl
    MISSING: input/v2.inv
    MISSING: input/oisstv2_mod4.clim.gz
    MISSING: input/v2.mean
    MISSING: input/ushcnv2
    MISSING: input/SBBX.HadR2
    Attempting to fetch missing files: antarc1.list antarc1.txt antarc2.list antarc2.txt antarc3.list antarc3.txt t_hohenpeissenberg_200306.txt_as_received_July17_2003 ushcn2.tbl ushcnV2_cmb.tbl mcdw.tbl ushcn2.tbl sumofday.tbl v2.inv oisstv2_mod4.clim.gz v2.mean ushcnv2 SBBX.HadR2
    Extracting members from http://data.giss.nasa.gov/gistemp/sources/GISTEMP_sources.tar.gz
    Traceback (most recent call last):
    File “tool/run.py”, line 225, in
    sys.exit(main())
    File “tool/run.py”, line 172, in main
    preflight.checkit(sys.stderr)
    File “/home/kampmann/Projekte/GlobalWarming/ccc-giss 0.5.0/ccc-gistemp-0.5.0/tool/preflight.py”, line 103, in checkit
    fetch.main(argv=[‘fetch’] + missing)
    File “/home/kampmann/Projekte/GlobalWarming/ccc-giss 0.5.0/ccc-gistemp-0.5.0/tool/fetch.py”, line 405, in main
    fetch(args)
    File “/home/kampmann/Projekte/GlobalWarming/ccc-giss 0.5.0/ccc-gistemp-0.5.0/tool/fetch.py”, line 219, in fetch
    handler[hname](group, prefix, output)
    File “/home/kampmann/Projekte/GlobalWarming/ccc-giss 0.5.0/ccc-gistemp-0.5.0/tool/fetch.py”, line 262, in fetch_tar
    compression_type=tar_compression_type)
    File “/home/kampmann/Projekte/GlobalWarming/ccc-giss 0.5.0/ccc-gistemp-0.5.0/tool/fetch.py”, line 348, in fetch_from_tar
    tar = tarfile.open(”, mode=’r|%s’ % compression_type, fileobj=inp)
    File “/usr/lib/python2.5/tarfile.py”, line 1046, in open
    _Stream(name, filemode, comptype, fileobj, bufsize))
    File “/usr/lib/python2.5/tarfile.py”, line 352, in __init__
    self._init_read_gz()
    File “/usr/lib/python2.5/tarfile.py”, line 439, in _init_read_gz
    raise ReadError(“not a gzip file”)
    tarfile.ReadError: not a gzip file
    kampmann@ibk-node14:~/Projekte/GlobalWarming/ccc-giss 0.5.0/ccc-gistemp-0.5.0>

    could you explain or help?
    Thanks
    Jörg

  13. clearscience Says:

    So they (google) will be shutting down the mailing list?

  14. drj Says:

    @clearscience: ah… no?

  15. Yet another study confirms hockey stick - Page 3 (politics) Says:

    […] and easy. And if you can't read FORTRAN, the same code translated to Python can be found here: http://clearclimatecode.org/code/ Use either source. (Hint: you won't find what you're looking for. Because it's not there.) […]

Leave a Reply