Why bother with words when abbreviations are so much more cryptic? As pd points out, there is a new version of Global Historical Climate Network, version 3. There isn’t an official announcement yet, but others have noticed.

GHCN-M is the monthly datasets. Version 3 is still in beta, so we’re all still learning.

The file format is different. More like USHCNv2. And like USHCN each datum has a set of flags that indicate quality checks (isolated value, inconsistent with climatology, month has missing days, and so on). One of the flags is a source flag, each monthly datum is tagged with its source: UK Met Office, CLIMAT report, MCDW, and so on.

Unlike GHCN v2 there is only one record for each station in GHCN v3. There are no “duplicates”. This makes one job (the job of Step 0) easier, we don’t have to decide how to select or combine multiple records for the same station: that’s been done for us. On the other hand, we may have wanted to combine records in a different way.

I’ve been modifying ccc-gistemp to experiment with GHCN v3. At first I thought I could use the v2.inv file supplied by GISTEMP, but the GHCN station identifiers for the contiguous US have changed (so that they’re based on their USHCN station identifiers—probably a good thing). Writing code to parse the new v3 .inv file is straightforward enough.

Of course the v3 .inv file doesn’t have the night-time satellite brightness that GISTEMP uses in its analysis (globally, since 2010-01-16). So I also added a parameter to use the GHCN population index (POPCLS in the documentation) globally.

This result should be considered preliminary.

When making comparisons with official GISTEMP there are several caveats:

  • Only GHCN v3 data is used. No SCAR READER (and no Hohenpeissenberg correction).
  • In Step 2, urban adjustment, the GHCN v3 analysis uses the POPCLS field for the rural/urban designation. The field has three values, R/S/U, for Rural/Semi-Rural/Urban. R maps to rural in the analysis, the others count as urban. The current GISTEMP analysis uses night-time satellite brightnesses.
  • Each GHCN v3 station is treated as a single record. GISTEMP using v2 data combines duplicate records for the same station into one record (sometimes more than one); this record may not be the same as the GHCN v3 record. And in particular…
  • (because I appended a ‘G’ to all the 11-digit v3 station identifiers) the “hand picked” list of deletions and adjustments is not used. The most obvious example of where this matters is St Helena, 14761901000.

I changed ccc-gistemp to use GHCN v3 and wrote this post ages ago, but when I met Jay Lawrimore at the Exeter workshop, he said I should probably hold off posting. Here’s the record of my GHCN v3 changes in googlecode (made on 2010-09-04).