Posted by drj | Filed under Uncategorized
Over at his blog, Nick Stokes selects rural stations with more than 90 years of data and that report in 2010. In GHCN he counts just 61 stations.
We can do something similar with ccc-gistemp. We get a result that is unreasonably similar to the official GISTEMP with full series.
That major differences in methodology are:
- I use the GISTEMP version of rural, locations with a brightness less than 10 (units of whatever the night-time satellite map is in);
- I count stations after Step 1, which is after USHCN data has been incorporated and multiple records at the same location have been combined;
- GISTEMP analysis algorithm rather than Stokes’.
Here’s the short (but not very clear) Python script that identifies long rural stations:
rural = set((row[:11] for row in open('input/v2.inv') if int(row[102:106]) <= 10)) def id11(x):return x[:11] def id12(x):return x[:12] import itertools rurlong = [group for group,rows in ((g,list(r)) for g,r in itertools.groupby(open('work/v2.step1.out'), id12)) if id11(group) in rural and len(rows) > 90 and rows[-1][12:16] == '2010'] print len(rurlong)
There are 440 stations. With these 12-digit record identifiers in hand it is a trivial matter to create a new v2.mean file («open(‘v2.longrural’, ‘w’).writelines((row for row in open(‘work/v2.step1.out’) if row[:12] in rurlong))» if you must).
As I said before the results are pretty close to the standard analysis:
The new v2.mean file is over 10 times smaller (uncompressed) than the official GHCN v2.mean file. The analysis is correspondingly about 10 times quicker to run (a few minutes on my old laptop). In case you want to use this file, to replicate my results or run it through your own analysis, I’ve made it available from our googlecode repository.