Analysis of Canada data

In an earlier post I describe the trials and tribulations of tracking down some station data from Environment Canada’s website.

The obvious question to ask is, how does this affect the ccc-gistemp analysis?

For starters, how much extra data do we get, once we’ve merged all the duplicates and rejected short records and so on? Here’s the station count by year for the GHCN stations (dark lines), and the extra Environment Canada stations (lighter lines):

This count is made after ccc-gistemp Step 2 processing, so duplicate records for the same station have been merged, short records have been discarded, and urban stations that could not be adjusted have been dropped (the log tells me there are 18 such stations). New Environment Canada stations (recall that some of the Environment Canada data is for stations that are not in GHCN) do not get any brightness information in the v2.inv file; it so happens that in ccc-gistemp this means they get marked as rural, more by accident by design. I should probably fix this (by calculating brightnesses for the new stations, and rejecting stations with no brightness data), but this will certainly do for a preliminary analysis.

The 1990s still don’t reach the dizzying peaks of the 1960s (in terms of station count), but the Environment Canada data is certainly a welcome contribution. More than doubling the number of stations for recent decades.

The effect of this on the analysis? Here’s the arctic zone:

The first thing to note if you haven’t seen one of these before, is the scale. The swings in this zone are much larger than the global average (this zone is 5% of the Earth’s surface); the recent warming in this zone is over 5 °C per century! The remaining points of note are the slight differences here and there in the very recent period. That large dip in the 2000s is 2004, and the new analysis has the anomaly some 0.16 °C colder (+0.57 versus +0.73). A warm spike is 1995 is 0.09 °C warmer. The same blips are also just about visibly different on the Northern Hemisphere analysis, but the differences smaller.

The additional Environment Canada is welcome, and does affect the result just enough to be visible, but the trends and any conclusion one could derive are not affected at all.


The data are available here, but you don’t need to download that if you’re using ccc-gistemp. Run «python tool/» to download the data, and then run «python tool/» to generate a mapping table. «python tool/ -d ‘data_sources=ghcn ushcn scar hohenpeissenberg ca.v2’» will then run the analysis.

33 Responses to “Analysis of Canada data”

  1. Nick.Barnes Says:

    0.73 – 0.57 = 0.16 [ed: oops! fixed. ta.]

  2. cth Says:

    nice work. It does interest me why the Canadian station data has not been sought after. I can only guess that scientist are largely satisfied with what we have, or rather they realize that adding more stations won’t have any huge benefit. Probably actual variation in global temperature surpasses anything they can squeeze out from additional station data at this point. I guess they are right. After all you’ve shown here that effectively doubling the station count in Canada doesn’t even change the trend much in that region, let alone the world. It’s probably areas of sparse station coverage which are more of a problem, like central Africa.

  3. clearscience Says:

    [I would like to point out that i’ve done a similar analysis on a small region in Northern Canada and also found the spike in 1980 that you find for all of Canada. I have been unable to find any climatically significant causes of this spike. Any ideas? I would also like to point out that the 2010 data will likely put this as being the warmest year in Canada (at least if one uses a 12-month running mean) as it has already been established that it was the warmest winter and spring in Canadian history. Parts of Northeastern canada were 5-10 °C for the winter – Clearscience]

  4. Just a small note… « clearscience Says:

    […] the comments section Keon point out the following post which I found interesting over at clear climate code. They integrate all available records for the […]

  5. HR Says:

    Thanks for this.

    Any chance of posting the whole Canada graph rather than just arctic Canada?

  6. HR Says:

    Sorry to push this further but the first graph is for the whole of Canada while the 2nd graphis arctic Canada which makes interpreting this result a little difficult.

    What is the number of stations for the black line and red line in the second graph?

  7. drj Says:

    @HR: You misinterpret the second graph, it is not Arctic Canada, it is the entire Arctic zone.

    But you’re right in your general implication that many of the Canada stations will not be in the Arctic zone; map here. However, because GISTEMP will combine station data from up to 1200 km away (see Hansen & Lebedeff 1987), non-arctic stations will influence the series of arctic grid cells (stations at 53 North will (just) influence cells at 64 North). Because of this, it’s not easy to separate stations and zones.

    GISTEMP does not produce a “Canada” series. In principle ccc-gistemp could be adapted to produce a Canada series (possibly using the landmask feature), but it would require some effort.

    I previously opined that the US-only temperature series was “wildly parochial”; I guess Canada is much bigger than the US in land area, but a “Canada only” series is not high on our list of priorities.

    If you’d like to code something up, we could provide assistance; join the mailing list.

  8. HR Says:

    Thanks drj for the clarification.

    I guess what I was trying to get at is to what extent is the pool of stations that make up the red line different to the pool of stations that make up the black line? I take it from your answer that you can’t give a definitive answer to that?

    You might be aware I came to your site via I just thought the wording there, and here in fact, could be intrepreted that the number of stations is doubled. This obviously isn’t the case because you are talking about a completely different data set but I think there is scope for confusion.

    The implication of this work (shown in the second graph) and implied by the SkSci title is that an extended data set gives essentially the same result as the ‘official results’. I just wanted to gage just how much greater was your data set. I’m not saying this is the case but it strikes me that it would be of little significance if you’ve only added a few extra stations in the arctic comparison.

    “code something up” I wouldn’t know where to start. I’m happier just criticising other peoples efforts 😉

  9. drj Says:

    @HR: Hmm, I see your point, but I think you are mis-reading me. I’ve reviewed what I wrote, and I still stand by it. This is an article about adding a source of Canadian stations to ccc-gistemp. This addition does double the number of stations in Canada for recent years.

    The point of the arctic zone graph is not that the number of stations in this zone has doubled, but this is the largest change I could find having added the Environment Canada stations. The other zones, and the hemispheric averages, have an even smaller change.

    I agree that this may well be of little significance, but the point is that I went out and found extra stations for a poorly sampled area of the globe. Instead of just whinging about poorly sampled areas, as some other people are content to do.

  10. HR Says:

    Look I get the point of Ned’s and your post now but I hope you’ll indulge me a little longer.

    I don’t think it’s fair to say that skeptics are just whinging, there is usually some level of analysis involved so I was wondering just exactly which skeptic post you had in mind when you started this work?

    I had a quick look and noticed DiggingInTheClay had something on this topic. Is it this?

    They highlight one point which I think your analysis will have missed. The coloured line chart on their post shows that it’s not just whether a station is present or not in GISS but whether all the data is used. I assume the black and red lines in your chart are derived from your own database? In that case then the GISS-like graph your have generated (the black line) would actually be different to the GISS graph that GISS has generated. Because while you both use the same stations you don’t both use the same data set.

    Rather than whinge about that I thought I’d actually graph what GISS use. The GISS data is available below and fortunately includes a 64N-90N zone.

    When I graphed this and added a trend for the past 30 years I got 5.93oC/century. That’s 0.4oC greater than you get, I don’t know the signficance of that though.

    Certainly this comment by them is wildly over stating the case (i.e. is plain wrong)

    “the warming in Canada seems to be less about the warming in the high latitude long-lived stations, and more about the loss of stations generally, after 1989/90.”

    but then it seems your own comment may be going too far as well

    “but the trends and any conclusion one could derive are not affected at all”

    (I just noticed your comment on the DITC post so I assume you were inspired to do this by them)

  11. HR Says:

    Hey you’ve got my curiosity going here.

    I know nobody is really that interested in Canada (don’t tell my Canadian father-in-law I wrote that) but I’m curious what your extended database does to a ‘Canadian warming trend’. The reason being that while you say the arctic zone is the largest affected zone in your analysis what seems like a better test as to whether the GISS data is compromised in any way would be the trend in Canada afterall the overlap between the whole of Canada and the whole of the Arctic zone is relatively small.

    You probably know KNMI Climate Explorer lets you play with lat and long settings. I don’t know if your database allows the same thing? I graphed the GISS 250km data between 50N and 90N and 50W and 140W as a rough representation of Canada. That gave me a warming trend of

    0.7oC/century (1880-2009)
    3.5oC/century (1979-2009)

    If your database can mask(?) using lat and long I’d be really interested what figures you come up with.


  12. HR Says:

    That should be 60W-140W

  13. pd Says:


    Did you plan to include some tools to make maps? Something like this:

    Second question:
    I have problems with ccc-gistemp-0.6.1 when i put oiv2mon file (form )
    into input directory. Gistemp crash with error messages like in 0.6.0 version:

    ====> STEPS 0, 1, 2, 3, 4, 5 ====
    reading input/oiv2mon.201010
    Load GHCN records
    Load USHCN records
    (Reading average temperature)
    Load SCAR records
    Correct the GHCN Hohenpeissenberg record.
    Adjust USHCN records
    1 USHCN records had no GHCN counterpart.
    Traceback (most recent call last):
    File “tool/”, line 280, in
    File “tool/”, line 261, in main
    data = step_fn[step](data)
    File “tool/”, line 104, in run_step5
    result = step5.step5(data)


    There was no problem with ccc-0.5.

  14. drj Says:

    @HR: Lots of points to respond to, do pardon me and nudge again if I miss one.

    This post (and the coding effort behind it) are not a response to any particular post (“skeptic” or otherwise). Yes, I read the Digging in the Clay post (as my comment in May there shows); it was that post that alerted me to the existence of non-GHCN data that might be useful in the ccc-gistemp analysis.

    I didn’t say skeptics were whinging. I said that some other people were content to whinge. My point is that anyone whinging about poor coverage of Canadian stations could have gone out and found the data and analysed it.

    Regarding some of your other points, I think you are a little bit confused as to the relationship between ccc-gistemp (this project) and GISTEMP (GISS’s project). We implement exactly the same algorithm as GISTEMP. We use exactly the same data, and since the beginning of 2010 we get exactly the same results. You say GISS and this project don’t use the same dataset, but that’s not so. Same stations, same data, same results.

    Compared to GISTEMP, ccc-gistemp is, I think, I little easier to adapt to accept more (and different) sources of data. This Canada data is an example of that.

    The ZonAnn file you link to is a combined land and ocean analysis. The graph in this post is for land stations only (that’s what “land stations” in the graph title hints at; sorry it’s not clearer). However, you have found something interesting, because when I use GISTEMP’s land only analysis for that zone I get a trend of 5.84 °C/Century, which is still different from my graph. So I’ll have to investigate that.

    Also, it’s surprising that your figure for land+ocean has a higher trend than the land-only figure. I checked. I get a trend of 5.64 °C/Century for GISTEMP’s land+ocean analysis Zone 64N to 90N:

    Using ccc-gistemp’s tools, it’s easy for me to draw this graph:

    tool/ -x 45,50

    (the fact that GISTEMP land+ocean trend is very close to what I’ve plotted in the post, makes me wonder if I made some stupid mistake in my blog post, like plotting the land+ocean analysis instead of land-only)

    Note the huge variation between 1979 and 1981; the trend you get will be very sensitive to selection of years, we use the last 30 years with data, so that’s 1980 to 2009 (or it should be).

    I fail to see how my statement “but the trends and any conclusion one could derive are not affected at all” is “going too far”. I’m talking about conclusions one can make from the graph of zonal anomalies that I’ve drawn. Can you give an example of a conclusion that one could draw from the black line, that would be different from the conclusion one could draw from the red line?

    As to a Canada trend, I refer you to my previous statement. But… yes, something like a simple rectangle would be doable (but no it’s not something we can do right away). And you mean 50W, right? Blanc Sablon is in Canada (or are you worried about “infecting” the analysis with Greenland stations?).

  15. drj Says:

    @pd: Maps. Yeah, of course. I’d prefer something that didn’t have any dependencies (like ccc-gistemp does now), but I’d settle for something fairly clean that depended on only one or two modules. Since visualisation is extra icing I guess it’s okay that running it might need more modules.

    It’s mostly a matter of resources.

    Are you keen to creating any mapping visualisations? We’d welcome contributions to the repository.

    As for the bug… thanks, I’ll create an issue. Testing that combining new ocean data works is something that I don’t do often enough (clearly!).

  16. HR Says:


    Thanks for your patience. I hope you’ll keep that going a little longer.

    1) Ok I think I was struggling with how you were handling the new data from EC. I re-read the previous posting (“Canada”) on this and it openned up more options than I previously thought of. So the database contains duplicate records where both GHCN and EC have records for the same station and also merged(?) records for some stations exist?

    The DITC website had identified a problem that went beyond station dropout, it was showing some stations where retained by GHCN but data collection stopped (for many before 1990). I had only considered that your database contained merged records, that these meged records were used to generate the red and black line and so this aspect of the DITC analysis had been missed. I now see that it’s possible to generate the black line using the GHCN station data and the red line using EC (or merged) station data. So would that be the case then? If we took one of the examples from the coloured “piano roll” plots if Mayo was in your analysis then the black line would include data upto 1990ish while the red line would include all the EC data for that station.

    2) I think you identified my basic error in generating the trend number. I had 1979-2009 as 30 years (oops!). I didn’t have time to go back and check but I’ll take your word on this.

    3) Yeah I had waivered between 50W and 60W and choose 60W to avoid Greenland.

    4) Just on the subject of the ‘intent’ of this analysis I guess I was conflating some of the claims in the SkSci article with what you have written here. I see that you make fairly measured claims about this analysis. I guess the claim over on SkSci that I object to is this.

    “Is this result surprising? Not really. As discussed elsewhere on this site (e.g., here) previous claims of problems with global temperature reconstructions have been shown to be mistaken. ”

    Here Ned is suggesting that this work in some way validates the ‘official’ temperature reconstructions. As you point out the analysis you choose to do suggest in your database the extra EC data has little effect on the results, and that’s fine. But wider claims that this in someways validates what GISS (or others) are doing seems to be stretching things. While I now understand why you did the particular analysis (whole arctic) I still think it falls short on the claim made by Ned. I guess you’re not responsible for what Ned writes but I wonder, as author of this work, whether you think Ned is justified in making that claim.


  17. HR Says:

    I don’t know if it’s your general intention to hunt down more climate data but I recently read a paper that may interest you.

    Role of Polar Amplification in Long-Term Surface Air Temperature
    Variations and Modern Arctic Warming

    It contains an arctic temp reconstruction with heaps of extra data (particularly in Russia). I’ll copy their list of data sourses in case it helps you (a bit long sorry). The paper is well worth reading if you can behind the paywall.

    # monthly mean SATs from the Climate Research Unit,
    University of East Anglia (CRU UEA) provided by
    P. Jones;
    # monthly data fromtheWeb site of theNOAANational
    Climatic Data Center (NCDC) and NOAA Global
    Historical Climatology Network (GHCN), http://www.;
    # monthly data from the Web site of the Goddard Institute
    for Space Studies (GISS), http://data.giss.nasa.
    # monthly data from the University Corporation for Atmospheric
    Research (UCAR)Web site, http://dss.ucar.
    # The Environment Canada, National Climate Data and
    Information Archive, available online at http://climate.;
    # The Environmental Working Group Arctic Meteorology
    and Climate Atlas (Fetterer and Radionov 2000),
    available online at
    #National Snow and Ice Data Center (NSIDC) meteorological
    data from the Russian Arctic, 1961–2000,
    available online at
    # monthly SAT for several Russian meteorological stations
    from the All-Russian Institute of HydroMeteorological
    Information,WorldDataCenter (RIHMI-WDC)
    Web site,;
    # monthly and daily data from Alaskan meteorological
    stations provided by D. Atkinson and M. Shulski;
    # hourly data from the NOAA Web site, ftp://ftp.ncdc.;
    # daily observations from Russian stations provided by
    P. Groisman;
    # daily data from European stations provided by The
    Royal Netherlands Meteorological Institute (KNMI)
    Web site,
    # hourly data from the Web site of the RussianAcademy
    of Science Space Research Institute, Moscow, Russia,;
    # hourly data from the Web site ‘‘Raspisanie Pogodi,’’;
    monthly data from the Web site of the Icelandic Meteorological
    # monthly data from the Finnish Meteorological Institute
    Web site,;
    # monthly data from the Web site Rimfrost, http://;
    # monthly data from the meteorological station Abisko
    (Sweden) for 2000–07 provided by C. Jonasson,Abisko
    Scientific Research Station;
    # monthly data from several Scandinavian stations updated
    using the NordKlim project archive Web site,;
    # monthly data for several Scandinavian and Greenland
    stations updated or corrected using archive created under
    the Nordic Arctic Research Program project Web
    site,;narp/; and
    # data from several Greenland stations from the Danish
    Meteorological Institute Web site,

    As I said a little on the long side (fingercrossed on the formatting).

  18. HR Says:

    Did my second post with the climate data website links make it?

  19. Nick.Barnes Says:

    Thanks for the tip HR: it was in the spam trap.

  20. drj Says:

    @HR: Just a brief response now, more next week hopefully.

    With regard to your point 1), yes there are “duplicate” records, in other words multiple records for overlapping periods for the same station. These are a fact of life with GHCN v2, but the Environment Canada can introduce a further source of duplicates. I’ve recently written this overview document and I would welcome your comments (preferably via the mailing list). This duplicate stuff is the concern of “Step 1″.

    The red line and the black line: the black line is the tradition analysis, GHCN data only. The red line is the GHCN data augmented by the Environment Canada data. Where EC stations are identified as being identical to GHCN stations, the data will be merged (nothing special here, it’s just what Step 1 does); where the EC stations are identifier as being new, they will simply be treated as additional records.

  21. clearscience Says:

    drj and nick,

    Something I think you’ll find interesting that if you compute monthly anomalies for the region you will likely see a pretty big warming for the year 2010. Particularly as the Canadian arctic had a winter with around 5-10 °C above normal.

    Although I should note that much of the warmth was in the Labrador and baffin island region which is still considered “Arctic” but does not meet the 64 N criteria.

  22. clearscience Says:

    Is there any way to adjust the code so as to specify specific region to be used and also to produce maps from the output?

  23. drj Says:

    @HR: An interesting list. Curious in some respects… GISS (3rd source) in the arctic region is exactly the same as GHCN (2nd source). Why do they list both? (rhetorical). In principle I would be pleased to see many of these data sources used in a ccc-gistemp style analysis, in a similar way to the way I’ve integrated the Canada data.

  24. drj Says:

    @clearscience: This year’s anomaly in the Canadian region sounds like a great subject for a blog post (you, on your blog, I mean!).

    “Is there any way to adjust the code…?” Yes. It’s just a Simple Matter of Programming. Are you asking because you want me to do it, or because you’re interested in doing it? The technical details are probably best sorted out on the mailing list. Short version: Compute your favourite region as a subset of the 8000 cells used in the GISTEMP grid, and modify step 5 to compute an average over just that region.

  25. drj Says:

    @HR (#16): re Ned’s Skeptical Science post; it’s true that Ned makes some statements that I would probably word a little bit more carefully (but then, I like to be very careful when talking about ccc-gistemp), but the statement you quoted, “Is this result surprising? Not really. As discussed elsewhere on this site (e.g., here) previous claims of problems with global temperature reconstructions have been shown to be mistaken”, does not seem very controversial to me. It isn’t surprising that a bit more Canada data doesn’t alter the trends significantly; and it is the case that other claims of problems have shown to be mistaken.

    Skeptical Science is probably the best place to comment on their posts.

  26. Views part 1 – Canadian weather stations | Scraperwiki Data Blog Says:

    […] more about ScraperWiki and the Canada weather stations in the posts Canada and Analysis of Canada Data on the Clear Climate Code […]

  27. Doug Proctor Says:

    Do you support the adjustments made to the datasets that give rise to the above graph? Are we in Canada not subject to the same adjustment biases as in the rest of the GCHN data?

    It is so odd that until 1990 it was important to collect all this temperature data, to bring it into the computer system, and then, just as Gore & Hansen say that the world is in trouble, they discontinue station after station. As far as I know most of those locations still have operating weather stations. So why did NOAA decide they don’t need the data?

    I guess in 1990 the science was settled.

  28. Nick.Barnes Says:

    Doug: can you be more specific about the adjustments you mean? And can you say what you mean by “support”, or by “adjustment biases”? Unless you can be more specific, your questions are not answerable.

    The rest of your comment suggests that the fall in station counts after 1990 in GHCN reflects a 1990 change in procedures or systems. My understanding is that it’s mostly an effect of the WWR (World Weather Records) decadal reporting system: the 1980s report has been incorporated into GHCN, but the 1990s report has not yet. Many station records from the 1990s are likely to be incorporated into GHCN in future. See also

    Your political remarks, and your general tone, are not really acceptable on this blog. Please be more civil and on-topic.

  29. ferd berple Says:

    It looks like the rise in temperature over the past 30 years is almost identical to what happened 65 years prior. If anything the slope of the rise from 1918 to 1948 is greater than the more modern rise. However, CO2 is significantly higher today.

    What is equally interesting is that the CET, the longest thermometer record available, has been rising gradually since the LIA without any signs of atypical acceleration in spite of massive increases in human CO2 production.

    While many rational explanations can be found for these observations, the simplest and thus most likely explanation is that atmospheric CO2 levels are driven by temperature. As temperatures go up, more CO2 is released.

    Coincidentally, human prosperity is also largely driven by temperature, with increased temperatures leading to increased food production. Both due to increased arable land and increase Co2 to fertilize the plants.

  30. LightRain Says:

    Yabut, there are many more stations in southern Ontario than the far north. If you have 15 stations within a couple of hundred miles of each other and it’s a hot day, where there is one station in the north every 500-1000 miles, which station(s) will dominate the average for a particular day? If it’s hot in southern Ontario then you have many stations reflecting that, same for cold whereas the north is under serviced and one station can represent a huge area (kind of like what Hansen does) and make that huge area all hot, all cold, or average. I wonder if Hansen adjusts the Canadian temperatures for Environment Canada?

    Satellites are the only unbiased way to go!

  31. Nick.Barnes Says:

    ferd berple: I believe that most of your assertions are false, but more importantly your comment is completely off-topic. Future off-topic comments will be deleted without comment.

  32. drj Says:

    @LightRain: this, and my previous post, documents my solution to exactly that problem. Find more station data and use them. As to the other thing. Gridding.

    I’m not familiar with the satellite datasets, but what does the UAH North pole anomaly dataset show?

  33. drj Says:

    @LightRain: oh yes, and could you please clarify your question about Hansen? Or, better, avoid making snide rhetorical questions. This is not that sort of blog.

Leave a Reply