On integers, floating-point numbers, and rounding

Progress continues on the ccc-gistemp project. Anyone interested is welcome to go on over to the source code browse page and peruse it.

  • Paul Ollis has done excellent work separating all the I/O code from the main algorithm, and refactoring it so that data can flow through the entire program without passing through several intermediate data files.
  • David Jones has made a tool for indexing plain-text data files for random access, and has been working SVG-based visualisation tools. Together, one day these will let us provide a snappy graphical interface for answering questions like “how did the peri-urban adjustment on this station work?”
  • I have been working on removing rounding from the whole system. Until now we have often found ourselves having to round values in order to maintain exact equivalence with GISS results (which may have been rounded for output to an intermediate data file which is read by a later phase). For example, rounding temperatures to the nearest tenth degree Celsius, or latitude and longitude values to the nearest tenth degree. I mentioned this in email with Dr Reto Ruedy of GISS, and he assured me that all such rounding is incidental to the algorithm – an accident of history. So we are removing it from our version, to help clarify the algorithm. We will end up with the only explicit rounding in the system being done in order to write the final result files.
  • Next I am hoping we will extract the main numerical parameters of the algorithm – for instance, the 1200km station radius for gridding, the 4 rural stations required for peri-urban adjustment – to a separate module, where they can be easily modified by anyone interested in experimenting with different values.

We are aiming for a release 0.4.0 of ccc-gistemp to happen around the end of February or in early March, time permitting. The specification of this version is something like “no I/O, no rounding, and explicit parameters”, and we’re pretty close to that now.

Rounding in GISTEMP has prompted a lot of discussion in the blogosphere, and since I have been working in that area in ccc-gistemp, I thought I could write a few words here to clarify it. There is a lot of general misunderstanding of computer arithmetic, even among professional programmers. I have dealt with the nitty-gritty of it in various capacities in the past, and hopefully can convey some of my expertise.

Computer programs generally deal with two sorts of numbers. The first sort is “integers”, whole counting numbers such as 42, 57, -73, and 4294967296. The second sort is “floating-point”, which some people refer to as “real” numbers. These are often used when manipulating fractions and numbers with decimal points (although the most obvious example – monetary amounts such as $5.99 – is in fact often handled as an integer number of pennies).

Many people, including some professional programmers, distrust floating-point arithmetic. Maybe they have had too many bad experiences with poorly-programmed computers displaying a number like 94.8 as “94.799999999999997”. However, a programmer who is familiar with the way the system works can use floating-point arithmetic in very accurate and powerful ways.

The first thing to understand is that every floating-point number in a computer is an exact fraction, in which the denominator is a power of 2. Numbers such as 1.5, 15/16, and 348304583453/2199023255552 are all exact floating-point numbers (the denominator of the last number there is 241). Any fraction which doesn’t have a power-of-2 denominator can’t be represented exactly. For instance, 0.1 can’t be represented exactly. It falls between the two exact floating-point numbers 7205759403792793 . 2-56 and 7205759403792794 . 2-56. That is to say between 0.09999999999999999167332731531132594682276248931884765625 and 0.1000000000000000055511151231257827021181583404541015625. It is slightly closer to the latter number, so the computer chooses this number to represent 0.1, and some systems might (unwisely, I think) print it out as “0.10000000000000001”.

Exactly this problem is also true of decimal arithmetic. One third falls between 0.333333 and 0.333334, closer to the first. So if we want to represent one third as a decimal to six places, we write 0.333333 (in floating-point we choose the number 0.333333333333333314829616256247390992939472198486328125). This hasn’t prevented decimal arithmetic from being tremendously useful over the past thousand years, so it is unreasonable to shy away from binary arithmetic just because of this.

Every time a number is read from a file, or numbers are added, multiplied, or otherwise combined, the computer has to choose a floating-point result to represent the answer. Because it can only use fractions with power-of-two denominators (and some other limits – the power can’t exceed about a thousand, and the numerator can’t be greater than about 9 million billion), many operations will result in a tiny bit of rounding – approximately 1 part in 9 million billion (one digit in the 16th decimal place). As operation follows operation, this rounding error accumulates. In some kinds of iterative algorithm, that accumulation can blow up out of control and the error will come to completely skew the results. Numeric programming experts are well aware of this and may spend their whole careers identifying such algorithms and designing ways around them.

Similarly some aspects of the real world – known as “chaotic systems” – are inherently sensitive to tiny changes, which can become amplified to dominate the behaviour, in just the same way that rounding errors in programs can blow up out of control. The “butterfly effect” in the weather is a famous example. No program can model such a system accurately for an extended period, because any rounding error can soon change the behaviour of the whole model. This is why, despite huge computer systems and ranks of experts, the Met Office can’t yet tell me whether it will rain on my birthday.

Happily, GISTEMP does not include any algorithm of that sort, and does not represent a system of that sort (climate is not weather, as one can easily see by predicting whether London this June will average warmer, or cooler, than February). GISTEMP does include, in a few places, code with sudden “cut-offs” (for instance, a rural station 999.9 km from an urban station may contribute to adjusting its record, whereas one 1000.1 km away will not), and these parts of the code are potentially sensitive to small changes, such as rounding errors. However, it is reasonable to suppose that such effects balance out, in terms of their effect on the final results. And this is indeed what we see: as we remove the explicit rounding from the algorithm, occasionally a monthly temperature in the final results might increase by 0.01K, and occasionally one might decrease by the same amount, but overall the rounding does not affect the trend. As the code becomes clearer, it is becoming easier for us, or for others, to experiment with different rounding effects.

Finally, a concrete example. In the original GISTEMP code, this line of Fortran converted USHCN temperatures (recorded in Fahrenheit with one decimal place) to an integer number of tenths of degrees celsius:
itemp(m)=nint( 50.*(temp-32.)/9 )
In our first ccc-gistemp code, that line became this:
temp = round_to_nearest((temp_fahrenheit - 32) * 50/9.0)
A typical value read might be “68.1”. On my machine, Python reads this as 2396055739249459 * 2-45 (i.e. 68.099999999999994315658113919198513031005859375). After the multiplication and division, this becomes 7056421291149084 * 2-45 (i.e. 200.5555555555555429236846975982189178466796875). This rounds to 201. If we had written the arithmetic in a different order (e.g. the division before the multiplication), we would get an answer which was smaller by 2-45, but this difference of course would disappear in the rounding.
Later, the file format changed so that we would read “681” instead of “68.1”, and the arithmetic became
temp = round_to_nearest((temp_fahrenheit - 320) * 5/9.0)
Doing the multiplication first again gives us 7056421291149084 * 2-45. This time, doing the division first produces a number which is larger by 2-45, and again this difference disappears in the rounding.
Recently, in removing the explicit rounding, and treating temperatures as floating-point values rather than integral numbers of tenths of degrees, this code has become:
temp = (temp_fahrenheit - 320) * 5/90.0.

The result is 5645137032919268 * 2-48 i.e. 20.0555555555555571345394128002226352691650390625, and this number is taken, without rounding, into the rest of the computation. It differs from the exact Fahrenheit/Celsius conversion by less than one hundred-thousand-billionth of a degree Celsius. If we had done the division before the multiplication, the number would have been less by 2-48, which is less than one million billionth of a degree Celsius. In the context of an input dataset precision of at least 0.1 degree Fahrenheit, this difference is obviously totally negligible.

9 Responses to “On integers, floating-point numbers, and rounding”

  1. AndyB Says:

    I’ve never known how floats actually worked – I supposed if pressed I would have guessed something like that given the nature of computing. But, c’mon 2^-45? Is there really discussion out in the blogs that claims floating point arithmetic is at all relevant to the thermometer record? That’s just silly on the face of it.

    Thanks for the lesson!

  2. drj Says:

    @AndyB: Yes, there really is.

  3. Nick.Barnes Says:

    A comment to correct an error I made in the post. The USHCNv1 dataset has Fahrenheit temperatures to two decimal places, not to one as I stated in the post. For instance, a reading might be “66.83”. Reading this produces 4702743173393285.2-46, i.e. 66.8299999999999982946974341757595539093017578125. Converting with multiplication first produces 387.2-1, i.e. 193.5, exactly the “right” answer (19.35 Celsius == 66.83 Fahrenheit). Converting by division first, or by multiplying by the factor (50.0/9.0), produces 6808175999188991.2-45, which is less by 2-45. Critically, this latter value rounds to a different integer (193 instead of 194). In this way, explicit rounding and truncation can amplify some floating-point rounding errors. A small proportion of the USHCNv1 dataset is prone to exactly this problem: some of them round up, some down. This problem does not occur with the USHCNv2 dataset, or the other input datasets.

    As it happens, even these amplified errors balance out in the later parts of the system, especially in the combination, averaging, and anomaly steps, and are not visible in the end results. But still, now that we no longer need to pass the dataset through plain-text intermediate files, the code is clearer without these explicit rounding and truncation steps.

  4. anon Says:

    Why are Fahrenheit numbers being used in scientific data anyway?

  5. Nick Barnes Says:

    Why are Fahrenheit numbers being used in scientific data anyway?
    Because it is US weather data, which has always been, and continues to be, recorded in degrees Fahrenheit. Degrees Fahrenheit may not be my favoured unit, but they are a respectable scientific unit nonetheless.
    Any more dissing of Fahrenheit is off-topic here.

  6. Rojarani Says:

    why square of floting points are less.

  7. Nick.Barnes Says:

    Less than what?

  8. nerd Says:

    how javascript prints 0.1 instead of 0.10000000000000001?

  9. Nick.Barnes Says:

    I’m not sure I understand the question. Printing floating-point numbers both accurately and well is a solved problem since at least the Steele and White paper in 1990: google “Printing floating-point numbers accurately”. Sadly many language implementors seem to remain unaware of the solutions.

Leave a Reply