The United Kingdom Meteorological Office produces (in conjunction with the University of East Anglia’s Climatic Research Unit) the downloadable and widely used gridded temperature anomaly data sets known as HadCRUT and CRUTEM3. Yet even such a high-profile data set, developed by an organization with a good standard of software development34, contained errors that would have been more quickly identified and rectified had the underlying code been readily available.
In 2009, on examining the available data sets and the description of the algorithm35, J.G.-C. identified a number of errors (the software he used to check the meteorological database is available upon request). One set of errors was procedural, and involved incorrect computation of historical average temperatures in a number of records in New Zealand and Australia. The Meteorological Office confirmed the errors, showed that they had resulted in errors up to 0.2 °C (either warmer or cooler) in the average temperature for Australia and New Zealand in some years before 1900, and issued an update to CRUTEM3. Two other errors occurred in the coding of the calculation of station errors (an estimate of the error in any average temperature reading). When corrected, a minor reduction in station errors resulted, improving the accuracy of the data. So, although these implementation problems did not lead to serious errors in the temperature data sets, they highlight the difficulty of translating a natural-language description (even with some formulae expressed mathematically) into code.
These errors do not in any way reflect badly on the original authors. The code rewriting simply plays the part of peer review and it is normal to find such errors. Indeed, the discovery of such errors in ‘working’ software is exceedingly common in all computing, even when the software has been in use for a considerable time. This was emphatically demonstrated in a seminal IBM study36, demonstrating that fully a third of all the software failures in the study took longer than 5,000 execution years (execution time indicates the total time taken executing a program) to fail for the first time.
Department of Computing Open University, Walton Hall, Milton Keynes MK7 6AA, UK
- Darrel C. Ince
School of Computing and Information Systems, Kingston University, Kingston KT1 2EE, UK
- Leslie Hatton
83 Victoria Street, London SW1H 0HW, UK
- John Graham-Cumming
D.C.I., L.H. and J.G.-C. contributed to all aspects of this article.
Competing financial interests
The authors declare no competing financial interests.
Darrel C. Ince