Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Artificial intelligence reconstructs missing climate information

Abstract

Historical temperature measurements are the basis of global climate datasets like HadCRUT4. This dataset contains many missing values, particularly for periods before the mid-twentieth century, although recent years are also incomplete. Here we demonstrate that artificial intelligence can skilfully fill these observational gaps when combined with numerical climate model data. We show that recently developed image inpainting techniques perform accurate monthly reconstructions via transfer learning using either 20CR (Twentieth-Century Reanalysis) or the CMIP5 (Coupled Model Intercomparison Project Phase 5) experiments. The resulting global annual mean temperature time series exhibit high Pearson correlation coefficients (≥0.9941) and low root mean squared errors (≤0.0547 °C) as compared with the original data. These techniques also provide advantages relative to state-of-the-art kriging interpolation and principal component analysis-based infilling. When applied to HadCRUT4, our method restores a missing spatial pattern of the documented El Niño from July 1877. With respect to the global mean temperature time series, a HadCRUT4 reconstruction by our method points to a cooler nineteenth century, a less apparent hiatus in the twenty-first century, an even warmer 2016 being the warmest year on record and a stronger global trend between 1850 and 2018 relative to previous estimates. We propose image inpainting as an approach to reconstruct missing climate information and thereby reduce uncertainties and biases in climate records.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: AI models reconstruct two exemplary monthly show cases with many missing values.
Fig. 2: Evaluation of AI models with different methods for annual global mean temperature reconstructions.
Fig. 3: AI model spatial reconstruction of an observed El Niño with many missing values in HadCRUT4.
Fig. 4: AI model reconstruction of HadCRUT4 for the full time series between 1850 and 2018.

Similar content being viewed by others

Data availability

A software snapshot, trained AI models (checkpoints), missing value masks and the HadCRUT4 reconstructions by the AI models can be downloaded at https://doi.org/10.5281/zenodo.3766741. Training data from 20CR and CMIP5 cannot be hosted due to copyrights, but are available at National Oceanic and Atmospheric Administration and ESGF (Methods). Contact kadow@dkrz.de for further information. Source Data are provided with this paper.

Code availability

All the code utilized in this project can be downloaded here or cloned here at https://github.com/FREVA-CLINT/climatereconstructionAI. This code will be updated and changed over time.

References

  1. Brázdil, R. et al. European climate of the past 500 years: new challenges for historical climatology. Clim. Change 101, 7–40 (2010).

    Article  Google Scholar 

  2. Cubasch, U. & Kadow, C. Global climate change and aspects of regional climate change in the Berlin–Brandenburg Region. Erde 142, 3–20 (2011).

    Google Scholar 

  3. Hartmann, D. L. et al. in Climate Change 2013: The Physical Science Basis (eds Stocker, T.F. et al.) Ch. 2 (IPCC, Cambridge Univ. Press, 2013).

  4. Morice, C. P., Kennedy, J. J., Rayner, N. A. & Jones, P. D. Quantifying uncertainties in global and regional temperature change using an ensemble of observational estimates: the HadCRUT4 dataset. J. Geophys. Res. 117, D08101 (2012).

    Article  Google Scholar 

  5. Vose, R. S. et al. NOAA’s merged land-ocean surface temperature analysis. Bull. Am. Meteorol. Soc. 93, 1677–1685 (2012).

    Article  Google Scholar 

  6. Lenssen, N. et al. Improvements in the GISTEMP uncertainty model. J. Geophys. Res. Atmos. 124, 6307–6326 (2019).

    Article  Google Scholar 

  7. Cowtan, K. & Way, R. G. Coverage bias in the HadCRUT4 temperature series and its impact on recent temperature trends. Q. J. R. Meteorol. Soc. 133, 459–77 (2013).

    Google Scholar 

  8. Rayner, N. A. et al. Global analyses of sea surface temperature, sea ice, and night marine air temperature since the late nineteenth century. J. Geophys. Res. 108, 4407 (2003).

    Article  Google Scholar 

  9. Rhode, R. et al. A new estimate of the average Earth surface land temperature spanning 1753 to 2011. Geoinfor. Geostat. Overview 1, https://doi.org/10.4172/2327-4581.1000101 (2013).

  10. Beckers, J. & Rixen, M. EOF calculations and data filling from incomplete oceanographic data sets. J. Atmos. Oceanic Technol. 20, 1839–1856 (2003).

    Article  Google Scholar 

  11. Wang, K. & Clow, G. D. Reconstructed global monthly land air temperature dataset (1880–2017). Geosci. Data J. https://doi.org/10.1002/gdj3.84 (2019).

  12. Smith, T. M., Reynolds, R. W., Livezey, R. E. & Stokes, D. C. Reconstruction of historical sea surface temperatures using empirical orthogonal functions. J. Clim. 9, 1403–1420 (1996).

    Article  Google Scholar 

  13. Kaplan, A., Kushnir, Y., Cane, M. A. & Blumenthal, M. B. Reduced space optimal analysis for historical data sets: 136 years of Atlantic sea surface temperatures. J. Geophys. Res. Oceans 102, 27835–27860 (1997).

    Article  Google Scholar 

  14. Elken, J., Zujev, M., She, J. & Lagemaa, P. Reconstruction of large-scale sea surface temperature and salinity fields using sub-regional EOF patterns from models. Front. Earth Sci. 7, 232 (2019).

    Article  Google Scholar 

  15. Reichstein, M. Deep learning and process understanding for data-driven Earth system science. Nature 566, 195–204 (2019).

    Article  Google Scholar 

  16. Monteleoni, C., Schmidt, G. A. & McQuade, S. Climate informatics: accelerating discovering in climate science with machine learning. Comput. Sci. Eng. 15, 32–40 (2013).

    Article  Google Scholar 

  17. Barnes, E. A., Hurrell, J. W., Ebert-Uphoff, I., Anderson, C. & Anderson, D. Viewing forced climate patterns through an AI lens. Geophys. Res. Lett. 46, 13389–13398 (2019).

    Article  Google Scholar 

  18. Racah, E. et al. ExtremeWeather: a large-scale climate dataset for semi-supervised detection, localization, and understanding of extreme weather events. Adv. Neural Inform. Process. Syst. 30, 3405–3416 (2017).

    Google Scholar 

  19. Kadow, C., Illing, S., Kröner, I., Ulbrich, U. & Cubasch, U. Decadal climate predictions improved by ocean ensemble dispersion filtering. J. Adv. Modeling Earth Syst. 9, 1138–1149 (2017).

    Article  Google Scholar 

  20. Irrgang, C., Saynisch, J. & Thomas, M. Estimating ocean heat content from tidal magnetic satellite observations. Sci. Rep. 9, 7893 (2019).

    Article  Google Scholar 

  21. Bertalmio, M., Sapiro, G. Caselles, V. & Ballester, C. Image inpainting. In Proc. ACM Conf. Comp. Graphics (SIGGRAPH) (eds Brown, J. R. & Akeley, K.) 417–424 (ACM/Addison-Wesley, 2000).

  22. Shibata, S., Iiyama, M., Hashimoto, A. & Minoh, M. Restoration of sea surface temperature satellite images using a partially occluded training set. In 24th International Conference on Pattern Recognition (ICPR), Beijing (IEEE Computer Society) 2771–2776 (IEEE, 2018).

  23. Dong, J. et al. Inpainting of remote sensing SST images with deep convolutional generative adversarial network. IEEE Geosci. Remote Sens. Lett. 16, 173–177 (2019).

    Article  Google Scholar 

  24. Liu, G. et al. in Computer Vision—ECCV 2018 Lecture Notes in Computer Science, Vol. 11215 (eds Ferrari, V. et al.) 19–35 (Springer, 2018).

  25. Barnes, C., Shechtman, E., Finkelstein, A. & Goldman, D. B. Patchmatch: a randomized correspondence algorithm for structural image editing. ACM Trans. Graph. 28, 24 (2009).

    Article  Google Scholar 

  26. Iizuka, S., Simo-Serra, E. & Ishikawa, H. Globally and locally consistent image completion. ACM Trans. Graph. 36, 107 (2017).

    Article  Google Scholar 

  27. Yu, J. et al. Generative Image Inpainting with Contextual Attention. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA 5505–5514 (IEEE/CVF, 2018).

  28. Perez, P., Gangnet, M. & Blake, A. Poisson image editing. ACM Trans. Graph. 22, 313–318 (2003).

    Article  Google Scholar 

  29. Elharrouss, O., Almaadeed, N., Al-Maadeed, S. & Akbari, Y. Image inpainting: a review. Neural Process. Lett. 51, 2007–2028 (2019).

    Article  Google Scholar 

  30. Compo, G. P. et al. The Twentieth Century Reanalysis project. Q. J. R. Meteorol. Soc. 137, 1–28 (2011).

    Article  Google Scholar 

  31. Taylor, K. E., Stouffer, R. J. & Meehl, G. A. An overview of CMIP5 and the experiment design. Bull. Am. Meteor. Soc. 93, 485–498 (2012).

    Article  Google Scholar 

  32. Folland, C. K., Boucher, O., Colman, A. & Parker, D. E. Causes of irregularities in trends of global mean surface temperature since the late 19th century. Sci. Adv. 4, eaao5297 (2018).

    Article  Google Scholar 

  33. Kiladis, G. N. & Diaz, H. F. An analysis of the 1877–78 ENSO episode and comparison with 1982–83. Mon. Weather Rev. 114, 1035–1047 (1986).

    Article  Google Scholar 

  34. Aceituno, P. et al. The 1877–1878 El Niño episode: associated impacts in South America. Clim. Change 92, 389–416 (2009).

    Article  Google Scholar 

  35. Knutson, T. R., Zhang, R. & Horowitz, L. W. Prospects for a prolonged slowdown in global warming in the early 21st century. Nat. Commun. 7, 13676 (2016).

    Article  Google Scholar 

  36. Kosaka, Y. & Xie, S. P. Recent global-warming hiatus tied to equatorial Pacific surface cooling. Nature 501, 403–407 (2013).

    Article  Google Scholar 

  37. Saffioti, C., Fischer, E. M. & Knutti, R. Contributions of atmospheric circulation variability and data coverage bias to the warming hiatus. Geophys. Res. Lett. 42, 2385–2391 (2015).

    Article  Google Scholar 

  38. Marotzke, J. & Forster, P. M. Forcing, feedback and internal variability in global temperature trends. Nature 517, 565–570 (2015).

    Article  Google Scholar 

  39. Yan, Z. X., Li, M., Zuo, W. & Shan, S. in Computer Vision—ECCV 2018 Lecture Notes in Computer Science, Vol. 11215 (eds Ferrari, V. et al.) 3–19 (Springer, 2018).

  40. Kennedy, J. J., Rayner, N. A., Atkinson, C. P. & Killick, R. E. An ensemble data set of sea-surface temperature change from 1850: the Met Office Hadley Centre HadSST.4.0.0.0 data set. J. Geophys. Res. Atmos. 124, 7719–7763 (2019).

    Article  Google Scholar 

  41. Eyring, V. et al. Overview of the Coupled Model Intercomparison Project Phase 6 (CMIP6) experimental design and organization. Geosci. Model Dev. 9, 1937–1958 (2016).

    Article  Google Scholar 

  42. Dufresne, J. L. et al. Climate change projections using the IPSL-CM5 Earth System Model: from CMIP3 to CMIP5. Clim. Dyn. 40, 2123–2165 (2013).

    Article  Google Scholar 

  43. Illing, S., Kadow, C., Oliver, K. & Cubasch, U. MurCSS: a tool for standardized evaluation of decadal hindcast systems. J. Open Res. Softw. 2, e24 (2014).

    Google Scholar 

  44. Lewis, S. C. & Karoly, D. J. Assessment of forced responses of the Australian Community Climate and Earth System Simulator (ACCESS) 1.3 in CMIP5 historical detection and attribution experiments. Aust. Meteorol. Oceanogr. J. 64, 87–101 (2014).

    Article  Google Scholar 

  45. Collier, M. & Uhe, P. CMIP5 Datasets from the ACCESS1.0 and ACCESS1.3 Coupled Climate Models CAWCR Technical Report 059 (CAWCR, 2012).

  46. Xin, X., Wu, T. & Zhang, J. Introduction of CMIP5 experiments carried out with the climate system models of Beijing Climate Center. Adv. Clim. Change Res. 4, 41–49 (2013).

    Article  Google Scholar 

  47. Ji, D., Wang, L., Feng, J., Wu, Q. & Cheng, H. BNU-ESM Model Output Prepared for CMIP5 rcp45 Experiment, Served by ESGF (WDCC at DKRZ, 2015); https://doi.org/10.1594/WDCC/CMIP5.BUBUr4

  48. Canadian Centre for Climate Modelling and Analysis (CCCma). CanESM2 Model Output Prepared for CMIP5 Historical, Served by ESGF (WDCC at DKRZ, 2015); https://doi.org/10.1594/WDCC/CMIP5.CCE2hi

  49. Scoccimarro, E. et al. Effects of tropical cyclones on ocean heat transport in a high resolution coupled general circulation model. J. Clim. 24, 4368–4384 (2011).

    Article  Google Scholar 

  50. Centre National de Recherches Météorologiques and Centre Européen de Recherche et Formation Avancée en Calcul Scientifique WCRP CMIP5: The CNRM-CERFACS Team CNRM-CM5-2 Model Output for the Historical Experiment (Centre for Environmental Data Analysis, 2017); http://catalogue.ceda.ac.uk/uuid/6ea812758cf14de8a5577406e896c3f9

  51. Rotstayn, L. et al. Improved simulation of Australian climate and ENSO-related climate variability in a GCM with an interactive aerosol treatment. Int. J. Climatol. 30, 1067–1088 (2010).

    Google Scholar 

  52. Hazeleger, W. et al. EC-Earth. Bull. Am. Meteor. Soc. 91, 1357–1364 (2010).

    Article  Google Scholar 

  53. Li, L. et al. The flexible global ocean–atmosphere–land system model, Grid-point Version 2: FGOALS-g2. Adv. Atmos. Sci. 30, 543–560 (2013).

    Article  Google Scholar 

  54. Qiao, F. et al. Development and evaluation of an Earth System Model with surface gravity waves. J. Geophys. Res. Oceans 118, 4514–4524 (2013).

    Article  Google Scholar 

  55. Miller, R. L. et al. CMIP5 historical simulations (1850–2012) with GISS ModelE2. J. Adv. Model. Earth Syst. 6, 441–477 (2014).

    Article  Google Scholar 

  56. Volodin, E. M., Dianskii, N. A. & Gusev, A. V. Simulating present-day climate with the INMCM4.0 coupled model of the atmospheric and oceanic general circulations. Atmos. Ocean. Phys. 46, 414–431 (2010).

    Article  Google Scholar 

  57. Watanabe, M. et al. Improved climate simulation by MIROC5: mean states, variability, and climate sensitivity. J. Clim. 23, 6312–6335 (2010).

    Article  Google Scholar 

  58. Giorgetta, M. et al. CMIP5 Simulations of the Max Planck Institute for Meteorology (MPI-M) based on the MPI-ESM-LR Model: the rcp45 Experiment, Served ESGF (WDCC at DKRZ, 2012); https://doi.org/10.1594/WDCC/CMIP5.MXELr4

  59. Meteorological Research Institute (MRI) MRI-CGCM3 Model Output Prepared for CMIP5, Served by ESGF (WDCC at DKRZ, 2012); http://cera-www.dkrz.de/WDCC/CMIP5/Compact.jsp?acronym=MRMC

  60. Iversen, T. et al. The Norwegian Earth System Model, NorESM1-M—Part 2: climate response and scenario projections. Geosci. Model Dev. 6, 389–415 (2013).

    Article  Google Scholar 

  61. Gent, P. R. et al. The Community Climate System Model version 4. J. Clim. 24, 4973–4991 (2011).

    Article  Google Scholar 

Download references

Acknowledgements

We thank the HPC-Service of ZEDAT, Freie Universität Berlin and the German Climate Computing Center (DKRZ) for the computation resources; the Climatic Research Unit (CRU) of the University East Anglia (UEA) and the MetOffice UK for providing the HadCRUT4 and HadSST4 datasets; the Earth System Grid Federation (ESGF) for providing the CMIP5 experiments; J. Marotzke (MPI-M), M. Schuster (FUB), E. Barnes (CSU), K. Buscher (UKM) for discussions; N. Inoue (University of Tokyo) for providing the applicable code for image inpainting; A. Richling (FUB) for reproducing the Intergovernmental Panel on Climate Change trend, uncertainty and confidence values; K. Cowtan, R. Way, and the University of York for not just providing the reconstructed HadCRUT4 data (used in Fig. 3b), but also software to apply the kriging scheme (used in Fig. 2). Support for the 20CR Project dataset is provided by the US Department of Energy, Office of Science Innovative and Novel Computational Impact on Theory and Experiment (DOE INCITE) programme, by the Office of Biological and Environmental Research (BER) and by the National Oceanic and Atmospheric Administration Climate Program Office.

Author information

Authors and Affiliations

Authors

Contributions

C.K. initiated the study design, coded the AI technology for climate research, performed the analysis and drafted the paper. D.M.H. supervised the NVIDIA AI technology and U.U. supervised the climate research results. All the authors discussed the results and edited the manuscript.

Corresponding author

Correspondence to Christopher Kadow.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Primary Handling Editors: Stefan Lachowycz; Heike Langenberg.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Scheme for the study setup including training set.

Input for the AI models, training of the models, and their output. HadCRUT4 data in black, CMIP data or AI in red, 20CR data or AI in blue. Numbers on the bottom of the boxes represent the number of ‘images’ / months / time steps, which are used as input or result as output (see Method section).

Extended Data Fig. 2 Detailed grid space evaluation of 20CR reconstruction.

Correlation (left) and root mean squared error in centigrade (right) comparing the reconstructed 20CR 56th member by the 20crAI model with the original 20CR 56th member. Comparison of all grid points in an annual (row 1) and monthly (row 2) analysis. The respective analysis for the reconstructed grid points only, without (w/o) grid points which were evident during reconstruction below (row 3/4). Grey grid points indicate points that exist for the whole time series.

Source data

Extended Data Fig. 3 Detailed grid space evaluation of CMIP reconstruction.

Correlation (left) and root mean squared error in centigrade (right) comparing the reconstructed CMIP 145th member by the cmipAI model with the original CMIP 145th member. Comparison of all grid points in an annual (row 1) and monthly (row 2) analysis. The respective analysis for the reconstructed grid points only, without (w/o) grid points which were evident during reconstruction below (row 3/4). Grey grid points indicate points that exist for the whole time series.

Source data

Extended Data Fig. 4 Time-series analysis and evaluation of AI model reconstruction.

As Fig. 2, but the annual global mean anomaly temperature reconstructions in centigrade of 20CR (a, b) / CMIP (c, d) test-suite of monthly grid reconstructions of the held-out 56th / 145th member using the HadCRUT4 missing value mask (1870-2005). In black the original held-out member, in black-dashed the original but masked held-out member to see the effect of the missing values. In blue/red the reconstructed grid time-series of the 20crAI/cmipAI. Tables show anomaly correlation (r) and root mean squared error (rmse) compared to the original dataset on four selected time ranges. (see also Fig. 2).

Source data

Extended Data Fig. 5 Spatial evaluation of AI models over time.

Fieldcorrelation of the annual (a) and monthly (b) mean reconstruction of the 20CR 56th / CMIP 145th member by the 20crAI / cmipAI models with the original 20CR 56th / CMIP 145th member in blue / red. Solid line compares the full grid space, while the dashed line respective analysis for the reconstructed grid points only, without (w/o) grid points which were evident during reconstruction.

Source data

Extended Data Fig. 6 Evaluation on reconstructed grid points only.

Annual global mean anomaly temperature reconstruction in centigrade of 20CR (a) and CMIP (b) of monthly grid reconstructions applying only reconstructed missing values the extra 56th / 145th member using the HadCRUT4 missing value mask between 1870 and 2005. In black the extra member without (w/o) existing grid points, in black-dashed the original full left-out member to see the effect of the missing values. In blue/red the reconstructed grid time-series of the 20crAI/cmipAI models without (w/o) existing grid points.

Source data

Extended Data Fig. 7 Reconstruction analysis of additional Hadley Centre products.

Annual global mean anomaly temperature time series between 1850 and 2018. (a) HadCRUT4 original (masked) 100 member data in black (median, 95th, 5th percentile). The HadCRUT4 reconstruction of the 20crAI/cmipAI models in blue/red (median, 95th, 5th percentile). (b) HadCRUT4 original (masked) data in black, HadSST4 original (masked) data in pink, HadMIX original (masked) data in orange. The originals are dashed, the reconstructions have straight lines. HadMIX has all grid points available of HadSST4, if not available (usually over land) HadCRUT4 grid points are used.

Source data

Extended Data Fig. 8 HadCRUT4 trends of AI models in grid space.

Trends in surface temperature from Fig. 4 for 1901–2012. White areas indicate incomplete or missing data. Trends have been calculated only for those grid boxes with greater than 70% complete records and more than 20% data availability in first and last decile of the period. Black plus signs (+) indicate grid boxes where trends are significant (i.e., a trend of zero lies outside the 90% confidence interval). Graphics are constructed, to be compared with IPCC AR5 Chapter 2 Figure 2.21. Here HadCRUT4 Version 4.6.0.0 is used, IPCC report used Version 4.1.1.

Source data

Extended Data Fig. 9 Spatial reconstruction of an observed El Niño.

As Fig. 3 but with additional datasets. Recently, the HadSST4 (b) data set was released as an update to HadSST3 (ocean component of HadCRUT4 (a)). Kriging analysis of Cowtan&Way (c) is set next to Berkley Earth (d). In July 1877 HadSST4 has three new grid points, which show very high (warm) temperature anomalies in a region (further south than usual) where the the PCA reconstruction of 20crPCA (e) and cmipPCA (f) show some weak signal. Neural network reconstructions of 20crAI (g) and cmipAI (h) show some strong signal of an El Niño like temperature pattern.

Source data

Extended Data Fig. 10 CMIP numerical models to train the neural network.

CMIP5 Historical monthly experiments between 1850 and 2005 applied to train the cmipAI. Data from refs. 42,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61.

Supplementary information

Supplementary Information

Supplementary Figs. 1–5.

Source data

Source Data Fig. 1

Temperature anomaly maps in NetCDF format.

Source Data Fig. 2

Temperature anomaly time series in NetCDF format.

Source Data Fig. 3

Temperature anomaly maps in NetCDF format.

Source Data Fig. 4

Temperature anomaly time series in NetCDF format.

Source Data Extended Data Fig. 2

Statistical Source Data on maps in NetCDF format.

Source Data Extended Data Fig. 3

Statistical Source Data on maps in NetCDF format.

Source Data Extended Data Fig. 4

Temperature anomaly time series in NetCDF format.

Source Data Extended Data Fig. 5

Statistical measure time series in NetCDF format.

Source Data Extended Data Fig. 6

Temperature anomaly time series in NetCDF format.

Source Data Extended Data Fig. 7

Temperature anomaly time series in NetCDF format.

Source Data Extended Data Fig. 8

Temperature trend maps in NetCDF format.

Source Data Extended Data Fig. 9

Temperature anomaly maps in NetCDF format.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kadow, C., Hall, D.M. & Ulbrich, U. Artificial intelligence reconstructs missing climate information. Nat. Geosci. 13, 408–413 (2020). https://doi.org/10.1038/s41561-020-0582-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41561-020-0582-5

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing