Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Interpretable weather forecasting for worldwide stations with a unified deep model


Automatic weather stations are essential for fine-grained weather forecasting; they can be built almost anywhere around the world and are much cheaper than radars and satellites. However, these scattered stations only provide partial observations governed by the continuous space–time global weather system, thus introducing thorny challenges to worldwide forecasting. Here we present the Corrformer model with a novel multi-correlation mechanism, which unifies spatial cross-correlation and temporal auto-correlation into a learned multi-scale tree structure to capture worldwide spatiotemporal correlations. Corrformer reduces the canonical double quadratic complexity of spatiotemporal modelling to linear in spatial modelling and log-linear in temporal modelling, achieving collaborative forecasts for tens of thousands of stations within a unified deep model. Our model can generate interpretable predictions based on inferred propagation directions of weather processes, facilitating a fully data-driven artificial intelligence paradigm for discovering insights for meteorological science. Corrformer yields state-of-the-art forecasts on global, regional and citywide datasets with high confidence and provided skilful weather services for the 2022 Winter Olympics.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Collaborative forecasting of worldwide stations.
Fig. 2: Overall design of the multi-correlation mechanism for spatiotemporal correlation modelling.
Fig. 3: Station distributions.
Fig. 4: Case study.
Fig. 5: Computational details.

Similar content being viewed by others

Data availability

The Global datasets are available from the National Oceanic and Atmospheric Administration (NOAA) at, which have been processed and deposited in our GitHub repository The partial set of the regional and Olympics datasets can be obtained from the CMA website after registration as real-name users. The complete set was used under license for the current study and is available from the authors upon reasonable request and with permission from the CMA. The map data are made with Natural Earth.

Code availability

The code is available on Code Ocean at (ref. 41), which is also on GitHub at or


  1. Palmer, T. Climate forecasting: build high-resolution global climate models. Nature. 515, 338–339 (2014).

    Article  Google Scholar 

  2. Phillips, N. A. The general circulation of the atmosphere: a numerical experiment. Q. J. R. Meteorolog. Soc 82, 123–164 (1956).

    Article  Google Scholar 

  3. Jacox, M. G. et al. Global seasonal forecasts of marine heatwaves. Nature. 604, 486–490 (2022).

    Article  Google Scholar 

  4. Gao, Z. et al. Earthformer: exploring space-time transformers for earth system forecasting. In 36th Conference on Neural Information Processing Systems (NeurIPS, 2022).

  5. Chatfield, C. The Analysis of Time Series: An Introduction (Chapman and Hall/CRC, 1981).

  6. Papoulis, A. & Saunders, H. Probability, Random Variables and Stochastic Processes (McGraw-Hill Education, 1989).

  7. Krishnamurti, T. N. & Bounoua, L. An Introduction to Numerical Weather Prediction Techniques (CRC Press, 1996).

  8. McGuffie, K. & Henderson-Sellers, A. The Climate Modelling Primer (John Wiley and Sons, 2014).

  9. Reichstein, M. et al. Deep learning and process understanding for data-driven earth system science. Nature. 566, 195–204 (2019).

    Article  Google Scholar 

  10. Wu, H., Xu, J., Wang, J. & Long, M. Autoformer: decomposition transformers with auto-correlation for long-term series forecasting. In 35th Conference on Neural Information Processing Systems Vol. 34, 22419–22430 (NeurIPS, 2021).

  11. Robert, C., William, C. & Irma, T. STL: a seasonal-trend decomposition procedure based on loess. J. Off. Stat. 6, 3–73 (1990).

    Google Scholar 

  12. Berg, M. Applied General Statistics (Prentice-Hall, 2016).

  13. Dietrich, C. F. Uncertainty, Calibration and Probability: The Statistics of Scientific and Industrial Measurement (Routledge, 1973).

  14. Park, K. I. Fundamentals of Probability and Stochastic Processes with Applications to Communications (Springer, 2017).

  15. Chou, Y.-L. Statistical Analysis, with Business and Economic Applications (Holt, Rinehart and Winston, 1975).

  16. Sutton, G. Weather forecasting: the future outlook. Nature 176, 993–996 (1955).

    Article  Google Scholar 

  17. Nielsen, F. Introduction to HPC with MPI for Data Science (Springer, 2016).

  18. Rokach, L. & Maimon, O. in Data Mining and Knowledge Discovery Handbook Ch. 15 (Springer, 2005).

  19. Wiener, N. Generalized harmonic analysis. Acta Math 55, 117–258 (1930).

    Article  MathSciNet  MATH  Google Scholar 

  20. van den Oord, A. et al. WaveNet: a generative model for raw audio. In Proc. 9th ISCA Workshop on Speech Synthesis Workshop (ISCA, 2016).

  21. Hersbach, H. et al. The ERA5 global reanalysis. Q. J. R. Meteorolog. Soc. 146, 1999–2049 (2020).

    Article  Google Scholar 

  22. Dee, D. P. et al. The ERA-Interim reanalysis: configuration and performance of the data assimilation system. Q. J. R. Meteorolog. Soc. 137, 553–597 (2011).

    Article  Google Scholar 

  23. Global Forecast System (NOAA, accessed 1 July 2022);

  24. Bauer, P., Thorpe, A. & Brunet, G. The quiet revolution of numerical weather prediction. Nature 525, 47–55 (2015).

    Article  Google Scholar 

  25. Kendall, M. Time-Series 2nd edn (Griffin, 1976).

  26. Hyndman, R. J. & Athanasopoulos, G. Forecasting: Principles and Practice (OTexts, 2018).

  27. Ke, G. et al. LightGBM: a highly efficient gradient boosting decision tree. In 31st Conference on Neural Information Processing Systems (NeurIPS, 2017).

  28. Taylor, S. J. & Letham, B. Forecasting at scale. Am. Stat. 72, 37–45 (2018).

    Article  MathSciNet  MATH  Google Scholar 

  29. Salinas, D., Flunkert, V., Gasthaus, J. & Januschowski, T. DeepAR: probabilistic forecasting with autoregressive recurrent networks. Int. J. Forecast. 36, 1181–1191 (2020).

    Article  Google Scholar 

  30. Li, S. et al. Enhancing the locality and breaking the memory bottleneck of transformer on time series forecasting. In 33rd Conference on Neural Information Processing Systems (NeurIPS, 2019).

  31. Oreshkin, B. N., Carpov, D., Chapados, N. & Bengio, Y. N-BEATS: neural basis expansion analysis for interpretable time series forecasting. In International Conference on Learning Representations (2019).

  32. Beltagy, I., Peters, M. E. & Cohan, A. Longformer: the long-document transformer. Preprint at (2020).

  33. Lee-Thorp, J., Ainslie, J., Eckstein, I. & Ontanon, S. FNet: mixing tokens with Fourier transforms. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 4296–4313 (Association for Computational Linguistics, 2022).

  34. Zhou, H. et al. Informer: beyond efficient transformer for long sequence time-series forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence Vol. 35, 11106–11115 (AAAI, 2021).

  35. Liu, S. et al. Pyraformer: low-complexity pyramidal attention for long-range time series modeling and forecasting. In International Conference on Learning Representations (2022).

  36. Zhou, T. et al. FEDformer: frequency enhanced decomposed transformer for long-term series forecasting. In International Conference on Machine Learning (PMLR, 2022).

  37. Woo, G., Liu, C., Sahoo, D., Kumar, A. & Hoi, S. ETSformer: exponential smoothing transformers for time-series forecasting. Preprint at (2022).

  38. Challu, C. et al. N-Hits: neural hierarchical interpolation for time series forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence Vol. 37 (AAAI, 2023).

  39. Sun, F.-K. & Boning, D. S. FreDo: frequency domain-based long-term time series forecasting. Preprint at (2022).

  40. Cao, D. et al. Spectral temporal graph neural network for multivariate time-series forecasting. In 33rd Conference on Advances in Neural Information Processing Systems Vol. 33, 17766–17778 (NeurIPS, 2020).

  41. Wu, H. et al. Code of interpretable weather forecasting for worldwide stations with a unified deep model. CodeOcean (2023).

Download references


All research described in this paper was funded by the National Key Research and Development Project through grant 2021YFC3000905 (M.L.) and the National Natural Science Foundation of China through grants 62021002 (J.W.) and 62022050 (M.L.). We thank our colleagues from the CMA, including B. Bi, K. Dai and Y. Gong, for their suggestions and support for the paper.

Author information

Authors and Affiliations



M.L. conceived and designed the project. H.W. and M.L. developed the Corrformer model and wrote the paper. H.W. and H.Z. implemented all of the methods, processed the data, conducted the experiments, analysed the results and validated the Winter Olympics weather cases. M.L. supervised the work, investigated the methodology and accepted responsibility for the overall integrity of the paper. J.W. revised and approved the paper and provided the research environment and funding support.

Corresponding authors

Correspondence to Mingsheng Long or Jianmin Wang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Machine Intelligence thanks Boris Oreshkin, Danielle Robinson and Tobias Finn for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Forecasting of global temperature.

Forecasting of global temperature from 2020-08-19 19:00:00 to 2020-08-20 19:00:00. A brighter pixel indicates a higher temperature. The changing part is highlighted in red dotted boxes for readability. Corrformer can generate a more accurate prediction in temperature for each station and also accurately capture temperature changes along the longitude over time.

Extended Data Fig. 2 Forecasting of global wind.

Forecasting of global wind from 2020-10-24 19:00:00 to 2020-10-25 19:00:00. A brighter pixel indicates a stronger wind. The changing part is highlighted in red dotted boxes for readability. Comparing to other baselines, Corrformer performs better in the prediction of mean and extreme values.

Extended Data Fig. 3 Single station case studies.

We plot the predictions of different models from the single station view. We can find that Corrformer surpasses the other baselines in both seasonal and peak value modeling. Especially, for the Olympics Wind, Corrformer can capture the weakening phase of the wind speed well, which is meaningful for scheduling competitions between strong winds.

Extended Data Fig. 4 Model efficiency analysis.

We fix the bath size to 1 and model channels to 512, then record the change curves of GPU memory and running time when the number of stations increases. Specifically, the running time is averaged from 1,000 iterations. Corrformer presents linear complexity in both memory and running time w.r.t. the number of stations.

Extended Data Fig. 5 Model adaptability analysis.

Corrformer is trained on roughly 10% stations, that is, 350 stations for the Global dataset and 3,404 stations for the Regional dataset, and further evaluated on the large dataset with progressively added new stations. Corrformer can adapt to newly added stations seamlessly with no need to re-train.

Supplementary information

Supplementary Information

Supplementary Figs. 1–4, Tables 1–13, ablations and related work.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, H., Zhou, H., Long, M. et al. Interpretable weather forecasting for worldwide stations with a unified deep model. Nat Mach Intell 5, 602–611 (2023).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

This article is cited by


Quick links

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics