Over the past few decades, weather prediction has grown steadily more accurate. Every ten years or so, our ability to forecast with a given accuracy extends by another day; today we make weather predictions over four days as accurately as we could over three days a decade ago. The improvement is the result of better technology for numerical modelling, and more knowledge of atmospheric physics, especially of processes taking place below the resolution of numerical models — clouds, radiation and drag due to mountains or other small-scale surface features. With modern satellite data, we can also estimate atmospheric initial conditions more precisely.

Equally responsible for this improvement was a radical shift in modelling perspective undertaken some 25 years ago. Atmospheric processes mostly follow classical physics and deterministic laws, and it’s natural to think that the best forecasts should come by using all computing resources and the most accurate initial conditions to simulate the physics as precisely as possible. This gives one ‘best-guess’ prediction. Yet weather modellers have come to make more accurate and useful predictions by turning away from such heroic, best guesses. An alternative idea — known as ensemble forecasting — runs a host of cruder simulations that explore the space of possible outcomes more effectively.

The superior predictive performance of ensemble forecasting stems from deep insights into the nature of prediction in inherently chaotic systems, of which weather is a classic example. The method respects the irreducible uncertainty of weather dynamics in a way earlier forecasting methods did not. In a recent personal history of the development of ensemble forecasting, Oxford physicist Tim Palmer notes (preprint at https://arxiv.org/abs/1803.06940) that this shift in philosophy carries other advantages besides accuracy — in particular, it makes the communication of uncertainty more natural.

Edward Lorenz first indicated in 1963 in his famous paper introducing a low-dimensional model of atmospheric convection that the chaotic nature of the atmosphere would limit deterministic prediction to fairly short times. Based on numerical experiments with weather models, a more quantitative view later emerged that the atmosphere has a ‘limit of deterministic predictability’ of about ten days, a result encouraging the formation of European and other national prediction centres in the 1970s.

But for times within the ten-day frame, why use ensemble forecasts rather than simple deterministic simulations? The reasons turn out to be subtler than one might think. As Palmer relates, the ten-day deterministic prediction limit reflects an average of the dynamics over the state space, which are more unstable in some zones than in others. In the most unstable zones, nearby trajectories will actually separate much faster than the ten-day limit would suggest, meaning that models predicting even over short times remain prone to large errors. Gaining some insight into these unstable zones and branch points requires running multiple simulations from many initial conditions.

There is a second issue too, given that weather is a physical process, rather than a precise set of mathematical equations. Forecast uncertainty also enters from imperfections in the models used, which always fail to capture some aspects of atmospheric physics on scales below the resolution of simulations. These can be included approximately with deterministic formulae, but weather scientists have learned that inherently stochastic schemes give better results. An ensemble of runs from different initial conditions, and with some noise added into the physics, makes for better forecasts.

Forecasting of this kind represents a shift in philosophical orientation that’s well captured, Palmer suggests, by the phrase ‘the primacy of doubt’, as used by James Gleick in his biography of physicist Richard Feynman: ‘‘He believed,” Gleick wrote, “in the primacy of doubt: not as a blemish on our ability to know, but as the essence of knowing.’’ Predictive knowledge is made more valuable when qualified by clear information on the uncertainties of that knowledge, a point with several implications.

One has to do with the communication of uncertainty. The ensemble approach, Palmer argues, demands clarity about what is known and what is not. Media forecasters can be tempted to claim unwarranted certainty, working under intense pressure from the users of these forecasts, who crave certainty. By embracing the primacy of doubt through ensemble forecasting, forecasters can communicate their results more honestly. The approach brings uncertainty to the fore.

It also encourages the assessment of forecasts by looking more closely at their real world consequences. Traditionally, forecasts were judged for accuracy by comparing certain standard physical variables in the predicted and actual weather — differences in air pressure patterns, for example. But while physically meaningful, such measures bear little relation to how much a forecast’s inaccuracy matters to real forecast users. With ensemble forecasting, researchers have moved to measures that reflect a range of possible outcomes and the social and economic consequences linked to forecasting errors, which are the quantities modellers really seek to reduce.

Finally, one other counterintuitive finding is that embracing ensembles can sometimes improve computing efficiency. Studies have shown that many weather models in use were overly complex, and their variables defined more precisely than needed. For example, Palmer points out that a careful accounting of all elements of uncertainty for one model found that its consistent use of 64-bit floating point precision for variables brought no improvement in accuracy over the use of 32-bit precision. An unthinking quest for precision led to the opposite.

Palmer quotes the Dutch meteorologist and fluid dynamicist Henk Tennekes, who said that “no forecast is complete without a forecast of the forecast skill”. This view from the study of weather and chaotic dynamical systems may also hold lessons for other areas of science and public policy where better predictions could be made by putting more focus on what we don’t know, and possibly cannot know.