Structural decomposition of decadal climate prediction errors: A Bayesian approach

Zanchettin, Davide; Gaetan, Carlo; Arisido, Maeregu Woldeyes; Modali, Kameswarrao; Toniazzo, Thomas; Keenlyside, Noel; Rubino, Angelo

doi:10.1038/s41598-017-13144-2

Download PDF

Article
Open access
Published: 09 October 2017

Structural decomposition of decadal climate prediction errors: A Bayesian approach

Davide Zanchettin¹,
Carlo Gaetan¹,
Maeregu Woldeyes Arisido¹,
Kameswarrao Modali²,
Thomas Toniazzo³,
Noel Keenlyside⁴ &
…
Angelo Rubino¹

Scientific Reports volume 7, Article number: 12862 (2017) Cite this article

1507 Accesses
4 Citations
39 Altmetric
Metrics details

Subjects

Abstract

Decadal climate predictions use initialized coupled model simulations that are typically affected by a drift toward a biased climatology determined by systematic model errors. Model drifts thus reflect a fundamental source of uncertainty in decadal climate predictions. However, their analysis has so far relied on ad-hoc assessments of empirical and subjective character. Here, we define the climate model drift as a dynamical process rather than a descriptive diagnostic. A unified statistical Bayesian framework is proposed where a state-space model is used to decompose systematic decadal climate prediction errors into an initial drift, seasonally varying climatological biases and additional effects of co-varying climate processes. An application to tropical and south Atlantic sea-surface temperatures illustrates how the method allows to evaluate and elucidate dynamic interdependencies between drift, biases, hindcast residuals and background climate. Our approach thus offers a methodology for objective, quantitative and explanatory error estimation in climate predictions.

North Atlantic climate far more predictable than models imply

Article 29 July 2020

Historically-based run-time bias corrections substantially improve model projections of 100 years of future climate change

Article Open access 14 October 2020

The significant influence of the Atlantic multidecadal variability to the abrupt warming in Northeast Asia in the 1990s

Article Open access 26 January 2024

Introduction

The multiannual forecast horizon inherent to decadal climate prediction requires that the complexity and uncertainties arising from the interaction between the climate response to external forcing and the evolution of internal modes of climate variability are accounted for^1,2. To generate an ensemble of decadal forecasts, various procedures are applied in climate models to obtain sets of initial conditions based on the observed state of the climate system at a certain time². The observed evolution of many geophysical parameters however resides outside the climatological envelope simulated by coupled climate models used in contemporary climate prediction systems, due to the presence of large systematic model biases with respect to observations^3,4,5. These affect all of mean state, seasonal cycle and interannual internal variability. Decadal climate forecasts based on full-field initialization therefore unavoidably include a growing systematic error, which corresponds with the adjustment of the simulations from the assimilated state drawn from the observed climatology towards a state consistent with the biased climatology of the model². This signal is commonly referred to as climate model drift⁶.

Model drifts and biases can result from the erroneous representation of oceanic and atmospheric processes in climate models^{1,4,7,8,9,10,11}, but more generally they reflect our limited understanding of many of the interactions and feedbacks in the climate system and approximations and simplifications inherent to the numerical representation of climate processes (so-called parameterizations). Sea-surface temperature (SST) biases are often central in the characterization of climate model biases³. Typical examples of systematic model bias are the warm and weak upwelling systems at the eastern boundaries of the tropical oceans^12,13,14. Among these, the severe warm SST bias in the southeastern tropical Atlantic has been especially studied^{5,12,15,16,17}.

A multi-model analysis of SST drifts in this region highlighted that similar biases arise there in different models through a variety of locally as well as remotely growing error mechanisms¹². The dynamical processes governing model drifts (or their numerical representation) are identical to those of the simulated climate itself, so that climate and error signals are indistinguishable by first principles. The fact that model biases and drifts often have a prominent seasonal character^11,16 further increases the difficulty of evaluating, quantifying and understanding their impacts on the overall quality of simulated climates. Various techniques for drift mitigation have been proposed, such as anomaly initialization or flux correction¹⁸. Nonetheless, in decadal climate predictions model drifts remain most often regarded as mere biases, which are often corrected for by means of empirical techniques with varying complexity^19,20,21,22, but rarely investigated in depth²³.

A substantial improvement in our understanding of mechanisms contributing to the generation and propagation of systematic decadal climate prediction errors may be achieved by means of a process-oriented statistical characterization of their temporal and spatial complexity and a robust quantification of associated uncertainties. Here, we propose a structural decomposition analysis²⁴ of systematic hindcast errors (i.e., systematic discrepancies between values from “retrospective predictions” and the corresponding observation) in an ensemble of decadal climate predictions. Structural decomposition refers to a process model in which temporal changes of an observable variable are attributed to the discriminant effects (or “driving forces”) underlying these changes. Accordingly, each time series of hindcast errors D_j(t) in the ensemble is represented as the sum of a systematic component Δ(t), which captures the (systematic) portion of hindcast errors common to all realizations, and a non-systematic irregular component ε_j(t), which is specific to the individual considered hindcast realization j. Here, Δ(t) is defined as a linear combination of systematic hindcast error δ(t) and systematic seasonal bias σ(t). In particular, δ(t) evolves as a stochastic trend with slope β(t): initially, when β(t) ≠ 0, δ(t) describes the drift; then, when β(t) ≈ 0, δ(t) corresponds to the climatological bias. Moreover, Δ(t) can incorporate the explanatory effects of a covariate X(t) representing local and/or remote processes and quantified by the associated stochastic regression coefficient γ(t) of Δ(t) on X(t). As discriminant effects are non-observable, a state-space model is built in which the state vector includes all unobserved elements and the transition matrix describes their individual dynamics (see the methods section for details).

The variances of the error terms in the state-space model are unknown parameters, and inference is performed following a Bayesian approach that specifies prior distributions for them. Therefore, our modeling strategy consists of three hierarchical levels (see the methods section for details): the data level, where observational errors, ε_j(t), for all hindcasts are accounted for conditionally on the systematic component Δ(t); the process level, where the systematic components of the bias, δ(t) and σ(t), and its constituting discriminant effects, X(t), are represented by means of a stochastic model; and the parameter level, where prior knowledge about the distributions of model errors (via the parameters τ²) is formalized.

In this contribution, we choose to illustrate our hierarchical statistical framework and its value for a robust quantification of systematic model prediction errors and their associated uncertainty by means of a specific example. We discuss the diagnosed distinctive characteristics of the different error components and use them for a reliable identification of connections between drift dynamics and the physics of the simulated climate. This study thus focuses on SST errors in the Tropical and South Atlantic Ocean estimated for an ensemble of decadal hindcasts initialized from observations, with the MiKlip prototype system for decadal climate prediction²⁵ (see Methods).

Results

We consider the spatially averaged SST in a region of the south-eastern tropical Atlantic containing the Angola-Benguela frontal zone (SST_ABF, the region is defined in the methods section). Empirical hindcast errors consist in large seasonal errors superposed on a warm climatological mean bias (grey lines in Fig. 1a). The posterior marginal distributions, i.e., the posterior distributions of the individual systematic error components obtained by the structural decomposition, characterize three major drift phases: an initial strong warming (β(t) > 0) in the first two hindcast years, which peaks at ~4 °C; a subsequent progressive weak cooling (β(t) < 0), which extends into the 7^th hindcast year; and finally a transition into climatological bias conditions (β(t) ≈ 0) quantified as a bias of 3.12 [3.00, 3.23] °C – estimated as median and 5^th–95^th percentile range of δ(t = 90 …120). Annual and semiannual seasonal biases – σ^A(t) and σ^SA(t) – with amplitudes of ~1.3 °C and ~1.05 °C, respectively, interfere constructively in July/August and destructively in January/February. Further, the semiannual bias component is substantially damped during the first hindcast year, possibly highlighting the effects of the initial coupling shock. The associated Bayesian analysis (Fig. 1b) demonstrates that the data strongly modify our (weakly informative) prior assumption about the model’s parameters, leading to well constrained posterior estimates. Variance of errors at the data level (τ² _D, see equation (2) of the methods section) is substantially larger than those at the process level, and therefore more strongly contribute to the overall uncertainty in the hindcast errors. Posterior distributions of error variances are noticeably skewed at the process level, particularly for the seasonal bias component (τ² _σ).

The residuals r_j(t), i.e., the differences between D_j(t) and the Bayesian estimates of Δ(t) (see Fig. 1a, bottom panel), become roughly stationary around zero and homoskedastic (i.e., consistent with being drawn from the same distribution at all time steps t) from roughly the third integration year onward, indicating that our decomposition captures the bulk evolution of systematic hindcast errors. However, systematic departures from zero of the residuals are diagnosed in the first two hindcast years. The positive residuals in the first few months and the negative residuals around January of the second integration year possibly reflect unresolved issues with the seasonal error (i.e., the initial shock). Figure 2 expands the residual analysis to the case of individual hindcasts: stripes of similar colors in Fig. 2a that are parallel to the black dashed lines characterize specific events that the decadal climate prediction system systematically fails to predict. Among these events are the El Niños of 1972/73, 1982 (concomitant with the El Chichón eruption), 1991/92 (concomitant with the Pinatubo eruption) and 1997/98. For long hindcast times (>2 years), failure to predict these events is not unexpected since their occurrence is determined by the chaotic nature of the climate system. It also reflects the difficulties inherent to simulating the Atlantic response to the Pacific²⁶.

The hindcasts also fail to predict multiannual SST_ABF anomalies. The residual error signal displays an interdecadal fluctuation with predominant negative errors before the mid-1970s and predominant warm errors afterwards. This signal robustly emerges in different seasons (Fig. 2a, right panel) and for different hindcast times (Fig. 2b). Accounting for the delay of the error signal with respect to the initialization year, its interdecadal modulation can be shown to match the shift of the Pacific Decadal Oscillation (PDO) from a cold to a warm phase in the mid-1970s (Fig. 2b, right panel).

Based on this empirical analysis, the statistical model for the estimation of systematic SST_ABF errors is expanded to assess the explanatory effect of the PDO. In practice, the observed PDO index is included as an additional covariate in the state-space model, and the posterior marginal distributions of all unknowns obtained from the expanded model are compared with those obtained in the original analysis illustrated in Fig. 1. Consistent with the analysis of residuals without co-variates, the PDO index is lagged by 24 months, i.e., X(t) = PDO(t-24). The approach is applied to the full hindcast ensemble and for two equally-sized sub-ensembles defined by the period of the hindcast initialization year, i.e., 1960–1979 and 1980–1999. In the full-ensemble analysis, the stochastic regression coefficient γ_PDO(t) indicates that the PDO index is informative about SST_ABF hindcast errors, with strongest impact in the second and third hindcast years (black line/shading in Fig. 3b). Inclusion of the PDO index does not appreciably affect the posterior estimation of the parameters of the systematic components (posterior estimates as black lines for τ² _β and τ² _σ in Fig. 3c), but it substantially reduces uncertainty in the non-systematic component (see τ² _D in Fig. 3c). Similar damping of τ² _D is inferred also for the two sub-ensembles. The PDO index is relevant for the estimation of SST_ABF hindcast errors for both sub-ensembles, but with a smaller impact in the 1980–1999 sub-ensemble (Fig. 3b), in line with the weaker relationship seen in the empirical analysis (Fig. 2b). The PDO-corrected posterior estimates of δ(t) agree with the full-period estimate better than the corresponding estimates from a model without covariates, particularly around the peak warm error phase (compare dashed and continuous lines in Fig. 3a). A plausible interpretation may be given as follows. A teleconnection exists between the PDO and the southeastern tropical Atlantic, which is observed as a negative PDO-SST_ABF correlation; the positive explanatory term γ_PDO (in this context, a sort of correction term) suggests that this teleconnection is active also in the simulated climate; removing the effects of remote forcing from the erroneous Pacific state leads to improved estimation of the local drift component. In general, it can be concluded that the estimation of the drift and of its uncertainty depends, to some extent, on the systematic portion of hindcast errors that depends on (inter)decadal internal climate variability. Our method could therefore be used in practice to improve drift estimation by accounting for known explanatory factors of such internal climate variability.

As a further example for the possibility of including explanatory covariates in the statistical model, Fig. 4a shows the structural decomposition of systematic SST_ABF hindcast errors when the effect of hindcast errors in selected terms of the heat budget for the local mixed layer are accounted for. The blue curves in Fig. 4a correspond to results of a model setup accounting for the explanatory effect of the mixed layer heat content (Q_ml). Whereas empirical estimates of systematic errors (see methods) in Q_ml and SST_ABF are significantly correlated (r = −0.66, p < 0.001), the Bayesian analysis indicates that systematic SST_ABF errors are largely unexplained by Q_ml alone, reflective of the weaker average correlation between errors of both variables in the individual hindcasts (r = −0.32) and suggesting a more complex causal association beyond the thermal state of the mixed layer.

Hindcast errors in both, adjusted surface heat flux (hfs) and 1-month lagged sea-level height (slh) reduce the error component δ(t) and damp the estimated annual and semiannual biases (red and green lines in Fig. 4a, respectively). Changes in slh tend to reflect ocean currents. Horizontal circulation changes tend to be associated with meridional shifts of the Angola-Benguela SST front, while changes in upwelling result in local shoaling or sinking of the thermocline. With slh as covariate, the annual seasonal bias σ^A(t) nearly vanishes from the second hindcast year onward, and the semiannual seasonal bias σ^SA(t) is also appreciably reduced. The larger amplitudes of σ^A(t) during the first years suggest decoupling of surface from subsurface errors during the initial shock phase. The posterior regression coefficient γ(t) (Fig. 4b) corresponding to this parameter illustrates a seasonally-varying connection between errors in SST that is strongest during austral spring, possibly reflecting the seasonality of wind-dependent errors. Analysis of the posterior marginal distributions of the statistical model parameters indicates that inclusion of slh as explanatory covariate largely damps the variance of the observational error (Fig. 4c), which is the major source of uncertainty in the original estimation (Fig. 1b).

An even larger reduction of δ(t) is obtained when the explanatory effects of hindcast errors in hfs and slh are accounted for in conjunction (orange lines in Fig. 4a). With this choice, δ(t) has values below 1 °C for long hindcast times, indicating that the climatological bias is almost completely damped when accounting for the explanatory effects of hfs and slh. The posterior estimate of σ^A(t) obtained with this model configuration is phase-shifted compared to the original analysis and has larger amplitudes compared to the configuration with slh as the only covariate, which suggests concomitant effects on SST biases from misrepresented seasonal variations in hfs and slh that mutually compensate or reinforce each other depending on their phase.

The SST_ABF drift thus largely develops due to erroneous heat gain by the ocean through surface fluxes coupled with an erroneous redistribution of that heat in the water mass. In contrast, an only marginal connection is found with associated systematic errors in the mixed layer heat content, which are in turn dominated by the erroneous volume loss simulated in the mixed layer. Overall, the analysis highlights the complexity of errors in a region featuring a great variety of coupled dynamics and teleconnections²⁷. Fully disentangling such complexity exceeds the scope of the illustrative analysis presented here.

Figures 1c, 3c and 4c consistently show that the largest uncertainty that affects the estimation of the SST_ABF drift stems from the data model parameter τ² _D included in the observation equation of our statistical model (see the methods section). We tested the impact of using different input data to the dynamic linear model by constructing different hindcast ensembles, from small sub-ensembles to that used in the main analysis to super-ensembles including multiple realizations for each hindcast (results not shown). Tighter constrains on τ² _D are generally achieved by increasing the size of the ensemble, without significant impacts on its estimated mean. By contrast, reducing the ensemble size can lead to major discrepancies in the mean estimation. It is especially in this case that the proposed Bayesian hierarchical formulation could be valuable – through the implementation of prior knowledge about uncertainty parameters and/or explanatory factors, as shown in Fig. 3 – to partly overcome the dependency of drift uncertainty estimation on the quality and quantity of available empirical information about the drift.

So far, the state-space decomposition method has been illustrated for a single predictand, SST_ABF. Its analytic potential, however, is exploited more fully when it is applied simultaneously to spatially distributed data. Here, we illustrate how it may aid the characterization of the spatio-temporal complexity of decadal climate prediction errors, and assist in understanding the underlying dynamics with an application to gridded upper-ocean potential temperature data in the southern Atlantic mid-latitudes, near the Brazil–Malvinas confluence zone. The results indicate intricate variability in the sub-surface ocean for the δ(t) and σ^A(t) components (see supplementary movie). For instance, at [44°S, 50°W] during the initial drift phase negative (i.e., cold) errors (δ(t) < 0) are diagnosed throughout the water column down to 120 m depth, with peak negative values around 50 m depth, while errors of opposite sign characterize near-surface and subsurface evolution during the climatological bias phase (Fig. 5a). In the annual bias component, downwelling pulses of both warm and cold errors are observed (see Fig. 5c). Inclusion of hindcast errors in local seawater salinity as covariate in the state-space model indicates that a considerable part of potential temperature errors in the subsurface ocean is linked with concomitant errors in salinity. Following the initial drift phase, the warm climatological bias below 80 m depth is largely damped in the model including salinity errors as covariate (compare panels a and b of Fig. 5). Also, annual bias fluctuations are damped in the sub-surface ocean while they are phase-shifted in the near-surface layer (compare panels c and d of Fig. 5), as highlighted by the strong seasonal variations of the regression coefficient γ_sal(t) in the near-surface layers (Fig. 5e). Thus, coherent seasonal error evolutions are detected throughout the upper ocean column (Fig. 5d). Overall, the analysis highlights once more how our statistical model aids the identification and quantification of linkages between drift and biases affecting different covarying processes. Specifically, it links downwelling of systematic warm (i.e., buoyancy positive) hindcast errors with overcompensation by corresponding local salinity errors.

Discussion and Conclusions

We propose a Bayesian hierarchical framework for the unified statistical assessment of systematic hindcast errors in decadal climate predictions. The major novelty of our approach is that the proposed state-space model allows for an explicit statistical estimation of the temporal evolution of major systematic error components, including drift, climatological bias and seasonal biases. It also allows to quantify the explanatory effect of co-varying processes, and to separately evaluate the associated uncertainties at the data, process and parameter levels.

We present an illustrative application of the model for spatially-averaged SST in the Angola-Benguela front region – where coupled climate models are typically affected by a strong warm bias⁵ – simulated by an ensemble modelling system intended for decadal climate predictions. Different aspects of drift/bias quantification and interpretation are discussed, and we demonstrate the value of the proposed statistical approach as a diagnostic, exploratory and hypothesis-testing tool.

First, we show that by virtue of the separation between data and process levels, the hierarchical method allows for well-constrained estimates of statistical uncertainty for the different systematic error components (Fig. 1). Compared to currently adopted empirical approaches, the Bayesian estimates of drift uncertainty yield narrower distributions (Fig. 6). The more reliable estimation of drift uncertainty – at the process level – resulting from this rigorous methodology can provide more robust estimates of prediction skills compared to traditional simple averages. In particular, the posterior ensemble realizations of the drift (and of other systematic hindcast error components) obtained from the Bayesian hierarchical model gives a forecast drift correction analogous to the empirical estimate used in current approaches²⁸.

Second, we show that drift estimation is appreciably affected by the interdecadal climate state and evolution, including extant global teleconnections between hindcast errors (Fig. 2). The shorter the analysis period, the larger the risk of erroneously attributing to drift a portion of (inter)decadal climate variability which the system systematically fails to predict. We show that this issue can be in principle overcome within our statistical modeling framework by accounting for the explanatory effect of such unresolved (inter)decadal climate variability (Fig. 3).

Finally, we show that the proposed structural decomposition permits a robust quantification of non-stationary explanatory effects of the bias and their uncertainty (Figs 4 and 5), leading to the possibility of probing causal hypotheses for the origin of model biases with a statistically rigorous quantification that is inaccessible to commonly used approaches in traditional multivariate assessments of systematic decadal climate prediction errors. We stress that the proposed hierarchical framework uniquely allows for a fully traceable quantification of explanatory effects based on the interdependencies expressed in each individual hindcast.

We believe that the strength of these advantages, together with their modest computational requirements, justify the view that state-space models like the one proposed here, or variants of greater complexity (for instance accounting for spatial dependencies²⁹, can be used in future applications to robustly and quantitatively assess explanatory effects and uncertainties in climate forecasts. Ultimately, improved drift adjustment techniques such as that outlined here will help to advance the quality of the estimation of the predictability and the predictive skills of a forecast system²⁰.

To conclude, the generality of our results make them relevant well beyond the specific decadal prediction system, the focus region or reference variable employed in this study. Compared with the existing empirical techniques for drift estimation, the structural decomposition and Bayesian hierarchical approach envisaged here represent an opportunity for progress in our understanding of systematic climate model errors: the method yields a reliable estimation of drift and bias within a unified framework; it permits an evaluation of causal relationships and teleconnections by embedding relevant information on associated dynamics at the process level; and it allows to make a quantitative separation of the process-related uncertainties from those associated to empirical data limitations.

Methods

Bayesian approach

Scientific literature contains numerous studies^{29,30,31,32,33,34} describing the application of Bayesian hierarchical models (BHMs) in the field of climate research, especially to the assessment of climate model outputs. BHMs use Bayes’ theorem to incorporate information from different sources, including observations, physical theories and experts’ knowledge. They provide a flexible framework to developing consistent inference and prediction of unknown quantities under study, which overcomes single-value estimates of uncertainty.

The BHM general formulation is based on three building blocks: the data model, the process model and the parameter model. Suppose that Z represents the data, Y represents the process and θ represents the parameters related to the data and the process model, we then have:

(a)
Data model, [Z|Y, θ]
(b)
Process model, [Y|θ]
(c)
Parameter model, [θ]

where [A] is the generic notation for the probability distribution of the random quantity A. In practice, (a) defines the statistical model representing the dependence of observations on the unknown process, (b) describes the conditional probability distribution of the process on the model parameters, and (c) describes the (prior) probability distribution of the parameters, which are treated as random quantities according to the Bayesian approach.

BHMs provide estimates of the unknowns (Y and θ) with associated uncertainty through the calculation of their posterior distribution conditioned to available observations, which is allowed by the Bayes’ theorem:

$$[{\boldsymbol{Y}},{\boldsymbol{\theta }}|{\boldsymbol{Z}}]=[{\boldsymbol{Z}}|{\boldsymbol{Y}}][{\boldsymbol{Y}}|{\boldsymbol{\theta }}][{\boldsymbol{\theta }}]/[{\boldsymbol{Z}}]$$

(1)

Direct evaluation of (1) to obtain the posterior distribution on the left term is often computationally intractable. This can be circumvented by generating samples from the posterior distribution through a Markov Chain Monte Carlo method³⁵.

Statistical model for the estimation of systematic decadal climate prediction errors

At the data level, the systematic error Δ(t) of the decadal climate prediction system at prediction time t is observed through differences between the predicted and the observed – or, analogously, assimilated – values (D_j(t)) from an ensemble of j = 1,…,p hindcasts initialized at different times, according to the following model:

$${{\rm{D}}}_{{\rm{j}}}(t)={\rm{\Delta }}(t)+{{\rm{\varepsilon }}}_{{\rm{j}}}(t)$$

(2)

where ε_j(t) is a Gaussian white noise random error with zero mean and variance τ² _D. We assume, for simplicity, that the observation error has the same variance in all hindcasts.

At the process level, Δ(t) is decomposed into two components: the drift/bias δ(t) and the seasonal bias σ(t), further split into annual (σ^A(t)) and semiannual (σ^SA(t)). Namely:

$${\rm{\Delta }}(t)={\rm{\delta }}(t)+{{\rm{\sigma }}}^{{\rm{A}}}(t)+{{\rm{\sigma }}}^{{\rm{SA}}}(t)$$

(3)

The drift/bias δ(t) changes through time according to a local linear trend:

$${\rm{\delta }}(t)={\rm{\delta }}(t-{1})+{\rm{\beta }}(t-{1})+{{\rm{\varepsilon }}}_{{\rm{\delta }}}(t)$$

(4)

$${\rm{\beta }}(t)={\rm{\beta }}(t-{1})+{{\rm{\varepsilon }}}_{{\rm{\beta }}}(t)$$

(5)

where ε_δ(t) and ε_β(t) are uncorrelated Gaussian white noise random errors with zero mean and variance τ² _δ and τ² _β, respectively. The term β(t) is a random walk. The effect of ε_δ(t) is to allow the level of the drift/bias to shift up and down, while ε_β(t) allows the slope to change. The larger the variances, the greater the stochastic movements in the drift/bias. If τ² _δ = τ² _β = 0, the local linear trend collapses to a linear deterministic trend.

Seasonal terms are modeled using harmonic functions³⁶. Using monthly data with k = 1 for σ^A(t) and k = 2 for σ^SA(t), the k ^th harmonic function takes the general form

$${{\rm{\sigma }}}_{{\rm{k}}}(t)={{\zeta }^{1}}_{{\rm{k}}}\,\cos (2{\rm{\pi }}kt/12)+{{\zeta }^{2}}_{{\rm{k}}}\,\sin (2{\rm{\pi }}kt/12)$$

where ζ¹ _k and ζ² _k are constants. Like the local linear trend, the seasonal term can be built up recursively, leading to the (stochastic) model³⁶ for both components:

$$(\begin{array}{c}{\sigma }_{{\rm{k}}}(t)\\ {\sigma }_{{\rm{k}}}^{\ast }(t)\end{array})=(\begin{array}{cc}\cos (\mathrm{k2}\pi /\mathrm{12}) & \sin (\mathrm{k2}\pi /\mathrm{12})\\ -\sin (\mathrm{k2}\pi /\mathrm{12}) & \cos (\mathrm{k2}\pi /\mathrm{12})\end{array})(\begin{array}{c}{\sigma }_{{\rm{k}}}(t-1)\\ {\sigma }_{{\rm{k}}}^{\ast }(t-1)\end{array})+(\begin{array}{c}{{\rm{\varepsilon }}}_{\sigma ,k}(t)\\ {{\rm{\varepsilon }}}_{\sigma ,k}^{\ast }(t)\end{array}){\rm{k}}=1,2$$

(6)

with ε_σ,k(t) and ε*_σ,k(t) uncorrelated Gaussian noise random errors with zero mean and variance τ² _σ, and * indicating the conjugate.

The process model (3) can be easily extended to include the effect of external factors in terms of additional explanatory variables. For one dimensional explanatory variable X(t), the model becomes:

$${\rm{\Delta }}(t)={\rm{\delta }}(t)+{{\rm{\sigma }}}^{{\rm{A}}}(t)+{{\rm{\sigma }}}^{{\rm{SA}}}(t)+{\rm{\gamma }}(t){\rm{X}}(t)$$

(7)

In equation (7) we allow γ(t) to vary according to a random walk:

$${\rm{\gamma }}(t)={\rm{\gamma }}(t-{1})+{{\rm{\varepsilon }}}_{{\rm{\gamma }}}(t)$$

(8)

with ε_γ(t) Gaussian white noises with zero mean and variance τ² _γ. Time varying coeffcients γ(t) allow us to take into account possible non stationary effects of the covariate. Again, if τ² _γ = 0, the effect collapses to be constant in time. Finally, the parameter level requires the specification of the prior distribution for the unknown parameters θ = (τ² _D, τ² _δ, τ² _β, τ² _σ, τ² _γ). In this study, we set τ² _δ = 0 to obtain a smoothly varying error δ(t), which then better corresponds to the drift/bias. We define weakly informative lognormal (LN) priors in the form LN(0,1) for all parameters.

Dynamic Linear Model implementation

The formulation of the statistical model above allows for a straightforward implementation within the dynamic linear model (DLM) framework. DLMs are based on a state-space approach, i.e., unobservable state variables are used that allow direct modeling of the process (Y) generating the observed data (Z)^36,37. DLMs have the general form:

$${\boldsymbol{Z}}(t)={\boldsymbol{FY}}(t)+{\bf{v}}(t)$$

(9)

$${\boldsymbol{Y}}(t)={\boldsymbol{GY}}(t-{1})+{\bf{w}}(t)$$

(10)

where t is the discrete time variable representing, in our case, monthly values, Z(t) is a vector of p observations at time t, Y(t) is the underlying state vector of dimension m, G is the mxm system matrix, and F is the mxp observation matrix. We suppose that v(t)~N(0,V) and w(t)~N(0,W) are the observation and model Gaussian errors, respectively, that are serially and mutually uncorrelated. In this formulation, the matrices V and W contain the model parameters θ. If we suppose that the unknown parameters are random the DLM formulation is a BHM where (9) and (10) are typically referred to as observation equation and system equation, respectively.

We have applied the DLM to spatially-averaged indices and, individually, to the model’s grid-points over the investigated domain. Following (2) and (9), and having p values of D(t) (i.e., having p hindcasts) the observation vector is defined as Z(t) = {D¹(t),…, D^p(t)}’.

In the base model without explanatory covariates, following (3) and accounting for annual and semi-annual seasonalities in the decadal climate prediction errors, the state vector is defined as Y(t) = {δ(t), β(t), σ^A(t), σ^A*(t), σ^SA(t), σ^SA*(t)}’. Therefore, the dimension of the state vector is m = 6. We further assume Y(0)~N(0,0.025), i.e., there is practically no systematic error during the assimilation. The observation matrix F is defined following (2) and (3).

The sequential definition of the process model (having a conditional dependency only on the previous time step) allows to use the Kalman filter formulas^36,37 for calculating the posterior distribution (1) inside a Monte Carlo Markov Chain algorithm. In particular, we use a slice-sampler algorithm³⁸ to iteratively sample from the full posterior distribution of the unknown parameters θ.

Empirical estimation of systematic decadal climate prediction errors

The BHM results are compared with empirical estimates obtained following the guidelines for drift/bias correction by the Decadal Climate Prediction Panel for the upcoming Coupled Model Intercomparison Project phase 6²⁸. The associated uncertainty is calculated for each hindcast time t as the unbiased sample standard deviation of the D_j(t) values from the j = 1,…,p hindcasts. Similar results are obtained using the 95% confidence interval of the standard error of the mean.

Decadal prediction system and dataset

The MiKlip prototype system for decadal climate predictions²⁵ is based on the low-resolution version of the Max Planck Institute - Earth System Model (MPI-ESM-LR). MPI-ESM-LR is a conglomerate of a coupled general circulation model and subsystem models for land and vegetation, and biogeochemistry. In MPI-ESM-LR, the atmospheric general circulation model ECHAM6³⁹ uses a T63/1.9° horizontal resolution and 47 hybrid sigma pressure levels that extend up to 0.01hPa; the ocean-sea ice model MPIOM⁹ uses a 1.5° resolution bipolar grid with 40 z-levels. The time step for numerical integration is 600 s for ECHAM6 and 4320 s for MPIOM.

[9] describe the main oceanic biases, highlighting serious biases prevailing in the intermediate layers of the ocean, which reflect the inability of the model to maintain the correct water mass properties. In general, the ocean in MPI-ESM gets too warm and saline at intermediate levels and in the deep ocean whereas it is too cold and fresh in the upper layers. MPI-ESM-LR and similar configurations of MPI-ESM have been widely tested and used in studies of climate dynamics and variability^40,41.

We use the MiKlip decadal prediction experiments²⁵ based on full-field assimilation of the ORAS4 ocean reanalysis data⁴². The lagged initialization procedure is used to obtain the initial state from the historical assimilation run for each of the 15 ensemble members for each initialization year between 1960 and 2000. The external radiative forcing data are based on the recommendations for the Coupled Model Intercomparison Project phase 5 (CMIP5), an overview of which is outlined in²⁵. The ensemble including the first ensemble members (“r1”) was used in the main analysis. The associated historical assimilation run is used as analog for the observational targets.

Spatially-averaged indices for the Angola-Benguela front region used in the study refer to the domain spanning 10°S–20°S latitude and 10°E–15°E longitude (to the coastline). Empirical hindcast errors in monthly mean spatially-averaged SST in the Angola-Benguela front region are provided as supplementary information. The observed PDO index is the JISAO PDO index⁴³ available at: http://research.jisao.washington.edu/pdo/PDO.latest.txt. The original monthly index was low-pass filtered with an 85-month running average (corresponding to 7 years) before analysis to smooth out ENSO influences. Adjusted surface energy flux for the mixed layer and mixed layer heat content in the ABF region are estimated following⁴⁴. All empirical distributions are smoothed with a 5-point moving average with symmetric Hanning weights.

References

Meehl, G. A. et al. Decadal climate prediction: an update from the trenches. Bull. Am. Meteorol. Soc. 95, 243–267 (2014).
Article ADS Google Scholar
Kirtman, B. et al. Near-termClimate Change: Projections and Predictability in Climate Change 2013: The Physical Science Basis. Contribution of Working Group I to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change (eds Stocker, T. F. et al.) (Cambridge University Press, Cambridge, United Kingdom and New York, NY, USA, 2013).
Wang, C., Zhang, L., Lee, S.-K., Wu, L. & Mechoso, C. R. A global perspective on CMIP5 climate model biases. Nature Clim. Ch. 4, 201–205 (2014).
Article ADS Google Scholar
Hawkins, E., Dong, B., Robson, D. J., Sutton, R. & Smith, D. M. The interpretation and use of biases in decadal climate predictions. J. Clim. 27, 2931 (2014).
Article ADS Google Scholar
Xu, Z. et al. Diagnosing southeast tropical Atlantic SST and ocean circulation biases in the CMIP5 ensemble. Clim. Dyn. 43, 3123 (2014).
Article Google Scholar
Sen Gupta, A., Jourdain, N., Brown, J. & Monselesan., D. Climate drift in the CMIP5 models. J. Clim. 26, 8597–8615 (2013).
Article ADS Google Scholar
Voldoire, A. et al. The CNRM-CM5.1 global climate model: Description and basic evaluation. Clim. Dyn. 40, 2091–2121 (2013).
Article Google Scholar
Danabasoglu, G. et al. The CCSM4 Ocean Component. J. Clim. 25, 1361–1389 (2012).
Article ADS Google Scholar
Jungclaus, J. H. et al. Characteristics of the ocean simulations in MPIOM, the ocean component of the Max Planck Institute Earth System Model. J. Adv. Model. Earth Syst. 5, 422–446 (2013).
Article ADS Google Scholar
Sterl, A. et al. A look at the ocean in the EC-Earth climate model. Clim. Dyn. 39, 2631–2657 (2012).
Article Google Scholar
Ding, H., Keenlyside, N., Latif, M., Park, W. & Wahl, S. The impact of mean state errors on equatorial Atlantic interannual variability in a climate model. J. Geophys. Res. Oceans 120, 1133–1151 (2015).
Article ADS Google Scholar
Toniazzo, T. & Woolnough, S. Development of warm SST errors in the southern tropical Atlantic in CMIP5 decadal hindcasts. Clim. Dyn. 43, 2889–2913 (2014).
Article Google Scholar
Small, R. J., Curchitser, E., Hedstrom, K., Kauffman, B. & Large, W. G. The Benguela upwelling system: quantifying the sensitivity to resolution and coastal wind representation in a global climate model. J. Clim. 28, 9409–9432 (2015).
Article ADS Google Scholar
Ndoye, S. et al. SST patterns and dynamics of the southern Senegal-Gambia upwelling center. J. Geophys. Res. Oceans 119, 8315–8335 (2014).
Article ADS Google Scholar
Richter, I. & Xie, S.-P. On the origin of equatorial atlantic biases in coupled general circulation models. Clim. Dyn. 31, 587–598 (2008).
Article Google Scholar
Wahl, S., Latif, M., Park, W. & Keenlyside, N. On the Tropical Atlantic SST warm bias in the Kiel Climate Model. Clim. Dyn. 36, 891–906 (2011).
Article Google Scholar
Milinski, S., Bader, J., Haak, H., Siongco, A. C. & Jungclaus, J. H. High atmospheric horizontal resolution eliminates the wind-driven coastal warm bias in the south eastern tropical Atlantic. Geophys. Res. Lett. 43, 10455–10462 (2016).
Article ADS Google Scholar
Carrassi, A. et al. Full-field and anomaly initialization using a low-order climate model: a comparison and proposals for advanced formulations. Nonlin. Proc. Geophys. 21, 521–537 (2014).
Article ADS Google Scholar
Garcia-Serrano, J. & Doblas-Reyes, F. J. On the assessment of near-surface global temperature and North Atlantic multi-decadal variability in the ENSEMBLES decadal hindcast. Clim. Dyn. 39, 2025–2040 (2012).
Article Google Scholar
Fyfe, J. C. et al. Skillful predictions of decadal trends in global mean surface temperature. Geophys. Res. Lett. 38, L22801 (2011).
Article ADS Google Scholar
Kharin, V. V., Boer, G. J., Merryfield, W. J., Scinocca, J. F. & Lee, W. S. Statistical adjustment of decadal predictions in a changing climate. Geophys. Res. Lett. 39, L19705 (2012).
Article ADS Google Scholar
Fučkar, N. S., Volpi, D., Guemas, V. & Doblas‐Reyes, F. J. A posteriori adjustment of near‐term climate predictions: Accounting for the drift dependence on the initial conditions. Geophys. Res. Lett. 41, 5200–5207 (2014).
Article ADS Google Scholar
Sanchez-Gomez, E., Cassou, C., Ruprich-Robert, Y., Fernandez, E. & Terray, L. Drift dynamics in a coupled model initialized for decadal forecasts. Clim. Dyn. 46, 1819–1840 (2016).
Article Google Scholar
Hoekstra, R. & van der Bergh, J. J. C. J. M. Comparing structural decomposition analysis and index. Energy economics 25, 39–64 (2003).
Article Google Scholar
Marotzke, J. et al. MiKlip - a National Research Project on Decadal Climate Prediction. Bull. Amer. Meteor. Soc. 97, 2379–2394 (2016).
Article Google Scholar
Luebbecke, J. & McPhaden, M. On the inconsistent relationship between Pacific and Atlantic Niños. J. Clim. 25, 4294–4303 (2012).
Article ADS Google Scholar
Lübbecke, J. F., Böning, C. W., Keenlyside, N. S. & Xie, S.-P. On the connection between Benguela and equatorial Atlantic Niños and the role of the South Atlantic Anticyclone. J. Geophys. Res. 115, C09015 (2010).
Article ADS Google Scholar
Boer, G. J. et al. The Decadal Climate Prediction Project (DCPP) contribution to CMIP6. Geosci. Model Dev. 9, 3751-3777 (2016).
Article CAS PubMed Google Scholar
Arisido, M. W., Gaetan, C., Zanchettin, D. & Rubino, A. A Bayesian hierarchical approach for spatial analysis of climate model bias in multi-model ensembles. Stoch. Environ. Res. Risk. Assess., https://doi.org/10.1007/s00477-017-1383-2 (2017).
Duan, Q. & Phillips, T. J. Bayesian estimation of local signal and noise in multimodel simulations of climate change. J. Geophys. Res. Atmos. 115, D18123 (2010).
Article ADS Google Scholar
Tebaldi, C., Smith, R. L., Nychka, D. & Mearns, L. O. Quantifying Uncertainty in Projections of Regional Climate Change: A Bayesian Approach to the Analysis of Multimodel Ensembles. J. Clim. 18, 1524–1540 (2005).
Article ADS Google Scholar
Buser, C. M., Künsch, H. R., Lüthi, D., Wild, M. & Schär, C. Bayesian multi-model projection of climate: bias assumptions and interannual variability. Clim. Dyn. 33, 849–868 (2009).
Article Google Scholar
Kang, E. L., Cressie, N. & Sain, S. R. Combining outputs from the North American Regional Climate Change Assessment Program by using a Bayesian hierarchical model. J. R. Stat. Soc. C 61, 291–313 (2012).
Article MathSciNet Google Scholar
Reilly, J. et al. Uncertainty in climate change assessments. Science 293, 430–433 (2001).
Robert, C. P., & Casella, G. Monte Carlo Statistical Methods. pp. 649 (Springer, 2004).
Laine, M., Latva-Pukkila, N. & Kyrölä, E. Analysing time-varying trends in stratospheric ozone time series using the state space approach. Atmos. Chem. Phys. 14, 9707–9725 (2014).
Article ADS Google Scholar
Brogan, W. L. Modern Control Theory. pp. 736 (Prentice-Hall, 1974).
Radford, N. M. Slice Sampling. Ann. Stat. 31, 705–767 (2003).
Article MathSciNet MATH Google Scholar
Giorgetta, M. A. et al. Climate and carbon cycle changes from 1850 to 2100 in MPI-ESM simulations for the Coupled Model Intercomparison Project phase 5. J. Adv. Model Earth Syst. 5, 1–26 (2013).
Article Google Scholar
Zanchettin, D., Bothe, O., Müller, W., Bader, J. & Jungclaus, J. H. Different flavors of the Atlantic Multidecadal Variability. Clim. Dyn. 42, 381–399 (2014).
Article Google Scholar
Moreno-Chamarro, E., Zanchettin, D., Lohman, K. & Jungclaus, J. H. An abrupt weakening of the subpolar gyre as trigger of Little Ice Age-type episodes. Clim. Dyn. 48, 727–744 (2017).
Article Google Scholar
Balmaseda, M. A., Mogensen, K. & Weaver, A. T. Evaluation of the ECMWF ocean reanalysis system ORAS4. Q. J. R. Meteorol. Soc. 139, 1132–1161 (2013).
Article ADS Google Scholar
Mantua, N. J., Hare, S. R., Zhang, Y., Wallace, J. M. & Francis, R. C. A Pacific interdecadal climate oscillation with impacts on salmon production. Bull. Am. Meteorol. Soc. 78, 1069–1079 (1997).
Article ADS Google Scholar
Zanchettin, D. et al. A decadally delayed response of the tropical Pacific to Atlantic multidecadal variability. Geophys. Res. Lett. 43, 784–792 (2016).
Article ADS Google Scholar

Download references

Acknowledgements

The research leading to these results has received funding from the European Union, Seventh Framework Programme (FP7/2007-2013) under Grant agreement n° 603521 – PREFACE, and from the German Federal Ministry for Education and Research (BMBF) (MiKlip, project Nr. FKZ01LP1519A). The MPI-ESM simulations were carried out at the German Climate Computing Center, which also provided data services. We thank Wolfgang Mueller for having granted access to the MPI-ESM data. The statistical model is developed in matlab using the dlmsmo routine by Marko Laine (http://helios.fmi.fi/~lainema/dlm/). Part of the numerical calculations were done with the scientific computing system of Ca’Foscari (SCSCF). Primary data used in the analysis that may be useful in reproducing the author’s work are archived by the Max Planck Institute for Meteorology and can be obtained by contacting miklip-mpi-esm@mpimet.mpg.de. We thank three anonymous reviewers, whose critical comments helped to improve the original study.

Author information

Authors and Affiliations

University Ca’Foscari of Venice, Dept. of Environmental Sciences, Informatics and Statistics, Via Torino 155, 30170, Mestre Venezia, Italy
Davide Zanchettin, Carlo Gaetan, Maeregu Woldeyes Arisido & Angelo Rubino
Max Planck Institute for Meteorology, Bundesstrasse 53, 20146, Hamburg, Germany
Kameswarrao Modali
Uni Research, Bjerknes Centre for Climate Research, Bergen, Norway
Thomas Toniazzo
Geophysical Institute, University of Bergen, Postboks 7803, 5020, Bergen, Norway
Noel Keenlyside

Authors

Davide Zanchettin
View author publications
You can also search for this author in PubMed Google Scholar
Carlo Gaetan
View author publications
You can also search for this author in PubMed Google Scholar
Maeregu Woldeyes Arisido
View author publications
You can also search for this author in PubMed Google Scholar
Kameswarrao Modali
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Toniazzo
View author publications
You can also search for this author in PubMed Google Scholar
Noel Keenlyside
View author publications
You can also search for this author in PubMed Google Scholar
Angelo Rubino
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

C.G., D.Z. and A.R. conceived the study. D.Z. performed the DLM simulations and the analyses. K.M. calculated spatially-averaged indices. D.Z., T.T., K.M., C.G., M.A., N.K. and A.R. contributed to the discussion and writing the paper.

Corresponding author

Correspondence to Davide Zanchettin.

Ethics declarations

Competing Interests

The authors declare that they have no competing interests.

Additional information

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Supplementary Information

Dataset 1

Supplementary Video

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Zanchettin, D., Gaetan, C., Arisido, M. et al. Structural decomposition of decadal climate prediction errors: A Bayesian approach. Sci Rep 7, 12862 (2017). https://doi.org/10.1038/s41598-017-13144-2

Download citation

Received: 06 July 2017
Accepted: 18 September 2017
Published: 09 October 2017
DOI: https://doi.org/10.1038/s41598-017-13144-2

This article is cited by

Spatio-temporal quantification of climate model errors in a Bayesian framework
- Maeregu Woldeyes Arisido
- Carlo Gaetan
- Angelo Rubino
Stochastic Environmental Research and Risk Assessment (2019)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.