Estimating global identifiability using conditional mutual information in a Bayesian framework

Bhola, Sahil; Duraisamy, Karthik

doi:10.1038/s41598-023-44589-3

Download PDF

Article
Open access
Published: 26 October 2023

Estimating global identifiability using conditional mutual information in a Bayesian framework

Sahil Bhola¹ &
Karthik Duraisamy¹

Scientific Reports volume 13, Article number: 18336 (2023) Cite this article

888 Accesses
Metrics details

Subjects

Abstract

A novel information-theoretic approach is proposed to assess the global practical identifiability of Bayesian statistical models. Based on the concept of conditional mutual information, an estimate of information gained for each model parameter is used to quantify the identifiability with practical considerations. No assumptions are made about the structure of the statistical model or the prior distribution while constructing the estimator. The estimator has the following notable advantages: first, no controlled experiment or data is required to conduct the practical identifiability analysis; second, unlike popular variance-based global sensitivity analysis methods, different forms of uncertainties, such as model-form, parameter, or measurement can be taken into account; third, the identifiability analysis is global, and therefore independent of a realization of the parameters. If an individual parameter has low identifiability, it can belong to an identifiable subset such that parameters within the subset have a functional relationship and thus have a combined effect on the statistical model. The practical identifiability framework is extended to highlight the dependencies between parameter pairs that emerge a posteriori to find identifiable parameter subsets. The applicability of the proposed approach is demonstrated using a linear Gaussian model and a non-linear methane-air reduced kinetics model. It is shown that by examining the information gained for each model parameter along with its dependencies with other parameters, a subset of parameters that can be estimated with high posterior certainty can be found.

Identifying the sources of structural sensitivity in partially specified biological models

Article Open access 09 October 2020

Predictive power of non-identifiable models

Article Open access 10 July 2023

Tutorial: a beginner’s guide to building a representative model of dynamical systems using the adjoint method

Article Open access 15 April 2024

Introduction

With the growth in computational capabilities, statistical models are becoming increasingly complex to make predictions under various design conditions. These models often contain uncertain parameters which must be estimated using data obtained from controlled experiments. While methods for parameter estimation have matured significantly, there remain notable challenges for a statistical model and estimated parameters to be considered reliable. One such challenge is the practical identifiability of model parameters which is defined as the possibility of estimating each parameter with high confidence given different forms of uncertainties, such as parameter, model-form, or measurement are present^1,2,3. Low practical identifiability of the statistical model can lead to an ill-posed estimation problem which becomes a critical issue when the parameters have a physical interpretation and decisions are to be made using their estimated values^{4, 5}. Further, such identifiability deficit can also lead to an unreliable model prediction, and therefore such statistical models are not suitable for practical applications^{6, 7}. Therefore, for a reliable parameter estimation process and model prediction, it is of significant interest that the practical identifiability is evaluated before any controlled experiment or parameter estimation studies are conducted^{8, 9}.

In frequentist statistics, the problem of practical identifiability is to examine the possibility of unique estimation of model parameters $\theta$⁸. Under such considerations, methods examining identifiability are broadly classified into local and global identifiability methods. While the former examines the possibility that $\theta =\theta ^k$ is a unique parameter estimate within its neighborhood $N(\theta ^k)$ in the parameter space, the latter is concerned with the uniqueness of $\theta ^k$ when considering the entire parameter space. Local sensitivity analysis has been widely used to find parameters that produce large variability in the model response^10,11,12,13. In such an analysis, parameters resulting in large variability are considered relevant and therefore assumed to be identifiable for parameter estimation. However, the parameters associated with large model sensitivities could still have poor identifiability characteristics¹⁴. Another class of frequentist identification methods is based on the analysis of the properties of the Fisher information matrix (FIM). Staley et al.¹⁵ proposed that the positive definiteness of the FIM is the necessary and sufficient condition for the parameters to be considered practically identifiable. Similarly, Rothenberg¹⁶ showed that the identifiability of the parameters is equivalent to the non-singularity of the FIM. However, subsequent findings^{17, 18} have reported that models with singular FIM could also be identifiable. Weijers et al.¹⁹ extended the classical FIM analysis and showed that even if an individual parameter has low identifiability it can belong to an identifiable subset, such that the subset is practically identifiable. Parameters within such subsets have functional relationships with each other, thus resulting in a combined effect on the model response. It has been shown that such identifiable subsets can be found by examining the condition number (E-criterion) and determinant (D-criterion), and selecting parameter pairs with the smallest condition number and largest determinant. Similarly, Machado et al.²⁰ considered the D to E ratio to examine practical identifiability and to find the identifiable subsets. Another popular identification technique is likelihood profiling^{4, 21,22,23}. The method is based on finding the likelihood profile of a parameter by maximizing the likelihood with respect to the rest of the parameters. Parameters for which its likelihood profile is shallow are deemed to have low practical identifiability. In addition to evaluating practical identifiability, likelihood profiling could also be used to find functional relationships between parameters, which is helpful for model reparameterization^{24, 25}. However, due to the several re-optimizations required to obtain the likelihood profiles, the method does not scale well with parameter space and could quickly become computationally intractable. While methods based on FIM or likelihood profiling have gained significant popularity they only examine local identifiability. This means that the estimate of practical identifiability is dependent on $\theta ^k$ for which the analysis is conducted and is only valid within its neighborhood $N(\theta ^k)$. To overcome the limitations of local identifiability, global identifiability methods using Kullback-Leibler divergence²⁶ and identifying functions²⁷ have been proposed. However, such methods are computationally complex and not suitable for practical problems. Moreover, since such methods are based on frequentist statistics, they are unable to account for parametric uncertainty and therefore unable to provide an honest representation of global practical identifiability.

There have been few studies examining global practical identifiability in a Bayesian framework. Early attempts were based on global sensitivity analysis (GSA) that apportions the variability (either by derivatives or variance) of the model output due to the uncertainty in each parameter^{6, 28,29,30}. Unlike local sensitivity analysis, GSA-based methods simultaneously vary model parameters according to their distributions, thus providing a measure of global sensitivity that is independent of a particular parameter realization. However, global parameter sensitivity does not guarantee global practical identifiability³¹. Pant et al.³² and Capellari et al.³³ formulated the problem of practical identifiability as gaining sufficient information about each model parameter from data. An information-theoretic approach was used to quantify the information gained, such that larger information gain would mean larger practical identifiability. However, assumptions about the structure of parameter-data joint distribution were made when developing the estimator. A similar approach was used by Ebrahimian et al.³⁴ where the change in parameter uncertainty moving from the prior distribution to the estimated posterior distribution was used to quantify information gained. Pant³⁵ proposed information sensitivity functions by combining information theory and sensitivity analysis to quantify information gain. However, the joint distribution between the parameters and the data was assumed to be Gaussian.

Framed in a Bayesian setting, the information-theoretic approach to identifiability provides a natural extension to include different forms of uncertainties that are present in practical problems. In this work, a novel estimator is developed from an information-theoretic perspective to examine the practical identifiability of a statistical model. The expected information gained from the data for each model parameter is used as a metric to quantify practical identifiability. In contrast to the aforementioned methods based on information theory, the proposed approach has the following novel advantages: first, the estimator for information gain can be used for an a priori analysis, that is, no data is required to evaluate practical identifiability; second, the framework can account for different forms of uncertainty, such as model-form, parameter, and measurement; third, the framework does not make assumptions about the joint distribution between the data and parameters as in the previous methods; fourth, the identifiability analysis is global, rather than being dependent on a particular realization of model parameters. Another contribution of this work is an information-theoretic estimator to highlight dependencies between parameter pairs that emerge a posteriori, however, in an a priori manner. Combining the knowledge about information gained about each parameter and parameter dependencies using the proposed approach, it is possible to find parameter subsets that can be estimated with high posterior certainty before any controlled experiment is performed. Broadly, this can dramatically reduce the cost of parameter estimation, inform model-form selection or refinement, and associate a degree of reliability to the parameter estimation.

The manuscript is organized as follows. In "Bayesian parameter inference" the Bayesian paradigm for parameter estimation is presented. In "Quantifying information gain" differential entropy and mutual information are presented as information-theoretic tools to quantify the uncertainty associated with random variables and information gain, respectively. In "Estimating practical identifiability" an a priori estimator is developed to quantify global practical identifiability in a Bayesian construct. In "Estimating parameter dependence" the problem of estimating parameter dependencies is addressed. An a priori estimator is developed to quantify parameter dependencies developed a posteriori. The practical identifiability framework is applied to a linear Gaussian statistical model and methane-air reduced kinetics model; results are presented in "Numerical experiments". Concluding remarks are presented in "Concluding remarks and perspectives".

Quantifying practical identifiability in a Bayesian setting

In this section, we first present the Bayesian framework for parameter estimation. Next, we utilize the concepts of differential entropy and mutual information from information theory to quantify information contained in the data about uncertain parameters of the statistical model. Thereafter, we extend the idea of mutual information to develop an a priori estimator to quantify practical identifiability in a Bayesian setting. While in most statistical models low practical identifiability is due to insufficient information about model parameters, it may often be the case that identifiable subsets exist. Parameters within such subsets have functional relations and exhibit a combined effect on the statistical model, such that the subset is practically identifiable. To find such identifiable subsets, we develop an estimator to highlight dependencies between parameter pairs that emerge a posteriori.

Bayesian parameter inference

Consider the observation/data $y\in \mathcal {Y}$ of a physical system which is a realization of a random variable $\text{Y}:\Omega \rightarrow {{\mathbb {R}}^{n}}$ distributed as p(y), where $\mathcal {Y}$ is the set of all possible realizations of the random variable. Herein, we will use the same lower-case, upper-case, and symbol notation to represent a realization, random variable, and the set of all possible observations, respectively. Consider another real-valued random variable $\Theta :\Omega \rightarrow {{\mathbb {R}}^{m}}$ distributed as $p(\theta ): {{\mathbb {R}}^{m}}\rightarrow {{\mathbb {R}}^{+}}$ which denotes the uncertain parameters of the model. The data is assumed to be generated by the statistical model given as

$$\begin{aligned} y\triangleq \mathscr {F}(\theta , d)+\xi , \end{aligned}$$

(1)

where $\mathscr {F}(\theta , d): {{\mathbb {R}}^{m}}\times {{\mathbb {R}}^{\ell }}\rightarrow {{\mathbb {R}}^{n}}$ is the forward model which maps the parameters and model inputs $d\in {{\mathbb {R}}^{\ell }}$ to the prediction space. For simplicity, consider the input of the model d as known. The random variable $\xi$ is the additive measurement noise or uncertainty in our measurement. Once the observations are collected using controlled experiments, the prior belief of the parameter distribution $p(\theta | d)$ can be updated to obtain the posterior distribution $p(\theta | y, d)$ via the Bayes’ rule

$$\begin{aligned} p(\theta | y, d) = \frac{p(y\mid \theta , d)p(\theta \mid d)}{p(y\mid d)}, \end{aligned}$$

(2)

where $p(y\mid \theta , d)$ is called the model likelihood and $p(y\mid d)$ is called the evidence.

Quantifying information gain

Updating parameter belief from the prior to posterior in (2) is associated with a gain in information from the data. This gain can be quantified as the change in the uncertainty of the parameters $\Theta$. As an example, consider a 1D Gaussian prior and posterior distribution such that the information gain can be quantified as a change in variance (a measure of uncertainty) of the parameter distribution. A greater reduction in parameter uncertainty is a consequence of more information gained from the data.

In general, the change in parameter uncertainty between the prior and posterior distributions for a given input of the model $d\in \mathcal {D}$ is defined as

$$\begin{aligned} \Delta {\mathscr {U}}(d, y) \triangleq {\mathscr {U}}(p(\theta \mid d)) - {\mathscr {U}}(p(\theta \mid y, d)), \end{aligned}$$

(3)

where ${\mathscr {U}}$ is an operator quantifying the amount of uncertainty or the lack of information for a given probability distribution. Thus, the expected information gained about the parameters is defined as

$$\begin{aligned} \Delta {\mathscr {U}}(d) \triangleq {\mathscr {U}}(p(\theta \mid d)) - \int \limits _{\mathcal {Y}}{\mathscr {U}}(p(\theta \mid y, d)) \,\text{d}y. \end{aligned}$$

(4)

One popular choice for the operator ${\mathscr {U}}$ is the differential entropy^{32, 33, 36} which is defined as the average Shannon information³⁷ for a given probability distribution. Mathematically, for a continuous random variable $Z:\Omega \rightarrow {{\mathbb {R}}^{t}}$ with distribution $p(z): {{\mathbb {R}}^{t}}\rightarrow {{\mathbb {R}}^{+}}$ and support $\mathcal {Z}$, the differential entropy is defined as

$$\begin{aligned} H(p(z)) = H(Z) \triangleq -\int \limits _{\mathcal {Z}} p(z) \log p(z) \,\text{d}z. \end{aligned}$$

(5)

Using differential entropy to quantify the uncertainty of a probability distribution, the change in uncertainty (or expected information gain) of $\Theta$ can be evaluated as

$$\begin{aligned} \Delta {\mathscr {U}}(d)&= H(\Theta \mid d) - H(\Theta \mid Y,d), \end{aligned}$$

(6a)

$$\begin{aligned}{}&= H(\Theta \mid d) + H(Y\mid d) - H(\Theta , Y \mid d) , \end{aligned}$$

(6b)

$$\begin{aligned}{}&= -\int \limits _{{\Theta }}p(\theta \mid d)\log p(\theta \mid d) \,\text{d}\theta + \int \limits _{{\Theta }, \mathcal {Y}} p(\theta , y\mid d) \log p(\theta \mid y, d) \,\text{d}\theta \,\text{d}y,\end{aligned}$$

(6c)

$$\begin{aligned}{}&= \int \limits _{{\Theta }, \mathcal {Y}} p(\theta , y\mid d) \log \frac{p(\theta , y\mid d)}{p(\theta \mid d)p(y\mid d)} \,\text{d}\theta \,\text{d}y,\end{aligned}$$

(6d)

$$\begin{aligned}{}&\triangleq I(\Theta ;Y\mid d). \end{aligned}$$

(6e)

The quantity $I(\Theta ;Y \mid d)$ is called mutual information between the random variables $\Theta$ and Y given the model inputs $\mathcal {D}=d$³⁸. In the case of discrete random variables the mutual information is measured in bits, whereas in the case of continuous variables the units are nats.

Remark 1

The mutual information $I(\Theta ; Y\mid d)$ is always non-negative³⁸. This means that updating the parameter belief from the prior to the posterior cannot increase parameter uncertainty.

Estimating practical identifiability

In a Bayesian framework where the parameters are treated as random variables, practical identifiability can be determined by examining information gained about each model parameter^{32, 35}. Parameters for which the data is uninformative cannot be estimated with a high degree of confidence and therefore are practically unidentifiable. While mutual information in (6e) is a useful quantity to study information gained from data about the entire parameter set, it does not apportion information gained about each parameter. Therefore, to examine practical identifiability, we define a conditional mutual information

$$\begin{aligned} I(\Theta _{i};Y\mid \Theta _{\sim i}, d)\triangleq & {} \, {\mathbb {E}}_{\Theta _{\sim i}}[I(\Theta _i;Y\mid \Theta _{\sim i}= \theta _{\sim i}, d)], \end{aligned}$$

(7)

where $\Theta _{\sim i}$ are all parameters except $\Theta _i$ and ${\mathbb {E}}_{\Theta _{\sim i}}[\cdot ]$ denotes the expectation over $p(\theta _{\sim i} \mid d)$. Using such conditional mutual information for practical identifiability is based on the intuition that on average high information gained about $\Theta _i$ means high practical identifiability. We can thus present the following definitions for identifiability in a Bayesian setting.

Definition 1

(Local identifiability) Given a statistical model with parameters $\Theta$, a parameter $\Theta _i\in \Theta$ is said to be locally identifiable if sufficient information is gained about it for a particular realization $\theta _{\sim i}$ of $\Theta _{\sim i}$.

Definition 2

(Global identifiability) Given a statistical model with parameters $\Theta$, a parameter $\Theta _i\in \Theta$ is said to be globally identifiable if sufficient information is gained about it on average with respect to the distribution $p(\theta _{\sim i}{\mid d})$.

The expectation over possible realizations of $\Theta _{\sim i}$ in (7) therefore provides a statistical measure of global practical identifiability⁸. On the contrary, evaluating (7) at a fixed $\theta _{\sim i}$ will result in a local identifiability measure, which means that the information gained about $\Theta _i$ will implicitly depend on $\theta _{\sim i}$.

Typically, (7) does not have a closed-form expression and must be estimated numerically. Using the definition of differential entropy in (5) the conditional mutual information can be written as

$$\begin{aligned} I(\Theta _i;Y\mid \Theta _{\sim i}, d)&= \int \limits _{{\Theta _{i},\Theta _{\sim i},} \mathcal {Y}} p(\theta _i,\theta _{\sim i}, y\mid d) \log \frac{p(\theta _i, y \mid \theta _{\sim i}, d)}{p(\theta _i\mid \theta _{\sim i}, d)p(y\mid \theta _{\sim i}, d)} \,\text{d}\theta _i \,\text{d}\theta _{\sim i} \,\text{d}y, \end{aligned}$$

(8a)

$$\begin{aligned}{}&= \int \limits _{{{\Theta _{i}},{\Theta _{\sim i}},} \mathcal {Y}} p({\theta _i},{\theta _{\sim i}}, y\mid d) \log \frac{p(y\mid {\theta _i}, {\theta _{\sim i}}, d)p({\theta _i}\mid {\theta _{\sim i}}, d)}{p({\theta _i}\mid {\theta _{\sim i}}, d)p(y\mid {\theta _{\sim i}}, d)} \,\text{d}{\theta _i} \,\text{d}{\theta _{\sim i}} \,\text{d}y, \end{aligned}$$

(8b)

$$\begin{aligned}{}&= \int \limits _{{{\Theta _{i}},{\Theta _{\sim i}},} \mathcal {Y}} p({\theta _i,}{\theta _{\sim i}}, y\mid d) \log \frac{p(y\mid {\theta _i}, {\theta _{\sim i}}, d)}{p(y\mid {\theta _{\sim i}}, d)} \,\text{d}{\theta _i} \,\text{d}{\theta _{\sim i}} \,\text{d}y. \end{aligned}$$

(8c)

Remark 2

In terms of differential entropy, the conditional mutual information in (7) can be defined as

$$\begin{aligned} I(\Theta _{i};Y\mid \Theta _{\sim i}, d)&\triangleq H(\Theta _i\mid \Theta _{\sim i}, d) - H(\Theta _i\mid \Theta _{\sim i}, Y, d), \end{aligned}$$

(9a)

$$\begin{aligned}{}&= H(\Theta _i, \Theta _{\sim i}\mid d) + H(\Theta _{\sim i}, Y\mid d)- H(\Theta _{\sim i}\mid d) - H(\Theta _i, \Theta _{\sim i}, Y\mid d). \end{aligned}$$

(9b)

In case the parameters are uncorrelated,

$$\begin{aligned} I(\Theta _i;Y\mid \Theta _{\sim i}, d) = H(\Theta _i\mid d) + H(\Theta _{\sim i}, Y\mid d) - H(\Theta _i, \Theta _{\sim i}, Y\mid d). \end{aligned}$$

(10)

While this formulation does not involve any conditional distributions involving the parameters or data, it requires joint distributions, namely, $p(\theta _i, \theta _{\sim i}\mid d)$, $p(\theta _{\sim i}, y\mid d)$, $p(\theta _i, \theta _{\sim i}, y\mid d)$. Typically, such joint distributions do not have a closed-form expression and must be approximated.

In the special case where $\Theta _i$ perfectly correlates with $\Theta _{\sim i}$ such that the realization of $\theta _{\sim i}$ provides sufficient information about $\theta _i$, the term inside the logarithm in (8c) becomes identically unity. For such a case, the data is not informative about $\Theta _i$ and the effective parameter dimensionality $m_{\text {eff}}$ becomes less than m. For a more general case, Monte-Carlo integration can be used to approximate the high dimensional integral as

$$\begin{aligned} I(\Theta _{i};Y\mid \Theta _{\sim i}, d) \approx {\hat{I}}(\Theta _{i};Y\mid \Theta _{\sim i}, d) = \sum _{k=1}^{n_{\text {outer}}}\log \frac{p(y^{k}\mid \theta _{i}^{k}, \theta _{\sim i}^{k}, d)}{p(y^{k}\mid \theta _{\sim i}^{k}, d)}, \end{aligned}$$

(11)

where $(\theta _{i}^{k},\theta _{\sim i}^{k})$ is drawn from the distribution $p(\theta _i, \theta _{\sim i}\mid d)$; $y^{k}$ is drawn from the likelihood distribution $p(y\mid \theta _{i}^k, \theta _{\sim i}^k, d)$; and $n_{\text {outer}}$ is the number of Monte-Carlo samples. Typically, conditional evidence $p(y\mid \theta _{\sim i}, d)$ does not have a closed-form expression, and therefore $p(y^k\mid \theta _{\sim i}^k, d)$ must be numerically approximated. One approach is to rewrite the conditional evidence $p(y^k\mid \theta _{\sim i}^k, d)$ by means of marginalization as

$$\begin{aligned} p(y^{k}\mid \theta _{\sim i}^{k}, d) \triangleq \int \limits _{{\Theta _{i}}} p(y^{k},\theta _{i}\mid \theta _{\sim i}^{k}, d) \,\text{d}\theta _{i} = \int \limits _{{\Theta _{i}}} p(y^{k}\mid \theta _{i}, \theta _{\sim i}^{k}, d)p(\theta _i\mid \theta _{\sim i}^k, d) \,\text{d}\theta _{i}. \end{aligned}$$

(12)

For simplicity, assume that the parameters are uncorrelated prior to observing the data, and are also independent of the model inputs d. As a result, (12) can be re-written as

$$\begin{aligned} p(y^{k}\mid \theta _{\sim i}^{k}, d) = \int \limits _{{\Theta _{i}}} p(y^{k}\mid \theta _{i}, \theta _{\sim i}^{k}, d)p(\theta _i) \,\text{d}\theta _{i}. \end{aligned}$$

(13)

This results in a low-dimensional integral over a univariate prior distribution $p(\theta _i)$. Evaluating (13) using the classical Monte-Carlo integration can dramatically increase the overall cost of estimating the conditional mutual information in (11), especially if the likelihood evaluation is computationally expensive. In the special case where the priors are normally distributed, this cost can be reduced by considering a $\zeta$-point Gaussian quadrature rule. Using the quadrature approximation in (13) gives

$$\begin{aligned} p(y^{k}\mid \theta _{\sim i}^{k}, d) \approx {\hat{p}}(y^{k}\mid \theta _{\sim i}^{k}, d) = \sum _{\zeta =1}^{\zeta =n_{\text {inner}}} \Big [p(y^{k}\mid \theta _{i}^{\zeta }, \theta _{\sim i}^{k}, d)\Big ]\gamma ^{\zeta }, \end{aligned}$$

(14)

where $\theta _{i}^{\zeta }$ and $\gamma ^{\zeta }$ are the $\zeta ^{th}$ quadrature point and weight, respectively; $n_{\text {inner}}$ is the number of quadrature points. Here, we use the Gauss-Hermite quadrature rule, which uses the $t^{th}$ order Hermite polynomial and will be exact for polynomials up to order $2t-1$³⁹. In a much more general case where the prior distributions can be non-Gaussian (however, can still be evaluated), the cost of estimating (13) can be reduced by using importance sampling with a proposal distribution $q(\theta _i)$. Using importance sampling we can rewrite (13) as

$$\begin{aligned} p(y^{k}\mid \theta _{\sim i}^{k}, d) = \int \limits _{{\Theta _{i}}} \Big [p(y^{k}\mid \theta _{i}, \theta _{\sim i}^{k}, d)w(\theta _i)\Big ] q(\theta _i) \,\text{d}\theta _{i}, \end{aligned}$$

(15)

where $w(\theta _i) = p(\theta _i)/q(\theta _i)$ are the importance sampling weights. In the case where the proposal distribution $q(\theta _i)$ is Gaussian, the quadrature rule can be applied to (15) as

$$\begin{aligned} p(y^{k}\mid \theta _{\sim i}^{k}, d) \approx {\hat{p}}(y^{k}\mid \theta _{\sim i}^{k}, d) = \sum _{\zeta =1}^{\zeta =n_{\text {inner}}} \Big [p(y^{k}\mid \theta _{i}^{\zeta }, \theta _{\sim i}^{k}, d)w(\theta _i^{\zeta })\Big ]\gamma ^{\zeta }. \end{aligned}$$

(16)

Combining the estimator for conditional evidence ((14) or (16)) with (11) results in a biased estimator for conditional mutual information^{40, 41}. While the variance is controlled by the numerical accuracy of estimating the high-dimensional integral in (11), the bias is governed by the accuracy of approximating the conditional evidence in (12). This means that the variance is controlled by $n_{\text {outer}}$ Monte-Carlo samples and bias by $n_{\text {inner}}$ quadrature points.

In practice, estimating conditional evidence can become computationally expensive, especially when the variability in the output of the forward model is high with respect to $\Theta _i$ given $\Theta _{\sim i} = \theta _{\sim i}$, that is, large $\nabla _{\theta _{i}} {\mathscr {F}}(\theta , d)|_{\Theta _{\sim i}=\theta _{\sim i}}$. For such statistical models, conditional evidence can become near zero such that numerical approximation by means of vanilla Monte-Carlo integration or Gaussian quadrature in (14) can be challenging⁴¹. Using an estimator based on importance sampling for conditional evidence as shown in (16) can alleviate this problem by carefully choosing the density of the proposal $q(\theta _i)$. As an example, consider the case where the additive measurement noise $\xi$ is normally distributed as $\mathcal {N}(0, \Gamma )$ such that the likelihood of the model is distributed as $p(y\mid \theta )=\mathcal {N}({\mathscr {F}}(\theta , d), \Gamma )$, and $y^{k}$ is sampled according to $\mathcal {N}({\mathscr {F}}(\theta _{i}^{k}, \theta _{\sim i}^{k}, d), \Gamma )$. In the case where model predictions have large variability with respect to the parameter $\Theta _i$ for a given $\Theta _{\sim i}=\theta _{\sim i}$ the model likelihood can become small. For such a case, the importance-sampling-based estimator given in (16) can be used by constructing a proposal around the sample $\theta _{i}^{k}$, such as $q(\theta _{i})=\mathcal {N}(\theta _{i}^{k}, {\sigma ^2_{\text {proposal}}})$ where ${\sigma ^2_{\text {proposal}}}$ is the variance of the proposal distribution. This results in a robust estimation of conditional evidence and prevents infinite values for conditional mutual information. Here, we consider (16) to estimate conditional evidence.

Remark 3

Assessing the practical identifiability in a Bayesian framework is dependent on the prior distribution. Although the framework presented in this article is entirely an a priori analysis of practical identifiability, prior selection can affect estimated identifiability. Prior selection in itself is an extensive area of research and is not considered a part of this work.

Physical interpretation of identifiability in an information-theoretic framework

Assessing practical identifiability using the conditional mutual information described in (7) provides a relative measure of how many bits (or nats) of information is gained for a particular parameter. In practical applications where this information gain can vary on disparate scales, it is useful to associate a physical interpretation to identifiability. Following Pant et al.³², consider a hypothetical direct observation statistical model given as $\psi \triangleq \theta _{i} + \Lambda$, where $\Lambda \sim \mathcal {N}(0, \sigma ^2_{\Lambda })$ is the additive measurement noise. Given this observation model, we can define an information gain equivalent variance $\mathscr {C}(\Theta _i)$ as the measurement uncertainty in the direct observation model given $I(\Theta _i;\Psi ) = {\hat{I}}(\Theta _i;Y\mid \Theta _{\sim i}, d)$. Large $\mathscr {C}(\Theta _i)$ would mean that the information gained about $\Theta _i$ (using (7)) for the statistical model (1) would lead to higher measurement uncertainty if the parameter is observed directly.

If the prior distribution $p(\theta _i)$ can be approximated by means of an equivalent normal distribution $\mathcal {N}(\mu _{e}, \sigma ^2_{e})$ then $I(\Theta _i;\Psi )$ is given as

$$\begin{aligned} I(\Theta _i;\Psi ) \triangleq \frac{1}{2}\log \Big ( 1 + \frac{\sigma ^2_{e}}{\sigma ^2_{\Lambda }}\Big ), \end{aligned}$$

(17)

such that

$$\begin{aligned} \mathscr {C}(\Theta _i) \triangleq \sigma ^2_{\Lambda } = \sigma ^2_{e} (\exp \{2{\hat{I}}(\Theta _i;Y\mid \Theta _{\sim i}, d)\} - 1)^{-1}. \end{aligned}$$

(18)

This information gain equivalent variance only depends on the information gained for the model parameter, and thus, can be used as a metric to compare different model parameters.

Estimating parameter dependence

In most statistical models, unknown functional relationships or dependencies may be present between parameters such that multiple parameters have a combined effect on the statistical model. Such parameters can form an identifiable subset where an individual parameter will exhibit low identifiability, however, the subset is collectively identifiable. This means that the data is uninformative or weakly informative about an individual parameter within the subset, whereas it is informative about the entire subset. As an example, consider the statistical model: $y = \theta _1\theta _2*d + \xi$ for which individually identifying $\Theta _1$ or $\Theta _2$ is not possible as they have a combined effect on the statistical model. However, it is clear that $\Theta _1$ and $\Theta _2$ belong to an identifiable subset such that the pair $(\Theta _1, \Theta _2)$ is identifiable. Thus, considering the statistical model given by $y = \theta _3*d + \xi$ where $\theta _3=\theta _1*\theta _2$ will have better identifiability characteristics. For such statistical models, the traditional method of examining correlations between parameters is often insufficient, as it only reveals linear functional relations between random variables.

To highlight the parameter dependencies, consider the statistical model given in (1) such that we are interested in examining the relations between $\Theta _i\in \Theta$ and $\Theta _j\in \Theta$ that emerge a posteriori. While the conditional mutual information presented in "Estimating practical identifiability" provides information on the practical identifiability of an individual parameter, it does not provide information about dependencies developed between pairs of parameters. To quantify such dependencies, we define a conditional mutual information between parameter pairs

$$\begin{aligned} I(\Theta _i;\Theta _j\mid Y, \Theta _{\sim i, j}, d) \triangleq {\mathbb {E}}_{\Theta _{\sim i, j}}[{\mathbb {E}}_{Y}[I(\Theta _i;\Theta _j\mid Y=y, \Theta _{\sim i, j}=\theta _{\sim i, j}, d)]], \end{aligned}$$

(19)

which evaluates the average information between the variables $\Theta _i$ and $\Theta _j$ that is obtained a posteriori. Here, $\Theta _{\sim i, j}$ is defined as all the parameters of the statistical model except $\Theta _i$ and $\Theta _j$.

A closed-form expression for (19) is typically not available, and therefore a numerical approximation is required. In integral form, (19) is given as

$$\begin{aligned} \begin{aligned} I(\Theta _{i};\Theta _{j}\mid Y, \Theta _{\sim i, j}, d) \triangleq \int \limits _{{\Theta _{i}, \Theta _{j}, \Theta _{\sim i, j}}, \mathcal {Y}} p(\theta _{i}, \theta _{j}, \theta _{\sim i, j}, y\mid d) \Big [\log \big [ p(\theta _{i}, \theta _j \mid y, \theta _{\sim i, j}, d)\big ]- \\ \log \big [p(\theta _{i}\mid y, \theta _{\sim i, j}, d)p(\theta _j\mid y, \theta _{\sim i, j}, d)\big ]\Big ] \,\text{d}\theta _{i} \,\text{d}\theta _{j} \,\text{d}\theta _{\sim i, j} \,\text{d}y, \end{aligned} \end{aligned}$$

(20)

where

$$\begin{aligned} p(\theta _i, \theta _j\mid y, \theta _{\sim i, j}, d)&\triangleq \frac{p(y\mid \theta _i, \theta _j, \theta _{\sim i, j}, d)p(\theta _i, \theta _j, \mid \theta _{\sim i, j}, d)}{p(y\mid \theta _{\sim i, j}, d)},\end{aligned}$$

(21a)

$$\begin{aligned} p(\theta _i \mid y, \theta _{\sim i, j}, d)&\triangleq \frac{p(y\mid \theta _i,\theta _{\sim i, j}, d)p(\theta _i\mid \theta _{\sim i, j}, d)}{p(y\mid \theta _{\sim i, j}, d)}, \end{aligned}$$

(21b)

$$\begin{aligned} p(\theta _j \mid y, \theta _{\sim i, j}, d)&\triangleq \frac{p(y\mid \theta _j,\theta _{\sim i, j}, d)p(\theta _j\mid \theta _{\sim i, j}, d)}{p(y\mid \theta _{\sim i, j}, d)}, \end{aligned}$$

(21c)

via Bayes’ theorem.

Remark 4

In terms of differential entropy, the conditional mutual information in (19) can be defined as

$$\begin{aligned} I(\Theta _i;\Theta _j\mid Y, \Theta _{\sim i, j}, d)&\triangleq H(\Theta _i\mid Y, \Theta _{\sim i, j}, d) - H(\Theta _i\mid \Theta _j, Y, \Theta _{\sim i, j}, d), \end{aligned}$$

(22a)

$$\begin{aligned}{}&= H({\Theta _i}, {\Theta _{\sim i, j}}, Y\mid d) + H({\Theta _j}, {\Theta _{\sim i, j}}, Y\mid d) \\&\qquad -H({\Theta _{\sim i,j}}, Y\mid d) - H({\Theta _i}, {\Theta _j},{\Theta _{\sim i, j}}, Y\mid d). \end{aligned}$$

(22b)

Such a formulation requires evaluating joint distributions, namely, $p(\theta _i, \theta _{\sim i, j}, y\mid d)$, $p(\theta _j, \theta _{\sim i, j}, y\mid d)$, $p(\theta _{\sim i, j}, y\mid d)$, and $p(\theta _i, \theta _j, \theta _{\sim i, j}, y\mid d)$. Typically, such joint distributions do not have a closed-form expression and must be approximated.

For the sake of illustration, assume that the parameters are uncorrelated with each other prior to observing the data. As a consequence of this assumption, any relations developed between $\Theta _i$ and $\Theta _j$ are discovered solely from data. Furthermore, it is also reasonable to assume that prior knowledge of the parameters is independent of the input of the model d. Substituting (21a) through (21c) into (20) we obtain

$$\begin{aligned} \begin{aligned} I(\Theta _{i};\Theta _{j}\mid Y, \Theta _{\sim i, j}, d) = \int \limits _{{\Theta _{i}, \Theta _{j}, \Theta _{\sim i, j}}, \mathcal {Y}}&p(\theta _{i}, \theta _{j}, \theta _{\sim i, j}, y\mid d) \Big [\log \big [p(y\mid \theta _{i}, \theta _j,\theta _{\sim i, j} d)\big ] \\&+ \log \big [p(y\mid \theta _{\sim i, j}, d)\big ]\\&- \log \big [p(y\mid \theta _{i}, \theta _{\sim i, j}, d)\big ]\\&- \log \big [p(y\mid \theta _j,\theta _{\sim i,j}, d)\big ]\Big ] \,\text{d}\theta _{i} \,\text{d}\theta _{j} \,\text{d}\theta _{\sim i, j} \,\text{d}y. \end{aligned} \end{aligned}$$

(23)

Similar to "Estimating practical identifiability" we can estimate the conditional mutual information in (23) using Monte-Carlo integration as ${\hat{I}}(\Theta _i;\Theta _j\mid Y, \Theta _{\sim i, j}, d) \approx I(\Theta _i;\Theta _j\mid Y, \Theta _{\sim i, j}, d)$ where

$$\begin{aligned} \begin{aligned} {\hat{I}}(\Theta _i;\Theta _j\mid Y, \Theta _{\sim i, j}, d) = \sum _{k=1}^{k=n_{\text {outer}}}\log \frac{p(y^{k}\mid \theta _{i}^{k}, \theta _j^{k},\theta _{\sim i, j}^{k}, d)p(y^{k}\mid \theta _{\sim i, j}^{k}, d)}{p(y^{k}\mid \theta _{i}^{k}, \theta _{\sim i, j}^{k}, d)p(y^{k}\mid \theta _j^{k},\theta _{\sim i,j}^{k}, d)}, \end{aligned} \end{aligned}$$

(24)

where $\theta _i^{k}$, $\theta _j^{k}$, and $\theta _{\sim i, j}^{k}$ are drawn from the prior distributions $p(\theta _i)$, $p(\theta _j)$, and $p(\theta _{\sim i, j})$, respectively; $y^{k}$ is drawn from the likelihood distribution $p(y\mid \theta _i^k, \theta _j^k, \theta _{\sim i, j}^k, d)$. The conditional evidence in (24) can be obtained by means of marginalization

$$\begin{aligned} p(y^{k}\mid \theta _{\sim i, j}^{k}, d)&\triangleq \int \limits _{{\Theta _i, \Theta _j}} p(y^{k}\mid \theta _i, \theta _j, \theta _{\sim i, j}^{k}, d)p(\theta _i, \theta _j) \,\text{d}\theta _i \,\text{d}\theta _j, \end{aligned}$$

(25a)

$$\begin{aligned} p(y^{k}\mid \theta _{i}^{k}, \theta _{\sim i, j}^{k}, d)&\triangleq \int \limits _{{\Theta _j}} p(y^{k}\mid \theta _j, \theta _{i}^{k}, \theta _{\sim i, j}^{k}, d)p(\theta _j) \,\text{d}\theta _j,\end{aligned}$$

(25b)

$$\begin{aligned} p(y^{k}\mid \theta _{j}^{k}, \theta _{\sim i, j}^{k}, d)&\triangleq \int \limits _{{\Theta _i}} p(y^{k}\mid \theta _i, \theta _{j}^{k}, \theta _{\sim i, j}^{k}, d)p(\theta _i) \,\text{d}\theta _i. \end{aligned}$$

(25c)

Similar to "Estimating practical identifiability" the conditional evidence in (25a) through (25c) can be efficiently estimated using importance sampling along with Gaussian quadrature rules. However, it should be noted that (25a) is an integral over a two-dimensional space, and therefore requires $n_{\text {inner}}^{2}$ quadrature points.

Numerical experiments

This section presents numerical experiments to validate the information-theoretic approach to examine practical identifiability. The estimate obtained for global identifiability is compared with the variance-based global sensitivity analysis by means of first-order Sobol indices computed using $\texttt{SALib}$^{42, 43} (see S1 in the supplementary material). First, a linear Gaussian statistical model is considered for which practical identifiability can be analytically examined through the proposed information-theoretic approach. This model is computationally efficient and is therefore ideal for conducting estimator convergence studies. Next, the practical identifiability of a reduced kinetics model for methane-air combustion is considered. Reduced kinetics models are widely used in the numerical analysis of chemically reactive flows since embedding detailed chemistry of combustion is often infeasible. Such reduced kinetic models are often parameterized such that constructing models with practically identifiable parameters is desirable to improve confidence in the model prediction.

Application to a linear Gaussian model

The identifiability framework is now applied to a linear Gaussian problem for which closed-form expressions are available for the conditional mutual information in (7) and (19) (see S2 in the supplementary material). Consider the statistical model

$$\begin{aligned} y = {\mathscr {F}}(\theta , d)+ \xi \quad ;\xi \sim \mathcal {N}(0, \Gamma ), \end{aligned}$$

(26)

where ${\mathscr {F}}(\theta , d) = {\textbf{A}}\theta$ and ${\textbf{A}}\in {{\mathbb {R}}^{n\times m}}$ is called the feature matrix. The prior distribution is given by $p(\theta ) = \mathcal {N}(\mu _{\Theta }, \Sigma _{\Theta })$ where $\mu _{\Theta } \in {{\mathbb {R}}^{m}}$ and $\Sigma _{\Theta }\in {{\mathbb {R}}^{m\times m}}$. Model likelihood is therefore given by $p(y\mid \theta ) = \mathcal {N}({\textbf{A}}\theta , \Gamma )$ where $\Gamma \in {{\mathbb {R}}^{n\times n}}$. Here, $\mu _{\Theta }, \Sigma _{\Theta }$, and $\Gamma$ are all constants and are considered known. The evidence distribution for this model is given by $p(y) \triangleq \mathcal {N}(\mu _{Y}, \Sigma _{Y}) = \mathcal {N}({\textbf{A}}\mu _{\Theta }, {\textbf{A}}\Sigma _{\Theta }{\textbf{A}}^{T}+\Gamma )$, such that no model-form error exists. Consider a feature matrix

$$\begin{aligned} {\textbf{A}} = \begin{pmatrix} d_{1} &{} d_{1}^{2} &{}\dots &{} d_{1}^{m} \\ d_{2} &{} d_{2}^{2} &{}\dots &{} d_{2}^{m} \\ \vdots &{} \vdots &{}\ddots &{} \vdots \\ d_{n} &{} d_{n}^{2} &{}\dots &{} d_{n}^{m} \\ \end{pmatrix}, \end{aligned}$$

(27)

where $d_{i\mid _{i=1}^n}$ are n linearly-spaced points in an interval $[-1, 1]$, and $m=3$, which means that the statistical model has 3 uncertain parameters. Assume an uncorrelated measurement noise $\Gamma =\sigma _{\xi }^2{\mathbb {I}}$ with $\sigma _{\xi }^2 = 0.1$. For the purpose of parameter estimation, synthetic data is generated using (26) assuming $\theta ^{*}=[1, 2, 3]^T$ and $n=100$.

Parameter identifiability

The goal of the framework developed in "Estimating practical identifiability" is to assess the practical identifiability of the statistical model in (26) before any controlled experiment is conducted. Consider $\mu _{\Theta } = {\textbf{0}}$ and $\Sigma _{\Theta } = {\mathbb {I}}$. Using such an uncorrelated prior distribution for the identifiability study ensures that the information obtained is only due to the observation of the data (as discussed in "Estimating parameter dependence"). Using historical parameter estimates can improve the prior (Remark 1) which can affect the identifiability analysis. However, we have not considered any such prior refinement.

Figure 1 illustrates the convergence of error in estimating information gain for each parameter using the estimator developed in "Estimating practical identifiability". As expected, for a fixed number of quadrature points, increasing the number of Monte-Carlo integration points decreases the variance in estimation. However, for a fixed $n_{\text {outer}}$ increasing the number of quadrature points reduces the bias in the estimate. Figure 2 illustrates the variance and bias convergence of error in estimating parameter dependencies as described in "Estimating parameter dependence". As expected and observed, the variance in error is controlled by the accuracy of Monte-Carlo integration, that is, by $n_{\text {outer}}$, and the bias is controlled by the quadrature approximation, that is, through $n_{\text {inner}}$.

The first-order Sobol indices, estimated information gain, and information gain equivalent variance $\mathscr {C}(\Theta _i)$ are shown in Figure 3. The estimated first-order Sobol indices (see S3 in the supplementary material for convergence study) show that the considered linear Gaussian forward model has the largest output variability due to uncertainty in $\Theta _1$, followed by $\Theta _2$ and $\Theta _3$. This implies that the forward model is most sensitive to the parameter $\theta _1$, followed by $\theta _2$ and then $\theta _3$. This is not surprising since $d_{i\mid _{i=1}^n}$ are points in the interval $[-1, 1]$. Thus, according to the first-order Sobol indices, the relevance of the parameters follows the order: $\theta _1$, $\theta _2$, and $\theta _3$. The estimated information gained agrees well with the truth. Further, the obtained trend suggests that the data is most informative about $\Theta _1$, followed by $\Theta _2$, and then $\Theta _3$. As discussed in "Estimating practical identifiability", practical identifiability follows the same trend. Furthermore, as reported in previous work³¹, it can be seen that parameters with good identifiability characteristics also exhibit high model sensitivity. Using the hypothetical direct observation model described in "Estimating practical identifiability", the smallest measurement uncertainty is obtained for the variable $\Theta _1$, followed by $\Theta _2$ and $\Theta _3$. That is, parameters with high practical identifiability are associated with low measurement uncertainty in a direct observation model.

Figure 4 illustrates the variability of the first-order Sobol indices and the estimated information gain with measurement noise variance $\sigma _{\xi }^2$. The first-order Sobol indices only account for the parameter uncertainty, and therefore remain unchanged with an increase in measurement noise. However, the estimated information gain and thereby the practical identifiability decreases with measurement noise. This observation corroborates the intuition that large measurement uncertainty will lead to large uncertainty in the parameter estimation.

Figure 5 shows the second-order Sobol indices and the true and estimated dependencies between the parameter pairs for the linear Gaussian model. Examining the second-order Sobol indices (see S3 in the supplementary material for convergence study) shows that there are negligible interactions between parameter pairs. Estimated parameter dependencies agree well with the truth; the trend is preserved. The bias observed is due to the error in approximating the conditional evidence, as shown in Figure 2. It can be clearly seen that the parameters $\Theta _1$ and $\Theta _3$ have high dependencies. This means that these parameters compensate for one another such that they will have a combined effect on the output of the statistical model. These parameters are associated with the features $d_{i\mid _{i=1}^n}$ and $d_{i\mid _{i=1}^n}^3$ which, in fact, have a similar effect on the statistical model for $d_{i\mid _{i=1}^n}\in [-1, 1]$. This observation also shows that the low practical identifiability of $\Theta _3$ is mainly due to the underlying dependency with $\Theta _1$ such that the pair $(\Theta _1, \Theta _3)$ has a combined effect in the statistical model.

Parameter estimation

For the linear Gaussian model, the joint distribution $p(\theta , y)$ can be written as

$$\begin{aligned} p(\theta , y) \triangleq \mathcal {N}(\mu _{\Theta , Y}, \Sigma _{\Theta , Y}) = \mathcal {N} \begin{pmatrix} \begin{bmatrix}\mu _{\Theta } \\ \mu _{Y}\end{bmatrix}, \begin{bmatrix}\Sigma _{\Theta } &{} \Sigma _{\Theta }{\textbf{A}}^{T} \\ {\textbf{A}}\Sigma _{\Theta } &{} \Sigma _{Y} \end{bmatrix} \end{pmatrix}, \end{aligned}$$

(28)

such that the analytical posterior distribution is given as $p(\theta \mid y) = \mathcal {N}(\mu _{\Theta _{post}}, \Sigma _{\Theta _{post}})$, where $\mu _{\Theta _{post}} = \mu _{\Theta } + \Sigma _{\Theta }{\textbf{A}}^{T}\Sigma _{Y}^{-1}(y-\mu _Y)$ and $\Sigma _{\Theta _{post}} = \Sigma _{\Theta } - \Sigma _{\Theta }{\textbf{A}}^{T}\Sigma _{Y}^{-1}A\Sigma _{\Theta }$ using Gaussian conditioning.

Samples from the posterior distribution and the aggregate posterior prediction are shown in Figure 6. Variables $\Theta _1$ and $\Theta _3$ have a negative correlation, whereas $\Theta _2$ is uncorrelated with other parameters. This means that the parameter variables $\Theta _1$ and $\Theta _3$ have (linear) dependencies on each other, and $\Theta _2$ does not have such dependencies. These dependencies were suggested during the a priori analysis conducted on the statistical model as illustrated in Figure 5. Aggregate posterior prediction agrees well with the data and exhibits high certainty.

Figure 7 illustrates the change in variance of the parameter $\Theta _i$ defined as $\Delta (\sigma ^2_{\Theta _i}) \triangleq \sigma ^2_{\Theta _i} - \sigma ^2_{\Theta _{i, post}}$ versus $\sigma _{\xi }^2$. Parameter $\Theta _2$ exhibits the smallest posterior uncertainty, followed by $\Theta _1$ and $\Theta _3$ for all $\sigma _{\xi }^2$. While $\Theta _1$ has the largest estimated information gain (Figure 3(center)), it exhibits dependencies with $\Theta _3$ (Figure 5(right)), thereby resulting in larger posterior uncertainty in comparison to $\Theta _2$. In practical applications, where model selection or parameter selection is critical, examining the information gain and parameter dependencies can therefore aid in finding parameters that can be estimated with high certainty. Increasing the measurement noise results in a smaller change in parameter variance, that is, the parameters exhibit larger posterior uncertainty. This is also shown by the variation of estimated information gain with measurement noise (Figure 4(right)). On the contrary, the first-order Sobol indices remain unchanged with measurement noise (Figure 4(left)).

Application to methane chemical kinetics

Accurate characterization of chemical kinetics is critical in the numerical prediction of reacting flows. Although there have been significant advancements in computational architectures and numerical methods, embedding the full chemical kinetics in numerical simulations is almost always infeasible. This is primarily because of the high-dimensional coupled ordinary differential equations that have to be solved to obtain concentrations of a large number of involved species. As a result, significant efforts have been made to develop reduced chemical kinetics models that seek to capture features such as ignition delay, adiabatic flame temperature, or flame speed observed using the true chemical kinetics^44,45,46,47. These reduced mechanisms are typically formulated using a combination of theory and intuition, leaving unresolved chemistry, resulting in uncertainties in the relevant rate parameters⁴⁸. Selecting a functional form of the modeled reaction rate terms that lead to reliable parameter estimation is highly desirable^{7, 49}. This means that for high confidence in parameter estimation and thereby model prediction, the underlying parameterization of reaction rate terms must exhibit high practical identifiability.

Shock tube ignition is a canonical experiment used to develop and validate combustion reaction mechanism⁵⁰. In such experiments, the reactant mixture behind the reflected shock experiences elevated temperature and pressure, followed by mixture combustion. An important quantity of interest in such experiments is the time difference between the onset of the reflected shock and the ignition of the reactant mixture, defined as the ignition delay $t_{ign}$⁵¹. Ignition delay is characterized as the time of maximum heat release or steepest change in reactant temperature and is therefore a key physio-chemical property for combustion systems.

To illustrate the practical identifiability framework we will consider stoichiometric methane-air combustion in a shock tube under an adiabatic, ideal-gas constant pressure ignition assumption. Typically, the chemical kinetics capturing detailed chemistry of methane-air ignition is computationally expensive due to hundreds of associated reactions. To model the reaction chemistry, consider the classical 2-step mechanism proposed by Westbrook et al.⁵² that accounts for the incomplete oxidation of methane. This reduced mechanism consists of a total of 6 species (5 reacting and 1 inert species, namely, $\mathrm {N_2}$) and 2 reactions (1 reversible), thus drastically reducing the cost of evaluating the chemical kinetics. The reactions involved in this reduced chemical kinetics model are

$$\begin{aligned} {\text{CH}_4} + \frac{3}{2} {\text{O}_2} \xrightarrow []{k_{1}} \mathrm{{CO}} + 2 {\text{H}}_{2}{\text{O}}, \end{aligned}$$

(29)

$$\begin{aligned} {\text{CO}} + \frac{1}{2} {\text{O}}_2 \mathop {\rightleftharpoons }\limits _{k_{2b}}^{ k_{2f} } {\text{CO}}_2, \end{aligned}$$

(30)

where the overall reaction rates are temperature-dependent and are modeled using the Arrhenius rate equation as

$$\begin{aligned} k_1\triangleq & {} Ae^{ \frac{-48400}{RT} }[{\text{CH}_4}]^{-0.3}[{\text{O}}_2]^{1.3}, \end{aligned}$$

(31)

$$\begin{aligned} k_{2f}\triangleq & {} 3.98\times 10^{14}e^{ \frac{-40000}{RT} }[{\text{CO}}][{\text{H}_2O}]^{0.5}[{\text{O}}_2]^{0.25},\end{aligned}$$

(32)

$$\begin{aligned} k_{2b}\triangleq & {} 5\times 10^{8}e^{ \frac{-40000}{RT} }[{\text{CO}_2}], \end{aligned}$$

(33)

where $A=2.8\times 10^9$ is the pre-exponential factor, R is the ideal gas constant, and T is the temperature in Kelvin. To solve the resulting reaction equations CANTERA v2.6.0⁵³ is used. Figure 8 illustrates the temperature evolution using the 2-step mechanism and GRI-Mech 3.0⁵⁴ for an initial temperature $T_o$ = ${1500}{\text{K}}$, initial pressure $P_o$ = ${100}{\text{kPa}}$, and at a stoichiometric ratio $\phi =1$. The GRI-Mech 3.0 mechanism consists of detailed chemical kinetics with 53 species and 325 reactions. As noticed, the 2-step mechanism under-predicts the ignition delay by nearly an order of magnitude. To improve the predictive capabilities of the 2-step mechanism a functional dependency for the pre-exponential factor can be introduced as $\log A = \mathscr {G} (T_o, \phi )$, where

$$\begin{aligned} \mathscr {G} (T_o, \phi ) \triangleq {18 + \theta _1} + \tanh {(\theta _2 + \theta _3*\phi )\frac{T_o}{1000}}. \end{aligned}$$

(34)

Here, $\theta _1, \theta _2$, and $\theta _3$ are the uncertain model parameters. Similar parameterization has been used for n-dodecane reduced chemical kinetics⁴⁸. It should be noted that while a more expressive functional form for the pre-exponential factor can be chosen in (34), the goal of the framework is to ascertain practical identifiability. For parameter estimation, consider the detailed GRI-Mech 3.0 to be the ‘exact solution’ to the combustion problem which can then be used to generate the data. Consider logarithm of ignition temperature at $T_{o} = {1100}, {1400}, {1700}$ and ${2000}{\text{K}}$ at $\phi =1.0$ and $P_o$ = ${100}{\text{kPa}}$ as the available data for model calibration. Assume an uncorrelated measurement noise $\Gamma =\sigma _{\xi }^2{\mathbb {I}}$ with $\sigma _{\xi }^2 = 0.1$.

Parameter identifiabiliy

The practical identifiability framework is now applied to the methane-air combustion problem to examine the identifiability of the model parameters in (34) before any controlled experiments are conducted. Consider an uncorrelated prior distribution for the model parameters as $\theta _1\sim \mathcal {N} {(0, 1)}; \theta _2\sim \mathcal {N} (0, 1); \theta _3\sim \mathcal {N} (0, 1)$. Such priors result in pre-exponential factors in an order similar to those previously reported⁵², and are therefore considered suitable for the study. Similar to "Application to a linear Gaussian model" historical estimates of the model parameters are not considered for examining identifiability.

The first-order Sobol indices , estimated information gain, and information gain equivalent variance $\mathscr {C}(\Theta _i)$ are shown in Figure 9. The information gain is estimated using $n_{\text {outer}}=12000$ Monte-Carlo samples, and $n_{\text {inner}}=5$ quadrature points. Examining the first-order Sobol indices (see S3 in the supplementary material for convergence study), the output of the forward model exhibits the largest variability due to uncertainty in the variable $\Theta _1$. Followed by similar variability in the model output with respect to $\Theta _2$ and $\Theta _3$. The largest information gain is observed for the variable $\Theta _1$, followed by similar gains for $\Theta _2$ and $\Theta _3$. This means that $\Theta _1$ will have the highest practical identifiability, followed by a much lower identifiability for $\Theta _2$ and $\Theta _3$. Using the hypothetical direct observation model as described in "Estimating practical identifiability", the variable $\Theta _1$ with the largest practical identifiability exhibits the lowest measurement uncertainty, followed by similar uncertainty for $\Theta _2$ and $\Theta _3$.

Figure 10 shows the second-order Sobol indices and estimated parameter dependencies. The second-order Sobol indices (see S3 in the supplementary material for convergence study) follow the trend $S_{2, 3} > S_{1, 2} \approx S_{1, 3}$, suggesting that there are underlying interactions between the parameters $\Theta _2$ and $\Theta _3$. As observed, the low identifiability of $\Theta _2$ and $\Theta _3$ suggested in Figure 9 is primarily due to the underlying dependencies between pairs $(\Theta _1, \Theta _2)$ and $(\Theta _1, \Theta _3)$. To estimate the parameter dependencies $n_{\text {outer}}=12000$ Monte-Carlo samples, and $n_{\text {inner}}=5$ and 10 quadrature points are used for single and two-dimensional integration space, respectively. Similar magnitude of parameter dependencies obtained for the pairs $(\Theta _1, \Theta _2)$ and $(\Theta _1, \Theta _3)$ in addition to similar information gain for $\Theta _2$ and $\Theta _3$ also suggest underlying symmetry with respect to $\Theta _1$. This means that the interchange of $\Theta _2$ and $\Theta _3$ will not affect the output of the statistical model, which can be clearly seen in (34) for $\phi =1$. This is also evident from the second-order Sobol indices which suggest that there is a combined effect on the output of the statistical model due to interactions between $\Theta _2$ and $\Theta _3$.

Parameter estimation

Now, let us consider the parameter estimation problem which seeks $p(\theta \mid y)$, that is the posterior distribution. Typically, a closed-form expression for the posterior distribution is not available due to the non-linearities in the forward model or the chosen family of the prior distribution. As an alternative, sampling-based methods such as Markov Chain Monte Carlo (MCMC) that seek samples from an unnormalized posterior have gained significant attention. These methods construct Markov chains for which the stationary distribution is the posterior distribution. The Metropolis-Hastings algorithm is an MCMC method that can be used to generate a sequence of samples from any given probability distribution⁵⁵. The adaptive Metropolis algorithm is a powerful modification to the Metropolis-Hastings algorithm and is used here to sample from the posterior distribution⁵⁶.

Figure 11 illustrates the correlation between samples obtained using the Adaptive Metropolis algorithm and the obtained aggregate posterior prediction for ignition delay time. Any (linear) correlation is not observed between the variables; however, the joint distribution between pairs $(\Theta _1, \Theta _2)$ and $(\Theta _1, \Theta _3)$ show similarities. These similarities were also observed during the a priori analysis quantifying parameter dependencies as shown in Figure 10.

The obtained aggregate prediction shows dramatic improvement over the 2-step mechanism in predicting ignition delay time over a wide range of temperatures. Using a functional form as (34) for the pre-exponential factor also improved the mixture temperature evolution, as shown in Figure 12. However, the adiabatic flame temperature, which is defined as the mixture temperature upon reaching equilibrium, is still being over-predicted. An improvement in the prediction of the evolution of species concentration over time is also noticed, as shown in Figure 13.

Concluding remarks and perspectives

Examining the practical identifiability of statistical models is useful in many applications, such as parameter estimation, model-form development, and model selection. Estimating practical identifiability prior to conducting controlled experiments or parameter estimation studies can assist in a choice of parametrization that can be associated with a high degree posterior certainty, thus improving confidence in estimation and model prediction.

In this work, a novel information-theoretic approach based on conditional mutual information is presented to assess global practical identifiability of a statistical model in a Bayesian framework. The proposed framework examines the expected information gain for each parameter from the data before performing controlled experiments. Parameters with higher information gain are characterized by having higher posterior certainty, and thereby have higher practical identifiability. The adopted viewpoint is that the practical identifiability of a parameter does not have a binary answer, rather it is the relative practical identifiability among parameters that is useful in practice.

In contrast to previous numerical approaches used to study practical identifiability, the proposed approach has the following notable advantages: first, no controlled experiment or data is required to conduct the practical identifiability analysis; second, different forms of uncertainties, such as model-form, parameter, or measurement can be taken into account; third, the framework does not make assumptions about the distribution of the data and parameters as in the previous methods; fourth, the estimator provides knowledge about global identifiability and is therefore not dependent on a particular realization of the parameters. To provide a physical interpretation to practical identifiability in the context of examining information gain for each parameter, an information gain equivalent variance for a direct observation model is also presented. The practical identifiability framework is then extended to examine dependencies among parameter pairs. Even if an individual parameter exhibits poor practical identifiability characteristics, it can belong to an identifiable subset such that parameters within the subset have functional relationships with one another. Parameters within such an identifiable subset have a combined effect on the statistical model and can be collectively identified. To find such subsets, a novel a priori estimator is proposed to quantify the expected dependencies between parameter pairs that emerge a posteriori.

To illustrate the framework, two statistical models are considered: (a) a linear Gaussian model and (b) a non-linear methane-air reduced kinetics model. For the linear Gaussian model, it is shown that parameters with large information gain and low parameter dependencies can be estimated with high confidence. The variance-based global sensitivity analysis (GSA) also illustrates that parameter sensitivity is necessary for identifiability. However, as conclusively shown, the inability of variance-based GSA to capture different forms of uncertainties can lead to unreliable estimates for practical identifiability. The information gain equivalent variance obtained using a direct observation model shows that parameters with high practical identifiability will be associated with low measurement uncertainty if observed directly. In the case of the methane-air reduced kinetics model, it is shown that parameters with large dependencies can have low information gain and therefore low practical identifiability. Further, the proposed estimator can capture non-linear dependencies and reveal structures within the parameter space before performing controlled experiments. Such non-linear dependencies cannot be observed when considering a posteriori parameter correlations, as only linear relations can be well understood.

Data availability

All data generated or analyzed during this study are included in this published article.

Code availability

Code related to the experiments performed in this article is available at https://github.com/sahilbhola14/GIM.

References

Bellman, R. & Åström, K. J. On structural identifiability. Math. Biosci. 7(3–4), 329–339 (1970).
Article Google Scholar
Cobelli, C. & Distefano, J. J. 3rd. Parameter and structural identifiability concepts and ambiguities: A critical review and analysis. Am. J. Physiol. Regul. Integr. Comp. Physiol. 239(1), 7–24 (1980).
Article Google Scholar
Paulino, C. D. M. & Bragança Pereira, C. A. On identifiability of parametric statistical models. J. Ital. Stat. Soc. 3(1), 125–151 (1994).
Article MATH Google Scholar
Raue, A., Kreutz, C., Theis, F. J. & Timmer, J. Joining forces of Bayesian and frequentist methodology: A study for inference in the presence of non-identifiability. Philos. Trans. R. Soc. A Math. Phys. Eng. Sci. 371(1984), 20110544 (2013).
Article ADS MathSciNet MATH Google Scholar
Lam, N., Docherty, P. & Murray, R. Practical identifiability of parametrised models: A review of benefits and limitations of various approaches. Math. Comput. Simul. 199, 202–216 (2022).
Article MathSciNet MATH Google Scholar
Ramancha, M. K., Astroza, R., Madarshahian, R. & Conte, J. P. Bayesian updating and identifiability assessment of nonlinear finite element models. Mech. Syst. Signal Process. 167, 108517 (2022).
Article Google Scholar
Deussen, P. & Galvanin, F. A model-based experimental design approach to assess the identifiability of kinetic models of hydroxymethylfurfural hydrogenation in batch reaction systems. Chem. Eng. Res. Des. 178, 609–622 (2022).
Article CAS Google Scholar
Ran, Z.-Y. & Hu, B. G. Parameter identifiability in statistical machine learning: A review. Neural Comput. 29(5), 1151–1203 (2017).
Article MathSciNet PubMed MATH Google Scholar
Qian, G. & Mahdi, A. Sensitivity analysis methods in the biomedical sciences. Math. Biosci. 323, 108306 (2020).
Article MathSciNet CAS PubMed MATH Google Scholar
Tomovic, R., Parezanovic, N. S. & Merritt, M. J. Sensitivity of dynamic systems to parameters which increase the order of mathematical models. IEEE Trans. Electron. Comput. 6, 890–897 (1965).
Article MATH Google Scholar
Meissinger, H. F. & Bekey, G. A. An analysis of continuous parameter identification methods. Simulation 6(2), 94–102 (1966).
Article Google Scholar
Tortorelli, D. A. & Michaleris, P. Design sensitivity analysis: Overview and review. Inverse Probl. Eng. 1(1), 71–105 (1994).
Article Google Scholar
Saltelli, A. et al. Global Sensitivity Analysis: The Primer (Wiley, 2008).
MATH Google Scholar
Dobre, S., Bastogne, T. & Richard, A. Global sensitivity and identifiability implications in systems biology. IFAC Proc. Vol. 43(6), 54–59 (2010).
Article Google Scholar
Staley, R. M. & Yue, P. C. On system parameter identifiability. Inf. Sci. 2(2), 127–138 (1970).
Article MathSciNet MATH Google Scholar
Rothenberg, T. J. Identification in parametric models. Econom. J. Econom. Soc. 39, 577–591 (1971).
MathSciNet MATH Google Scholar
Stoica, P. & Söderström, T. On non-singular information matrices and local identifiability. Int. J. Control 36(2), 323–329 (1982).
Article MATH Google Scholar
Petersen, B., Gernaey, K. & Vanrolleghem, P. A. Practical identifiability of model parameters by combined respirometric-titrimetric measurements. Water Sci. Technol. 43(7), 347–355 (2001).
Article CAS PubMed Google Scholar
Weijers, S. R. & Vanrolleghem, P. A. A procedure for selecting best identifiable parameters in calibrating activated sludge model no. 1 to full-scale plant data. Water Sci. Technol. 36(5), 69–79 (1997).
Article CAS Google Scholar
Machado, V. C., Tapia, G., Gabriel, D., Lafuente, J. & Baeza, J. A. Systematic identifiability study based on the fisher information matrix for reducing the number of parameters calibration of an activated sludge model. Environ. Model. Softw. 24(11), 1274–1284 (2009).
Article Google Scholar
Raue, A. et al. Structural and practical identifiability analysis of partially observed dynamical models by exploiting the profile likelihood. Bioinformatics 25(15), 1923–1929 (2009).
Article CAS PubMed Google Scholar
Raue, A., Kreutz, C., Maiwald, T., Klingmüller, U. & Timmer, J. Addressing parameter identifiability by model-based experimentation. IET Syst. Biol. 5(2), 120–130 (2011).
Article CAS PubMed Google Scholar
Simpson, M. J., Baker, R. E., Vittadello, S. T. & Maclaren, O. J. Practical parameter identifiability for spatio-temporal models of cell invasion. J. R. Soc. Interface 17(164), 20200055 (2020).
Article PubMed PubMed Central Google Scholar
Eisenberg, M. C. & Hayashi, M. A. Determining identifiable parameter combinations using subset profiling. Math. Biosci. 256, 116–126 (2014).
Article MathSciNet PubMed MATH Google Scholar
Maiwald, T. et al. Driving the model to its limit: Profile likelihood based model reduction. PloS One 11(9), 0162366 (2016).
Article Google Scholar
Ran, Z. Y. & Hu, B. G. Determining parameter identifiability from the optimization theory framework: A Kullback-Leibler divergence approach. Neurocomputing 142, 307–317 (2014).
Article Google Scholar
Ran, Z. Y. & Hu, B. G. An identifying function approach for determining parameter structure of statistical learning machines. Neurocomputing 162, 209–217 (2015).
Article Google Scholar
Archer, G. E. B., Saltelli, A. & Sobol, I. M. Sensitivity measures, anova-like techniques and the use of bootstrap. J. Stat. Comput. Simul. 58(2), 99–120 (1997).
Article MATH Google Scholar
Saltelli, A., Tarantola, S. & Campolongo, F. Sensitivity analysis as an ingredient of modeling. Stat. Sci. 15, 377–395 (2000).
MathSciNet Google Scholar
Kucherenko, S., Feil, B., Shah, N. & Mauntz, W. The identification of model effective dimensions using global sensitivity analysis. Reliability Engineering & System Safety 96(4), 440–449 (2011).
Article Google Scholar
Wu, X., Shirvan, K. & Kozlowski, T. Demonstration of the relationship between sensitivity and identifiability for inverse uncertainty quantification. J. Comput. Phys. 396, 12–30 (2019).
Article ADS MathSciNet MATH Google Scholar
Pant, S. & Lombardi, D. An information-theoretic approach to assess practical identifiability of parametric dynamical systems. Math. Biosci. 268, 66–79 (2015).
Article MathSciNet PubMed MATH Google Scholar
Capellari, G., Chatzi, E., Mariani, S., et al. Parameter identifiability through information theory. In Proc. of the 2nd ECCOMAS Thematic Conference on Uncertainty Quantification in Computational Sciences and Engineering (UNCECOMP), 15–17 (2017).
Ebrahimian, H., Astroza, R., Conte, J. P. & Bitmead, R. R. Information-theoretic approach for identifiability assessment of nonlinear structural finite-element models. J. Eng. Mech. 145(7), 04019039 (2019).
Article Google Scholar
Pant, S. Information sensitivity functions to assess parameter information gain and identifiability of dynamical systems. J. R. Soc. Interface 15(142), 20170871 (2018).
Article PubMed PubMed Central Google Scholar
Aggarwal, A., Lombardi, D. & Pant, S. An information-theoretic framework for optimal design: Analysis of protocols for estimating soft tissue parameters in biaxial experiments. Axioms 10(2), 79 (2021).
Article Google Scholar
Shannon, C. E. A mathematical theory of communication. Bell Syst. Tech. J. 27(3), 379–423 (1948).
Article MathSciNet MATH Google Scholar
Cover, T. M. Elements of Information Theory (Wiley, 1999).
Google Scholar
Särkkä, S. Bayesian Filtering and Smoothing (Cambridge University Press, 2013).
Book MATH Google Scholar
Ryan, K. J. Estimating expected information gains for experimental designs with application to the random fatigue-limit model. J. Comput. Graph. Stat. 12(3), 585–603 (2003).
Article MathSciNet Google Scholar
Huan, X. & Marzouk, Y. M. Simulation-based optimal Bayesian experimental design for nonlinear systems. J. Comput. Phys. 232(1), 288–317 (2013).
Article ADS MathSciNet CAS Google Scholar
Herman, J. & Usher, W. SALib: An open-source python library for sensitivity analysis. J. Open Sour. Softw.https://doi.org/10.21105/joss.00097 (2017).
Article Google Scholar
Iwanaga, T., Usher, W. & Herman, J. Toward SALib 2.0: Advancing the accessibility and interpretability of global sensitivity analyses. Socio-Environ. Syst. Model. 4, 18155. https://doi.org/10.18174/sesmo.18155 (2022).
Article Google Scholar
Bhattacharjee, B., Schwer, D. A., Barton, P. I. & Green, W. H. Optimally-reduced kinetic models: Reaction elimination in large-scale kinetic mechanisms. Combust. Flame 135(3), 191–208 (2003).
Article CAS Google Scholar
Peters, N. & Rogg, B. Reduced Kinetic Mechanisms for Applications in Combustion Systems (Springer, 2008).
Google Scholar
Pepiot-Desjardins, P. & Pitsch, H. An efficient error-propagation-based reduction method for large chemical kinetic mechanisms. Combust. Flame 154(1–2), 67–81 (2008).
Article CAS MATH Google Scholar
Kelly, M., Dooley, S. & Bourque, G. Toward machine learned highly reduced kinetic models for methane/air combustion. Turbo Expo Power Land Sea Air 84942, 03–04005 (2021).
Google Scholar
Hakim, L., Lacaze, G., Khalil, M., Najm, H. N. & Oefelein, J. C. Modeling auto-ignition transients in reacting diesel jets. J. Eng. Gas Turbines Power 138(11), 112806 (2016).
Article Google Scholar
Vajda, S., Rabitz, H., Walter, E. & Lecourtier, Y. Qualitative and quantitative identifiability analysis of nonlinear chemical kinetic models. Chem. Eng. Commun. 83(1), 191–219 (1989).
Article CAS Google Scholar
Davidson, D. F. & Hanson, R. K. Interpreting shock tube ignition data. Int. J. Chem. Kinet. 36(9), 510–523 (2004).
Article CAS Google Scholar
Chaos, M. & Dryer, F. L. Chemical-kinetic modeling of ignition delay: Considerations in interpreting shock tube data. Int. J. Chem. Kinet. 42(3), 143–150 (2010).
Article CAS Google Scholar
Westbrook, C. K. & Dryer, F. L. Simplified reaction mechanisms for the oxidation of hydrocarbon fuels in flames. Combust. Sci. Technol. 27(1–2), 31–43 (1981).
Article CAS Google Scholar
Goodwin, D.G., Moffat, H.K., Schoegl, I., Speth, R.L., Weber, B.W. Cantera: An object-oriented software toolkit for chemical kinetics, thermodynamics, and transport processes. https://www.cantera.org. Version 2.6.0 (2022). https://doi.org/10.5281/zenodo.6387882
Smith, G.P., Golden, D.M., Frenklach, M., Moriarty, N.W., Eiteneer, B., Goldenberg, M., Bowman, C.T., Hanson, R.K., Song, S., Gardiner Jr., W.C., Lissianski, V.V., Zhiwei: GRI-Mech 3.0 (1999). http://www.me.berkeley.edu/gri_mech/
Tierney, L. Markov chains for exploring posterior distributions. Ann. Stat. 22, 1701–1728 (1994).
MathSciNet MATH Google Scholar
Haario, H., Saksman, E. & Tamminen, J. An adaptive metropolis algorithm. Bernoulli 7, 223–242 (2001).
Article MathSciNet MATH Google Scholar

Download references

Funding

The authors disclose support for the research of this work from OUSD(RE) Grant No: N00014-21-1-295.

Author information

Authors and Affiliations

Department of Aerospace Engineering, University of Michigan, Ann Arbor, MI, 48109, USA
Sahil Bhola & Karthik Duraisamy

Authors

Sahil Bhola
View author publications
You can also search for this author in PubMed Google Scholar
Karthik Duraisamy
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

S.B. primarily developed the methodology and generated numerical results. K.D. formulated the problem statement and provided guidance for the work. All authors reviewed the manuscript.

Corresponding author

Correspondence to Sahil Bhola.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Bhola, S., Duraisamy, K. Estimating global identifiability using conditional mutual information in a Bayesian framework. Sci Rep 13, 18336 (2023). https://doi.org/10.1038/s41598-023-44589-3

Download citation

Received: 11 April 2023
Accepted: 10 October 2023
Published: 26 October 2023
DOI: https://doi.org/10.1038/s41598-023-44589-3

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Identifying the sources of structural sensitivity in partially specified biological models

Predictive power of non-identifiable models

Tutorial: a beginner’s guide to building a representative model of dynamical systems using the adjoint method

Introduction

Quantifying practical identifiability in a Bayesian setting

Bayesian parameter inference

Quantifying information gain

Remark 1

Estimating practical identifiability

Definition 1

Definition 2

Remark 2

Remark 3

Physical interpretation of identifiability in an information-theoretic framework

Estimating parameter dependence

Remark 4

Numerical experiments

Application to a linear Gaussian model

Parameter identifiability

Parameter estimation

Application to methane chemical kinetics

Parameter identifiabiliy

Parameter estimation

Concluding remarks and perspectives

Data availability

Code availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Supplementary Information

Supplementary Information.

Rights and permissions

About this article

Cite this article

Share this article

Comments

Search

Quick links