Article | Open | Published:

# The impact of creating mathematical formula to predict cardiovascular events in patients with heart failure

## Abstract

Since our retrospective study has formed a mathematical formula, α = f(x1, …, x252), where α is the probability of cardiovascular events in patients with heart failure (HF) and x1 is each clinical parameter, we prospectively tested the predictive capability and feasibility of the mathematical formula of cardiovascular events in HF patients. First of all, to create such a mathematical formula using limited number of the parameters to predict the cardiovascular events in HF patients, we retrospectively determined f(x) that formulates the relationship between the most influential 50 clinical parameters (x) among 252 parameters using 167 patients hospitalized due to acute HF; the nonlinear optimization could provide the formula of α = f(x1, …, x50) which fitted the probability of the actual cardiovascular events per day. Secondly, we prospectively examined the predictability of f(x) in other 213 patients using 50 clinical parameters in 3 hospitals, and we found that the Kaplan–Meier curves using actual and estimated occurrence probabilities of cardiovascular events were closely correlated. We conclude that we created a mathematical formula f(x) that precisely predicted the occurrence probability of future cardiovascular outcomes of HF patients per day. Mathematical modelling may predict the occurrence probability of cardiovascular events in HF patients.

## Introduction

Heart failure (HF), one of the leading causes of mortality and morbidity worldwide1, is the end stage of many cardiovascular diseases. Although the cause of HF is usually unique for each patient, numerous clinical and social factors, including disease severity, treatment protocols, comorbidity, lifestyle and social environment, independently link to the patients’ prognoses2,3,4,5,6, implicated in ‘precision medicine’7. However, many studies have not considered the relative contribution of such factors and have not examined the contribution of the unexpected and unknown factors to cardiovascular events. Even if the multiple factors are identified as necessary for the occurrence of cardiovascular events, the results are qualitative and not quantitative. Finally, these factors are usually proved retrospectively because researchers usually do not test the reproducibility of the results in the prospective study, which may not lead the definite conclusion for the identified factors. To overcome these limitations, we devised a mathematical formula using all the parameters and factors in the medical records to provide the occurrence probability of cardiovascular events and revealed that more than 250 factors are linked to the occurrence probability of cardiovascular events in patients with HF8. However, one might argue that this formula is merely the result of the fitting of the clinical data with the function of occurrence probability of cardiovascular outcomes and that this mathematical formula may not predict the future clinical outcomes such as a law of gravity9 -the law of gravity guarantees the time for an object to reach the ground.

To clarify that our mathematical model prospectively provides the probability of cardiovascular events, we devised a mathematical formula using the clinical retrospective data of patients with HF and tested whether this formula can predict the probability of future clinical cardiovascular events per day in patients with HF. If this is proved, we can obtain the formula to predict the occurrence probability of cardiovascular events using many clinical or social parameters beforehand, leading to the precision medicine of HF10,11.

## Methods

### Ethics statement

This study was approved by National Cerebral and Cardiovascular Center Research Ethics Committee (M22-49, M24-51). The Committee decided that the acquisition of informed consent from 167 patients was not required according to the Japanese Clinical Research Guideline because this was a retrospective observational study. Instead, we made a public announcement using both internet homepage of our institution and bulletin boards of our out-patient and in-patient clinics in accordance with the request of the Ethics Committee and the Guideline. For the prospective observational study of 213 patients, we obtained written informed consent after the approval of Research Ethics Committees in three institutes of National Cerebral and Cardiovascular Center and Hokkaido and Kyushu Universities. Registration number of the clinical trial is UMIN000018691 at https://upload.umin.ac.jp/cgi-open-bin/ctr/ctr.cgi?function=brows&action=brows&recptno=R000021637&type=summary&language=J.

### Protocols

#### Protocol I: The creation of the mathematical formula using the retrospective data

Since we retrospectively obtained 252 clinical parameters among 402 parameters in 152 patients with acute decompensated HF (ADHF), calculated the formula to provide the probability of cardiovascular events (the hospitalization or death due to HF)8 and added 16 patients in the patients’ cohort after sorting the data, we enrolled 167 patients with ADHF admitted between November 2007 and October 2009. We followed up these patients until the time of cardiovascular events or December 2014. The diagnosis of HF was confirmed by an expert team of cardiologists using the Framingham criteria12.

Here, we showed how to create the mathematical formula to predict the cardiovascular events in the previous study. First of all, our hypothesis in the previous study is that we can derive a mathematical formula for the estimation of prognosis, i.e., the equation τ = f(x1, …, xp), where x1, …, xp are clinical features and τ represents the day for the cardiovascular event in the patients with HF, and we showed the positive evidence to support such a hypothesis in the previous study. In the present study, we prospectively tested the predictive capability and feasibility of the mathematical formula of cardiovascular events in HF patients to strengthen the feasibility of the creation of the mathematical formula to predict the probability of the cardiovascular events.

Then we explained how we performed to create the mathematical formula of τ = f(x1, …, xp) in the previous study. Since we obtained 402 parameters at the discharge following the hospitalization due to ADHF from the data of careful history-taking, physical examinations, laboratory tests, chest X-rays, electrocardiograms, complete Doppler echocardiographic studies, coronary angiography, right heart catheterization, cardiac scintigraphy, cardiovascular magnetic resonance, cardiopulmonary exercise testing and polysomnography in patients with HF, we hypothesized that all or some of the parameters influence the time of cardiovascular events to some extents, and we quantitatively assessed the occurrence probability of the cardiovascular events using the probability model based on the Poisson process. Thus, the probability density p i (t) for the cardiovascular events of patient i at an elapsed time t after discharge is represented by the following exponential formula:

$${p}_{i}(t)=\frac{1}{{\tau }_{i}}\exp (-\frac{t}{{\tau }_{i}})$$
(1)

A mean elapsed time τ i from discharge to the rehospitalization of patient i depends on some of the given clinical factors $${X}^{i}=\{{x}_{1}^{i},\ldots ,{x}_{p}^{i}\}$$ of the patient, i.e., a common subset $${X}_{S}^{i}\subseteq {X}^{i}$$ over all patients. The dependency is primarily approximated by the following inverse linear relation:

$${\tau }_{i}\cong \frac{1}{\sum _{{x}_{j}^{i}\in {X}_{S}^{i}}{\beta }_{j}{x}_{j}^{i}+\gamma }$$
(2)

where the denominator represents the expected frequency of cardiovascular rehospitalization per day, $${X}_{S}^{i}$$ is a set of values of the factors in X S for patient i, β j is the contributing weight of the j th factor to the frequency, and γ is the intrinsic frequency for any patient. We considered that τ i of the patients are sampled from a common population distribution p τ (τ). Therefore, the total probability distribution of the rehospitalization time P(t) is expected to be a superposition of Eq. (1) for various τ sampled from p τ (τ), as follows, where p(t) is p i (t) in Eq. (1) for a general τ:

From these two equations we obtained the following equation.

$$P(t)={\int }_{0}^{\infty }{p}_{\tau }(\tau )p(t)d\tau ={\int }_{0}^{\infty }{p}_{\tau }(\tau )\frac{1}{\tau }\exp (-\frac{t}{\tau })d\tau$$
(3)

Then we used the following natural conjugate prior distribution for the unknown p τ (τ):

$${p}_{\tau }(\tau )=\frac{{\tau }^{-n}\exp (-1/\tau {\sum }_{i=1}^{n}{\tau }_{i})}{{\int }_{0}^{\infty }{\tau }^{-n}\exp (-1/\tau {\sum }_{i=1}^{n}{\tau }_{i})d\tau }$$
(4)

where τ i is given by the dataset D.

After several steps of the manipulation, we finally described the modeling algorithm. First, the value of every factor $${x}_{j}^{i}\in {X}^{i}$$ for all patients $$i=1,\ldots ,n$$ in D was normalized to fit into the interval [0, 1] using the maximum and minimum values. This normalization to eliminate differences in the factor scales was necessary to allow for the measurement of the essential contribution of each factor’s variation to τ i . Subsequently, we applied the equations (1) and (2) to the normalized dataset D N to model the probabilistic rehospitalization process and we determined the model parameters β j and γ in the equation (1) to maximize the following objective function:

$$L({\beta }_{1},\ldots ,{\beta }_{p},\gamma )=\,\mathrm{ln}[\prod _{i=1}^{n}(\sum _{j=1}^{p}{\beta }_{j}{x}_{j}^{i}+\gamma )\exp \{-(\sum _{j=1}^{p}{\beta }_{j}{x}_{j}^{i}+\gamma ){\tau }_{i}\}]-\lambda (\sum _{j=1}^{p}|{\beta }_{j}|+|\gamma |)$$
(5)

The first term is the log-likelihood of the model consisting of the previous equations over D N . The second term is called an L1-regularization term, which penalizes the coefficients of negligible factors by setting them equal to zero when the larger hyper-parameter λ eliminates more factors. This term avoids the over-fitting of the model to the dataset by selecting a set of effective factors $${X}_{S}^{i}$$ from a given Xi. In our study, λ is tuned to be 0.02 to maintain the largest value of the equation (5) similarly to the other parameters β j and γ.

To seek the optimum parameter values of β1, …, β p , γ that maximize the objective function L(β1, …, β p , γ), we applied a simple greedy hill-climbing algorithm, in which the parameter values are iteratively modified toward their gradient direction $$(\partial L/{\beta }_{1},\ldots ,\partial L/{\beta }_{p},\partial L/\gamma )$$. When the improvement of L becomes nearly negligible, the resulting parameter values are taken as the optima. Because this process depends on the initial values of the parameters, we repeated this optimization 100 times starting with random initial values and selected the result providing the maximum L. This was how we selected 252 influential parameters among the 402 clinical parameters in the previous study8.

Then we selected the most influential 50 parameters among 252 parameters and revised the mathematical formula. The 50 most influential parameters in the present study are defined as the clinical parameters with the 50 highest coefficients values shown in the previous manuscript8. The number of the 50 is arbitrary and the realistic values to be collected for the prospective study.

#### Protocol II: The prospective study to validate the mathematical formula

We prospectively enrolled 213 patients with ADHF admitted between May 2013 and March 2015 in three different hospitals of National Cerebral and Cardiovascular Center (n = 114) and Hokkaido (n = 80) and Kyushu Universities (n = 19) and followed up these patients until the time of cardiovascular events or the end of April 2016. The timing of patients’ discharge was determined by an expert team of cardiologists in charge of the HF department; discharge was recommended when patients presented no signs of decompensation such as the New York Heart Association (NYHA) Functional Classfication <3, no sign of rales, no galloping rhythm, stable blood pressure and an improvement in renal function due to an optimal treatment that followed international guidelines13. Rehospitalization was defined as hospitalization for decompensated HF and cardiovascular death was defined as the death due to the worsening of HF. The primary endpoint was the first cardiovascular event of either rehospitalization or death due to the worsening of HF.

Then we created the mathematical model for the occurrence probability of cardiovascular events. First of all, we assumed that the probability of cardiovascular events per day of patients does not change significantly from its discharge to its cardiovascular events. We defined the mathematical formula to predict the constant occurrence probability of cardiovascular events per day as follows:

$$\alpha =f({x}_{1},\ldots ,{x}_{p}|{\boldsymbol{\beta }},c)={{\boldsymbol{\beta }}}^{T}X+c=\sum _{j=1}^{p}{{\beta }}_{j}{x}_{j}+c$$
(6)

where α is the estimated occurrence probability of cardiovascular events per day for a patient, $$X={({x}_{1},\mathrm{...},{x}_{p})}^{T}$$ is a clinical feature vector of the patient, $${\boldsymbol{\beta }}={({\beta }_{1},\mathrm{...},{\beta }_{p})}^{T}$$ is a weight vector of the features, and c is an intercept of α. In this study, 50 clinical features, that is, p = 50, was used. As any event occurring with a constant probability in a given time period is generated by a Poisson process14, cardiovascular events of a patient also occur through this process with its individual α. Thus, the probability density for cardiovascular events of a patient at an elapsed time t after discharge is represented by the following exponential formula:

$$P(t|X;{\boldsymbol{\beta }},c)=\exp (-\alpha t)=\exp \{-({{\boldsymbol{\beta }}}^{T}X+c)t\}=\exp \{-(\sum _{j=1}^{p}{\beta }_{j}{x}_{j}+c)t\}$$
(7)

Given a retrospective dataset $${D}_{R}=\{({X}_{i},{t}_{i})|i=1,\mathrm{...},{N}_{R}\}$$ where X i and t i are the clinical feature vector and the elapsed days at the cardiovascular event from the discharge of a patient i, respectively, the expected survival curve of patients in D R is represented as:

$$\begin{array}{c}{P}_{RE}(t|{\boldsymbol{\beta }},c)={\int }_{{D}_{R}}P(t|X;{\boldsymbol{\beta }},c){P}_{RE}(X)dX={\int }_{{D}_{R}}\exp \{-({{\boldsymbol{\beta }}}^{T}X+c)t\}{P}_{RE}(X)dX\\ \quad \quad \quad \quad \,=\frac{1}{{N}_{R}}\sum _{i\in {D}_{R}}\exp \{-({{\beta }}^{T}{X}_{i}+c)\cdot t\}=\frac{1}{{N}_{R}}\sum _{i\in {D}_{R}}\exp \{-(\sum _{j=1}^{p}{{\beta }}_{j}{x}_{ij}+c)\cdot t\}\end{array}$$
(8)

where P RE (X) is the population distribution of the retrospective dataset D R . N R is 167 in our case. Conversely, we directly derived the Kaplan–Meier survival curve P R (t) using D R by following a standard procedure15. Then, we estimated the best parameter values of β and c, which minimize the following Kullback–Leibler divergence (KL-divergence)16. The KL-divergence is a well-known statistical measure to reveal the discrepancy between two probability distributions.

$$\begin{array}{rcl}KL({P}_{R},{P}_{RE}|{\boldsymbol{\beta }},c) & = & \int {P}_{R}(t)\{\mathrm{ln}\,{P}_{R}(t)-\,\mathrm{ln}\,{P}_{RE}(t|{\boldsymbol{\beta }},c)\}dX\\ & = & \frac{1}{{N}_{R}}\sum _{i\in {D}_{RR}}\{\mathrm{ln}\,{P}_{R}({t}_{i})-\,\mathrm{ln}\,{P}_{RE}({t}_{i}|{\boldsymbol{\beta }},c)\}\\ & = & \frac{1}{{N}_{R}}\sum _{i\in {D}_{RR}}[\mathrm{ln}\,{P}_{R}({t}_{i})-\,\mathrm{ln}[\frac{1}{{N}_{R}}\sum _{i\text{'}\in {D}_{R}}\exp \{-(\sum _{j=1}^{p}{{\beta }}_{j}{x}_{i^{\prime} j}+c){t}_{i}\}]]\to \,{\rm{\min }}\end{array}$$
(9)

where D RR is a dataset excluding the patients whose observations are censored and, thus, do not have t i in D R . The parameters β and c minimizing this measure are determined by using the Nelder–Mead method17, which is a renowned non-linear optimization algorithm.

We used these estimated parameter values of β and c to predict the survival curve of a given prospective dataset $${D}_{P}=\{({X}_{i},{t}_{i})|i=1,\mathrm{...},{N}_{P}\}$$ where N P is 213 in our case. The predicted survival curve was obtained by substituting the above-mentioned best values of β and c and the clinical feature vectors X i of patients in D P to the following $${P}_{PE}(t|{\boldsymbol{\beta }},{\bf{c}})$$.

$$\begin{array}{c}{P}_{PE}(t|{\boldsymbol{\beta }},c)={\int }_{{D}_{P}}P(t|X;{\boldsymbol{\beta }},c)p(X)dX={\int }_{{D}_{P}}\exp \{-({{\boldsymbol{\beta }}}^{T}X+c)t\}{P}_{P}(X)dX\\ \quad \quad \quad \quad \,=\frac{1}{{N}_{P}}\sum _{i\in {D}_{P}}\exp \{-({\beta }^{T}{X}_{i}+c)\cdot t\}=\frac{1}{{N}_{P}}\sum _{i\in {D}_{P}}\exp \{-(\sum _{j=1}^{p}{{\beta }}_{j}{x}_{ij}+c\cdot t)\}\end{array}$$
(10)

We compared this predicted curve for the prospective dataset D P and the Kaplan–Meier survival curve15 P P (t) directly derived from D P .

### Statistical Analysis

Normally distributed data were expressed as mean ± standard deviation; other values were reported as a median and interquartile range (IQR). We conducted the goodness-of-fit test and used the coefficient of determination as a measure to assess the significant relationships between the predictive curves and actual Kaplan–Meier curves of the cardiovascular event-free rate. The differences in the predictive curves were tested using the Wilcoxon signed-rank test. We estimated the error bounds of the parameters, α and β, by applying the standard bootstrap sampling16. All tests were two-tailed, and P < 0·05 was considered significant. All analyses were performed using the JMP software for Windows (version 8.0.2, SAS Inc., Cary, NC).

## Results

### Patients characteristics

In the retrospective study (Protocol I), the clinical characteristics of the patients are summarized in Table 1. In 78 patients, cardiovascular events (n = 71 for HF rehospitalization, and n = 14 for HF-related death) occurred at a median time of 260 days after discharge and the remaining 89 patients had no cardiovascular events by a median time of 859 days after discharge (range, 515–1194 days). Among clinical parameters, we selected the highest coefficient values of 50 parameters without the data of cardiac catheterization; the 50 clinical parameters with coefficient values for constructing the mathematical formula are depicted in Table 2. In the prospective study (Protocol II), the clinical characteristics of 213 patients are summarized in Table 3. Of these, 84 patients were readmitted to each hospital at a median time of 161 days after discharge, and 21 patients died due to worsening of HF at a median time of 275 days; the remaining 114 patients had no cardiovascular events by a median time of 636 days after discharge (range, 183–898 days).

### Predictive capability of the mathematical formula for the prospective outcomes

We confirmed that the Kaplan–Meier curves using this formula and actual data in the retrospective study revealed the proper fitting of the probability of cardiovascular outcomes (Fig. 1). Then, in the prospective study, we just analyzed the prospective data using only our institute. Figure 2 shows that the mathematical formula obtained from the retrospective study can predict the clinical outcomes observed in the prospective study. Thus, we tested whether our formula can predict the probability of cardiovascular events in all the institutes, and we found that our formula can predict the clinical outcomes for three institutes (Fig. 3).

### The factors that provoke or prevent cardiovascular events in 50 clinical factors

Since we found that the mathematical formula applies to predict the occurrence of cardiovascular events in the prospective study, we assumed that each attribute coefficient for this mathematical formula is also essential for the clinical practice for HF (Table 2). When we investigated the contribution of each parameter for the objective measure, we found that ischemic heart disease results in a worse prognosis. In the physical examination, high heart rate or implantation of pacemaker classification was the worse factor, and the implantation of cardiac resynchronization therapy or implantable cardioverter defibrillator demonstrated better outcomes. Furthermore, the data of blood analysis, echocardiography and oral medications related to the cardiovascular events in the complex and confounding manners. Intriguingly, the number of family members resulted in a better prognosis.

## Discussion

This study provided the evidence that the mathematical formula using the retrospective clinical data provides the occurrence probability of cardiovascular events in the prospective study in patients with HF. We were able to derive the formula of α = f(x1, …, x50), where α is the probability of the cardiovascular events and x1, …, x50 are clinical factors observed before cardiovascular events, which could prospectively predict the occurrence probability of cardiovascular events. This study proposes the novel idea that the occurrence probability of future cardiovascular events can be mathematically formulated and deduced from the retrospective clinical and personal parameters before the time of cardiovascular events.

Importantly, we found that the occurrence probability depends not only on the cardiac dysfunction but also the dysfunction parameters of other organs, such as the kidneys and liver, and social factors, such as the number of family members living with a patient. Therefore, we can regard the occurrence probability as the overall severity of HF. This concept is well matched to the idea that we need to investigate the effect of certain treatment of HF by judging the mortality or morbidity, but not by cardiac function in large-scale clinical trials18. The mortality or morbidity during a certain observation period is depicted by the Kaplan–Meier curves, which represent the occurrence probability of cardiovascular events.

What is the differences between the present and previous studies to assess clinical outcomes? The earlier studies, including ours19,20,21, have merely identified the important factors for cardiovascular outcomes using the cohort data of patients with HF. In such studies, clinical data are retrospectively or prospectively collected and identified the most influential factors using the multivariate analysis. However, no researcher has tested whether such multiple factors can quantitatively predict the occurrence probability of future cardiovascular events. Most of all, arbitrary factors, which are unintentionally collected by investigators and usually ignored, may be essential factors to explain the occurrence probability, and the investigator-intended analysis of the data cannot cover such arbitrary factors beyond expectation. This is the concept of analysis of big data or data mining analysis22. Wang et al.23 revealed that although multiple biomarkers are associated with a high relative risk of adverse events, even the combination of these factors only moderately improved the prediction of risk in an individual. This suggests that the occurrence of cardiovascular events may not be well predictable even after the multiple factors are convoluted. In contrast, we collected almost all the numerical data in the medical records documented before the onset of cardiovascular events and solved the mathematical formula using these parameters to provide the exact probability for future cardiovascular events. Of more than 250 clinical factors that constitute the original mathematical formula8, we selected the 50 most influential factors and re-solved the mathematical formula. The mathematical formula using these 50 factors potentially validates its plausibility for the calculation of the occurrence probability of cardiovascular events in patients with HF, suggesting that we need more clinical data to predict the future outcomes or obtain the mathematical formula for the prediction than we expected. WBC values at admission may approximately indicate the unique value of each patient. On the other hand, the most abnormal values at the admission may determine the severity of the pathophysiology of CHF.

How do we interpret the mathematical formulae given in the present study? One may argue that our process is just adjusting or fitting the clinical data with the clinical outcomes using the mathematical formula. Nevertheless, if the clinical parameters had no relation to the time of the occurrence of cardiovascular events, we could not have fitted clinical parameters with the objective measures. Since we could fit the clinical parameters before the time of the occurrence of cardiovascular events with the objective function of the probability of cardiovascular events, we consider that our fitting process of the mathematical formula seems reasonable. To further confirm the feasibility and applicability of the framework of the present investigation, we agreed to this criticism against our previous work8 and decided to perform the prospective study to test the validity of our mathematical formula to predict the possibility of future cardiovascular events. Figures 2 and 3 support our hypothesis; thus, we can propose the predictability and reproducibility of the occurrence of cardiovascular events in patients with HF using the mathematical models. On the other hand, the patients’ characteristics for retrospective and prospective studies are quite different, as shown in Tables 2 and 3. Patients for the prospective study seemed to have suffered from severer HF than those for the retrospective study. Nevertheless, the Kaplan–Meier curves produced by the formula can provide the right fitting for the actual data of the prospective study, suggesting that the present formula is valid for any group of patients with HF.

It would be intriguing to see the coefficient of each clinical parameter for the mathematical formula. We have to note that we revealed that 50 factors are essential to constitute the function of the occurrence probability of cardiovascular events, however, these factors are confounded in each other, of which the mathematical formula is created, indicating that we should recognize the importance of the network of these 50 factors in creating the formula rather than the clinical impact of each factor. We should be cautious of the fact that some of the 50 clinical parameters are largely and sensitively affected by the acute changes of the pathophysiology of HF. Since such parameters contribute to the creation of the present formula, we can only conclude that each value at the admission or the discharge in each patient affects the occurrence probability of cardiovascular events after discharge. We need to investigate the pathophysiological meaning in the future study.

The most important issue is that we can provide the predictive model of cardiovascular events in HF patients using 50 factors and verify the feasibility of the model in the cohort of HF patients in 3 different institutes.

Another important point of this study is that we formed the mathematical formula by the retrospective clinical data in National Cerebral and Cardiovascular Center at the central part of Japan and tested the applicability in the prospective data in Hokkaido University located in the north of Japan and Kyushu University at the southern part. Although one may consider that this mathematical formula is only valid in National Cerebral and Cardiovascular Center, it is not the case. In fact, this mathematical formula to predict the possibility of cardiovascular events in patients with HF is valid throughout Japan. This mathematical formula may not be valid in other countries; however, the pathophysiology and treatment strategy of HF are common worldwide, suggesting that such formulas should be valid to provide the future occurrence of cardiovascular events in other countries. Of course, the concept to create a mathematical formula should be translated and transmitted worldwide to know the real risk of cardiovascular events and to treat the clinical factors using their data in patients with HF.

There are several applications and limitations for the present study. First of all, since these 50 clinical parameters can be easily provided in outpatient or inpatient clinics, we can evaluate the severity of HF from the viewpoint of the probability of the onset of cardiovascular events in each patient. Secondly, we can identify what clinical factors increase the probability of cardiovascular events, suggesting that we can identify the target of the treatment of HF in each patient. Thirdly, this formula may provide the educational tool for the HF patients. Fourthly, the concept of the creation of formula to predict the clinical outcomes may be applicable to the other fields such as cerebral infarction or cancers24. On the other hand, we have some limitations of the present formula because we created the formula using the data of the HF patients with mild to moderate severity of HF symptom. Therefore, we are not able to apply the present formula to the severe HF patients to predict the occurrence probability of the cardiovascular events because we did not derive the present equation from the cohort of severe HF patients. To respond this requirement, we need to create the mathematical formula using the data of the severe HF patients.

## Conclusions

We created a mathematical formula that precisely provides the probability of the clinical outcomes of patients who are hospitalized with ADHF and discharged after appropriate treatment. Mathematics using the present cardiovascular big data may predict the occurrence probability of future cardiovascular events. Since we found the importance of the clinical parameters independent of cardiac function, it merits the better treatment of HF.

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## References

1. 1.

Go, A. S. et al. Heart disease and stroke statistics–2014 update: a report from the American Heart Association. Circulation 129, e28–e292, https://doi.org/10.1161/01.cir.0000441139.02102.80 (2014).

2. 2.

Braunwald, E. Biomarkers in heart failure. The New England journal of medicine 358, 2148–2159, https://doi.org/10.1056/NEJMra0800239 (2008).

3. 3.

Fonarow, G. C., Peacock, W. F., Phillips, C. O., Givertz, M. M. & Lopatin, M. Admission B-type natriuretic peptide levels and in-hospital mortality in acute decompensated heart failure. Journal of the American College of Cardiology 49, 1943–1950, https://doi.org/10.1016/j.jacc.2007.02.037 (2007).

4. 4.

Abraham, W. T. et al. Predictors of in-hospital mortality in patients hospitalized for heart failure: insights from the Organized Program to Initiate Lifesaving Treatment in Hospitalized Patients with Heart Failure (OPTIMIZE-HF). Journal of the American College of Cardiology 52, 347–356, https://doi.org/10.1016/j.jacc.2008.04.028 (2008).

5. 5.

Mancini, D. M. et al. Value of peak exercise oxygen consumption for optimal timing of cardiac transplantation in ambulatory patients with heart failure. Circulation 83, 778–786 (1991).

6. 6.

Itoh, H., Taniguchi, K., Koike, A. & Doi, M. Evaluation of severity of heart failure using ventilatory gas analysis. Circulation 81, Ii31–37 (1990).

7. 7.

Collins, F. S. & Varmus, H. A new initiative on precision medicine. The New England journal of medicine 372, 793–795, https://doi.org/10.1056/NEJMp1500523 (2015).

8. 8.

Yoshida, A. et al. Derivation of a mathematical expression for predicting the time to cardiac events in patients with heart failure: a retrospective clinical study. Hypertension research: official journal of the Japanese Society of Hypertension 36, 450–456, https://doi.org/10.1038/hr.2012.200 (2013).

9. 9.

Newton, I. The Principia: mathematical principles of natural philosophy. (Univ of California Press, 1999).

10. 10.

Roundtable on Health, L. et al. In Relevance of Health Literacy to Precision Medicine: Proceedings of a Workshop (National Academies Press (US). Copyright 2016 by the National Academy of Sciences. All rights reserved., 2016).

11. 11.

Shah, S. H. et al. Opportunities for the Cardiovascular Community in the Precision Medicine Initiative. Circulation 133, 226–231, https://doi.org/10.1161/circulationaha.115.019475 (2016).

12. 12.

McKee, P. A., Castelli, W. P., McNamara, P. M. & Kannel, W. B. The natural history of congestive heart failure: the Framingham study. The New England journal of medicine 285, 1441–1446, https://doi.org/10.1056/nejm197112232852601 (1971).

13. 13.

Hunt, S. A. et al. Focused update incorporated into the ACC/AHA 2005 Guidelines for the Diagnosis and Management of Heart Failure in Adults A Report of the American College of Cardiology Foundation/American Heart Association Task Force on Practice Guidelines Developed in Collaboration With the International Society for Heart and Lung Transplantation. Journal of the American College of Cardiology 53, e1–e90, https://doi.org/10.1016/j.jacc.2008.11.013 (2009).

14. 14.

Gullberg, J. Mathematics: from the birth of numbers. (WW Norton & Company, 1997).

15. 15.

Efron, B. Logistic regression, survival analysis, and the Kaplan-Meier curve. Journal of the American statistical Association 83, 414–425 (1988).

16. 16.

Bishop, C. M. Pattern recognition. Machine Learning 128, 1–58 (2006).

17. 17.

McKinnon, K. I. M. Convergence of the Nelder–Mead Simplex Method to a Nonstationary Point. SIAM Journal on Optimization 9, 148–158, https://doi.org/10.1137/s1052623496303482 (1998).

18. 18.

Ferreira-Gonzalez, I. et al. Problems with use of composite end points in cardiovascular trials: systematic review of randomised controlled trials. BMJ (Clinical research ed.) 334, 786, https://doi.org/10.1136/bmj.39136.682083.AE (2007).

19. 19.

Ohara, T. et al. Plasma adiponectin is associated with plasma brain natriuretic peptide and cardiac function in healthy subjects. Hypertension research: official journal of the Japanese Society of Hypertension 31, 825–831, https://doi.org/10.1291/hypres.31.825 (2008).

20. 20.

Chen, C. Y. et al. Serum blood urea nitrogen and plasma brain natriuretic Peptide and low diastolic blood pressure predict cardiovascular morbidity and mortality following discharge in acute decompensated heart failure patients. Circulation journal: official journal of the Japanese Circulation Society 76, 2372–2379 (2012).

21. 21.

Chanson-Rolle, A., Aubin, F., Braesco, V., Hamasaki, T. & Kitakaze, M. Influence of the Lactotripeptides Isoleucine-Proline-Proline and Valine-Proline-Proline on Systolic Blood Pressure in Japanese Subjects: A Systematic Review and Meta-Analysis of Randomized Controlled Trials. PloS one 10, e0142235, https://doi.org/10.1371/journal.pone.0142235 (2015).

22. 22.

Zhu, K. et al. Predicting 30-day Hospital Readmission with Publicly Available Administrative Database. A Conditional Logistic Regression Modeling Approach. Methods of information in medicine 54, 560–567, https://doi.org/10.3414/me14-02-0017 (2015).

23. 23.

Li, G. et al. The long-term effect of lifestyle interventions to prevent diabetes in the China Da Qing Diabetes Prevention Study: a 20-year follow-up study. Lancet 371, 1783–1789, https://doi.org/10.1016/s0140-6736(08)60766-7 (2008).

24. 24.

Tanaka, G., Hirata, Y., Goldenberg, S. L., Bruchovsky, N. & Aihara, K. Mathematical modelling of prostate cancer growth and its application to hormone therapy. Philosophical transactions. Series A, Mathematical, physical, and engineering sciences 368, 5029–5044, https://doi.org/10.1098/rsta.2010.0221 (2010).

## Acknowledgements

This study is supported by Grants-in-aid from the Ministry of Health, Labor, and Welfare-Japan, Grants-in-aid from the Ministry of Education, Culture, Sports, Science and Technology-Japan, Grants-in-aid from Japan Agency for Medical Research and Development (JP17ek0210080).

## Author information

We would like to identify the role of each author in this manuscript as follows. Study concept and design: T.W., M.K.; Data collection, M.S., J.K., T.I., S.K., A.F., H.T.; Data analysis: H.F., A.I., S.I., H.A., M.A.; Figures and Tables: H.A., M.A.; Writing: M.K.

### Competing Interests

Nothing to disclose for M.S., H.F., J.K., T.I., S.K., A.F., A.I., S.I., H.A. and T.W. H.T. reports personal. fees from Astellas, Otsuka, Takeda, Daiichi-Sankyo, Tanabe-Mitsubishi, Boehringer Ingelheim, Novartis, Bayer, and Bristol Myers Squibb. M.A. reports personal fees from Bayer, Ono, Otsuka, Pfizer, Sanofi, and Takeda. M.K. reports grants from Japanese government, during the conduct of the study; grants from Japanese government, Japan Heart Foundation, Japan Cardiovascular Research Foundation, Novartis, and Nihon Kohden; grants. and personal fees from Asteras, Pfizer, Tanabe-Mitubishi, Ono, Astrazeneca and Kureha; personal fees from Daiichi-sankyo, Bayer, Kowa, MSD, Shionogi, Taisho-Toyama and Toaeiyo

Correspondence to Masafumi Kitakaze.