Multifractal foundations of biomarker discovery for heart disease and stroke

Any reliable biomarker has to be specific, generalizable, and reproducible across individuals and contexts. The exact values of such a biomarker must represent similar health states in different individuals and at different times within the same individual to result in the minimum possible false-positive and false-negative rates. The application of standard cut-off points and risk scores across populations hinges upon the assumption of such generalizability. Such generalizability, in turn, hinges upon this condition that the phenomenon investigated by current statistical methods is ergodic, i.e., its statistical measures converge over individuals and time within the finite limit of observations. However, emerging evidence indicates that biological processes abound with nonergodicity, threatening this generalizability. Here, we present a solution for how to make generalizable inferences by deriving ergodic descriptions of nonergodic phenomena. For this aim, we proposed capturing the origin of ergodicity-breaking in many biological processes: cascade dynamics. To assess our hypotheses, we embraced the challenge of identifying reliable biomarkers for heart disease and stroke, which, despite being the leading cause of death worldwide and decades of research, lacks reliable biomarkers and risk stratification tools. We showed that raw R-R interval data and its common descriptors based on mean and variance are nonergodic and non-specific. On the other hand, the cascade-dynamical descriptors, the Hurst exponent encoding linear temporal correlations, and multifractal nonlinearity encoding nonlinear interactions across scales described the nonergodic heart rate variability more ergodically and were specific. This study inaugurates applying the critical concept of ergodicity in discovering and applying digital biomarkers of health and disease.

www.nature.com/scientificreports/Heart rate variability (HRV) has been one of the key noninvasive biomarkers of cardiovascular health 10 .It measures the fluctuations and variations in time intervals between successive heartbeats or R-R intervals (RRi).HRV is an emergent phenomenon that emerges out of the complex and nonlinear interactions between the cardiovascular and nervous systems [11][12][13] and represents the peripheral output of the central autonomic network (CAN) and the capacity for behavioral adaption to environmental stresses [14][15][16][17][18][19][20][21][22] .Because it emerges from such complex and integral interactions, HRV can be a representative marker of cardiovascular health.Healthy human HRV indicates desirable balance and interaction between the functions of the sympathetic and parasympathetic nervous systems [23][24][25] .Group-level findings have shown that HRV might be superior to many other biomarkers in representing the overall state of health and well-being 26,27 .
Ergodicity is an essential requirement of a digital biomarker to be applied reliably in current medical practice.Similar values of a digital biomarker across different individuals must represent similar bodily states.In other words, standard cut-off points of such a biomarker must reliably separate the states of health and disease in each different individual 67,68 .Based on these practices, most medical research, similar to most biological, psychological, and social research, has aggregated the data gathered from randomly selected groups of individuals and used group-based statistical methods to reach conclusions.Such conclusions are then deemed generalizable to the behaviors of different individuals across different contexts.However, ergodicity is a requisite of this generalizability from group-level data to an individual's behaviors.In nonergodic measurements, the behaviors of an individual at a specific time diverge from the average of that measurement across a group of individuals and also the average of that individual's behaviors over an extended period [69][70][71][72] .Ergodicity refers to the convergence of these two averages: the finite-ensemble average and the finite-time average (Fig. 1).The finite-ensemble average, which is also recognized as the "sample average, " is x i (t), x i (t) , where x i (t) is the ith of N individual cases of x(t) included in the finite-ensemble average.The finite-time average when the measured behavior x changes at T = �t/δt discrete times t + δt, t + 2δt, . . . is x �t = 1 Tδt T τ =1 x(t + τ δt). (2) x(t + τ δt).x i (t).

HRV breaks ergodicity
To examine the ergodic properties of the RRi series (exemplified in Fig. 2a-c), we submitted the original RRi series and the corresponding shuffled versions to the Thirumalai-Mountain analysis 102,103 , which yields a dimensionless metric called the ergodicity breaking factor, E B , where is the time average mean-squared displacement of the stochastic series x(t) for lag time .Rapid decay of E B to a finite asymptotic value for progressively larger samples, i.e., E B → 0 as t → ∞ implies ergodicity.Slower decay indicates less ergodic systems in which trajectories are less reproducible.No decay or convergence to a finite asymptotic value indicates strong ergodicity breaking 104,105 .E B (x(t)) thus allows testing whether a given series breaks ergodicity.E B for the original RRi series did not decay at all with t in the finite range of 1000 secs, essentially remaining unchanged over a progressively longer time for healthy controls as well as the two patient groups ( E B (x(t)) = −0.0183� t , 0.0194 � t , and −0.0306 t for healthy controls, CHF nonsurvivors, and CHF survivors, respectively; colored lines in Fig. 2d-f).These values of E B (x(t)) indicate strong ergodicity breaking in the original RRi series.In contrast, E B for the shuffled RRi series rapidly decayed to a finite asymptotic value in the finite range of 1000 secs, indicating ergodicity ( E B (x(t)) = −1.0274� t , −1.0475 � t , and −1.0029 t for healthy controls, CHF nonsurvivors, and CHF survivors,

Table 1.
Baseline clinical characteristics of the chronic heart failure patients.Reproduced from Kiyono et al. 99 .BNP = brain natriuretic protein; BUN = blood urea nitrogen; Cr = creatinine; ACE = angiotensin-converting enzyme inhibitor; ARB = angiotensin II receptor blocker.respectively; grey lines in Fig. 2d-f).As by shuffling the original RRi series, the temporal structure and information of the RRi series are removed, these values of E B (x(t)) suggest that the very temporal structure of HRV is the source of nonergodicity in HRV.

Linear descriptors based on mean and variance are nonergodic
Now that we have witnessed ergodicity breaking in the raw RRi series, let us investigate the ergodic properties of some of the linear descriptors widely used in cardiovascular digital medicine: HRV parameters based on mean and variance 65 .Here, we chose the mean and root mean square of successive RR intervals, hereinafter noted as M and RMS, respectively.M and RMS were computed by dividing the RRi series into nonoverlapping epochs comprising 500 beats.So, for example, a raw RRi series of 100,000 beats yielded 2000 nonoverlapping epochs, each comprising 500 beats, ultimately yielding M and RMS series of 2000 samples each.Similar to the behavior of E B for the raw RRi series, E B for M and RMS did not decay at all over epochs in the narrow range of 20  .E B remained unchanged over a progressively larger number of epochs for all three groups (colored lines in Fig. 3a,c).In contrast, E B for M and RMS of the shuffled RRi series rapidly decayed to a finite asymptotic value in the narrow range of 20 epochs ( E B (M(epoch)) = −1.23373a,c).In other words, M and RMS-based HRV parameters failed to provide ergodic descriptions of HRV.Furthermore, the contrast between behaviors of E B for the original and the shuffled RRi series indicates that the very temporal structure of HRV contributes to this failure.
To test the specificity and reliability of these HRV parameters, we also performed Monte Carlo simulations by randomly sampling the 1000-sample RRi series from the 24-hour recordings for each individual and performing one-way ANOVA tests separately on these series' M and RMS values.We repeated this process 1000 times.One-way ANOVAs failed to detect reduced M of HRV due to the CHF, 36.8% and 12.7% times in nonsurvivors www.nature.com/scientificreports/(red histogram in Fig. 3b) and survivors (green histogram in Fig. 3b), respectively, compared to healthy controls.Likewise, one-way ANOVAs failed to detect reduced RMS of HRV due to the CHF, 34.6% and 11.3% times in nonsurvivors (red histogram in Fig. 3d) and survivors (green histogram in Fig. 3d), respectively, compared to healthy controls.In other words, we found a high likelihood of failing to identify statistically significant differences among the three groups' M and RMS.These results confirm that linear descriptors M and RMS cannot be used as reliable HRV parameters for digital biomarkers of health and disease.
Linear descriptors NN50 and pNN50 are only weakly ergodic but not specific The number of adjacent RR intervals that differ by more than 50 milliseconds and the percentage of such RR intervals are two other linear descriptors widely used as HRV parameters 65 .E B for NN50 and pNN50 had similar behavior to that of the shuffled RRi series; however, E B had a shallower initial decay for NN50 and pNN50 with a progressively larger number of epochs in the narrow range of  .Even- tually, E B reached an asymptotic finite but a relatively larger value over a progressively larger number of epochs (colored lines in Fig. 4a,c 4a,c).Thus, NN50 and pNN50 break ergodicity only weakly, providing more ergodic descriptions of the nonergodic HRV than the previous two linear descriptors, M and RMS.Again, the contrast between the original and shuffled RRi series indicates that the very temporal structure of HRV contributes to this weak ergodicity breaking by NN50 and pNN50.
To test the specificity and reliability of these parameters, we performed Monte Carlo simulations by randomly sampling 1000-sample RRi series from 24-hour recordings for each individual and performing one-way ANOVA tests separately on these series' NN50 and pNN50 values.We repeated this process 1000 times.One-way ANOVAs revealed that NN50 of HRV did not differ between either patient populations and healthy controls: CHF nonsurvivors and survivors (red histogram and green histogram, respectively, in Fig. 4b).Likewise, one-way (a) The ergodicity breaking parameter, E B , did not change for the mean of successive RR intervals, M, in the original RRi series over a progressively larger number of epochs (colored lines).In contrast, E B decayed rapidly for the M of the shuffled RRi series (grey lines).(b) Null hypothesis significance testing (NHST) for M across the three groups.One-way ANOVAs failed to detect reduced M of HRV due to the CHF, 36.8% and 12.7% times in nonsurvivors (red histogram) and survivors (green histogram), respectively, compared to healthy controls.(c) E B , did not change for the root mean square of successive RR interval differences, RMS, in the original RRi series over a progressively larger number of epochs (colored lines).In contrast, E B decayed rapidly for the RMS of the shuffled RRi series (grey lines).(d) Null hypothesis significance testing (NHST) for RMS across the three groups.One-way ANOVAs failed to detect reduced M of HRV due to the CHF, 34.6% and 11.3% times in nonsurvivors (red histogram) and survivors (green histogram), respectively, compared to healthy controls.Thin lines and thick lines in (a,c) represent E B for individuals and mean E B for the three groups, respectively.www.nature.com/scientificreports/ANOVAs revealed that pNN50 of HRV did not differ between either patient populations and healthy controls: CHF nonsurvivors and survivors (red histogram and green histogram, respectively, in Fig. 4d).Hence, NN50 and pNN50 might only weakly break ergodicity but also not diagnose CHF.

Cascade-dynamical descriptors H fGn and t MF are both ergodic and specific
We hypothesized that cascade-dynamical descriptors might provide ergodic descriptions of the nonergodic HRV by capturing the source of ergodicity breaking.The most compelling descriptions of cascading dynamics come from multifractal geometry 43,47,106 .Simulations of cascade processes show two critical features: long-range linear temporal correlations and nonlinear correlations involving interactions across timescales.The former feature appears most frequently as a fractional Gaussian noise (fGn) in which the standard deviation increases as a power function of the timescale.The fractional power in this function is known as the Hurst exponent, H fGn .H fGn and has already been shown to be sensitive to differences in HRV due to congestive heart failure 38 .The latter feature of nonlinear correlations concerns the effects spreading across the hierarchical organization of biological structures, producing a variation in H fGn , i.e., multifractality.We can estimate this nonlinear variation by estimating the variation in power functions over time and then comparing this multifractal variation to what a linear model of the underlying RRi series can produce.That is, by comparing the multifractality (i.e., the number of power functions) estimable for the original RRi series to the same multifractal property for a sample of synthetic RRi series.The one-sample t-test comparing the multifractality of the original to the synthetic RRi series provides a t-statistic, multifractal nonlinearity, t MF , which quantifies nonlinear correlations due to cascade dynamics 107 .
H fGn and t MF have been shown to provide ergodic descriptions of the nonergodic series of both simulated and empirical biological measurements 73,81,82 .We aim to determine whether H fGn and t MF can adequately describe the nonergodic HRV.Furthermore, we test whether H fGn and t MF provide specificity.H fGn and t MF were computed by first interpolating the RRi series to 2 Hz and then dividing it into nonoverlap- ping epochs comprising 1000 samples (i.e., spanning 500 s).So, for example, a raw RRi series of 100, 000 samples yielded 1000 nonoverlapping epochs, each comprising 1000 samples, ultimately yielding H fGn and t MF series of 2000 samples each.The behavior of E B for the epoch series of H fGn and t MF bears a strong resemblance to the behavior of E B for the shuffled raw RRi series.For H fGn , E B had an initial rapid decay in the narrow range of 20   5c).As epoch size increased, CHF survivors and CHF nonsurvivors exhibited an E B decay fully comparable with shuffled versions of the series for short to medium epoch sizes and all epoch sizes, respectively.Healthy patients showed much shallower decay for the original RRi series than for the shuffled RRi series.These results show that the cascade-dynamical descriptors, H fGn and t MF , provide more ergodic descriptions of the nonergodic HRV but only over a narrow range of short epoch sizes for healthy cases and only for longer epochs in CHF cases.The cascade-dynamical nature of HRV that contributed to the nonergodicity of linear descriptors like M and RMS was thus captured by cascadedynamical descriptors H fGn and t MF marginally more than traditional linear descriptors.But the truth is that these E B -vs.-epoch curves show a heterogeneity across epochs that we had not previously observed in theoretical simulations.Hence, the success of ergodic characterization by cascade-dynamical descriptors is mixed at best and encouraging only in contrast to the much poorer performance of the linear descriptors.In the Discussion, we reflect on what these mixed results could mean for methodological and theoretical concerns moving forward.
To test the specificity of these cascade-dynamical descriptors, we performed Monte Carlo simulations by randomly sampling 1000-sample RRi series from 24-hour recordings for each individual and performing oneway ANOVA tests separately on these series' H fGn and t MF values.We repeated this process 1000 times.One-way ANOVAs failed to detect reduced H fGn of HRV due to the CHF, 0.2% and 1.8% times in nonsurvivors (red histo- gram in Fig. 5b) and survivors (green histogram in Fig. 5b), respectively, compared to healthy controls.Likewise, one-way ANOVAs failed to detect reduced M of HRV due to the CHF, 13.2% and 2.4% times in nonsurvivors (red histogram in Fig. 5d) and survivors (green histogram in Fig. 5d), respectively, compared to healthy controls.Thus, H fGn and t MF provide ergodic descriptions of the nonergodic HRV and can specifically differentiate clinical The ergodicity breaking parameter, E B , decayed initially for the H fGn of the original RRi series over short epochs (colored lines).However, this decay was shallower compared to that of the H fGn of the shuffled RRi series(grey lines) overall longer epochs.(b) NHST of H fGn across the three groups.One-way ANOVAs failed to detect reduced H fGn of HRV due to the CHF, only 0.2% and 1.8% times in nonsurvivors (red histogram) and survivors (green histogram), respectively, compared to healthy controls.(c) E B decayed rapidly over epochs for the t MF of both the original RRi series (colored lines) and the shuffled RRi series (grey lines) but only for the CHF nonsurvivors.Rapid decays of E B for the original RRi series comparable to the shuffled RRi series only held for extremely short epochs in the healthy case and for small to medium epoch sizes in the CHF survivors.(d) One-way ANOVAs failed to detect reduced t MF of HRV due to the CHF, 13.2% and 2.4% times in nonsurvivors (red histogram) and survivors (green histogram), respectively, compared to healthy controls.Thin lines and thick lines in (a,c) represent E B for individuals and mean E B for the three groups, respectively.groups with high reliability.These results support our proposal of capitalizing cascade-dynamical descriptors as generalizable and reproducible HRV parameters for digital biomarkers of health and disease.
To assess the prognostic potential of the cascade-dynamical descriptors H fGn and t MF in predicting mortality among CHF patients 108,109 , survival analysis was conducted.Kaplan-Meier cumulative survival curves utilizing H fGn and t MF as predictive variables are depicted in Fig. 6.The dichotomization of these descriptors was deter- mined by their respective medians (Mdn).However, the analysis revealed that H fGn and t MF failed to predict mor- tality effectively.Mantel-Haenszel log-rank statistics were calculated, yielding hazard ratios of 0.814 [0.619, 1.010] with a p-value of 0.295 for H fGn and 0.846 [0.652, 1.040] with a p-value of 0.391 for t MF .These results suggest that neither of these descriptors exhibited significant prognostic capacities in this cohort of CHF patients, highlighting their limited utility.Therefore, incorporating cascade-dynamical descriptors like H fGn and t MF alongside conven- tional ones could be advantageous for optimizing predictivity, specificity, generalizability, and reproducibility in post-CHF prognosis until further advancements enhance the capabilities of these descriptors.

Discussion
Here, we surveyed various descriptors that could be used by traditional and digital medicine to inform the diagnosis and prognosis of cardiovascular conditions such as CHF to identify those descriptors that could provide predictive, specific, generalizable, and reproducible assessments.The primary risk we have highlighted is that the raw RRi series breaks ergodicity.This nonergodicity of HRV is a liability to clinical care because the raw RRi series fails to converge toward an average.Without this convergence, any sequence of RRi cannot be deemed sufficiently representative-whether of the patient's long-term HRV or groups with a definitive clinical diagnosis or prognosis.We found that many of the most conventional linear descriptors are all as nonergodic as the raw RRi series they summarize.We also identified that the primary source of these nonergodicities is the very temporal structure of HRV and its cascade-dynamical nature.Afterward, we hypothesized that this very origin of nonergodicities might hold the key to the ergodic descriptions of HRV.The cascade-dynamical descriptors H fGn and t MF confirmed this hypothesis, provided ergodic descriptions of the nonergodic HRV, and could also specifically differentiate clinical groups.However, our survival analysis indicated that even H fGn and t MF cannot sufficiently predict post-CHF prognosis.Previous works exploring the usefulness of cascade- dynamical descriptors have often relied on a suite of descriptors, including those sensitive to the non-Gaussian statistics of HRV 33,34,[96][97][98][99][100][101] .This suggests that it might be best to employ cascade-dynamical descriptors, like H fGn and t MF , alongside more traditional descriptors to achieve maximum predictivity, specificity, generalizability, and reproducibility.
The widespread adoption of smartwatches and other wearable biosensors with heart-rate monitoring capabilities has sparked hope for the early detection of cardiovascular diseases.However, the belief that more data is sufficient to improve predictions is overly optimistic and misguided.Machine learning/artificial intelligence (ML/ AI) models are being developed to achieve highly accurate, sensitive, and specific measures of cardiovascular health; however, to aptly capitalize on these powerful tools, the need for a theoretical understanding of heart rate variability must be addressed.Despite the availability of vast amounts of data, the advancement in understanding the nature of heart rate variability has been modest despite years of work and thousands of scientific publications.This limitation prevents us from making meaningful inferences using ML/AI models.The current reliance on manual or automatic feature extraction [110][111][112][113][114][115] is problematic since these features may not suitably reflect the www.nature.com/scientificreports/primary causal mechanisms and be too much dependent on contextual variables [116][117][118][119] .This study emphasizes that the current optimism surrounding the use of wearables and ML/AI models to detect cardiovascular diseases must be accompanied by a deeper understanding of the ergodicity-breaking behavior of HRV.
Our conclusions merit urgent attention as they show the unreliability of prevalent linear descriptors of HRVlike mean-based parameters.We have shown that some of the most intuitive conventional descriptors of HRV, like mean-based descriptors, are nonergodic.In contrast, cascade-dynamical descriptors, such as t MF , can improve the assessment of cardiovascular health when used along with traditional linear descriptors.These results align with those of previous studies that had reported that nonlinear descriptors could provide additional prognostic information compared to conventional linear descriptors 39,42,97,99,101 , e.g., short-term scaling exponent is a better predictor of mortality or other primary endpoints in cardiovascular patients 120,121 .Moreover, such nonlinear descriptors have even been found to be reproducible across different populations [122][123][124] and contexts, e.g., receiving or not receiving beta-blockers 125 , and different times and methods of measurement 39,123 .
Our results also show that a descriptor's ergodicity is necessary but insufficient for its prognostic capability.Although H fGn and t MF provided ergodic descriptions of HRV, they failed to predict mortality in CHF patients.Thus, although ergodicity is necessary for generalizable and reproducible inferences, to reach the utmost specific, generalizable, and reproducible assessment, combining descriptors that provide ergodic descriptions, like the cascade-dynamical descriptors investigated here, with other descriptors, like the conventional ones.These cascade dynamics descriptors encapsulate a pivotal facet of physiological functionality and remain firmly rooted in theoretical validity.Although the precise governing mechanism underlying the intricate heart rate dynamics remains elusive, a study by Lin and Hughson 37 has drawn attention to a captivating analogy between heart rate dynamics and turbulence.This analogy is unveiled through the revelation of structural parallels within the realm of multifractal formalism 126 -specifically, Lin and Hughson established a correlation between heart rate increments and spatial velocity differences within a stochastic cascade process, which serves as a model for hydrodynamic turbulence.In our present investigation, we have harnessed this concept together with long-range correlations characterized by the fractal Hurst exponent to delineate the heart's operation at a critical juncture 127,128 .Fluctuations occurring within systems operating near critical points are inherently entwined with scale invariance and universal behavior, as encapsulated by the scaling function 129,130 .Consequently, we could access the ergodic manifestation of these scale-invariant structures quantified by our cascade-dynamical descriptors 73,81,82 .
To also compare the nonlinear descriptors investigated here, it must be noted that H fGn is primarily a monofractal descriptor and is best suited to describe series generated based on one fractal-scaling exponent.However, the modeling of cascade dynamics due to nonlinear interactions across scales inherent to HRV is beyond the scope of H fGn and requires multifractal formalism 47,106,131 .Monofractal fluctuations such as fGn are ideally defined exhaustively by single fractional exponents H fGn and fall cleanly within the linear model through an autocorrelation function indicative of fractional integration 132,133 .The nonlinearity of interactions across scales requires not only one but many fractional scaling exponents in addition to strictly linear long-range correlations.Hence, multifractal modeling is necessary to analyze the putative cascade-dynamical route to nonergodicity thoroughly 73,81,82 , i.e., the inherently multifractal descriptor, t MF , is superior to H fGn in encoding nonlinear interactions across scales, which is characteristic of HRV.
It is important to recognize that ergodicity functions on various levels, including those of the individual and the group.When referring to an individual, ergodicity refers to the capacity to generalize through time.On the other hand, it entails extrapolating from a group level to an individual level when observed in a group setting.The former is ideal since it is compatible with personalized medicine; however, the latter situation is less preferable unless we support non-personalized medicine and more inclusive species-wide strategies.We mainly relied on the ergodicity breaking parameter E B 102,103,134 , which reflects a strictly intraindividual analysis.Specifically, the E B metric refers to the time average across various epochs within the same measurement series, i.e., intra- series variation of the mean, not inter-series variability.When we compare E B to shuffled versions of the same measurement series, this comparison again uses the original measurement to construct the standard.In other words, E B is never calculated by comparing one participant to another, let alone to any population parameters.Nonetheless, our stance is justified because ergodicity-breaking implies the absence of either type of ergodicity at the individual and group levels.
At first glance, it can seem absurd to seek ergodicity in a physiological measurement because biological and psychological processes routinely break ergodicity 69,70,72,74,[135][136][137][138][139][140] .However, biological and physiological sciences explicitly identify those scales and spans where ergodicity holds.They do so because ergodicity supports the clearest information-theoretic and predictive modeling that our statistics could marshal for understanding biology and physiology 141 .This carving out of ranges where ergodicity holds might be why we still have many valuable biomarkers used in clinical practice and established through extensive group data.Indeed, at clinically relevant timescales, the best biomarkers have been those with little temporal change, allowing the expectation of consistent repetition throughout time for each individual and accurately reflecting their biological status.We usually prefer such intraindividual ergodicity rather than the less likely ergodicity at the scale of a whole population of organisms.Whereas ergodicity is likely to fail at the scale of a species population, the best hope for biomarkers is with ergodicity within the intraindividual variation, such as might support a more contextsensitive, personalized clinical approach.Some other points also warrant further attention.For more comprehensive employment of nonlinear descriptors, such as those proposed here, especially in traditional medical settings, it might be necessary to provide more intuitive interpretations for clinicians and educate clinicians so that the biological basis of these mathematical parameters is clear 28 .Also, cascade dynamics is only one of the mechanisms that can lead to ergodicity-breaking physiological variabilities.It is still being determined whether all such mechanisms could be modeled as cascade processes (e.g., 95 ).Despite the central role of cascade processes in biomedical explanation 142 , we hope that future www.nature.com/scientificreports/investigations examine a broader class of anomalous diffusion regimes [143][144][145][146][147][148][149][150][151][152] that can also lead to ergodicitybreaking physiological variabilities.Further work is needed to determine whether cascade-dynamical descriptors enable reproducible health assessment when the sources of ergodicity breaking are more nuanced.The statistical modeling framework presented in the present study will be fundamental in guiding these investigations.This study presents several salient limitations warranting scrutiny: First and foremost, excluding patients with cardiac pacemakers represents a notable constraint.However, we must recognize that patients possessing dual-chamber devices, thoughtfully programmed to forestall atrial pacing, could have been judiciously integrated into the analytical framework.In such instances, the sinus rate and rhythm regulation would have mirrored those observed in non-paced patients.This consideration bears particular relevance in light of the escalating prevalence of CHF patients undergoing cardiac resynchronization therapy.Secondly, it is conceivable that the prognostic potential of the cascade-dynamical descriptor may have been obscured by the relatively modest sample size, comprising a mere cohort of 108 CHF patients.Consequently, it becomes imperative to embark on further inquiries to unravel the prognostic import of the cascade-dynamical descriptors within the realm of CHF prognosis.In summation, notwithstanding the commendable contributions engendered by this study, it is paramount to acknowledge and systematically address these limitations.Such endeavors are quintessential for fostering a more encompassing comprehension of the repercussions and applicability of the findings, particularly within the context of biomarker research.
Eventually, the challenges faced in this study and our proposed solutions should be unrestricted to the case of HRV and cardiovascular health.nonergodicity and cascade dynamics abound in biological processes and are regularities-not exceptions.Much more attention must be paid to the ergodicity of investigated biological phenomena.Moreover, in cases of ergodicity breaking, we have shown here and in previous studies 73,81,82 that cascade dynamics should be considered one of the primary candidates for its origin and that capturing this origin through nonlinear, multifractal, and cascade-dynamical descriptors may be the key to ergodically describing nonergodic phenomena.The importance of these insights cannot be exaggerated as they are crucial for reliable and reproducible diagnosis and prognosis across all fields.nonergodicity may be a signature of life, but seeking ergodicity in our generalizations and causal reasoning is pivotal for arriving at generalizable and reproducible digital biomarkers of health and disease.
In pragmatic terms, our investigation illuminates the inherent constraints of conventional biomarker descriptors predicated on the mean and variance calculations derived from raw R-R interval data.These erstwhile markers exhibit a pronounced nonergodic and non-specific character.In contrast, cascade-dynamical descriptors, exemplified by the Hurst exponent and multifractal nonlinearity, furnish a conspicuously more ergodic and precise portrayal of HRV.The discerned outcome carries profound implications, as it intimates that our cascade dynamics-centered methodology can unearth biomarkers of superior reliability germane to the domains of heart disease and stroke.This addresses a critical lacuna endemic to contemporary clinical practice, charting the course for integrating ergodicity paradigms into digital biomarker exploration.This paradigm shift holds the promise of catalyzing advancements in risk stratification and diagnostic precision, thereby auguring tangible enhancements in the quality of patient care within the intricate realm of cardiovascular health.Future inquiries should delve further into the comparative evaluation of our cascade-dynamical HRV descriptors vis-á-vis established clinical biomarkers, scrutinizing their synergistic potential to yield an amalgamated diagnostic and prognostic arsenal of heightened potency.

Theoretical and practical implications of ergodicity in mining physiological data for biomarkers: the curious choice of RRi series
A significant challenge in using the RRi series to yield a biomarker is the requirement for ergodicity.The standard linear descriptors fail to offer an ergodic description of nonergodic HRV.Theoretical work had previously found that time series like HRV with temporal correlations or non-Gaussian histograms thwart ergodic characterization by linear descriptors 73,81,82 .The same theoretical work had also found that descriptors derived for cascade-like dynamics, H fGN and t MF , could do better.We see in the present results that the ergodic description these cascade- dynamical parameters provide is, at once, better than that from linear descriptors but also only suitable for short time windows.The E B for H fGn and t MF for the original RRi series had decays comparable to the shuffled RRi series only for the smallest epochs.Beyond short epochs on the order of 5 epochs, H fGn series describing the RRi series over longer timescales have comparable ergodicity breaking as the prior linear descriptors.We could punt this limitation back to the fact that the linear model has room to encompass the linear autocorrelation encoded by H fGn .By such logic, t MF should be better at ergodically describing the nonergodic HRV because it encodes the nonlinear interactivity capable of breaking ergodicity 81 .However, the ergodicity breaking of the RRi series is so rampant that t MF now only provides a fleeting improvement over H fGn -t MF provides more ergodic description across all epochs than all prior descriptors but only for the surviving CHF case.This group showed E B for the original RRi series that decayed comparably to the shuffled RRi series across all epochs only in the CHF nonsurviving case.In healthy participants, t MF was scarcely better than H fGn , yielding E B -vs.-epoch cures resembling results for H fGn with the decay of E B resembling those for the shuffled RRi series for slightly longer epochs.In CHF nonsurviving case, t MF exhibited ergodicity breaking similar to H fGn across all epochs, with the E B -vs.-epoch cures for the original RRi series exhibiting shallow decay than the shuffled RRi series over short to medium epochs and even more shallow decay over the longer epochs.
These results reflect a convergence of multiple constraints all at once.First, finite-size limitations are a perennial constraint on empirical work, preventing a clean resemblance to the theoretical work.For instance, even theoretical work using simulated time series with lengths customary to most empirical work rarely shows E B converging to zero, even for the shuffled time series 73,81,82 .Second, previous theoretical work has already shown that the non-Gaussianity and temporal correlations implicit in cascade dynamics can together make the ergodic www.nature.com/scientificreports/characterization by H fGn mediocre, i.e., showing neither the non-decay of E B for characteristically nonergodic process nor the same rapid decay of E B for t MF 81 .Third, the non-Gaussianity of the HRV could be so excessive as to introduce asymmetric multifractal spectra (e.g., 153 ), and such asymmetry could be such as to cloud the t MF with surplus meaning.For instance, the t MF offers a way to test for multifractality arising from nonlinearity as the marginal difference in spectrum width between original and surrogate spectra (e.g., ref 47 ).Nevertheless, this proposed difference comes traditionally without clarifying how each side of the multifractal spectrum contributes to that marginal difference between original and surrogates.It is meanwhile well known that the left side of the spectrum is often more stable than the right for finite-length empirical series [154][155][156][157][158][159] .Hence, for time series with such multifaceted sources of ergodicity breaking as the RRi series, it is likely that t MF is a crude simplification of a multifractal spectrum with more asymmetric nuance than a simple t-test can convey.Non-Gaussianity and finite-size limitations on the RRi series may increase t MF for reasons that do not reflect nonlinearity.This point warrants reexamination of how we use t-tests for multifractal tests of nonlinearity, let alone how we use such tests as biomarkers.These results point to two broader concerns about how we even begin approaching the theoretical and methodological work implicit in mining physiological data for biomarkers.First, it may be worth considering the rationale for using the RRi series in the first place-given current wisdom about the RRi series as containing nonlinear correlations across multiple timescales, treating each RR interval as an isolated event is a curious choice.The reduction of the ECG time series to an interval series through first-differences (i.e., subtractions of previous R-event time from current R-event time) is a methodological choice that could have yet-unknown implications for the reliability of any signal-processing outcomes that follow (e.g., ref 160 ).Although this choice of how to reduce a raw series for subsequent analysis may be explicitly theoretical or may reflect convenience or habit, the fact that it can have implications for the reproducibility of a result may give us pause.Indeed, it is alarming that, at this late date, it remains an ongoing research question how to classify and detect the peaks in the QRS complex of an ECG record 161,162 .
Second, and more deeply theoretically, the heterogeneity of E B -vs.-epoch curves for empirical applications instead of theoretical demonstrations raises old and persistent questions about how to envision ergodicity.In effect, what sort of variable is ergodicity?Is it a dichotomy in which systems are or are not ergodic, with no gradation or grey areas between (e.g., ref 163 )?Such a position feels formally clean, but it may raise questions that reflect a pretheoretical choice instead of a conclusion informed by empirical tests.This choice presents some steep challenges for future scientific enterprise 164 .Then again, is ergodicity more of a continuum?There has been a decades-long tradition of dabbling in considering ergodicity as a continuous property that can become "more" or "less"-or even "quasi" [165][166][167][168][169][170][171][172][173][174][175][176][177][178] .This latter position sometimes deals in quotation marks around these terms as though to dodge the critique of being mush-mouthed or nonspecific-as though defending against or reacting to the claims to the clarity of theories enlisting ergodicity-as-dichotomy.The time may come when the practical need for biomarkers brings enough empirical scientists to the theoretical concern of ergodicity.Measurements like the RRi series and E B curves they yield may be an essential catalyst for the following dialogue.We suspect that the theoretical clarity from dichotomy may need to give way to some compromise with ergodicity-as-continuum if we are to confront the practicalities of diagnosis.Theoretical clarity and empirical/applied practicality reflect different motivations.Still, ergodicity-as-continuum may offer no less and more theoretical clarity-new theoretical clarity that could advise the interpretation of such heterogeneous E B curves.

Methods
Each patient gave informed written consent with full knowledge of the details.The ethics committee of Fujita Health University approved the research, which followed the guidelines stated in the Declaration of Helsinki.All data were fully anonymized before we accessed them.

Subjects
Based on the data of one of our previous studies 99 , we retrospectively enrolled the patients referred to the hospital of the Fujita Health University from January 2000 to December 2001 for assessment or treatment of CHF.24-hour monitoring of Holter ECG was conducted before their hospital discharge.To be eligible for this study, the patients had to be in normal sinus rhythm and had Holter ECG recordings whose periods taken up by artifacts or noise were less than 5% .No intravenous positive-inotropic agents or vasodilators were administered during the Holter ECG recordings.We excluded patients with chronic or paroxysmal atrial fibrillation, permanent or temporary cardiac pacemakers, active thyroid disease, or malignancy.

Follow-up and endpoint
We recorded the baseline data upon hospital discharge and the time-to-event information for each subject in a database.We then periodically sent questionnaires to patients or their families during the follow-up period and conducted telephone interviews to gather mortality information.Death from progressive heart failure was defined as death resulting from multi-organ failure caused by the progression of pump failure, and sudden death was defined as either witnessed cardiac arrest or death within one hour of onset of acute symptoms or the unexpected death of a patient known to have been well within the previous 24 hours.

Analysis of holter ECG
Using proprietary software, we digitized ECG signals at 125 Hz and 12 bits (Cardy Analyzer II, Suzuken Co., Ltd., Nagoya, Japan).We included only recordings with at least 22 hours of data in the analysis and > 95% of quantified sinus beats.Although the Cardy Analyzer II software had detected and labeled all QRS complexes in each recording, we manually corrected any errors in R-wave detection and QRS labeling.then exported www.nature.com/scientificreports/ the individual files containing the duration of individual RRi intervals and morphology classifications of individual QRS complexes (normal, supraventricular, and ventricular premature complexes, supraventricular, and ventricular escape beats).We analyzed the 24-hour sequence of intervals between two successive R waves of sinus rhythm (i.e., heart rate variability or HRV).To avoid the adverse effects of any remaining errors in detecting the R wave, we reviewed large ( > 20% ) consecutive RRi interval differences until all errors were corrected.In addition, when we encountered atrial or ventricular premature complexes, we interpolated the corresponding RRi intervals by the median of the two successive beat-to-beat intervals.We also confirmed that no sustained tachyarrhythmias were present in the HRV recordings.We then interpolated the observed RRi series with a cubic spline function and resampled at an interval ( t ) of 500 ms (2 Hz), yielding interpolated RRi series.
A previous study employing the same dataset, as reported by Kiyono et al. 99 , did not reveal any noticeable distinctions related to sex, disease severity based on the New York Heart Association classification, prevalence of ischemic heart disease, or ventricular premature beat frequency when comparing survivors and nonsurvivors.Furthermore, no significant differences were observed in key heart rate parameters, including mean RRi, timeand frequency-domain heart rate variability (HRV) measures, or the fractal exponents α 1 and α 2 , between the two groups.

Estimating descriptors of HRV for epoch series
We computed the following descriptors of HRV-linear descriptors over nonoverlapping 500-beat epochs extracted from the RRi series and fractal and multifractal descriptors over nonoverlapping 1000-sample epochs extracted from the interpolated RRi series.Hence, we computed fractal and multifractal descriptors in the time domain, as both are time-domain analytical methods.We computed these descriptors for the original (i.e., unshuffled) and a shuffled counterpart (i.e., a version with the temporal information destroyed) of each RRi series.

Conventional linear descriptors
We computed four linear descriptors of HRV.(i) Mean of successive RR intervals (M).(ii) Root mean square of successive RR intervals (RMS) mathematically defined as (iii) Number of pairs of successive RRi intervals that differ by more than 50 ms ( NN50 ).(iv) The percentage of successive RRi intervals that differ from each other by more than 50 ms ( pNN50).

Fractal-scaling descriptor of long-range correlations using monofractal detrended fluctuation analysis
Detrended fluctuation analysis (DFA) computes the Hurst exponent, H fGn , quantifying the strength of long-range correlations in series 179,180 using the first-order integration of T-length series x(t): DFA computes root mean square (RMS; i.e., averaging the residuals) for each linear trend y n (t) fit to N n nonoverlapping n-length bins to build a fluctuation function: where H fGn is the scaling exponent estimable using logarithmic transformation: Higher H fGn corresponds to stronger long-range correlations.

Multifractal spectrum width based on the direct estimation of singularity spectrum
Chhabra and Jensen's 181 direct method estimates multifractal spectrum width �α by sampling a series x(t) at progressively larger scales using the proportion of signal P i (n) falling within the vth bin of scale n as As n increases, P v (n) represents a progressively larger proportion of x(t), www.nature.com/scientificreports/suggesting a growth of the proportion according to one "singularity" strength α 133 .P(n) exhibits multifractal dynamics when it grows heterogeneously across time scales n according to multiple singularity strengths, such that whereby each vth bin may show a distinct relationship of P(n) with n.The width of this singularity spectrum, �α = (α max − α min ) , indicates the heterogeneity of these relationships 182,183 .
Chhabra and Jensen's 181 method estimates P(n) for N n nonoverlapping bins of n-sizes and transforms them into a "mass" µ(q) using a q parameter emphasizing higher or lower P(n) for q > 1 and q < 1 , respectively, in the form Then, α(q) is the singularity for mass µ-weighted P(n) estimated as Each estimated value of α(q) belongs to the multifractal spectrum only when the Shannon entropy of µ(q, n) scales with n according to the Hausdorff dimension f(q) 181 , where For values of q yielding a strong relationship between Eqs. ( 15) and ( 16)-in this study, correlation coefficient r > 0.9975 , the parametric curve (α(q), f (q)) or (α, f (α)) constitutes the multifractal spectrum and �α (i.e., α max − α min ) constitutes the multifractal spectrum width.r determines that only scaling relationships of comparable strength can support the estimation of the multifractal spectrum, whether generated as cascades or surrogates.Using a correlation benchmark aims to operationalize previously raised concerns about misspecifications of the multifractal spectrum 184 .

Surrogate testing using Iterated Amplitude Adjusted Fourier Transformation (IAAFT) generated t-statistic, t MF
While multifractality is necessary for cascade-like interactivity, multifractality is not conclusive evidence of cascade-like interactivity, as it can follow from other sources, e.g., linear autocorrelation and outliers in the histogram 185 .To identify whether non-zero multifractal spectrum width (i.e., �α > 0 ) reflected multifractality due to nonlinear interactions across scales, we compared �α for the original and shuffled RRi series to �α for 32 iterated amplitude adjusted Fourier transform (IAAFT) surrogates 186,187 .IAAFT randomizes original values time-symmetrically around the autoregressive structure, generating surrogates with randomized phase ordering of the series' spectral amplitudes while preserving linear temporal correlations.We refer interesting readers to Kelty-Stephen et al. 107 for a step-by-step guide to generating the IAAFT surrogates for any series.The resulting surrogate series should thus have the same values as the original series and thus the same mean and variance.It should also have the same amplitude spectrum and autocorrelation function as the original series.The onesample t-statistic, t MF takes the subtractive difference between �α for the original series and that for 32 surrogates, dividing by the standard error of �α for the surrogates.

Estimating ergodicity breaking parameter, E B
Ergodicity can be quantified using a dimensionless statistic of ergodicity breaking parameter, E B , also known as the Thirumalai-Mountain metric 102,103 and already mentioned by Rytov et al. 134 , computed as where δ 2 (x(t)) = t−� 0 [x(t ′ + �) − x(t ′ )] 2 dt ′ (t − �) is the time average mean-squared displacement of the stochastic series x(t) for lag time .This relationship is effectively the variance of sample variance divided by the total-sample squared variance.Rapid decay of E B to a finite asymptotic value for progressively larger samples, α(q) = − lim www.nature.com/scientificreports/i.e., E B → 0 as t → ∞ implies ergodicity.Thus, for Brownian motion E B (x(t)) = 4 3 ( � t ) 148,188 .Slower decay indi- cates less ergodic systems in which trajectories are less reproducible, and no decay or convergence to a finite asymptotic value indicates strong ergodicity breaking 104,105 .E B (x(t)) thus allows testing whether a given series fulfills ergodic assumptions or breaks ergodicity.For instance, it has been shown that for fractional Brownian motion (FBM) 104,105 , The present work is less focused on firmly meeting the criterion of E B converging to zero within our finite samples.Instead, we compared the original and shuffled RRi series to assess ergodicity breaking instead of strict convergence of E B to zero.We computed E B for each original and shuffled RRi series (range = T/50 ; lag = 10 ) and for each epoch series of M, RMS, NN50 , PNN50 , H fGn , and t MF for the original and shuffled RRi series (range = N epochs /2 ; lag = 1).

Monte Carlo simulations
We performed Monte Carlo simulations to test our hypothesis that ergodicity breaking by various linear and cascade-dynamical HRV descriptors could compromise these descriptors' reliability as diagnostic biomarkers.We randomly sampled 1000-sample RRi series from 24-hour recordings for each individual and performed linear mixed-effects models separately on M, RMS, NN50 , pNN50 , H fGn , t MF , values calculated from these series.We used linear mixed-effects models with each descriptor as the dependent variable and the participant group as the independent variable.The t-statistic and the resultant p value were saved across the 1000 iterations.We performed all mixed-effects modeling in MATLAB 2022b (Mathworks, Inc., Natick, MA) using the function fitlme().

Survival analysis
We examined whether H fGn and t MF were predictive of death using univariate Cox proportional hazards regres- sion analysis 108,109 .We used the Mantel-Haenszel log-rank test to compare Kaplan-Meier cumulative survival curves to examine the impact of identified risk factors on survival.We performed all survival analysis in R 189 using the function coxph() from the package "survival" 190 .The sex ratio among the survivors, as presented in Table 1, exhibits a notable skew towards males (42M/27F) when contrasted with the non-survivors (19M/20F).Given that sex constitutes a pivotal physiological determinant in cardiac health, it merits consideration in interpreting results.Nonetheless, it is worth noting that a prior study utilizing the same dataset failed to uncover any discernible disparities concerning sex between survivors and nonsurvivors 99 .Consequently, we lacked a compelling rationale to anticipate that sex would influence any facet of the current analysis.(

Figure 1 .
Figure 1.Nonergodicity refers to the lack of equivalence between finite-ensemble and finite-time averages.The finite-ensemble average, which biomedical discourse recognizes as the "sample average, " is �x i (t)� N = 1 N

Figure 2 .
Figure 2. The raw R-R interval (RRi) series are nonergodic.(a-c) Representative examples of the original and shuffled RRi series (colored lines and grey lines, respectively).(a) The RRi series for a healthy control (a 54-year-old woman).(b) The RRi series for a 74-year-old man with congestive heart failure (CHF) who died 101 days after the measurement.(c) The RRi series for an 82-year-old woman who survived CHF.The original RRi series, for healthy controls (d) as well as the two patient groups (e, f), show no change in the ergodicity breaking parameter, E B , over progressively longer periods, reflecting that HRV breaks ergodicity (colored lines).Shuffling the original RRi series produces an RRi series that is ergodic, as indicated by the rapid decay in E B over progressively longer periods (grey lines).Thin lines and thick lines in (d-f) represent ergodicity breaking for individuals and mean ergodicity breaking for the three groups, respectively.

Figure 3 .
Figure 3. Commonly used linear descriptors of HRV based on mean and root mean square are nonergodic.(a) The ergodicity breaking parameter, E B , did not change for the mean of successive RR intervals, M, in the original RRi series over a progressively larger number of epochs (colored lines).In contrast, E B decayed rapidly for the M of the shuffled RRi series (grey lines).(b) Null hypothesis significance testing (NHST) for M across the three groups.One-way ANOVAs failed to detect reduced M of HRV due to the CHF, 36.8% and 12.7% times in nonsurvivors (red histogram) and survivors (green histogram), respectively, compared to healthy controls.(c) E B , did not change for the root mean square of successive RR interval differences, RMS, in the original RRi series over a progressively larger number of epochs (colored lines).In contrast, E B decayed rapidly for the RMS of the shuffled RRi series (grey lines).(d) Null hypothesis significance testing (NHST) for RMS across the three groups.One-way ANOVAs failed to detect reduced M of HRV due to the CHF, 34.6% and 11.3% times in nonsurvivors (red histogram) and survivors (green histogram), respectively, compared to healthy controls.Thin lines and thick lines in (a,c) represent E B for individuals and mean E B for the three groups, respectively. https://doi.org/10.1038/s41598-023-45184-2 e p o c hs t h at b e c am e s h a l l owe r ove r a pro g re s s ive ly l arge r nu mb e r of e p o c hs ( E B (H fGn (epoch)) = −0.4640� epoch , −0.1648 � epoch and − 0.2121 � epoch ; colored lines in Fig.5a).For t MF , E B rapidly decayed initially over a progressively larger number of epochs in the narrow range of 20 epochs

Figure 4 .
Figure 4. NN50 and pNN50 are specific but only weakly ergodic.(a) The epoch series of NN50 describing the original RRi series show an initial decay in the ergodicity breaking parameter, E B , with epochs (colored lines), albeit shallower compared to the epoch series of NN50 describing the shuffled RRi series (grey lines).(b) Null hypothesis significance testing (NHST) for NN50 across the three groups.One-way ANOVAs revealed that NN50 of HRV did not differ between either patient populations and healthy controls: CHF nonsurvivors and survivors (red histogram and green histogram, respectively).(c) The epoch series of pNN50 describing the original RRi series show an initial decay in E B over epochs (colored lines), albeit shallower compared to the epoch series of pNN50 describing the shuffled RRi series (grey lines).(d) NHST for pNN50 across the three groups.One-way ANOVAs revealed that pNN50 of HRV did not differ between either patient populations and healthy controls: CHF nonsurvivors and survivors (red histogram and green histogram, respectively).Thin lines and thick lines in (a,c) represent E B for individuals and mean E B for the three groups, respectively.

Figure 5 .
Figure 5. Cascade-dynamical descriptors of long-range correlations, H fGn , and multifractal nonlinearity, t MF , are ergodic and specific.(a)The ergodicity breaking parameter, E B , decayed initially for the H fGn of the original RRi series over short epochs (colored lines).However, this decay was shallower compared to that of the H fGn of the shuffled RRi series(grey lines) overall longer epochs.(b) NHST of H fGn across the three groups.One-way ANOVAs failed to detect reduced H fGn of HRV due to the CHF, only 0.2% and 1.8% times in nonsurvivors (red histogram) and survivors (green histogram), respectively, compared to healthy controls.(c) E B decayed rapidly over epochs for the t MF of both the original RRi series (colored lines) and the shuffled RRi series (grey lines) but only for the CHF nonsurvivors.Rapid decays of E B for the original RRi series comparable to the shuffled RRi series only held for extremely short epochs in the healthy case and for small to medium epoch sizes in the CHF survivors.(d) One-way ANOVAs failed to detect reduced t MF of HRV due to the CHF, 13.2% and 2.4% times in nonsurvivors (red histogram) and survivors (green histogram), respectively, compared to healthy controls.Thin lines and thick lines in (a,c) represent E B for individuals and mean E B for the three groups, respectively.