Abstract
Novel coronavirus infection (COVID19) has exserted certain burden on global public health, spreading around the world with reportedly low mortality and morbidity. This study advocates novel bio and health system reliability approach, especially suitable for multiregional environmental and health systems. Advocated spatiotemporal method has been crossvalidated, versus well established bivariate Weibull method, based on available raw clinical dataset. The purpose of this study was to assess risks of excessive coronavirus death rates, that may occur within any given time horizon, and in any region or district of interest. This study aims at benchmarking of the novel Gaidai bioreliability method, allowing accurate assessment of national public health system risks, for the years to come. Novel biosystem reliability approach is particularly suitable for multiregional environmental and health systems, monitored for a sufficiently representative period of time. In case when underlying biosystem is stationary, or the underlying trend is known, longterm future death rate risk assessment can be done, and confidence intervals can be generated. Advocated methodology may to be useful for a wide variety of public health applications, thus, it is not limited to the example, considered here.
Similar content being viewed by others
Introduction
Statistical characteristics of COVID19 (SARSCoV2) and other comparable recent influenza outbreaks have been receiving substantial research interest in recent years^{1,2,3}. Environmental effects on biological systems typically follow cyclical patterns. For environmental effects see ref. ^{4}; for meteorological parameters see ref. ^{5}; for heat stress and thermal perception see ref. ^{6}. In general, determining actual biological system’s reliability factors to assess future epidemic outbreak risks, is fairly challenging, given a variety of epidemic and environmental factors. In principle, direct MC (Monte Carlo) simulations or a sufficient number of raw clinical observations might be sufficient to evaluate reliability of a complex biological system. However, COVID19’s clinical observational data are limited to years 2020–2022. In order to address the challenge of having too limited underlying clinical dataset, the authors have developed Gaidai reliability approach, suitable for biological and health systems, when risks of near future epidemic outbreaks are of interest. COVID19 outbreaks in Singapore were the primary focus of this study, which focused on crosscorrelations between various health data from the same climatic zone. Singapore has been chosen due its extensive national health surveillance, and its publicly accessible raw clinical data^{7}. Engineering and medical research both make extensive use of statistical lifetime data modeling by EVT (extreme value theory)^{8}. In ref. ^{9} EVT has been utilized by authors to forecast H1N1 (swine flu) epidemiological risks. For spatial lag and error models, along with regression techniques, see ref. ^{10}. In the current study an epidemic outbreak is defined as unexpected random event, that might occur at any time and in any administrative region of a particular national health care system. Spatiotemporal aspect of epidemiological risk has been therefore taken into account. Nondimensional parameter λ has been introduced to unify various national regions with different epidemiological backgrounds into one multidimensional biodynamic system.
Singapore’s COVID19 raw clinical data has been retrieved from a public source^{7}. National public health system under investigation has been modeled as MDOF (multidegreeoffreedom) dynamic biosystem, with strongly correlated administrative components (spatial dimensions). The goal of this study was assessment of future epidemiological outbreaks risks, hence authors only considered daily reported patient numbers, and not symptoms. The map of Singapore represents specific clinical recorded instances.
Based on quasistationarity assumption, this study assumed that, despite seasonal fluctuations, the underlying epidemiological process would be statistically representative throughout two consecutive observational years, 2020–2022. In case of underlying trend is of interest, it should be identified first, and epidemiological thresholds should be made variable with time. In the latter case, GaidaiYakimov method can be applied even to nonstationary biosystems.
Method
MDOF dynamic system is represented here by a collection of its critical/key components/dimensions, combined into biosystem’s representative vector\(\left(X\left(t\right),Y\left(t\right),Z\left(t\right),\ldots \right)\), consisting of biosystem’s key components \(X\left(t\right),Y\left(t\right),Z\left(t\right),\ldots\) that has been measured/observed over sufficiently long (representative) clinical period \((0,T)\). Biosystem component’s global maxima being denoted as \({X}_{T}^{\max }=\mathop{\max }\limits_{0\le t\le T}X\left(t\right)\), \({Y}_{T}^{\max }=\mathop{\max }\limits_{0\le t\le T}Y\left(t\right)\), \({Z}_{T}^{\max }=\mathop{\max }\limits_{0\le t\le T}Z\left(t\right),\ldots\). By sufficiently long clinical/observational duration \(T\) authors primarily mean long enough observational duration \(T\) with respect to the dynamic biosystem relaxation and autocorrelation time scales. Let \({X}_{1},\ldots ,{X}_{{N}_{X}}\) be temporally consequent biosystem component \(X=X(t)\) local maxima occurring at discrete temporally nondecreasing timeinstants \({t}_{1}^{X} < \ldots < {t}_{{N}_{X}}^{X}\) within clinical observational period\((0,T)\). Identical definitions can be given for other MDOF biosystem’s key components \(Y\left(t\right),Z\left(t\right),\ldots\) namely \({Y}_{1},\ldots ,{Y}_{{N}_{Y}};\) \({Z}_{1},\ldots ,{Z}_{{N}_{Z}}\) and so on. For simplicity, all biosystem components, and hence their local maxima have been assumed to be positive. Hence:
representing biosystem’s survival probability \(P\), given in terms of joint PDF (probability density function) \(p\). Due to biosystem’s high dimensionality, it is not practical to assess \({p}_{{X}_{T}^{\max },{Y}_{T}^{\max },{Z}_{T}^{\max },\ldots }\) directly. When either of key component \(X\left(t\right)\) exceeds \({\eta }_{X}\), or \(Y\left(t\right)\) exceeds \({\eta }_{Y}\), or \(Z\left(t\right)\) exceeds \({\eta }_{Z}\), etc., biosystem is viewed as having instantly failed or entered in a state of hazard. Fixed hazard/failure levels \({\eta }_{X}\), \({\eta }_{Y}\), \({\eta }_{Z}\),… being individually set for each 1D (1dimensional) biosystem’s component. The latter target biosystem survival probability \(P\) is needed to assess biosystem’s expected lifetime. Biosystem’s 1D key components \(X,Y,Z,\ldots\) being now rescaled as well as nondimensionalized:
making all biosystem key components nondimensional, having identical failure/hazard limits, equal to 1. Synthetic temporally nondecreasing vector being now created by merging/coalescing biosystem component’s local maxima into 1D combined system vector \(\vec{R}=\left({R}_{1},{R}_{2},\ldots ,{R}_{N}\right)\) coherent with corresponding combined temporal vector \({t}_{1}\le \ldots \le {t}_{N}\), \(N\le {N}_{X}+{N}_{Y}+{N}_{Z}+\ldots\). Each biosystem’s key component local maxima, constituting vector \({R}_{j}\) being actually observed within biosystem temporal record, occurring within either \(X\left(t\right)\) or \(Y\left(t\right)\), or \(Z\left(t\right)\) or other biosystem’s components. Constructed synthetic \(\vec{R}\)vector has 0 data loss.
Now temporally nondecreasing synthetic vector \(\vec{R}\), along with its corresponding component’s occurrence time instants \({t}_{1}\le \ldots \le {t}_{N}\), have been now fully introduced^{11,12,13}.
Results
This section utilizes advocated approach to a bivariate random bioprocess \(Z(t)=(X(t),Y(t))\) to demonstrate its efficiency. Patients with COVID who have been diagnosed and daily records are included in this approach, \(X(t),Y(t)\), being monitored synchronously over a certain observational time span \((0,T)\). It being assumed for simplicity that samples \(({X}_{1},{Y}_{1}),\ldots ,({X}_{N},{Y}_{N})\) within observational time period \(\left(0,T\right)\) were collected at N equidistant discrete time instants \({t}_{1},\ldots ,{t}_{N}\)^{11,12,14,15}\(,\) yielding bivariate joint CDF \(P\left(\xi ,\eta \right):={\rm{Prob}}\left({\hat{X}}_{N}\le \xi ,{\hat{Y}}_{N}\le \eta \right)\) of the 2D vector \(\left({\hat{X}}_{N},{\hat{Y}}_{N}\right)\), with components \({\hat{X}}_{N}=\max \left\{{X}_{j}{\rm{;}}j=1,\ldots ,N\right\}\), and \({\hat{Y}}_{N}=\max \left\{{Y}_{j}{\rm{;}}j=1,\ldots ,N\right\}\). In doing so, it serves as an example of a dynamic twodimensional (2D) system^{12,13,16}. Using onedimensional extreme response values with return times and probabilities, critical thresholds were found \(p\). Scaling has been done to combine both time series \(X,Y\) in accordance with Eq. (2), resulting in each of the two biosystem components having failure/hazard unitary limit equal to 1. Then, by maintaining them in temporal nondecreasing order, all biosystem components local maxima from each measured system component timeseries have been combined into one single timeseries \(\vec{R}=\left(\max \left\{{X}_{1},{Y}_{1}\right\},\ldots ,\max \left\{{X}_{N},{Y}_{N}\right\}\right)\).
Synthetic environmental example
The authors selected synthetic example, where exact analytical solution is known in advance. The latter made it possible to crossvalidate advocated reliability method versus well established bivariate Weibull method. Note that Gaidai method can tackle highdimensional systems, while bivariate Weibull method is suitable only for 2D (2dimensional) systems. The latter is a distinctive advantage of Gaidai method.
Wind speed 3.65day maxima process \(X\left(t\right)\) has been modeled within time period \(\left[0,T\right]\), based on stationary underlying Gaussian stochastic process \(U\left(t\right)\), having zero mean value and standard deviation equal to 1. It was assumed for simplicity that \(U\left(t\right)\) mean zero upcrossing rate equals \({\nu }_{U}^{+}\left(0\right)={10}^{3}/T\), with return period \(T=1\) year^{14,15,17,18,19}. As a result, wind speed maxima process \(X\left(t\right)\) will have 365/3.65 = \({10}^{2}\) data points annually, with total data record containing \({10}^{4}\) data points, which being equivalent to 100 years. Underlying wind speed process \(U\left(t\right)\) has 3.65 days maxima analytical CDF (cumulative density function) \({F}_{X}^{3d}\left(x\right)=\exp \left\{q\exp \left(\frac{{x}^{2}}{2}\right)\right\}\) corresponding to the 3 days wind speed maxima process \({X}^{3d}\left(t\right)\). GumbelHaugaard, Frank, and Clayton are three Archimedean copulas that are often used. The GumbelHaugaard copula \(G\left(u,v\right)\) dependence structure being taken into account initially, modeling crosscorrelation between the marginal peak wind speed variables \({X}^{3d}\left(t\right)\) and symmetrically distributed crosscorrelated process \({Y}^{3d}\left(t\right)\):
with \({X}^{3d}\left(t\right)\) and \({Y}^{3d}\left(t\right)\) having correlation coefficient \({R}_{{{\rm{corr}}}}\) of 0.5, and parameter \({m}=1/\sqrt{1{R}_{{{\rm{corr}}}}}\) being connected to correlation coefficient \({R}_{{{\rm{corr}}}}\). Since stationary random Gaussian processes underlying both \({X}^{3d}=X\left(t\right)\), and \({Y}^{3d}=Y\left(t\right)\), the GumbelHaugaard copula is easily adaptable, hence bivariate Weibull method prediction agrees well with both analytical solution x = 6, as well as with Gaidai prediction. Exact bivariate CDF reads as:
Figure 1 presents simulated (synthetic) time series, coalesced into 1D system \(\vec{R}\) vector. Bivariate Weibull contour, with target probability level 2D contour, containing selected bivariate testpoint \(\left({X}^{3d},{Y}^{3d}\right)=\left(\mathrm{6,5.2}\right)\) agreed well with both analytical and Gaidai method’s prediction \(R=6\), as expected, since underlying stochastic process was rather simple. Second, the equivalent Clayton copula was used in place of GumbelHaugaard copula \(C\left(u,v\right)\), with asymmetric Archimedean copula:
Clayton copula being more challenging for bivariate Weibull method to fit, since it being not part of the copula library—currently implemented are only Asymmetric logistic and Gumbel logistic copulas^{20,21,22,23,24,25,26}. Bivariate Weibull method being therefore expected to perform less accurately than Gaidai method in this case.
For specific numerical example mentioned above, it was found that, on average, Gaidai method performed 15–20% more accurately than bivariate Weibull technique. In the case of raw measured nonGaussian, crosscorrelated by nonArchimedean copulas data, an advantage of Gaidai method would be more pronounced. Last but not least, bivariate Weibull clearly required more processing time than Gaidai approach for any given bivariate failure/hazard limit since it performs 2D surface interpolation. Gaidai method has produced 95% CI (confidence interval), while bivariate Weibull method did not have such ability.
Method validation
Figure 2 presents an example of Singapore COVID19 raw clinical death rate data, recorded during the years 2020–2022, presented as observed timeseries.
Figure 3 presents bivariate Weibull bivariate contours for Singapore COVID19 death rate data^{7}. As seen from Fig. 3, there is an intrinsic inaccuracy, owing to the specific copula choice within bivariate Weibull fit to the raw measured dataset. See for more information on the bivariate Weibull technique^{19,20}. Bivariate failure/hazard testpoint \(\left(X,Y\right)=\left({44,000,\,65}\right)\) has been selected for comparison between two methods (Gaidai and bivariate Weibull), as this bivariate testpoint lies on the \(p{=10}^{1.3}\) contour line, predicted by bivariate Weibull technique. 95% CI produced by Gaidai method included bivariate point, utilized by bivariate Weibull method^{20,21,22,23}. Highdimensionality (say, above 2D) of biological and health systems makes it challenging to produce accurate multivariate predictions, based on available relatively limited clinical raw datasets. Hence abovedescribed novel health system reliability approach, has advantages of optimally utilizing clinical measured datasets, while taking into account biosystem’s high dimensionality.
The Poincare type plot may be used to analyze intrinsic data structural patterns, for example 2nd order difference plot (SODP) can be used to start with. For consecutive differences, 2nd order SODP may be used to statistically observe raw timeseries data^{24}.
Figure 4a presents 2nd order SODP plot. When employing an entropybased AI (artificial intelligence) recognition approach, 2nd order SODP plots may be used to spot data patterns and compare them to other similar datasets^{14,15,25,26,27}. This study did not focus on AI pattern analysis as such, therefore Fig. 4a can be seen as motivating for further research, when underlying raw dataset quality remains an open issue.
Figure 4b demonstrates correlation between the daily number of COVID19 fatalities and newly dailyregistered patients. It is clear from Fig. 4b that raw daily recorded new patient counts contain outliers. Traditional health systems reliability techniques that deal with observed raw timeseries do not have an advantage of dealing with highdimensional (above 2D) systems, along complex crosscorrelation between different biosystem components. The key advantage of Gaidai method being its ability to assess reliability of highdimensional nonlinear dynamic biosystems.
Discussion
Traditional timeseries reliability approaches do not always have advantage of easily handling highdimensional dynamic systems along with crosscorrelations between different key system components. Fundamental advantage of Gaidai method being its ability to examine reliability of highdimensional dynamic biosystems. In this investigation, synthetic wind speeds were used as validation case, as in this case analytical solution are known. The theoretical rationale of the proposed approach being thoroughly discussed. Although using direct measurement or Monte Carlo simulation to analyze the reliability of dynamic biosystems is often appealing, it should be noted that the complexity and highdimensionality of dynamic biosystems require development of novel, accurate, and robust techniques that can handle available raw datasets, while utilizing them optimally.
This study’s methodology has already been shown successful when applied to a number of simulation models, but only for onedimensional system components. Overall, quite accurate forecasts have been made. The main goal of this study was to develop a generalpurpose, trustworthy, and userfriendly multidimensional reliability strategy. Gaidai bioreliability method was compared to the bivariate Weibull method, using both analytically produced synthetic data and actual raw clinical data. To summarize, suggested methodology may be applied to a wide range of biological and public health studies. Presented national public health example by no means limits potential uses of advocated methodology.
Data availability
Data will be made available on request from corresponding author.
References
Thomas, M. & Rootzen, H. Realtime prediction of severe influenza epidemics using extreme value statistics. arXiv preprint arXiv:1910.10788 https://doi.org/10.48550/arXiv.1910.10788 (2019).
Chen, J., Lei, X., Zhang, L. & Peng, B. Using extreme value theory approaches to forecast the probability of outbreak of highly pathogenic influenza in Zhejiang, China. PLoS ONE 10, e0118521 (2015).
Mugglin, A., Cressie, N. & Gemmell, I. Hierarchical statistical modelling of influenza epidemic dynamics in space and time. Stat. Med. 21, 2703–2721 (2002).
Sia, A. et al. The impact of gardening on mental resilience in times of stress: a case study during the COVID19 pandemic in Singapore. Urban For. Urban Green. 68 https://doi.org/10.1016/j.ufug.2021.127448 (2022).
Pani, S., Lin, N. & RavindraBabu, S. Association of COVID19 pandemic with meteorological parameters over Singapore. Sci. Total Environ. 740 https://doi.org/10.1016/j.scitotenv.2020.140112 (2020).
Lee, J. et al. Heat stress and thermal perception amongst healthcare workers during the COVID19 pandemic in India and Singapore. Int. J. Environ. Res. Public Health. 17, 8100 (2020).
Singapore COVID19 data. https://voice.baidu.com/act/newpneumonia/newpneumonia/?from=osari_aladin_banner&city=%E6%96%B0%E5%8A%A0%E5%9D%A1%E6%96%B0%E5%8A%A0%E5%9D%A1. Acsessed on Jan 2023.
Thomas, M. et al. Applications of extreme value theory in public health. PLoS ONE 11 https://doi.org/10.1371/journal.pone.0159312 (2016).
Coburn, B. J., Wagner, B. G. & Blower, S. Modeling influenza epidemics and pandemics: insights into the future of swine flu (H1N1). BMC Med. 7, 30 (2009).
Meliker, J. R. & Sloan, C. D. Spatiotemporal epidemiology: principles and opportunities. Spat. Spatiotemporal Epidemiol. 2 https://doi.org/10.1016/j.sste.2010.10.001 (2011).
Gaidai, O., Cao, Y. & Loginov, S. Global cardiovascular diseases death rate prediction. Curr. Probl. Cardiol. https://doi.org/10.1016/j.cpcardiol.2023.101622 (2023).
Gaidai, O., Xing, Y., Balakrishna, R. & Xu, J. Improving extreme offshore wind speed prediction by using deconvolution. Heliyon https://doi.org/10.1016/j.heliyon.2023.e13533 (2023).
Gaidai, O. & Xing, Y. Prediction of death rates for cardiovascular diseases and cancers. Cancer Innov. https://doi.org/10.1002/cai2.47 (2023).
Gaidai, O., Yan, P., Xing, Y., Xu, J. & Wu, Y. A novel statistical method for longterm coronavirus modelling. F1000Research https://orcid.org/00000003088348542 (2022).
Gaidai, O. et al. Novel methods for wind speeds prediction across multiple locations. Sci. Rep. 12, 19614 (2022).
Gaidai, O., Wang, F. & Yakimov, V. COVID19 multistate epidemic forecast in India. Proc. Indian Natl Sci. Acad. https://doi.org/10.1007/s43538022001475 (2023).
Gaidai, O., Xing, Y. & Xu, X. COVID19 epidemic forecast in USA East coast by novel reliability approach. Res. Sq. https://doi.org/10.21203/rs.3.rs1573862/v1 (2022).
Gaidai, O. et al. Cargo vessel coupled deck panel stresses reliability study. Ocean Eng. https://doi.org/10.1016/j.oceaneng.2022.113318 (2022).
Gaidai, O. & Xing, Y. A novel multi regional reliability method for COVID19 death forecast. Eng. Sci. https://doi.org/10.30919/es8d799 (2022).
Gaidai, O. & Xing, Y. A novel biosystem reliability approach for multistate COVID19 epidemic forecast. Eng. Sci. https://doi.org/10.30919/es8d797 (2022).
Gaidai, O., Yan, P. & Xing, Y. Future world cancer death rate prediction. Sci. Rep. 13 https://doi.org/10.1038/s4159802327547x (2023).
Gaidai, O., Xu, J., Hu, Q., Xing, Y. & Zhang, F. Offshore tethered platform springing response statistics. Sci. Rep. 12 www.nature.com/articles/s4159802225806x (2022).
Gaidai, O., Xing, Y. & Xu, X. Novel methods for coupled prediction of extreme wind speeds and wave heights. Sci. Rep. https://doi.org/10.1038/s41598023281368 (2023).
Yayık, A., Kutlu, Y. & Altan, G. Regularized HessELM and Inclined Entropy Measurement for Congestive Heart Failure Prediction (Cornell University, 2019).
Gaidai, O., Cao, Y., Xing, Y. & Wang, J. Piezoelectric energy harvester response statistics. Micromachines 14, 271 (2023).
Gaidai, O., Yan, P. & Xing, Y. A novel method for prediction of extreme wind speeds across parts of Southern Norway. Front. Environ. Sci. https://doi.org/10.3389/fenvs.2022.997216 (2022).
Gaidai, O., Fu, S. & Xing, Y. Novel reliability method for multidimensional nonlinear dynamic systems. Mar. Struct. 86 https://doi.org/10.1016/j.marstruc.2022.103278 (2022).
Author information
Authors and Affiliations
Contributions
O.G.—conceptualization; V.Y.—writing; J.S.—visualization; E.v.L.—software. All authors contributed equally.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Gaidai, O., Yakimov, V., Sun, J. et al. Singapore COVID19 data crossvalidation by the Gaidai reliability method. npj Viruses 1, 9 (2023). https://doi.org/10.1038/s44298023000060
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s44298023000060
This article is cited by

FPSO/LNG hawser system lifetime assessment by Gaidai multivariate risk assessment method
Energy Informatics (2024)

Limit hypersurface state of art Gaidai reliability approach for oil tankers Arctic operational safety
Journal of Ocean Engineering and Marine Energy (2024)

Gaidai Multivariate Reliability Method for Energy Harvester Operational Safety, Given Manufacturing Imperfections
International Journal of Precision Engineering and Manufacturing (2024)