Introduction

Hemorrhagic fever with renal syndrome (HFRS, also known as epidemic hemorrhagic fever), which is also referred to as epidemic hemorrhagic fever, is a kind of natural focal disease that is induced by Hantaan virus, and carried and transmitted by rodents, which are the natural reservoir for hantaviruses1,2. Symptoms of HFRS usually develop within 1 to 2 weeks after exposure to infectious material, initial symptoms include intense headaches, back and abdominal pain, fever, chills, nausea, and blurred vision. Later symptoms can include low blood pressure, acute shock, vascular leakage, and acute kidney failure, which can cause severe fluid overload3. Active and effective monitoring is one of the most effective measures for controlling the HFRS epidemic. As the country with the most reported HFRS cases in the world, the epidemic in China has always been one of the most important problem, in the recent 30 years, the efforts to control HFRS in China have been increased, and great achievements have been attained in vaccine development, but the threat of HFRS epidemic has not been completely eliminated yet. A research by Zhang4 et al. forecasted the variation trend before 2004, which indicated a potential prevalence.

The epidemic of numerous infectious diseases is associated with the variation characteristics of periodicity and seasonality5,6,7. Similar to other diseases, the epidemiological characteristics of the time-based characteristics of HFRS have been one of the focused issues for long-term attention. The epidemic of HFRS shows certain periodicity, and lots of cases can be seen during the peak period, while sporadic cases can be observed during the non-peak period. In China, scholars analysed the annual report data of HFRS in China through the time series model8; and some characteristics could be found from the annual data; however, there were still some unclear problems regarding some trends within the year, such as the specific variation trend of each year. The National Health and Family Planning Commission of China (NHFPC, originally known as the Chinese Ministry of Health) has promulgated a National HFRS monitoring program (Trial) in 20059, focusing particularly on measuring the public health intervention’s effectiveness on HFRS control, with the implementation of this policy. Currently, little literature analyses the variation characteristics of HFRS within a year or determines the variation characteristics in recent years, as well as the periodical variation within the yearly data through the monthly data, and the determination of these conditions is of essential importance to the seasonal distribution of the control resources every year. In order to further determine these questions, we adopted the Seasonal-trend decomposition (STL) and exponential smoothing model (ETS) methods to analyse the monthly data from the National Heath and Family Planning Commission Reports, and analysed some specific conditions of the periodicity and seasonality of the monthly data.

Materials and Methods

Data resource

The reported HFRS data from January 2006 to June 2016 was derived at August 25 and 26, 2016, from the National Heath and Family Planning Commission (http://www.nhfpc.gov.cn/), and the same data could also be seen in the Chinese Center for Disease Control [ http://www.chinacdc.cn/], and they were assembled as monthly counts of the reported cases.

Statistical Analysis

STL analysis

One of the most challenges in data analysis of time series is the selection of an adequate model to describe seasonal components, in this paper, the Seasonal-Trend Decomposition based on locally-weighted regression (Loess) known as STL, which was originally presented by Cleveland in 1990, was selected as a filtering procedure designed for decomposing a time series into trend, seasonal, and remainder components10:

where Yv is the component of original time series, Tv is the component of trend variation that can be viewed as change tendency with low frequency, Sv is the component seasonal variation that can be regarded as variations with high frequency due to stable seasonal disturbance, and R is the component of remainder variation that can be viewed as irregular variation due to random disturbance. STL works as an iterative nonparametric regression procedure using a series of LOESS smoothers, which is based on fitting a weighted polynomial regression. In detail, LOESS produces a smoothed estimate () that is defined by the following:

where βpj is the (d + 1)-dimensional least squares estimate of the weighted regression, is the (d + 1)-dimensional vector of the time of observation, j is the number of time lags up to the maximum defined by the smoothing parameter (n), p = 0, …, d, and d is the degree of the polynomial fitting11. Finally, the estimates of both components are then used to compute the remainder: R = Y − Tv − Sv12. With the above-mentioned procedure, the STL can in turn detect both the overall and seasonal variation of a time series.

In this paper, seasonal time trends for HFRS was analysed using the STL method via the stl() function in R software, which enables each of the components to be isolated and analysed, according to Hyndman1’s definition in R13,14, two main parameters (the trend window (t.window) and seasonal window (s.window) can control how rapidly the trend and seasonal components can change.

ETS model

The exponential smoothing model (ETS) method is a kind of forecasting method which takes the historical information into comprehensive consideration; with weighting observed values, the forecasting value can comprehensively reflect all the historical information, and take the effect of time variation on the forecasting value into consideration15,16. ETS model considers an original time series as a combination of the trend (T), seasonal (S) and error (E) components, which can be additive (A), multiplicative (M) or none (N). There ETS method contains several methods in detail, such as single exponential smoothing, double exponential smoothing, Holt trend exponential smoothing (with or without seasonal characteristics), and some other methods based on the various characteristics of the original series. According to Yang’s description, the trend components consists of another combination of a level term (l) and a growth term (b). the forecast trend Th over the next h time periods, (l) and b can be combined in the following 5 ways:

where 0 < Φ < 1 is defined as the damping parameter, and the seasonal components can be additive(A), multiplicative(M) or none(N). When it comes to the seasonal components, it can be additive (T × S), multiplicative (T × S) or none. This gives rise to the combinations of time series components as shown in Table 1:

Table 1 Definitions of A, N, M and D in the ETS (A, N, M) model.

In E-views, parameters like A, N and M were automatically selected through setting the automatic selection mode, the optimal model from the 30 candidate models for fitting and forecasting; the model selection is conducted with the minimal Bayesian Information Criterion (BIC) principle, a residual test was then performed with the Ljung-Box Q test; in the meantime, MAPE is also utilized to test the accuracy (3):

According to Lee17, the MAPE of less than or equal to 10% means highly accurate forecasts, 10% < MAPE < 20% means good forecasts; 20% < MAPE < 50% means reasonable forecasts, and MAPE > 50% suggests inaccurate forecasting.

In the statistic E-views which was designed well for time-series analysis, provides was designed as a built-in analytic procedure (the exactly analytic procedure of ETS in E-views 8 can be seen from the paper “ETS Exponential Smoothing in EViews 8 in the official website of E-views: http://www.eviews.com/EViews8/ev8ecets_n.html)18.

The analysis in this research adopted R and Econometric Views 8 (E-views 8) (E-Views is a statistical package is developed by Quantitative Micro Software (QMS), and mainly designed for time-series analysis, it is currently a more and more popular program that widely used in time series modeling in various fields for the fitting and forecasting analysis of the HFRS data, with α = 0.05 being the significant level.

This paper has been approved by Affiliated Hospital/Clinical Medical College of Chengdu University, as aggregated data with no personal information were involved in this study.

Results

General information

We summarized the monthly reported cases in each year to analyze the overall annual variation trend from 2006 to 2015 in China, the results of which revealed that HFRS in China has been continuously from 2006 to 2015, and the variation trend could be divided into three sections; incidence cases remarkably decreased from 16129 to 9203 from 2006 to 2009, with the percent change of −42.94%; it rose to 13918 to year 2012, with the percent change of 51.23% relative to that in 2009; and with a following decrease from 2012 to 2016, with the percent change of 61.64%. On the whole, decreasing trend was observed from 2006 to 2015, with the percent change of −66.90%; Since 2005, the National Health and Family Planning Commission of China has promulgated a National HFRS monitoring program (Trial)9, focusing particularly on measuring the public health intervention’s effectiveness on HFRS control, descriptive analysis of HFRS incidence from 2006 to 2015 indicated that the prevention control has attained certain achievements as the disease incidence continuously declined during this period.

Decomposed the monthly data of HFRS into the overall trend and the seasonal trend through the STL analysis, we can isolate seasonality and trend components from the monthly HFRS data series and also eliminate part of the random noise or reminder component. As shown in Fig. 1, the variability of each component separately over the timescale. From the seasonal trend, the series showed a 12-month stochastic seasonality in the reporting pattern of HFRS, From the trend trend, we can see a downward overall trend and periodically change of disease incidence; From the reminder angle, we can also see a 12-month stochastic variance; Fig. 2 described the data variation in each year after being decomposed, it could be seen from the analysis results that the monthly data of HFRS had the year-based periodicity, the data in each year had distinct periodicity and seasonality, there were 2 peaks of the reported cases, the epidemic showed 2 peaks, which were summer and winter, and the reported cases in winter were higher than those in summer. May and June in summer would witness the first peak of the reported cases, and the most reported cases each year mainly concentrated on November to January in the following year, and August and September had the least reported cases.

Figure 1
figure 1

The disintegrating results of HFRS by the STL method.

It could be seen obviously that the variation of HFRS in China showed the periodicity with one year (12 months) being a cycle; there was obvious seasonality during the variation process in each year; and there were two peaks in each cycle. From the seasonal angle, the series showed a 12-month stochastic seasonality in the reporting pattern of HFRS, From the trend angle, we can see a downward overall trend and periodically change of disease incidence; From the reminder angle, we can also see a 12-month stochastic variance.

Figure 2
figure 2

The precise variation of the reported HFRS cases in each year after being disintegrated according to different years, among which the red points represented the months with relatively more reported cases in each year, which mainly concentrated on November, December, and January; while the blue points represented the months with relatively few reported cases, which were August and September.

The ETS model was run by the E-views software, and altogether 30 candidate models were enrolled in the analysis as the candidate models through the various combinations of the single parameters like A, N and M. Refer to for the fitting and forecasting of the monthly data of HFRS. ETS (M, N and A) (BIC = 1946.14, see Table 2 and Fig. 3) was determined to be the optimal model for fitting and forecasting (refer to Fig. 4 for the fitting and forecasting results) under the minimal BIC principle, and forecasts of incidence cases from July to December 2016 were: 577, 268, 334, 827, 1725, 1444. Ljung-Box Q test indicated that ETS (M, N, and A) was closer to achieve white noise (PBox-Ljung > 0.05); the goodness test of fit, which demonstrated that MAPE = 13.12%, suggested that the model had good fitting according to the judgment criteria of Lee et al.17.

Table 2 Characteristics of 30 candicate models.
Figure 3
figure 3

BIC comparison of adopting the ETS model to forecast HFRS.

The model selection is conducted with the minimal Bayesian Information Criterion (BIC) principle, and ETS (M, N and A) (BIC = 1946.14, see Table 2 and Fig. 3) was selected to be the optimal model for fitting and forecasting. (Note: A: Additive, N: None, M: Multiplicative, Ad: Additive damped, Md: Multiplicative damped).

Figure 4
figure 4

Fitting and forecasting results of the reported HFRS cases by ETS (M, N, and A).

Discussion

HFRS is a kind of highly fatal infectious disease with murine being the major source of infection, and HFRS has caused severe influence worldwide19. HFRS has milder epidemic situation in Europe and America, and it mostly distributes in Asian countries, among which China is the country that are mostly affected, and HFRS cases can be seen in most areas20. The incidence of HFRS is highly variable at the states level. Our results clearly show that the HFRS incidence in China have decreased dramatically in during the last decade, which is similar to the general trend in several countries in Asia, as far back as 1920s to 1930s, China witnessed the prevalence of HFRS. The targeted vaccine development has lasted for decades since the isolation of the Hantaan virus at home and abroad in 1980 successively; after almost 20 years of efforts, multiple HFRS-targeted vaccines which have played an important role in the control of HFRS in China, have been developed21,22. Currently, the prevention and treatment of HFRS in China follows the principle of “three-early and one-in-place”, namely, early discovery, early rest, early treatment and in-place isolation treatment, and it renders great progress in the prevention of HFRS, but it is still faced with challenges, with the implementation of NHFPC’s National HFRS monitoring program (Trial) since 2005, the disease incidence detected in most provinces showed significant decrease as in some previous studies23,24. The analytic result in this paper, which showed that a decreasing trend was obtained from 2006 to 2015, also indicated that the epidemic trend of HFRS in China was under control, and the prevention and control had attained certain achievements. Even though, the annual reported HFRS cases in China remains the top in the world, which is higher than that in American and European countries25; With the utilization of the time series model, the short term predicting result over the next year is expected that HFRS incidence will continue to decline, implies that the national monitoring program will continue to operate effectively in HFRS control in the near future.

Obtaining the original data series through the reported data, and analyzing the spatial-temporal characteristics through the time series analysis method is an important method to analyze the time data in the epidemiology, which can effectively obtain the important characteristics of data variation, such as periodicity and seasonality22; in addition, the short-term and long-term forecast can evaluate the control measures, in the meantime, it can adopts effective and timely solutions for the epidemic peak that may occur or the reappeared prevalence or outbreak26. Some scholars analyze the annual reported data of HFRS in China through time series model like ARMA, analyze the variation trends, seasonal trends and epidemic characteristics of the annual data of HFRS, and verify the effectiveness of the model27,28. In this research, we analyze the monthly data variation characteristics of HFRS in China only through the STL method. We determine the periodicity of the annual variation of incidence cases, and further determine the concern of the variation in each year through disintegrating the annual data. The epidemic shows two peaks of the reported cases, which are summer and winter, and the cases reported in winter are higher than those in summer. Figure 2 describes the data variation each year after disintegration, May and June in summer will witness the first peak of the reported cases, and the most reported cases each year mainly concentrate on November to January in the following year, and August and September have the least reported cases. Therefore, the relevant departments should conduct corresponding resource allocation for the months of peaks and those with few reported cases according to the incidence, when they are formulating the control policies for HFRS. We forecast the incidence from June to December 2016 through analyzing the ETS model, the results of which suggest that the incidence of HFRS in China still shows periodicity, but the overall condition shows a gradual decreasing trend. Meanwhile, it can be discovered from the model test results that ETS model has good fitting and forecasting accuracy for the incidence of HFRS, and it is suitable for monitoring the morbidity; therefore, we suggest that the time series models, such as ETS be adopted appropriately in the subsequent epidemic research on HFRS, so as to help decision-making.

Additional Information

How to cite this article: Ke, G. et al. Epidemiological analysis of hemorrhagic fever with renal syndrome in China with the seasonal-trend decomposition method and the exponential smoothing model. Sci. Rep. 6, 39350; doi: 10.1038/srep39350 (2016).

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.