Development and comparison of forecast models of hand-foot-mouth disease with meteorological factors

Hand-foot-mouth disease (HFMD) is an acute intestinal virus infectious disease which is one of major public health problems in mainland China. Previous studies indicated that HFMD was significantly influenced by climatic factors, but the associated factors were different in different areas and few study on HFMD forecast models was conducted. Here, we analyzed epidemiological characteristics of HFMD in Yiwu City, Zhejiang Province and constructed three forecast models. Overall, a total of 32554 HFMD cases were reported and 12 cases deceased in Yiwu City, Zhejiang Province. The incidence of HFMD peaked every other year and the curve of HFMD incidence had an approximately W-shape. The majority of HFMD cases were children and 95.76% cases aged ≤5 years old from 2008 to 2016. Furthermore, we constructed and compared three forecast models using autoregressive integrated moving average (ARIMA) model, negative binomial regression model (NBM), and quasi-Poisson generalized additive model (GAM). All the three models had high agreements between predicted values and observed values, while GAM fitted best. The exposure-response curve of monthly mean temperature and HFMD was approximately V-shaped. Our study explored epidemiological characteristics of HFMD in Yiwu City and provided accurate methods for early warning which would be great importance for the control and prevention of HFMD.

However, few studies were focus on forecast models of HFMD incidence which is of vital importance for HFMD prevention. Zhejiang is a southeastern coastal province of China and a total of 875945 HFMD cases were identified from 2008 to 2015 20 . Here, we not only analyzed the association between HFMD occurrence and meteorological factors in a city of Zhejiang Province, but also constructed and compared three prediction models for HFMD.

Materials and Methods
Data collection. HFMD cases were diagnosed according to "HFMD Control and Prevention Guide" issued by the National health commission of the People's Republic of China. HFMD cases should be reported to China Information System for Disease Control and Prevention (CISDCP, http://www.cdpc.chinacdc.cn) within 24 h after diagnosis. Information about HFMD cases from 2008 to 2016 in Yiwu City, Zhejiang Province, such as gender, age, occupation, and date of illness onset were obtained from CISDCP. Ethical approval for the study was obtained from the Chinese Center for Disease Control and Prevention Ethics Committee (No. 201214). Informed consent was obtained from all subjects or, if subjects were under 18, from a parent and/or legal guardian. All methods were carried out in accordance with relevant guidelines and regulations and all experimental protocols were approved by Chinese Center for Disease Control and Prevention when HFMD cases were diagnosed in hospitals.
Meteorological data including sunshine duration, monthly precipitation, monthly maximum temperature (T max ), monthly minimum temperature (T min ), monthly mean temperature (T mean ), monthly wind speed (W speed ), monthly minimum relative humidity (RH min ), and monthly mean relative humidity (RH mean ) in Yiwu City were downloaded from the China Meteorological Administration Network (http://data.cma.cn/).

Statistical analysis.
Statistical analysis was performed with the use of R 3.5.0 and Statistical Product and Service Solutions (SPSS 20.0; Chicago, IL). Descriptive statistics were used to analyze the demographic characteristics and seasonal distribution of HFMD in Yiwu City, Zhejiang Province, China. We used Chi square test to compare gender distribution and seasonal distribution of HFMD cases in different years.
The dataset from 2008 to 2015 was used to develop forecast models and the dataset from the dataset of 2016 were used to test the fit of forecast models. Prior to development of forecast models, correlation analysis was conducted to identify collinearity of independent variables.
ARIMA model was constructed as previously described 21 . Briefly, partial autocorrelation function (PACF) and autocorrelation function (ACF) were analyzed to decide the parameters (p, d, q). The optimal model was selected according to Akaike information criterion (AIC).
The number of monthly HFMD cases in the study is over-dispersed (variance/mean = 502.98), so negative binomial model was selected to analyze the relationship of HFMD and meteorological data. Furthermore, quasi-Poisson generalized additive model (GAM) was also selected to develop forecast model as meteorological factors may have a non-linear relationship with HFMD occurrence. These models were also developed and the optimal models were selected as previously described 21 .
In order to compare the agreement between observed data and forecast data of three models, F test was conducted and intraclass correlation coefficient (ICC) was calculated.

Results
Overall, a total of 32554 HFMD cases were reported and 12 cases deceased in Yiwu City, Zhejiang Province. The number of report HFMD cases from 2008 to 2016 were 747, 886, 3391, 1845, 7724, 2938, 4369, 2196, and 8458, respectively (Fig. 1). Of note, the incidence of HFMD peaked every other year and the curve of HFMD incidence had an approximately W-shape.
About 63.32% HFMD cases were male (20613/32554) and gender distribution of HFMD cases in different years was not similar (χ2 = 41.939, P = 0.000). The numbers of < 1 year old group, 1∼ years old group, 2∼ years old group, 3∼ years old group, 4∼ years old group, 5∼ years old group and > 5 years old group were 4002, 10140, 7668, 5632, 2609, 2170, and 333, respectively. Most HFMD cases were children and 95.76% cases aged ≤5 years old from 2008 to 2016. The majority of HFMD cases were scattered children and kindergarten's children which accounted for 64.13% and 33.36%, respectively.  (Table 1). According to results of correlation analysis, Tmax, Tmin, and Tmean were highest correlated, and RHmin and RHmean were collinearity. So we selected one of Tmax, Tmin, and Tmean, and one of RHmin and RHmean when we constructed forecast models.
Prior to construction of ARIMA model, results of ADF test indicated time series of monthly HFMD cases was stationary (DF = −3.5224, P = 0.04343) and results of Lung-Box test indicated that the time series was not of random (χ2 = 57.613, P = 0.000). According to the results of ACF, PACF, and AIC, ARIMA (0, 0, 2) × (0, 1, 1) 12 was selected as the optimal model (Fig. 3). Monthly minimum temperature was significantly associated with the number of monthly HFMD cases. Using the optimal ARIMA model, the predicted numbers of monthly HFMD cases of 2016 in Yiwu City were 229, 265, 325, 729, 1136, 1466, 1593, 109, 245, 130, 511, and 713, respectively. The predicted values were in good agreement with observed data (Fig. 4). ICC of the optimal ARIMA model was 0.878 indicating that this model was excellent (Table 2).
Based on the AIC values, the optimal NBM was the following:  (Fig. 4). ICC of the optimal ARIMA model was 0.834 indicating that the model was excellent (Table 2).
Based on the values of deviance explained (%), R square, and GCV principles, the optimal GAM was the following: t 0 µ = + . + = * + = + = µt, f0, month, Caset1, and Tmean represented similar variables in NBM. "RHmean" was the monthly mean relative humidity. As shown in Fig. 4, HFMD cases in the previous month and RHmean were positively associated with HFMD incidence. However, the exposure-response curve of monthly mean temperature and HFMD cases occurrence was approximately V-shaped and Tmean showed positive effects on HFMD incidence when it was higher than 17 °C (Fig. 5). Based on the optimal GAM model, the predicted numbers of monthly HFMD cases of 2016 in Yiwu City were 199, 58, 215, 824, 1399, 2180, 821, 289, 387, 499, 865, and 523, respectively. The predicted values were in good agreement with observed data (Fig. 4). ICC of the optimal ARIMA model was 0.993 indicating that the model was perfect (Table 2).

Discussion
Since HFMD was included in the management of Class C notifiable infectious diseases in mainland China in May 2008, HFMD cases were reported in most provinces of China. A cluster analysis indicated that the incidence rate of HFMD in Zhejiang Province, Hainan Province, Guangxi Province, Shanghai City and Beijing City was higher than that in other provinces 22 . In Zhejiang Province, at least tens of thousands of HFMD cases were reported every year and 212536 HFMD cases were identified in 2014 20 . Yiwu City is located in the center of Zhejiang   www.nature.com/scientificreports www.nature.com/scientificreports/ Province and clusters of HFMD cases have been identified in this city 23 . In this study, we found that the incidence rate of HFMD was upward despite the fluctuations. The results informed that comprehensive measures should be conducted to prevent the increase of HFMD incidence rate.
Similar to other studies, our study indicated that 95.76% HFMD cases aged ≤5 years and most cases were male 24,25 . Decrease immune function, more exposure chance, and genetic susceptibility may contribute this result. Nevertheless, male children aged ≤5 years were the emphasis for control and prevention of HFMD. Most HFMD cases were reported during April and July, but the peak period of different years was slightly different. Notably, the peak period in 2011 was November and December. The reason may be that weather factors of November and December, 2011 were suitable for the growth and transmission of HFMD pathogens. The result suggested that control measures should also be conducted in other months according to the fluctuation of HFMD incidence rate.
Previous studies reported that some meteorological factors were associated with HFMD incidence, but the associated factors were different in different areas. A study in Hefei City, China reported that HFMD occurrence was significantly influenced by extreme precipitation and the effect was the greatest at 6 days lag 26 . Yu G et al. reported that high precipitation, extreme temperatures and low-O3 concentration increased HFMD incidence, whereas extremely high wind speed, low PM2.5, low precipitation, and high O3 decreased HFMD incidence in Guilin City, China 27 . Wang P et al. found that the year-round temperature and relative humidity, sun duration in winter, and rainfall in summer significantly influenced HFMD incidence 28 . A study in Hong Kong reported that relative humidity, temperature, rainfall, solar radiation and wind speed were associated with HFMD incidence 29 . Moreover, Tian L et al. found that mean temperature, relative humidity, wind velocity and sunshine hours were all positively associated with HFMD in Beijing City, but Liu W et al. found relative humidity had no relationship with HFMD in Jiangsu Province 24,30 . In our study, NBM and GAM results indicated that monthly mean relative humidity, monthly mean temperature, and wind speed were significantly associated with HFMD. The different risk factors in different areas suggested that different interventions should be conducted in different areas and control measures should be more accurate.
Beside analysis of meteorological factors associated with HFMD incidence, we selected three forecast models using ARIMA, NBM, and GAM to predict HFMD incidence in Yiwu City. During the construction of ARIMA modes, periodic changes, long term trends, and random disturbances were all taken into account. Up to date, ARIMA models have been widely used in infectious diseases prediction including malaria, hemorrhagic fever with renal syndrome (HFRS), and influenza, and so on 31,32 . Our study found that ARIMA was also suitable for the forecast of HFMD incidence.
Due to over dispersion of monthly HFMD cases, NBM was selected to construct prediction model instead of Poisson model in our study. The optimal NBM also had good agreement between prediction values and observed values, but ICC of this model was the least. We found that the prediction was good in months when the number of HFMD cases was small, but the prediction was not good in months when the number of HFMD cases was large. The prediction value was significantly larger than observed value during HFMD peak periods.
GAM can analyze nonnormally distributed data 33 . It can adjust not only nonlinear, nonparametric, and trends, but also confounding effects of seasonality, and weather variables. In our study, we used a quasi-Poisson GAM to analyze relationship between meteorological factors and HFMD and develop HFMD model. ICC of GAM indicated that this model was perfect for prediction of HFMD occurrence. In addition, we found that monthly mean temperature had positive effect on HFMD when it was higher than 17 °C. It suggested that 17 °C of monthly mean temperature could be considered as an alarm value for early warning of HFMD. www.nature.com/scientificreports www.nature.com/scientificreports/ In summary, our study not only analyzed epidemiological characteristics of HFMD in Yiwu City but also explored associated meteorological factors and developed three forecast models. All the three models had high agreements between predicted values and observed values, while ICC of GAM was the best. We also identified that temperature, relative humidity, and wind speed were significantly associated with HFMD. The exposure-response curve of monthly mean temperature and HFMD cases occurrence was approximately V-shaped and it showed positive effects on HFMD incidence when it was higher than 17 °C. Our study explored epidemiological characteristics of HFMD in Yiwu City and provided accurate methods for early warning which would be great importance for the control and prevention of HFMD.