Spatio-Temporal Pattern and Socio-Economic Factors of Bacillary Dysentery at County Level in Sichuan Province, China

Bacillary dysentery (BD) remains a big public health problem in China. Effective spatio-temporal monitoring of BD incidence is important for successful implementation of control and prevention measures. This study aimed to examine the spatio-temporal pattern of BD and analyze socio-economic factors that may affect BD incidence in Sichuan province, China. Firstly, we used space-time scan statistic to detect the high risk spatio-temporal clusters in each year. Then, bivariate spatial correlation and Bayesian spatio-temporal model were utilized to examine the associations between the socio-economic factors and BD incidence. Spatio-temporal clusters of BD were mainly located in the northern-southern belt of the midwest area of Sichuan province. The proportion of primary industry, the proportion of rural population and the rates of BD incidence show statistically significant positive correlation. The proportion of secondary industry, proportion of tertiary Industry, number of beds in hospitals per thousand persons, medical and technical personnel per thousand persons, per capital GDP and the rate of BD incidence show statistically significant negative correlation. The best fitting spatio-temporal model showed that medical and technical personnel per thousand persons and per capital GDP were significantly negative related to the risk of BD.

incidence of BD is related to changes in biology, socio-economic status and environment over space and time 14 . The relationship between meteorological factors and BD has been reported in many studies [11][12][13]15 . But the association between socio-economic variables and BD is still far from clear, with very few studies examining the quantitative relationship between socio-economic factors and BD 16 . This is the first research targeted at the spatio-temporal characteristics of BD in Sichuan province over the last eleven years (2004 to 2014). The authors aimed to explore the spatio-temporal pattern of BD and socio-economic factors that affect the incidence of BD. The spatial, temporal and spatio-temporal analyses were conducted to determine high risk regions of BD, thus to provide information on appropriate allocation of public health resources for better disease control and prevention.

Results
Between January 2004 and December 2014, a total of 201,149 BD cases were reported in Sichuan province. The annual incidence ranged from 9.16 per 100,000 to 38.80 per 100,000 population, with an average annual incidence rate of 22.12 per 100,000 population. Table 1 showed the incidence rate of BD by year. The incidence rate increased from 29.44 per 100,000 in 2004 to 38.80 per 100,000 in 2006, and then declined during 2007-2014. In 2014, the incidence rate was 9.16 cases per 100,000 population, less than one third of what it was in 2004. Of 201,149 BD cases, 109,071 (54.22%) were males and 92,078 (45.78%) were females, with a male-to-female sex ratio 1. 18. Figure 1 illustrated the monthly distribution of BD cases. Clear seasonality peaks were observed. The seasonal peak occurred between May and September for each year. The spatial distribution of annual incidence rate of BD cases is shown in Fig. 2. It clearly indicated that the spatial distribution of BD was heterogeneous at county level. Relatively high incidence rates appeared in the northern-southern belt of the midwest region.
Spatial-temporal cluster analysis identified one most likely cluster and fifteen secondary clusters in 2004. The most likely cluster included 27  In each year, apart from the most likely cluster, several secondary clusters were also detected ( Fig. 3). Most of clusters were located in the northern-southern belt of the midwest area of Sichuan province. Table 3 showed the spatial correlation between the socio-economic factors and the incidence rate of BD at county level in Sichuan province in 2012. The proportion of primary industry and the proportion of rural population was positive correlated with the incidence rate of BD. The proportion of secondary industry, the proportion of tertiary industry and the incidence rate of BD showed a negative correlation. Possible reasons were that in Sichuan province the agricultural machinery production level was not high,    and the low-level sanitary conditions of farmers' living and working environment caused susceptibility to BD. While workers of secondary industry and tertiary industry have relative better sanitary conditions and therefore are less susceptible to BD. Number of beds in hospitals per thousand persons, medical and technical personnel per thousand persons were positively correlated with BD incidence. The two factors are representative of the level of medical conditions. This result suggests that better medical conditions might helpful for the reducing of disease transmission. Per capital GDP and the incidence rates of BD showed a negative correlation. Per capita GDP is a proxy variable for the level of economic development.
The results indicated that good economy was helpful for the improvement of public health condition and reduction BD transmission. In Fig. 4A, the "high-high" and "low-low" clusters indicated a significant positive spatial correlation between the proportion of primary industry and the incidence rates of BD. The results provided evidence  Table 3. Spatial correlation between the socio-economic factors and the incidence of BD, 2012. that the regions that had higher proportion of primary industry were also more likely to have higher BD incidence rates ("high-high"). Similarly, we observed some regions where both proportion of primary industry as well as BD incidence rates were low ("low-low"). A similar pattern was observed in Fig. 4B. The geographic regions that were marked by high rural population proportion had high levels of BD incidence rates. Figure 4C provided evidence that the regions that were poor in economic resources were also more likely to have higher BD incidence rates ("high-low"). Similarly, there were some regions which have better economic resources and lower BD incidence rates ("low-high"). Similar patterns were observed in Fig. 4D-G. The geographic regions that had lower proportion of secondary industry and tertiary industry and were poor in medical resources had high levels of BD incidence rates.
In Table 4, the parameter estimates for association and the corresponding credible interval of Bayesian spatio-temporal model were presented. The best model included two social-economic variables: medical and technical personnel per thousand persons and per capital GDP. The results indicated that medical and technical personnel per thousand persons and per capital GDP were significantly related to the risk of BD, as their 95% credible intervals were less than zero. The model with social-economic variables had a greater fit (DIC = 42202.64) than the model without social-economic variables (DIC = 40931.59). GDP per capita and technical personnel per thousand persons were negatively associated with the disease incidence. An increase of 1000 yuan in the per capita GDP was associated with a decrease of around 2% in the relative risk of BD. An increase of one person in the medical and technical personnel per thousand persons was associated with a decrease of around 1% in the relative risk of BD.
In our study, all BD cases were clinical or laboratory-confirmed and reported by hospital diagnostic. Among the total BD cases, 59734 (29.7%) were laboratory-confirmed cases. We have run the same spatial-temporal analyses on laboratory-confirmed cases of BD. Results of space-time cluster dection were showed in Supplementary Fig S1 and Supplementary Table S1 online. Supplementary Table S2 and Supplementary Fig S2 showed the spatial correlation between the socio-economic factors and the incidence rate of BD at county level in Sichuan province in 2012. In Supplementary Table S3 online, the parameter estimates for association and the corresponding credible interval of Bayesian spatio-temporal were presented.

Discussion
In our study, the temporal analysis showed that most cases occurred in summer and autumn. This was consistent with the findings in other areas of China. Similar seasonal variations were also found in Changsha city and Jiangsu Province 8,11 . However, there was a little difference. For example, the seasonal peak was observed from June to October in Changsha city, which is one month later than that in Sichuan province. Epidemiologists have long been perplexed by the causes of seasonality in infectious diseases of humans. Possibly there is no single theory which could explain this phenomenon 17,18 . Environment changes, especially climate changes, have been mostly implicated. Several previous studies demonstrated that temperature played an important role in the seasonality of BD [11][12][13][19][20][21] . During the high epidemic months (May-September), temperature was higher than that in other months. Rising temperature could increase the incubation and survival of Shigella in the environment. The optimum temperature for the growth of Shigella is 37 °C 22 . In addition, the housefly population density also increases during warm days 23 .
During the eleven-year study period, BD incidence in Sichuan province markedly increased during 2004-2006, and then decreased during 2007-2014. The spatio-temporal scanning results showed that the centers of the most likely cluster were all located in the southwest Sichuan province during the whole study period. The most likely cluster covered counties located in Panzhihua prefecture, Ganzhi prefecture, Liangshan prefecture and its neighboring areas for each year. The most likely cluster detected in 2012 included 23 counties that were mostly located in Liangshan prefecture. Liangshan prefecture, officially the Liangshan Yi autonomous prefecture, has the largest population of ethnic Yi nationally. It is the less developed area in Sichuan province. The result was consistent with the finding of the spatial correlation in which economic development played very important roles in the incidence of BD.
In this study, we used bivariate spatial correlation to examine the spatial relationship between the socio-economic factors and the BD incidence for the geographic regions of Sichuan province. The result indicated that proportion of primary industry and proportion of rural population had a positive  association with the BD incidence. And the other five socio-economic factors were positively associated with the BD incidence. Furthermore, bivariate Local Moran's I revealed the local variations of the spatial dependence between the socio-economic factors and the BD incidence across Sichuan province. The results of Bayesian spatio-temporal model showed that medical and technical personnel per thousand persons and per capital GDP were significantly negatively related to the risk of BD. Progress in economy might helpful for the improvement of hygiene and better access to sanitary water and food. This was consistent with the previous findings. Ferrer et al. reported economic factors as one of the factors determining diarrhea occurrence 24 . Tang et al. reported that people with higher family income had a lower BD incidence rate 8 . In addition, better medical conditions might helpful for the reducing of disease transmission. Preventative strategies should be concentrated in areas with poor economic and medical conditions.
A few limitations are deserved to mention: 1) The underreporting of BD cases is a potential limitation of our study. Some BD cases do not seek health care because they are asymptomatic or their symptoms are mild. The reporting homogeneous among the various counties of Sichuan province is an important parameter when drawing comparisons on the burden of BD in the various counties. However, this parameter haven't been taken into account in this study because the under-reporting rate of various counties in Sichuan province is not available. 2) In our study, we used county as the geographic unit of spatial analysis. However, finer geographic units may provide more useful information which could help health officials to devise more comprehensive strategies. A smaller spatial unit scale (e.g. township level) could be utilized in further research. For instance, the analysis could focus on Muli county and its neighboring counties to detect spatio-temporal clusters with higher burdens of BD at township level. 3) We could not differentiate the pathogens of BD cases reported to the CISDCP system. Therefore, we were unable to examine the specific impacts of socio-economic indicators on different pathogens. 4) In this study, we only examined the relationship between the incidence of BD and socio-economic variables. In further research, more thorough study about the driving forces and risk factors (climate, geography and environment) that contribute to transmission of BD are needed.
In conclusion, our study provides a good understanding of the spatio-temporal distribution of BD in Sichuan province. Allocating more resources to high-risk locations at suitable times might help to reduce BD incidence more effectively. Our results provide evidence that socio-economic factors were spatial correlated with the incidence rate of BD. The success of BD intervention strategies could benefit from giving more consideration to local social and economic conditions.

Materials and Methods
Sichuan province is located in south-west China between longitude 98.31E to 107.99E and latitude 26.40N to 33.68N. It is an inland province with a population of approximately 80 million people. Sichuan province covers an area of 485,000 km 2 , which is divided into 21 prefectures and 180 counties.
Records on BD cases in Sichuan Province from 2004 to 2014 were obtained from the China Information System for Disease Control and Prevention (CISDCP), which is a real-time web-based notifiable diseases reporting system. In our study, all BD cases were clinical or laboratory-confirmed and reported by hospital diagnostic. Demographic information was obtained from the National Bureau of Statistics of China. The socio-economic data from 2004 to 2012 were collected from Sichuan Statistical Yearbook. All collected data were geographically referenced based on 180 counties of Sichuan province, i.e., 180 spatial units for analysis.
A retrospective space-time scan statistic based on the discrete Poisson model was applied to detect high risk space-time clusters of BD cases within Sichuan province 25 . The spatio-temporal cluster analysis was conducted by using SaTScan software (version 9.2). The space-time scan statistic was defined by a cylindrical window with a circle indicating a geographic base and with height representing to time. The base of the cylinder indicated the underlying clustering areas, and the height represented the time period of the potential clusters. The cylindrical window was then moved over the study areas and periods to detect potential spatio-temporal clusters. For each scanning window, the difference of the incidence inside and outside the windows was calculated by log likelihood ratio (LLR). Monte Carlo testing was utilized to determine statistical significance of clusters. Scanning window with the maximum LLR was considered as the most likely cluster, indicating that it was least likely to have occurred by chance. In addition to the most likely cluster, other scan windows where the LLR values were statistically significant were identified as secondary clusters and ranked according to their likelihood ratio test statistic.
For this analysis, yearly scans were performed to control the time trend and to detect changes in spatio-temporal clustering during the whole study period. For each year, we used 180 counties of Sichuan province as spatial units, and 12 months from January to December as time units. To ensure sufficient statistical power, the number of Monte Carlo replications was set to 999. Statistical significance of the clusters was defined as a p-value less than 0.05.
Bivariate Moran's I statistic was used to describe the spatial correlation between the social-economic factors and the incidence rate of BD in Sichuan province 26 . Firstly, a global bivariate spatial correlation analysis was applied in order to calculate a single measurement of spatial correlation between the incidence rate of BD and the social-economic variables. Then, a local bivariate spatial correlation analysis was adopted to identify local patterns of spatial associations which was based on the local indicator of spatial analysis -LISA approach 27 . The standardized first-order contiguity queen neighbors were used as the definition of neighbors in our study. Significance of the test statistic was assessed with a Monte Carlo P value generated using 999 random permutations. The bivariate spatial correlation analysis was conducted by using GeoDa software (version 1.3.28).
In our study, Bayesian spatio-temporal model was used to quantify the association between the incidence of BD and socio-economic variables from 2004 to 2012. The Bayesian spatio-temporal model can be expressed as 28,29 : where i is an index for the spatial units (county), and t is an index for the time periods (year); y it is the observed count; e it is the expected count of cases adjusted for age and gender. It is calculated by applying the provincial crude incidence rate to the county-specific age and gender distributions; θ it is the spatiotemporal-specific relative risk of disease; α is the intercept quantifying the average Poisson relative risk in the whole of Sichuan province; x kit is the socio-economic variable; β k is the regression coefficient of socio-economic variable; υ i is the spatially structured random effect for county i, accounting for the assumption that geographically close areas are more related than distant areas; v i is the non-spatial random effect for county i; γ t is the temporally structured effect for year t; φ t is the unstructured temporal effect for year t. All random effects were modeled, and default minimally informative priors were specified 30 . Firstly, the spatial effect was modeled using an intrinsic conditional autoregressive structure. Secondly, the temporally structured effect was modeled dynamically through a time neighboring structure. Finally, the unstructured spatial and temporal effects were both specified by Gaussian models with a mean of zero. The Gamma (1, 0.0005) was chosen as the prior for the precision of the above Gaussian random effects.
Model parameters were estimated using integrated nested Laplace approximations (INLA), a method for approximate Bayesian inference in structured additive regression models with latent Gaussian models. INLA outperforms traditional Markov Chain Monte Carlo (MCMC) method in terms of computational time, while providing very accurate results 31 . The performance of models were compared using the deviance information (DIC). The model with the lowest DIC indicates the best trade-off between the model fit and complexity 32 . The INLA package in the R software (version 3.1.1) was used for the Bayesian spatio-temporal modeling.