Influential factors and spatial–temporal distribution of tuberculosis in mainland China

Tuberculosis (TB) is an infectious disease that threatens human safety. Mainland China is an area with a high incidence of tuberculosis, and the task of tuberculosis prevention and treatment is arduous. This paper aims to study the impact of seven influencing factors and spatial–temporal distribution of the relative risk (RR) of tuberculosis in mainland China using the spatial–temporal distribution model and INLA algorithm. The relative risks and confidence intervals (CI) corresponding to average relative humidity, monthly average precipitation, monthly average sunshine duration and monthly per capita GDP were 1.018 (95% CI 1.001–1.034), 1.014 (95% CI 1.006–1.023), 1.026 (95% CI 1.014–1.039) and 1.025 (95% CI 1.011–1.040). The relative risk for average temperature and pressure were 0.956 (95% CI 0.942–0.969) and 0.767 (95% CI 0.664–0.875). Spatially, the two provinces with the highest relative risks are Xinjiang and Guizhou, and the remaining provinces with higher relative risks were mostly concentrated in the Northwest and South China regions. Temporally, the relative risk decreased year by year from 2013 to 2015. It was higher from February to May each year and was most significant in March. It decreased from June to December. Average relative humidity, monthly average precipitation, monthly average sunshine duration and monthly per capita GDP had positive effects on the relative risk of tuberculosis. The average temperature and pressure had negative effects. The average wind speed had no significant effect. Mainland China should adapt measures to local conditions and develop tuberculosis prevention and control strategies based on the characteristics of different regions and time.

(99.910). The five highest provinces in 2015 were: Xinjiang (179.716), Tibet (137.407), Guizhou (132.626), Qinghai (122.310) and Hainan (97.113). This article studies the relative risk of tuberculosis in mainland China. The results of the study are divided into five aspects: spatial stratified heterogeneity detection, meteorological and social factors that affect the risk of tuberculosis, spatial distribution, temporal distribution and spatial-temporal distribution of relative risk of tuberculosis.
Spatial stratified heterogeneity detection. The factors include: location, time, temperature, relative humidity, precipitation, duration of sunshine, wind speed, air pressure and per capita GDP. Detecting the spatial stratified heterogeneity of the incidence and influencing factors of tuberculosis from 2013 to 2015, the significance level is less than 0.1, and the q-statistics of factors are 0.875, 0.061, 0.006, 0.058, 0.021, 0.033, 0.029, 0.218 and 0.201, respectively. This shows that the factors studied in this article are all significant in explaining the distribution of tuberculosis, and the spatial effects has the strongest explanatory power. The q-statistics and p-value are shown in Table 1.
The posterior means of the regression coefficients of mean temperature and mean air pressure are − 0.045 and − 0.268. The corresponding relative risks are 0.956 (95% CI 0.942-0.969) and 0.767 (95% CI 0.664-0.875). The two have significant negative effects on the incidence of tuberculosis. When the variable increases by one unit, the relative risks reduce by 4.4% (95% CI 3.1-5.8%) and 23.3% (95% CI 12.5-33.6%).
The posterior mean of the regression coefficient of average wind speed is − 0.009, and the relative risk is 0.991 (95% CI 0.980-1.002). The 95% confidence interval for relative risk contains 1, so average wind speed has no significant effect on the incidence of tuberculosis. Note that the CI is the one under the assumption of the model, rather than the real error, if the assumption of the model is different from the property of a population.
Spatial and temporal distribution. Spatial distribution. The relative risk in area is RR spatial = exp(u + v) .
The relative risk RR spatial of spatial effects in 31 provinces is shown in Table 3 and Fig. 1  www.nature.com/scientificreports/ Shanghai, and Tianjin. These provinces are mostly in East and Central China, which means that these two regions have a lower risk of tuberculosis. On the whole, the relative risk of tuberculosis has obvious spatial differences, showing a trend of distribution in the south and light in the north. In the future, attention should be paid to the spread of tuberculosis in Xinjiang, Guangxi, Hainan, and Heilongjiang, as well as epidemic monitoring in high-risk areas such as Jiangxi, Chongqing, Henan and Anhui.
Temporal distribution. The relative risk in time is RR temporal = exp(γ + ϕ) . The relative risk RR temporal of temporal effects is shown in Figs. 2 and 3. Figure 2 shows the relative risk RR temporal and its confidence band for a total of 36 months from 2013 to 2015. The relative risk of tuberculosis has a seasonal periodicity. It is the most frequent period from February to May each year and most significant in March. It decreases from June to December. Figure 3 shows the temporal effects line for each year. Overall, the relative risk of tuberculosis decreases year by year.
Spatial-temporal distribution. Interaction detection shows that there is a nonlinear enhancement between location and time. q(Location Time) = 1 is greater than the sum of q(Location) = 0.875 and q(Time) = 0.061. The interaction between spatial effect and temporal effect is nonlinearly enhancement. Spatial effect and temporal effect are not independent of each other. The spatial-temporal effect term δ represents a change that cannot be reflected by spatial and temporal effects alone. Figure 4 shows the relative risk in spatial-temporal effect RR spatial−temporal = exp(δ) . From the figure, we can see the change of RR spatial−temporal in two adjacent regions over time. The temporal trend of the incidence risk in two adjacent regions is random. The temporal trend of the regions is also independent of the spatial structure. That is, the impact of unobserved variables on the relative risk of disease does not have the time × spatial structure, and can be separated into time effects and space effects. It can be seen in the figure that the spatial-temporal effect terms of Tibet and Qinghai increased more from 2013 to 2015, indicating that unobserved variables have a greater impact on Tibet and Qinghai. For example, the local medical conditions are not sufficiently developed.

Discussion
This article investigates the influencing factors of the risk of tuberculosis and its spatial and temporal distribution. In general, the number and incidence of tuberculosis from 2013 to 2015 showed a downward trend as a whole. This rough result is satisfactory. The article gives a more rigorous analysis through four aspects. The results are expected to be professional for the research and control of tuberculosis in mainland China. Spatially, the relative risk is different in different provinces. Compared with the existing studies 8,15,23 , Xinjiang, Guizhou, Guangxi and Hunan have been high-risk areas and hot spots. The result in this paper shows that the risk in Hainan is also high from 2013 to 2015. This may be because Hainan has a tropical marine climate with high humidity throughout the year, high rainfall, and long sunshine hours 24 . Therefore, it is necessary to strengthen the prevention and control of tuberculosis in Hainan Province. Early detection and early treatment of tuberculosis patients is necessary. Do a good job of disinfection and sterilization in public places and strengthen the popularization of tuberculosis prevention knowledge.
The relative risk of tuberculosis is different in time, season, and month. Studies have shown that the risk of Zhejiang Province is highest in April 18 , and then gradually decreases. The risk of morbidity in Yunnan is also highest in spring 25 . Overall, the relative risk of tuberculosis is higher in spring and lower in autumn and winter, so protective measures should be strengthened in spring. Remind the public to ventilate frequently and keep indoor air fresh. Strengthen physical exercise and improve immunity.
The existing study 23 has shown that average temperature and average air pressure have negative effects on tuberculosis and average relative humidity has a positive effect, and the study 15 has shown that average precipitation has a positive effect, which are consistent with the results from 2013 to 2015 studied in this article. The results of this paper show that precipitation has a positive effect on tuberculosis, which is consistent with the conclusions of existing studies 7,15 . This may be because tuberculosis is a chronic infectious disease caused by Mycobacterium tuberculosis 26 . Mycobacterium tuberculosis is more likely to survive in an environment with high humidity and precipitation, but not easy to survive in an environment with high temperature and pressure. The monthly average sunshine duration is particularly significant in promoting the risk of tuberculosis. The ultraviolet light contained in the light can harm human skin and eyes, and may cause a decline in human immunity  www.nature.com/scientificreports/ and tuberculosis infection. The results of this study indicate that the duration of sunlight is an important factor affecting the risk of tuberculosis, so when studying the risk of tuberculosis, the duration of sunlight should be considered. The research in this paper shows that monthly GDP per capita has a positive effect on tuberculosis. This may be because the improvement of the economic level has made medical treatment more convenient, which is helpful for the diagnosis of tuberculosis. As GDP continues to increase, treatment levels and medical systems become more complete, the incidence of tuberculosis may decrease 8 . Tibet is relatively remote, with large temperature differences between day and night and relatively long periods of sunlight. Although the results of this study show that Tibet is not in the five provinces with the high risk of tuberculosis from 2013 to 2015, more attention is still needed.
The meteorological factors selected in this paper are comprehensive, but there are still some shortcomings in this paper. First, this article only collected data for a total of 36 months from 2013 to 2015 and data for longer periods can be collected in future research. Second, this article only selects per capita GDP as a socio-economic factor, which can take into account hidden factors such as medical resources.
In summary, this article gives the influence of meteorological and economic factors on the relative risk of tuberculosis from 2013 to 2015 and analyzes the spatial and temporal distribution characteristics of the relative risk of tuberculosis. It is hoped that this will provide a certain theoretical basis for the prevention and control of tuberculosis. The meteorological data from 2013 to 2015 came from the China Meteorological Data Network, which included six variable meteorological data of 826 stations across the country for 36 months. The monthly meteorological data of 31 provinces from 2013 to 2015 were obtained by ordinary kriging interpolation method. Then the total monthly precipitation in the monthly meteorological data was converted into monthly average precipitation, and the total monthly sunshine duration was converted into the monthly average sunshine duration. www.nature.com/scientificreports/ The quarterly GDP data for 2013-2015 came from the National Bureau of Statistics.First, the quarterly GDP was converted into monthly GDP, and then the monthly GDP of each province was converted into monthly per capita GDP.

Methods
Model. Bernardinelli et al. 27 proposed a Bayesian model to study spatial-temporal distribution of disease, also known as a Poisson log-linear model. This model studies the impact of spatial and temporal differences on the relative risk of a specific disease. That is the deviation from the overall relative risk in a region. The model include spatial effect and linear time effect terms, and the spatial effect and its corresponding time trend are random effects to reflect the overall relative risk level of a specific region. It also includes a separable space-time effect term, reflecting the temporal trends among regions. Knorr et al. 28 changed the linear time effect term in the Poisson log-linear model to non-linear, including structured time effect and unstructured time effect, and changed the spatiotemporal effect interaction term to non-separable to adapt to more universal disease research. This spatial-temporal distribution model can better study and explain the spatial and temporal distribution characteristics of relative risk. In the studies 7,23 of tuberculosis in mainland China, the time effect term is linear, and the spatiotemporal effect interaction term is not considered. The study of temporal and spatial-temporal distribution is not thorough enough.
The study of the spatial-temporal distribution of disease requires data from multiple regions, multiple times, and multiple influencing factors, and the amount of data is large. Compared with the MCMC method, the INLA algorithm proposed by Rue 29 in 2009 has more powerful computing capabilities without losing the accuracy. Therefore, applying INLA algorithm to the study of the spatial-temporal distribution of diseases 30,31 is an important method in epidemiology. In the paper, INLA algorithm was used to estimate the parameters of spatial-temporal distribution model.
Build the following spatial-temporal distribution model: where i = 1, 2, . . . , 31 , t = 1, 2, . . . , 36 , k = 1, 2, . . . , 6 . Y it is the number of tuberculosis cases in the month t, following the Poisson distribution with the mean value of it . it represents the average onset level on the area i. E it is the expected number of tuberculosis cases in the area i and month t, which is equal to the product of the number of people in area i and the national incidence rate in the month t, which represents the average national incidence. θ it is the relative risk, which represents the risk of the area i compared to the overall risk of tuberculosis in the country. b 0 is the average log relative risk. u i is the spatial structured effect of the area i, which represents that the undefined features in the area i have a spatial structure and follow the conditional autoregressive distribution. v i is the spatial unstructured effect of the area i, which means that the undefined features in the area i do not have a spatial structure and follow a normal distribution. u i and v i can be regarded as hidden variables of area i 32 , which are related and unrelated to the location of the area, respectively. X kit is the value of the kth influencing factor in month t of area i. β k represents effect of the kth influencing factor. γ t is the structured effect of the month t, which means that the undefined features of the month t have a temporal structure and follow the second-order walking model. ϕ t is the unstructured effect of the month t, which means that the undefined features of the month t do not have a temporal structure and follow a normal distribution. γ t and ϕ t can be regarded as hidden variables of the month t, which are related to and irrelevant to the position of month t. δ it is the spatial-temporal interaction effect in the area i and month t. δ follows the normal distribution and the precision matrix is κ δ K δ . K δ is the structure matrix, K δ = K v ⊗ K ϕ . The spatiotemporal interaction effect here represents that the unobserved variables in the area i and month t have no structure in the time × space. That is, the temporal incidence trend in two adjacent areas is random. The specific distribution of the above variables is as follows: where, N i = #N(i) , s 2 i = 1 τ u N i . Where N i is the number of neighbors in the area i. N(i) is the neighbors of the area i. If area i is adjacent to area j, a ij is equal to 1. Otherwise, a ij is 0. a ii is set to 0. τ u is the precision parameter of the spatial structured effect and τ v is the precision parameter of the spatial unstructured effect. Spatial stratified heterogeneity detector. China is huge and diverse in both environmental and socioeconomic determinants of TB prevalence. When analyzing the influence of factors on tuberculosis, it is necessary to detect spatial stratified heterogeneity. This article uses q-statistic 33 to detect the spatial stratified heterogeneity of tuberculosis and the interaction of spatial and temporal effects. The q-statistic formula is as follows: (1) www.nature.com/scientificreports/ where, h is stratum and h = 1, . . . , L . N h and N are the number of units in stratum h and the whole area, respectively; σ 2 h and σ 2 are the variances of the Y value of stratum h and the whole area, respectively. The value range of the q-statistic is [0,1]. The larger the value of q, the stronger the explanatory power of the factor to the dependent variable, otherwise the weaker. If q is equal to 0, it means that there is no relationship between the factor and the dependent variable. If q is 1, it means that the factor completely controls the spatial distribution of the dependent variable.

Ethics declarations. This study does not involve human experiments, and uses public data from the China
Centers for Disease Control and Prevention, so it was not approved by the Ethical Committee.

Data availability
Tuberculosis surveillance data generated during the current study are available in the Chinese Center for Disease