Ambient air pollution and cardiovascular disease rate an ANN modeling: Yazd-Central of Iran

This study was aimed to investigate the air pollutants impact on heart patient's hospital admission rates in Yazd for the first time. Modeling was done by time series, multivariate linear regression, and artificial neural network (ANN). During 5 years, the mean concentrations of PM10, SO2, O3, NO2, and CO were 98.48 μg m−3, 8.57 ppm, 19.66 ppm, 18.14 ppm, and 4.07 ppm, respectively. The total number of cardiovascular disease (CD) patients was 12,491, of which 57% and 43% were related to men and women, respectively. The maximum correlation of air pollutants was observed between CO and PM10 (R = 0.62). The presence of SO2 and NO2 can be dependent on meteorological parameters (R = 0.48). Despite there was a positive correlation between age and CD (p = 0.001), the highest correlation was detected between SO2 and CD (R = 0.4). The annual variation trend of SO2, NO2, and CO concentrations was more similar to the variations trend in meteorological parameters. Moreover, the temperature had also been an effective factor in the O3 variation rate at lag = 0. On the other hand, SO2 has been the most effective contaminant in CD patient admissions in hospitals (R = 0.45). In the monthly database classification, SO2 and NO2 were the most prominent factors in the CD (R = 0.5). The multivariate linear regression model also showed that CO and SO2 were significant contaminants in the number of hospital admissions (R = 0.46, p = 0.001) that both pollutants were a function of air temperature (p = 0.002). In the ANN nonlinear model, the 14, 12, 10, and 13 neurons in the hidden layer were formed the best structure for PM, NO2, O3, and SO2, respectively. Thus, the Rall rate for these structures was 0.78–0.83. In these structures, according to the autocorrelation of error in lag = 0, the series are stationary, which makes it possible to predict using this model. According to the results, the artificial neural network had a good ability to predict the relationship between the effect of air pollutants on the CD in a 5 years' time series.

Data collection. The data was gathered from the hospital database, which is including the number, gender, and age of the patients, also the cause of the disease. These data were obtained from two large hospitals of Yazd, which are referral centers for heart patients. Also, air pollutant concentration data were obtained from the air pollutants monitoring station based in Yazd. Studied air pollution were included PM 10 , SO 2 , O 3 , NO 2 , and CO. Information was related to 5 consecutive years, from 2016 till 2019, and including 1559 data.
The modeling of the relationship between pollutant and CD. In this study, the relationship between air pollutants and CD was analyzed by several models included linear model and non-linear. The linear model had included a linear regression model, multivariate linear regression model, and correlation. Besides, the time series of data performed by non-linear models such as ANN. Moreover, the relationship significant between these variables was predicted by the AMOS method.
The linear modeling. In this study, the relationship between dependent variables such as CD and independent variables included SO 2 , CO 2 , PM 10 , PM 2.5 , O 3 , and NO 2 was identified by the regression model. So, the pollutant variation trend and the number of hospital admissions relative to the 5-year studied period were determined by the linear regression model. In this model, the relationship value between variables was determined by R-square.
Besides, the multivariate regression was used to generate a measure of effect, typically CD risks ratio due to air pollutants. So that it describes the associations between these variables. Moreover, the synergistic effect of pollutants on the number of admitted patients was determined using a multivariate regression model. ANN modeling. In the current study, time series, relationships between variables, and outcomes analysis were done using ANN. The used algorithm in this study was Levenberg-Marqut. Furthermore, the nonlinear autoregressive exogenous model (NARX) series available in ANN was used because of its higher accuracy than other

Results and discussion
In this study, the effect of air pollutants in Yazd city on hospital admission rate because of CD disorder was investigated by using time series modeling. First, the current situation in this city was expressed, and then the relationship between each of the two variables was investigated. Ultimately, time series were predicted using ANN. The overall methodology of the study was shown in Fig. 2.
The profiles of air pollutants and admission CD patients over 5 studied years. The profile of air pollutants and meteorological parameters over the 5 years were shown in Table 1.
According to Table 1, the mean concentration of PM 10 was 98.48 ± 50.8 μg m −3 , but the maximum was 3002 μg m −3 . The results showed that during the 5 years of study, 1% of the samples were more than 500 µg m −3 , while 63% of the data were more than the standard (100 µg m −3 ). Also, concentrations of SO 2 , O 3 , NO 2 , and CO were 28.73-0.4, 1224.6.16, 91.94-0.1, and 139.6-0.8 µg m −3 , respectively. Additionally, 25% of the data for PM 10 and O 3 concentrations were in the range of 118-3002 and 23-1224 µg m −3 , respectively. Because on some days of the year, the concentration of PM 10 rises due to dust storms.
Moreover, the correlation coefficient between the concentrations of pollutants over the 5 years was shown in Table S1. The NO 2 concentration correlation with meteorological factors was more than other pollutants. The correlation coefficient between air pollutants and meteorological parameters also showed that the linear correlation between them for PM 10 , NO 2 , O 3 , CO, and SO 2 was (R temp = − 0.16; R humidity = − 0.2), (R temp = 1; R humidity = 0.42), (R temp = − 0.19; R humidity = − 0.3), (R temp = 0.1; R humidity = − 0.22), and (R temp = 0.48; R humidity = 0.39), respectively. In another study, there was a negligible and negative correlation between air pollutants and meteorological parameters 16,17 .
Among the index pollutants in Table S1, the highest positive and negative linear correlation was between CO and PM 10 (R = 0.62) and CO and SO 2 (R = − 0.65), respectively. In China, the highest correlation was detected between NO 2 and PM 10 . Whereas, the lowest and negative relationships were observed between O 3 and others 16 . The number of CD patients admitted pattern to Yazd hospital by sex and two age groups less than 65 years and above, was shown in Table 2.
Based on Table 2, the total number of CD patients over 5 years was 12,341. Of these, 57% were male (6598) and 43% were female (5749) (p < 0.0001). The mean total of CD patients over 5 years was 9.1, with a maximum CD of 27 people per day. On the other hand, 2109 CD patients were in the age group less than 65 years and 382 (15% of total CD) were more than 65 years of age, while the population over 65 years old constitutes 16% of the total population. There was a relatively low correlation between the number of CDs admitted and their age (R = 0.46). The relationship of univariate regression between pollutant concentration and CD count showed that this correlation was very low over 5 years. Thus, univariate linear regression models have not been a suitable model for predicting the time series of the effect of air pollutants on the number of hospital admissions of CD. The association of univariate regression between pollutant concentration and the number of CD patients showed that this correlation was very low over 5 years. So, for PM 10 Table 3.
According to Table 3, a univariate linear regression model showed that increasing SO 2 and NO 2 concentrations were effective in increasing hospital admissions. Hence, by increasing the lag, correlation with NO 2 and SO 2 decreased (lag1 = 0.28) and increased (lag 3 = 0.41), respectively. This correlation was reported in China for    19,20 . Moreover, in Arak-Iran in 2017, the highest effect was observed with NO 2 at lag = 0. Besides, the highest effect of air pollutants on CD was related to CO in the single-pollutant model 16 . However, the low relationship can be due to the limitation of the linear regression model in considering the synergistic effects of the pollutants.

Annual concentration of air pollutants and their correlation with CD disease admission.
The annual distribution of air pollutant concentrations and the pattern of hospital admission for CD patients were shown in Table S2. According to Table S2, the trend of O 3 and PM 10 variation were similar. The maximum mean concentrations of both pollutants in 2016 were 117.45 ± 69.7 μg m −3 and 22.8 ± 14.7 ppb, respectively. After this period, the concentration of the pollutants dramatically had increased. However, the maximum concentrations of PM 10 and O 3 were 3002 and 1224 ppb, respectively. On the other hand, the maximum standard deviation for both was 88.3 and 14.7 μg m −3 , respectively. In the study of Ghorbani et al. in 2019 in Mashhad, with increasing concentrations of air pollutants, the mortality rate due to cardiovascular disease has increased. Remarkably, the highest relative risk was related to CO and SO 2 21 . Also, in the study of Khajavi et al. In 2019 in Tehran using the AQI model, the highest effect of air pollutants on mortality due to cardiovascular disease occurred at lag = 0 22 .
The correlation between concentrations of air pollutants and meteorological parameters was shown in Table S3. The trend of detected changes for air pollutants can be due to fluctuations in weather. Because of the maximum mean O 3 concentration and air temperature (37.4 °C) per year (Table S3), O 3 concentration had a positive correlation with air temperature (R = 0.35-0.88). Because of the high temperature in this city, especially in the warm seasons (48-52 °C), the rate of photolysis of the NOx cycle was high, which produces more O 3 in warm air. Additionally, the dissolution of O 3 in water droplets in the air can also confirm the negative correlation of O 3 with air humidity (R = − 0.33 to − 0.56) 23 . This trend has not been observed for other pollutants including CO, SO 2 , and NO 2 . Which the relationship between these pollutants and the meteorological parameters was not specific (Table 5). In general, climate conditions including temperature, and humidity did not change significantly (p > 0.05) during 5 years in Yazd. While concentrations of pollutants such as SO 2 , O 3 , NO 2 , and CO fluctuated during this period. Thus calculations were performed with lag = 1, 2, and 3 ( Table 4).
The number of annual admissions of CD patients was variable during the study period (Table 4). In the concentration pattern of the pollutants, the maximum concentrations of SO 2 , O 3 , PM 10 , NO 2 , and CO were in months 3, 23, 7, and 3, respectively. The correlation between the number of CD diseases and the concentration of It can be said that this upward trend has grown so, in 2018 the number of these patients has increased to 3 times in 2017. However, there was no linear relationship between the annual concentration of pollutants and the number of CD hospital admissions at lag = 0, 1, and 2. There was also a low correlation for all pollutants (R 2 SO2 < 0.2). In addition, according to Table 4, no specific trends were observed for the annual concentration of the pollutants studied. In the study of Sharifi et al. in Tehran, the mortality rate due to cardiovascular disease had a significant relationship with the daily concentration of O 3 and suspended particles 24 . The study of Khanjani et al. in 2019 in Tehran also showed that the most distinguished effect of air pollutants on mortality due to cardiovascular disease was related to NO 2 and PM 10 and occurred at lag = 0 25 .  www.nature.com/scientificreports/ Fig. 3. According to Fig. 3a, the mean monthly CD hospital admission number has changed slightly by the 50th month of the study. But immediately after this time, the number of admissions significantly increased (p < 0.05). Monthly concentrations of NO 2 had the highest effect on the number of CD hospital admissions in Yazd. Meanwhile, the upward pattern in the number of CD patients was more consistent with the increase in NO 2 concentration over the 50th week. This correlation was not observed for the early months of the study. The correlation between the number of CD patients and the monthly concentration of pollutants was given in Table 5.

The pattern of monthly concentration of pollutants and the number of CD patient hospital admission. The trend of monthly changes in pollutant concentrations and the number of hospital admissions per month for CD patients was shown in
By comparing Tables 4 and 5, it can be concluded that the monthly concentration trend has a better prediction for the relationship between the index pollutants and the admission of CD patients in Yazd. In the study of Liu et al., monthly results also showed higher accuracy in predicting the association between hospital admissions and air pollutants 26 . However, it can be concluded that linear models were not suitable for determining the relationship between air pollutants and CD hospital admissions. Consequently, the prediction was performed using multivariate linear regression models and ANN.

Time series prediction of hospital admission numbers associated with air pollution by ANN
and multivariate linear regression model. In this study, the relationship between environmental pollutants and the number of CD patients admitted using the Stepwise regression model. A schematic of these relationships was shown in the regression model in Fig. 4.
In this model with Adjusted R 2 = 0.21, F = 5.87, and p = 0.001, it was shown that the age of CD patients, CO and, NO 2 had predictive power and were significant (p = 0.001) 19 . A study in Tabriz has also shown that most hospital admissions for CD patients, was related to age more than 65 years. Also, CO by forming COHb in the blood reduces the oxygen-carrying capacity of the blood, which was effective in causing CD diseases and heart failure [27][28][29] . In addition, NO 2 has been identified as a pollutant representative of vehicle exhaust causing excitotoxicity, endothelial and inflammatory response, and damaging synaptic plasticity in the brain 30 .
Also, NO concentration and temperature on CO had explanatory power (p = 0.002) as the amount of CO produced from homes increases in winter. Temperature and SO 2 had explanatory power over NO 2 (p < 0.001), because at low temperatures, humidity increases, and as moisture increases, the NO 3 in the air becomes HNO 3 , which falls like fine droplets. The temperature effect on NO 2 also results from the NOx photolysis cycle. Due to the higher accuracy of nonlinear models such as ANN than linear regression models for air quality time series, determining the effect of index pollutants on numbers of CD patients admission in the hospital was done by ANN with the NARX model 31,32 .
The optimal models derived from the ANN used to predict the number of CD hospital admissions due to index pollutants were determined according to the NARX criteria and were shown in Table S4. According to  Table S4, the best model to predict the effects of PM 10 , NO 2 , O 3 , and SO 2 had 14, 12, 10, and 13 neurons in the hidden layer. In these structures, the delay has been 6, 5, 9, and 9, respectively. Correlation coefficients for PM 10 , NO 2 , O 3 and SO 2 were 0.78, 0.79, 0.81, and 0.83, respectively. As such, the ANN had high power in predicting the number of hospital admissions for SO 2 -induced CD disease 33 because SO 2 is a gaseous pollutant in the Table 4. Correlation between CD hospital admission and annual concentration of indicator pollutants. www.nature.com/scientificreports/ environment. Which increases the expression of the proinflammatory enzyme and vasoregulatory pathway and is effective in the development and progression of many CD diseases 34,35 . Whereas NO 2 , O 3 , and NO have been effective factors in increasing the admission of CD patients in Tabriz 36 . In addition, the error autocorrelation function plot of the various series in Fig. S1a showed that the optimal models used in this study were stationary due to the maximum correlation in lag = 0 and with a 95% confidence limit. Thus, this model can be used to predict hospital admissions due to air pollutants.   Fig. S1c also shows that the output curve was distributed on both sides of the response curve. The low error rate in training, testing, and validation showed that the ANN model was reliably reflected in time series data. Only in 2019, this error was more than the previous time. Other studies have also shown that ANN has been a good model for predicting the admission of CD patients to hospitals in Iranian cities and other countries, compared to logistic regression and NARX methods 19,30,37,38 . According to the results of regression modeling and ANN, SO 2 has been an effective factor in hospital admission of CD patients in Yazd. Which has not received considerable attention in previous studies in the central region of Iran. In addition, in this study, the effect of single indicator pollutants by ANN was investigated but considering their synergistic effect, determine the effect of other specific pollutants and synergistic effects are suggested. According to the results of the current study, in this type of modeling, the correlation coefficient was less than 0.9. However, the results obtained from the used models had a higher relationship for the linear regression models (R = 0.45).

Conclusion
In the current study, the effect of index pollutants in the air of Yazd-center of Iran on the number of hospital admissions of CD patients in a 5-year has been investigated, and the following results have been obtained: According to the average concentration of pollutants, air quality in this period has not been in good condition. The changes trend in the concentration of NO 2 and SO 2 pollutants was more dependent on meteorological parameters. Also, most of the cardiovascular patients in this period were related to men. In this study, several models were examined to determine the relationship between air pollutants and the number of cardiovascular patients admitted to the hospital. According to the linear regression model, the highest correlation was in lag = 4 and for SO 2 , this correlation was lower than 0.5. Although the results of ANN have been appropriate, due to the limitations for the authors to predict these relationships using other algorithms, models and algorithms are proposed.  www.nature.com/scientificreports/ Reprints and permissions information is available at www.nature.com/reprints.
Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.