Introduction

Air pollution is an impressive parameter in various diseases including respiratory (RD) and cardiovascular disease (CD)1,2. Short-term and long-term exposure to these pollutants has led to signs of illness, recurrence or mortality, loss of life expectancy, and increased hospitalization of patients. Past studies have extensively addressed this issue3,4,5,6,8. Many models have been proposed to determine the relationship between air pollutants and the number of CD and RD patients admission7,8. Due to the increasing importance of variables affecting health and lack of suitable structural models for forecasting, on the other hand, Time series modeling has been developed. Time series contain a set of chronologically arranged evidence that can be predicted by arranging time-dependent observations9. Many of these studies have investigated the correlation, linear, and nonlinear relationships between these variables10,11. In the Slama (2019) study, the hospitalization rate of RD patients was estimated using correlation analysis and distributed lag non-linear model12. However, in Iran and many developing countries, less attention has been paid to this issue. Among the time series prediction methods, Artificial Neural Network (ANN) is more accurate because it is self-adaptive and data-based13. Also, ANN follows nonlinear mathematical relationships and is easily generalizable and used for various functions14. In the study by Rahman et al., the ANN's superiority over other linear and numerical methods for predicting hospital admissions due to air pollution has also been mentioned15. As well, Zhou et al., studied the ANN's superiority over other linear and numerical methods for predicting hospital admissions due to air pollution has also been mentioned. However, in many cities of Iran, including Yazd, a less predictive relationship between the effects of air pollutants on CD and RD patients has been investigated using ANN. Due to the importance of epidemiological studies on the effects of air pollution on human health and the location of Yazd city in desert areas of Iran with high dust flow, such studies are necessary. The ability of the ANN to determine nonlinear relationships between variables affecting the health consequences of air pollution can also be used in this city. Thus this study was aimed to evaluate the hospital admission rate because of CD by ANN and other statistical analyses and models in Yazd-Central Iran.

Materials and methods

Study area

Yazd city is located in central Iran and geographical coordinates of 31.8974°N, 54.3569°E. Its population is 656,474 (2016), of which 267.73 are under the age of 14 years, 564,107 are between the ages of 14 and 65, and about 65,594 are over 65 years. It has an area of 76,469 m2, 1230 m above sea level, and has a warm and dry climate with extended deserts. The geographical location of the study area is obtained from Google earth and shown in Fig. 1.

Figure 1
figure 1

The geographical location of the study area. Maps of Iran-Yazd are available at: https://earth.google.com/web/search/iran/@32.48376657,54.07653009,1720.21296492a,2586706.88328266d,35y,-0h,0t,0r/data=CigiJgokCV-5_6ohZkBAEaVzQIn7uz5AGYK1zaKG3kxAIXKfRyghJEpA and https://earth.google.com/web/search/yazd/@31.87955845,54.33676713,1228.81847223a,31608.9548231d,35y,0h,0t,0r/data=Cm8aRRI_CiUweDNmYTYxOTkzMDM1YjJhOTE6MHg1ZDkyYTE5ZGQ3ZDRhMTBjGfKr3oa95T9AIfovxnatLUtAKgR5YXpkGAIgASImCiQJotZsLUxlQ0AR11C4YGjzNUAZMIgmcwbTU0AhwkoZJxXbPEA in Google earth (2020), respectively.

Data collection

The data was gathered from the hospital database, which is including the number, gender, and age of the patients, also the cause of the disease. These data were obtained from two large hospitals of Yazd, which are referral centers for heart patients. Also, air pollutant concentration data were obtained from the air pollutants monitoring station based in Yazd. Studied air pollution were included PM10, SO2, O3, NO2, and CO. Information was related to 5 consecutive years, from 2016 till 2019, and including 1559 data.

The modeling of the relationship between pollutant and CD

In this study, the relationship between air pollutants and CD was analyzed by several models included linear model and non-linear. The linear model had included a linear regression model, multivariate linear regression model, and correlation. Besides, the time series of data performed by non-linear models such as ANN. Moreover, the relationship significant between these variables was predicted by the AMOS method.

The linear modeling

In this study, the relationship between dependent variables such as CD and independent variables included SO2, CO2, PM10, PM2.5, O3, and NO2 was identified by the regression model. So, the pollutant variation trend and the number of hospital admissions relative to the 5-year studied period were determined by the linear regression model. In this model, the relationship value between variables was determined by R-square.

Besides, the multivariate regression was used to generate a measure of effect, typically CD risks ratio due to air pollutants. So that it describes the associations between these variables. Moreover, the synergistic effect of pollutants on the number of admitted patients was determined using a multivariate regression model.

ANN modeling

In the current study, time series, relationships between variables, and outcomes analysis were done using ANN. The used algorithm in this study was Levenberg–Marqut. Furthermore, the nonlinear auto-regressive exogenous model (NARX) series available in ANN was used because of its higher accuracy than other series. The independent variable (air pollutants) and the dependent variable (CD admission) were entered into the model, as input and output, respectively. The hidden layers included intermediate variables formed by the ANN, which allows the modeling of complex relationships between variables. The transfer function in the hidden layer and the output layer was the sigmoid transfer function and linear transfer function, respectively. In this method, the modeling of a pollutant on the number of hospital admissions in each of the hospitals was investigated. Therefore, 70% of data is used for training, 15% for validation, and 15% for testing. Based on experience, the number of neurons and delays were selected in the range of 10 to 18 and 4 to 10, respectively. In total, five models were run for each pollutant. Ultimately, considering the 5 numbers of studied pollutants and the number of CD admissions in the hospital, 315 times mode were done. Finally, based on the mean square error (MSE), error rate, and correlation coefficient, the best structure in ANN for each pollutant was determined.

Statistical analysis

In the current study, descriptive and analytical statistics were used for data analysis. Concentrations of pollutants and the number of patients were shown using mean and dispersion rates. Data differences in each year were determined by t test. The correlation between the number of patients and other factors was determined by the Pearson correlation.

Ethical approval and consent to participate

In the present study, the data recorded in the hospital archive were used. In order to access the essential data, the necessary permission was issued by the Shahid Sadoughi University of Medical Sciences and correspondence was made. Given that the data was without the names of individuals, there was no need for informed consent.

This study was conducted with the approval of Shahid Sadoughi University of Medical Sciences and Health Services, Medical Ethics Committee. Code: IR.SSU.SPH.REC.1397.164. This study was approved by Shahid Sadoughi University of Medical Sciences in Yazd, which was stated that there is no need to obtain informed consent from patients. The project was found to be in accordance to the ethical principles and national norms and standards for conducting Medical research in Iran. Finally, all authors confirm that all methods were performed in accordance with the relevant guidelines and regulations to human data.

Results and discussion

In this study, the effect of air pollutants in Yazd city on hospital admission rate because of CD disorder was investigated by using time series modeling. First, the current situation in this city was expressed, and then the relationship between each of the two variables was investigated. Ultimately, time series were predicted using ANN. The overall methodology of the study was shown in Fig. 2.

Figure 2
figure 2

Methodological framework of study.

The profiles of air pollutants and admission CD patients over 5 studied years

The profile of air pollutants and meteorological parameters over the 5 years were shown in Table 1.

Table 1 Mean, maximum and minimum concentrations of air pollutants over 5 years.

According to Table 1, the mean concentration of PM10 was 98.48 ± 50.8 μg m−3, but the maximum was 3002 μg m−3. The results showed that during the 5 years of study, 1% of the samples were more than 500 µg m−3, while 63% of the data were more than the standard (100 µg m−3). Also, concentrations of SO2, O3, NO2, and CO were 28.73–0.4, 1224.6.16, 91.94–0.1, and 139.6–0.8 µg m−3, respectively. Additionally, 25% of the data for PM10 and O3 concentrations were in the range of 118–3002 and 23–1224 µg m−3, respectively. Because on some days of the year, the concentration of PM10 rises due to dust storms.

Moreover, the correlation coefficient between the concentrations of pollutants over the 5 years was shown in Table S1. The NO2 concentration correlation with meteorological factors was more than other pollutants. The correlation coefficient between air pollutants and meteorological parameters also showed that the linear correlation between them for PM10, NO2, O3, CO, and SO2 was (Rtemp = − 0.16; Rhumidity = − 0.2), (Rtemp = 1; Rhumidity = 0.42), (Rtemp = − 0.19; Rhumidity = − 0.3), (Rtemp = 0.1; Rhumidity = − 0.22), and (Rtemp = 0.48; Rhumidity = 0.39), respectively. In another study, there was a negligible and negative correlation between air pollutants and meteorological parameters16,17.

Among the index pollutants in Table S1, the highest positive and negative linear correlation was between CO and PM10 (R = 0.62) and CO and SO2 (R = − 0.65), respectively. In China, the highest correlation was detected between NO2 and PM10. Whereas, the lowest and negative relationships were observed between O3 and others16. The number of CD patients admitted pattern to Yazd hospital by sex and two age groups less than 65 years and above, was shown in Table 2.

Table 2 The mean, minimum and maximum number of CD patients by sex and age group.

Based on Table 2, the total number of CD patients over 5 years was 12,341. Of these, 57% were male (6598) and 43% were female (5749) (p < 0.0001). The mean total of CD patients over 5 years was 9.1, with a maximum CD of 27 people per day. On the other hand, 2109 CD patients were in the age group less than 65 years and 382 (15% of total CD) were more than 65 years of age, while the population over 65 years old constitutes 16% of the total population. There was a relatively low correlation between the number of CDs admitted and their age (R = 0.46). The relationship of univariate regression between pollutant concentration and CD count showed that this correlation was very low over 5 years. Thus, univariate linear regression models have not been a suitable model for predicting the time series of the effect of air pollutants on the number of hospital admissions of CD. The association of univariate regression between pollutant concentration and the number of CD patients showed that this correlation was very low over 5 years. So, for PM10, NO2, CO, SO2, and O3 were (R = − 0.0444), (R = 0.0749), (R = − 0.0736), (R = 0.1617), and (R = − 0.052), respectively. The correlation between the concentration of pollutants and the number of CD admission during the 5 years in Yazd was shown in Table 3.

Table 3 The relationship between index pollutants and hospital admission rates of CD in different lags.

According to Table 3, a univariate linear regression model showed that increasing SO2 and NO2 concentrations were effective in increasing hospital admissions. Hence, by increasing the lag, correlation with NO2 and SO2 decreased (lag1 = 0.28) and increased (lag 3 = 0.41), respectively. This correlation was reported in China for Ischemic stock18. The study of Ghozikali in Tabriz, northwestern Iran, and the study of Rajagopalan in JACC state also had higher SO2 and NO pollutants on hospital CD admission19,20. Moreover, in Arak-Iran in 2017, the highest effect was observed with NO2 at lag = 0. Besides, the highest effect of air pollutants on CD was related to CO in the single-pollutant model16. However, the low relationship can be due to the limitation of the linear regression model in considering the synergistic effects of the pollutants.

Annual concentration of air pollutants and their correlation with CD disease admission

The annual distribution of air pollutant concentrations and the pattern of hospital admission for CD patients were shown in Table S2.

According to Table S2, the trend of O3 and PM10 variation were similar. The maximum mean concentrations of both pollutants in 2016 were 117.45 ± 69.7 μg m−3 and 22.8 ± 14.7 ppb, respectively. After this period, the concentration of the pollutants dramatically had increased. However, the maximum concentrations of PM10 and O3 were 3002 and 1224 ppb, respectively. On the other hand, the maximum standard deviation for both was 88.3 and 14.7 μg m−3, respectively. In the study of Ghorbani et al. in 2019 in Mashhad, with increasing concentrations of air pollutants, the mortality rate due to cardiovascular disease has increased. Remarkably, the highest relative risk was related to CO and SO221. Also, in the study of Khajavi et al. In 2019 in Tehran using the AQI model, the highest effect of air pollutants on mortality due to cardiovascular disease occurred at lag = 022.

The correlation between concentrations of air pollutants and meteorological parameters was shown in Table S3. The trend of detected changes for air pollutants can be due to fluctuations in weather. Because of the maximum mean O3 concentration and air temperature (37.4 °C) per year (Table S3), O3 concentration had a positive correlation with air temperature (R = 0.35–0.88). Because of the high temperature in this city, especially in the warm seasons (48–52 °C), the rate of photolysis of the NOx cycle was high, which produces more O3 in warm air. Additionally, the dissolution of O3 in water droplets in the air can also confirm the negative correlation of O3 with air humidity (R = − 0.33 to − 0.56)23. This trend has not been observed for other pollutants including CO, SO2, and NO2. Which the relationship between these pollutants and the meteorological parameters was not specific (Table 5). In general, climate conditions including temperature, and humidity did not change significantly (p > 0.05) during 5 years in Yazd. While concentrations of pollutants such as SO2, O3, NO2, and CO fluctuated during this period. Thus calculations were performed with lag = 1, 2, and 3 (Table 4).

Table 4 Correlation between CD hospital admission and annual concentration of indicator pollutants.

The number of annual admissions of CD patients was variable during the study period (Table 4). In the concentration pattern of the pollutants, the maximum concentrations of SO2, O3, PM10, NO2, and CO were in months 3, 23, 7, and 3, respectively. The correlation between the number of CD diseases and the concentration of pollutants in these months was (R = 0.42; p = 0.0033), (R = − 0.20; p = 0.026), (R = − 0.026; p = 0.0161), (R = − 0.5; p < 0.0001), and (R = 0.37; p < 0.0001), respectively.

So, in 2015, 2016, 2017, 2018, and 2019 were 220, 353, 431, 421, and 1066, respectively. It can be said that this upward trend has grown so, in 2018 the number of these patients has increased to 3 times in 2017. However, there was no linear relationship between the annual concentration of pollutants and the number of CD hospital admissions at lag = 0, 1, and 2. There was also a low correlation for all pollutants (R2SO2 < 0.2). In addition, according to Table 4, no specific trends were observed for the annual concentration of the pollutants studied. In the study of Sharifi et al. in Tehran, the mortality rate due to cardiovascular disease had a significant relationship with the daily concentration of O3 and suspended particles24. The study of Khanjani et al. in 2019 in Tehran also showed that the most distinguished effect of air pollutants on mortality due to cardiovascular disease was related to NO2 and PM10 and occurred at lag = 025.

The pattern of monthly concentration of pollutants and the number of CD patient hospital admission

The trend of monthly changes in pollutant concentrations and the number of hospital admissions per month for CD patients was shown in Fig. 3.

Figure 3
figure 3

The monthly trend of standard pollutant concentrations and the number of CD hospital admissions.

According to Fig. 3a, the mean monthly CD hospital admission number has changed slightly by the 50th month of the study. But immediately after this time, the number of admissions significantly increased (p < 0.05). Monthly concentrations of NO2 had the highest effect on the number of CD hospital admissions in Yazd. Meanwhile, the upward pattern in the number of CD patients was more consistent with the increase in NO2 concentration over the 50th week. This correlation was not observed for the early months of the study. The correlation between the number of CD patients and the monthly concentration of pollutants was given in Table 5.

Table 5 Correlation between concentration of indicator pollutants and the number of monthly CD patients admitted.

By comparing Tables 4 and 5, it can be concluded that the monthly concentration trend has a better prediction for the relationship between the index pollutants and the admission of CD patients in Yazd. In the study of Liu et al., monthly results also showed higher accuracy in predicting the association between hospital admissions and air pollutants26. However, it can be concluded that linear models were not suitable for determining the relationship between air pollutants and CD hospital admissions. Consequently, the prediction was performed using multivariate linear regression models and ANN.

Time series prediction of hospital admission numbers associated with air pollution by ANN and multivariate linear regression model

In this study, the relationship between environmental pollutants and the number of CD patients admitted using the Stepwise regression model. A schematic of these relationships was shown in the regression model in Fig. 4.

Figure 4
figure 4

The relationship between index pollutants and the number of CD patients admitted in the hospital using multivariate linear regression.

In this model with Adjusted R2 = 0.21, F = 5.87, and p = 0.001, it was shown that the age of CD patients, CO and, NO2 had predictive power and were significant (p = 0.001)19. A study in Tabriz has also shown that most hospital admissions for CD patients, was related to age more than 65 years. Also, CO by forming COHb in the blood reduces the oxygen-carrying capacity of the blood, which was effective in causing CD diseases and heart failure27,28,29. In addition, NO2 has been identified as a pollutant representative of vehicle exhaust causing excitotoxicity, endothelial and inflammatory response, and damaging synaptic plasticity in the brain30.

Also, NO concentration and temperature on CO had explanatory power (p = 0.002) as the amount of CO produced from homes increases in winter. Temperature and SO2 had explanatory power over NO2 (p < 0.001), because at low temperatures, humidity increases, and as moisture increases, the NO3 in the air becomes HNO3, which falls like fine droplets. The temperature effect on NO2 also results from the NOx photolysis cycle. Due to the higher accuracy of nonlinear models such as ANN than linear regression models for air quality time series, determining the effect of index pollutants on numbers of CD patients admission in the hospital was done by ANN with the NARX model31,32.

The optimal models derived from the ANN used to predict the number of CD hospital admissions due to index pollutants were determined according to the NARX criteria and were shown in Table S4. According to Table S4, the best model to predict the effects of PM10, NO2, O3, and SO2 had 14, 12, 10, and 13 neurons in the hidden layer. In these structures, the delay has been 6, 5, 9, and 9, respectively. Correlation coefficients for PM10, NO2, O3 and SO2 were 0.78, 0.79, 0.81, and 0.83, respectively. As such, the ANN had high power in predicting the number of hospital admissions for SO2-induced CD disease33 because SO2 is a gaseous pollutant in the environment. Which increases the expression of the proinflammatory enzyme and vasoregulatory pathway and is effective in the development and progression of many CD diseases34,35. Whereas NO2, O3, and NO have been effective factors in increasing the admission of CD patients in Tabriz36. In addition, the error autocorrelation function plot of the various series in Fig. S1a showed that the optimal models used in this study were stationary due to the maximum correlation in lag = 0 and with a 95% confidence limit. Thus, this model can be used to predict hospital admissions due to air pollutants.

Besides, the time-series response plot in Fig. S1c also shows that the output curve was distributed on both sides of the response curve. The low error rate in training, testing, and validation showed that the ANN model was reliably reflected in time series data. Only in 2019, this error was more than the previous time. Other studies have also shown that ANN has been a good model for predicting the admission of CD patients to hospitals in Iranian cities and other countries, compared to logistic regression and NARX methods19,30,37,38. According to the results of regression modeling and ANN, SO2 has been an effective factor in hospital admission of CD patients in Yazd. Which has not received considerable attention in previous studies in the central region of Iran. In addition, in this study, the effect of single indicator pollutants by ANN was investigated but considering their synergistic effect, determine the effect of other specific pollutants and synergistic effects are suggested. According to the results of the current study, in this type of modeling, the correlation coefficient was less than 0.9. However, the results obtained from the used models had a higher relationship for the linear regression models (R = 0.45).

Conclusion

In the current study, the effect of index pollutants in the air of Yazd-center of Iran on the number of hospital admissions of CD patients in a 5-year has been investigated, and the following results have been obtained:

According to the average concentration of pollutants, air quality in this period has not been in good condition. The changes trend in the concentration of NO2 and SO2 pollutants was more dependent on meteorological parameters. Also, most of the cardiovascular patients in this period were related to men. In this study, several models were examined to determine the relationship between air pollutants and the number of cardiovascular patients admitted to the hospital. According to the linear regression model, the highest correlation was in lag = 4 and for SO2, this correlation was lower than 0.5. Although the results of ANN have been appropriate, due to the limitations for the authors to predict these relationships using other algorithms, models and algorithms are proposed.