Abstract
Air pollution due to air contamination by gases, liquids, and solid particles in suspension, is a great environmental and public health concern nowadays. An important type of air pollution is particulate matter with a diameter of 10 microns or less (\({\text {PM}}_{10}\)) because one of the determining factors that affect human health is the size of particles in the atmosphere due to the degree of permanence and penetration they have in the respiratory system. Therefore, it is extremely interesting to monitor and understand the behavior of \({\text {PM}}_{10}\) concentrations so that they do not exceed the established critical levels. In this work, we will study the \({\text {PM}}_{10}\) concentrations in all available monitoring stations in the Brazilian state of Minas Gerais. To better understand its behavior, we will provide a spatio-temporal visualization of the \({\text {PM}}_{10}\) concentrations. Besides the descriptive and visualization analysis, we consider six standard and advanced time series models that will be used to fit and forecast \({\text {PM}}_{10}\) concentrations, with application to three locations, one in Belo Horizonte, the Minas Gerais state capital, and the monitoring stations with the lowest and highest average \({\text {PM}}_{10}\) concentration levels.
Introduction
The human impact on the planet is remarkable, and the attempt to reduce these impacts is increasingly urgent. Air pollution, for example, is directly related to the environment and human health1,2,3. One of the determining factors that affect human health is the size of particles in the atmosphere due to the degree of permanence and penetration they have in the respiratory system4,5,6,7. As the health impact is directly related to the particle size, monitoring the \({\text {PM}}_{10}\) concentrations, particulate materials smaller than or equal to 10 micrometers, is very important8,9. Based on the annual average of \({\text {PM}}_{10}\), the World Health Organization (WHO) ranked Ahvaz in Iran as the most polluted city in the world at 372 \(\upmu\)g/m\(^3\)10.
In this scenario, in Europe, the Apheis project has developed guidelines for analyzing and collecting data on air quality, and public health impacts11. The study presented the health impact in 19 Eastern and Western European cities. The results indicate that reducing long-term \({\text {PM}}_{10}\) exposure by 5 \(\upmu\)g/m\(^3\) could prevent approximately 3300–7700 premature deaths annually. The Apheis project also showed that in urban Europe, current air pollution has a non-negligible impact on public health and that even in cities with low air pollution, preventive measures can reduce damage12. For its part, in Brazil, in the metropolitan region of São Paulo (MASP), 40% of \({\text {PM}}_{10}\) emissions come from mobile sources13,14,15. In addition, ozone and \({\text {PM}}_{10}\) are the pollutants with the greatest impact on air quality at MASP16,17. A study carried out in the Jânio Quadros and Maria Maluf tunnels in São Paulo indicates that the emission of heavy diesel vehicles is the major source of \({\text {PM}}_{2.5}\) fine particulate matter14. Likewise, a study in the metropolitan region of Lima, the capital of Peru, proposed a space-time visualization to analyze \({\text {PM}}_{10}\) levels, showing that the highest concentrations of \({\text {PM}}_{10}\) were recorded near hills and high-traffic roads and unpaved streets18.
In particular, in Brazil, through Conama Resolution No. 005/1989, the National Air Quality Control Program (Pronar) was created19. This program attempts to build the foundations for a national air quality protection policy20. However, although Pronar is the beginning of a national air quality policy, it has great legal fragility because its legal basis is hierarchically inferior to the already established laws. Furthermore, there is a clear asymmetry between the country’s regions and most of the air quality management instruments are located in southeast Brazil21.
Several studies with air pollution data were developed with this challenge, using statistical models to address both model fit22,23,24, and model forecast25,26,27 of air pollution in Brazil. For example, in Itabira (a city in the Brazilian state of Minas Gerais), an increase of 10 \(\upmu\)g/m\(^3\) of \({\text {PM}}_{10}\) was associated with an increase in respiratory diseases in the emergency room, concluding that an increase in \({\text {PM}}_{10}\) levels has a major impact on the exposed population28. Likewise, a study carried out in the Greater Vitória Region (the capital of the Brazilian state of Espírito Santo) used the seasonal auto-regressive integrated moving average with exogenous factors (SARIMAX) model to better understand and predict the behavior of \({\text {PM}}_{10}\) concentrations, noting that both wind speed and rainfall were statistically significant and helped to improve the model fit29. On the other hand, in a study carried out in the Brazilian State of Rio Grande do Sul, it was presented that the auto-regressive integrated moving average with exogenous factors (ARMAX) model, with the inclusion of the exogenous variables (Carbon Monoxide and Sulfur Dioxide), obtained better performance when compared to the autoregressive integrated moving average (ARIMA), simple exponential smoothing, and Holt-Winters models, for \({\text {PM}}_{10}\) prediction29.
Meanwhile, other methods for time series forecasting, including neural networks30,31,32,33 and deep learning have also been used to forecast variables related to air pollution. For example, one study proposed predictive models for \({\text {PM}}_{2.5}\) concentration with a model that combined the fast Fourier transform and long short-term memory neural network (FFT-LSTM) and proved to be superior to the traditional LSTM and extended long short-term memory recurrent neural network (LSTM) models34. Another study used a deep learning algorithm integrating convolutional neural networks (CNNs) and LSTM neural networks to predict \({\text {PM}}_{2.5}\) concentrations35. Cordova et al.36 studied the spatio-temporal behavior of air quality in Metropolitan Lima, evaluated and predicted the \({\text {PM}}_{10}\) concentrations using the recurrent artificial neural network LSTM, based on the past values of this pollutant and three meteorological variables obtained from five monitoring stations. It is important to notice that the \({\text {PM}}_{10}\) concentrations have nonlinear behavior and fluctuate strongly in spatio-temporal scales37 due to the nonlinear character of the atmospheric wind speed38. Consequently, to manage this strong variability, in this paper, we consider different forecasting models.
Although many studies have been made to better understand the behavior and to forecast \({\text {PM}}_{10}\) concentrations, no comprehensive study that includes all monitoring stations in the Brazilian state of Minas Gerais has been made. In this paper, we will analyze the spatio-temporal dynamics of \({\text {PM}}_{10}\) concentrations in all available monitoring stations in the Brazilian state of Minas Gerais. Then, we compare classical parametric models and neural networks to forecast the \({\text {PM}}_{10}\) concentrations, whose results can be useful for governmental agencies and policymakers to decide on specific policies and actions to improve air quality.
The rest of the paper is structured as follows. The following section describes the data collection, data cleaning, and the methods and models used for \({\text {PM}}_{10}\) forecasting. The section “Results and discussion” presents the descriptive analysis and the main findings of this research regarding model fit and model forecasting. Finally, the section “Conclusions” provides the main conclusions of this paper, together with some recommendations for future research.
Materials and methods
Data collection and data cleaning
The data used in this work was collected by the State Foundation for the Environment of the Brazilian state of Minas Gerais. The data was collected hourly and the last five years available were considered (between 2015 and 2019) in all 58 monitoring stations. The data are publicly available per municipality, monitoring station, and year. For each combination municipality/monitoring station/year, the data is available in a csv file that includes the hourly information on pollutant levels such as \({\text {PM}}_{10}\) and \({\text {PM}}_{2.5}\), as well as meteorological data such as temperature, wind direction, rainfall, atmospheric pressure, wind speed, radiation, and relative humidity. The first step of the analysis was to organize, clean, and store the database, which is always a challenging operation when dealing with real data. In this case, the main challenges were:
-
The lack of information for some variables in several stations.
-
A significant amount of missing values.
From the available 58 monitoring stations, we decided to discard those with a percentage of missing values above \(35\%\) in the \({\text {PM}}_{10}\) data. In addition, one station that did not have data for 2015 was also discarded. Thus, we proceeded with data from 29 air quality monitoring stations distributed throughout the Brazilian state of Minas Gerais. The locations of the 29 monitoring stations considered in this study can be seen in Fig. 1, with more stations in areas with higher population density, resulting in some overlapped points in the map. Figure 6S of the “Supplementary material” shows a heat map of the missing \({\text {PM}}_{10}\) values in each monitoring station, and Table 1S gives the rate of the missing values. The next stage was the imputation of missing values, which was done by using the function na_kalman of the package imputeTS in the R software39. At the end of the process, we obtain a database of \({\text {PM}}_{10}\) concentrations with 43824 hourly observations (rows) for each of the 29 stations (columns) available in the “Supplementary material”. Table 1S of the “Supplementary material”, presents detailed information for each monitoring station, including code, station name, company responsible for the monitoring station, longitude, latitude, and the rate of missing values.
Models for time series forecasting
Time series models are very important and can be useful in many areas of knowledge that collect time-dependent data40,41. They can be used both to understand the underline process that generated the data and to predict future observations42,43. Predictions can be for a short term (e.g., 1 h ahead) and for a long term (e.g., 720 h–1 month ahead). Despite the forecasting horizon, forecasting is an important aid to effective planning, and policy-making44. In this study, six models for time series forecasting of \({\text {PM}}_{10}\) levels are considered and briefly described in the sequence.
Seasonal Naive
The Seasonal Naive (SNAIVE) model is an extension of the NAIVE model that considers a seasonal component of period T in the time series45 and can be written as
where t is the length of the time series, h is the forecasting horizon, T is the seasonal period, \(\widehat{Y}(t+h|t)\) is the prediction h steps ahead, and \(Y(t+h-T)\) is the observed value T observations before the length of the series, t, minus the forecasting horizon, h. This means that the seasonal naive model estimates the out-of-sample forecast as the last observation at the same seasonal point. When considering \(T=1\), the NAIVE model is obtained. This model was adjusted using the snaive function of the package forecast in the software R.
Seasonal Naive + Decomposition
Let us consider the three-part decomposition of the time series Y(t) of length t,
where T(t) is the trend of the time series, S(t) is the seasonal component, and R(t) is the rest/residual of the time series. Although several techniques are available to estimate the components in the decomposition, we consider the STL (Seasonal and Trend decomposition using Loess) for its versatility and robustness. The model Seasonal Naive + decomposition firstly removes the seasonality S(t) of the time series Y(Y),
and then uses the NAIVE model to forecast the time series with the seasonal adjustment, which is added to the seasonal adjustment of the last time period of the time series to obtain the final forecast. The decomposition and forecasts can be obtained by using the stl and naive functions of the R software.
Exponential Smoothing + Decomposition
Exponential smoothing is one of the most used and well-known methods for time series forecasting46. The forecast h steps ahead for the simple exponential smoothing can be written as:
with \(\alpha \in [0,1]\). In this way, the forecasts are obtained as a weighted average of past observations, with the weights decreasing exponentially as we go back in time. Various versions of exponential smoothing have been proposed to deal with trends and seasonality in time series. In this work, we use the exponential smoothing model automatically selected for the seasonally adjusted series. Further details about exponential smoothing algorithms can be found in46.
SARIMA
The seasonal autoregressive integrated moving average (SARIMA) models are among the most widely used methods for time series forecasting. They are an extension of the autoregressive integrated moving average (ARIMA) model that adds a linear combination of seasonal values and/or forecast errors. Let Y(t) be a time series. The \(SARIMA(p,d,q)(P,D,Q)_s\) model can be written as
where B is the lag operator given by \(B^k=Y(t-k)/Y(t)\), \(\Phi (B) = 1 - \phi _1B^1 - \phi _2B^2 z \dots - \phi _pB^p\) is an autoregressive (AR) polynomial function of order p with vector of coefficients \(\Phi '=[\phi _1,\phi _2,\dots , \phi _p]\), \(\Theta (B)=1+\theta _1B^1+\theta _2B^2+\dots +\theta _qB^q\) is a moving average (MA) polynomial of order q with vector of coefficients \(\Theta '=[\theta _1,\theta _2,\dots , \theta _q]\), \(\Phi (B^s)=1-\phi _{s,1}B^s-\phi _{s,2}B^{2s}-\dots -\phi _{s,p}B^{ps}\) and \(\Theta (B^s)=1+\theta _{s,1}B^s-\theta _{s,2}B^{2s}-\dots -\theta _{s,q}B^{qs}\) are seasonal polynomial functions of order P and Q, respectively, that satisfy the stationarity and invertibility conditions, d is the number of differences needed to stationarize the series, D is the number of seasonal differences and \(\varepsilon (t)\) is white noise, defined as a sequence of uncorrelated random variables with zero mean and constant variance over time, \(\varepsilon _t \sim RB( 0, \sigma ^2_\varepsilon )\). The parameter estimates of the SARIMA model can be obtained with the arima function of the R software.
NNETAR and NNETAR + Decomposition
The Neural Network AutoRegression (NNETAR) model is an artificial neural network (ANN). ANNs are mathematical models based on the behavior of the brain that allow for complex nonlinear relationships between the response variable and its predictors44. A neural network comprises an input, output, and hidden layers. In the hidden layers, we find the weights (\(W_i\)), bias (b), and the activation function, which help to convert the input data into the expected output. The weights are the parameters that will determine the intensity with which each neuron affects the other. On the other hand, bias is a parameter used to adjust the output along with the weighted sum of the neuron’s inputs. In each neuron, there will be an activation process through the z function47. This process is illustrated by Eq. (6):
The forecasts using the NNETAR model and the NNETAR in the seasonally adjusted time series using the STL decomposition can be obtained with the nnetar function of the forecast package in the R software. The model receives the last observations up to time t and performs the forecast for time \(t+1\). To obtain more predictions, the same process is repeated iteratively.
Accuracy measures
To evaluate the performance of the models, two types of accuracy measures will be considered, one for the model fit (using the training data) and another for the model forecast (using the train set). Two accuracy measures will be used. Equation (7) defines the root mean squared error (RMSE) and Eq. (8) defines the symmetric mean absolute percent error (SMAPE). In contrast to the mean absolute percentage error, the SMAPE provides a value with upper and lower bounds, with values between zero and one.
In both equations, n is the number of observations (i.e., length of the train or test data), \(y_i\), \(i=1,\dots ,n\) are the observed real values, and \(\widehat{y}_i\) are the estimated or forecast values.
Results and discussion
Descriptive analysis
The database includes 43824 hourly observations (5 y between 2015 and 2019) of \({\text {PM}}_{10}\) concentrations in 29 monitoring stations in the Brazilian state of Minas Gerais. Being a large dataset results in a big challenge for its visualization. To better visualize and understand the behavior and patterns of the data, several strategies were used. The weekly average in each monitoring station is presented in Fig. 1S of the “Supplementary material”. In addition, boxplots per hour of the day, per day of the month, per month of the year, and per year are also presented in Figs. 2S–5S of the “Supplementary material”, respectively. In these plots, specific trends and patterns are visible, particularly, along the day, along the months, and along the years.
To present further results, without doing an exhaustive analysis, three monitoring stations were selected. The first is located in Belo Horizonte (BH1), the state capital city and the most populous city in the state, with its main sources of atmospheric pollution being traffic and industry. To consider the full range of the observed data in the 29 monitoring stations, the other two monitoring stations that were selected are those with the lowest (Itabira4) and the highest (S.J.daLapa2) average concentration of \({\text {PM}}_{10}\) among the available stations. Figure 2 shows the weekly average behavior of these three monitoring stations. It is possible to notice that the concentrations of BH1 and Itabira4 are very similar, with emphasis on the year 2019, where the BH1 station shows a significant increase in the average weekly concentration of \({\text {PM}}_{10}\). Among the 29 considered monitoring stations, BH1 and Itabira4 are among those with the lowest average pollution levels. São José da Lapa (S.J.daLapa2), located north of the metropolitan region of Belo Horizonte, has \({\text {PM}}_{10}\) concentrations well above the weekly average of the other two stations, which is likely due to lime and crushed stone factories located in the region. The average concentration of \({\text {PM}}_{10}\) in S.J.daLapa2 is 49.9 \(\upmu\)g/m\(^3\) against 25.37 \(\upmu\)g/m\(^3\) and 22.13 \(\upmu\)g/m\(^3\) in BH1 and Itabira4, respectively.
Average weekly concentration of \({\text {PM}}_{10}\) (in \(\upmu\)g/m\(^3\)) between 2015 and 2019 for one monitoring station located in the Brazilian state capital of Minas Gerais, Belo Horizonte (BH1, blue), the monitoring station with the lowest average of \({\text {PM}}_{10}\) concentrations (Itabira4, pink), and the monitoring station with the highest average of \({\text {PM}}_{10}\) concentrations (S.J.daLapa2, green).
Figure 3 shows the behavior of the hourly, daily, monthly, and annual concentration of \({\text {PM}}_{10}\) at the BH1 station. The hourly plot shows a higher concentration between 7 and 10 a.m. and between 6 and 10 p.m. In the monthly plot, higher concentrations of \({\text {PM}}_{10}\) are observed between June and October. There is also an increase in concentrations in the years 2018 and 2019. Figure 4 shows the behavior of the hourly, daily, monthly, and annual concentration of \({\text {PM}}_{10}\) at the Itabira4 station. The hourly graph shows a higher concentration between 6 and 9 a.m. and at the end of the day between 6 and 11 p.m. In the monthly plot, higher \({\text {PM}}_{10}\) concentrations are observed between June and October. Figure 5 shows the behavior of the hourly, daily, monthly, and annual concentration of \({\text {PM}}_{10}\) at the S.J.daLapa2 station. The hourly graph shows a higher concentration between 6 and 9 a.m. and between 5 and 11 p.m. In the monthly plot, higher concentrations of \({\text {PM}}_{10}\) are also observed between June and October.
All boxplots for the hourly, daily, monthly, and annual behavior of the 29 monitoring stations can be seen in Figs. 2S–5S, of the “Supplementary material”, respectively.
Model fit
The six models defined above were used for model fit, considering the data from the three monitoring stations described in the previous subsection (BH1, Itabira 4, and S.J.daLapa2). Table 1 shows the results of the two accuracy measures, RMSE and SMAPE, for the model fit of each model, in the data from the three monitoring stations. Based on the RMSE, the best fit was obtained by the model NNETAR for the Itabira4 and S.J.daLapa2 monitoring stations, while the best model for BH1 was the NNETAR+Decomposition. When considering the SMAPE, the results for Tabira4 and S.J.daLapa2 do not change, but for BH1, the best model was the Naive+Decomposition.
Model forecasting
A similar procedure for model fit, now considering the test data, was done for the model forecast. The same six models were used, considering the data from the three monitoring stations. Table 2 shows the results of the two accuracy measures, RMSE and SMAPE, for the forecasts using each of the six models for the data from the three monitoring stations, BH1, Itabira4, and S.J.daLapa2. The accuracy measures were obtained by considering the last 14 days (336 observations) of each time series as test data. From the analysis of Table 2, it can be seen that the best model to forecast the \({\text {PM}}_{10}\) concentrations in BH1 is SARIMA. For the monitoring station with the highest \({\text {PM}}_{10}\) average, S.J.daLapa2, the Exponential Smoothing + Decomposition was the best forecasting model. On the other hand, for Itabira4, the best forecasting model was the Exponential Smoothing + Decomposition based on the SMAPE and the SARIMA based on the RMSE.
Conclusion
The approach presented in this paper provided a spatio-temporal and descriptive analysis of the behavior of the \({\text {PM}}_{10}\) concentrations in 29 monitoring stations in the Brazilian state of Minas Gerais. The use of boxplots per hour of the day, per day of the month, per month of the year, and per year, allowed us to find specific trends and patterns. Besides the seasonal patterns, an increase in the \({\text {PM}}_{10}\) concentrations was visible in BH1 from 2018 and especially at the end of 2019. S.J.daLapa2 is the monitoring station with the highest average concentration of \({\text {PM}}_{10}\), likely due to lime and crushed stone factories located in the region, with an average concentration of 49.9 \(\upmu\)g/m\(^3\) against 25.37 \(\upmu\)g/m\(^3\) and 22.13 \(\upmu\)g/m\(^3\) in BH1 and Itabira4, respectively.
For the modeling and forecast part of the paper, six standard and more advanced models for time series were considered, as well as three monitoring stations: BH1, the capital city of the Brazilian state of Minas Gerais, and the monitoring stations with the lowest and highest average \({\text {PM}}_{10}\) concentration levels. The overall best models for model fit were the NNETAR and NNETAR+decomposition, and the overall best models for forecasting were the SARIMA and Exponential Smoothing + decomposition. This difference could be because of the small difference in RMSE and SMAPE between several models in the model fit.
Although the methodologies used in this study have been widely used for time series forecasting in general and to forecast \({\text {PM}}_{10}\) concentrations in particular, no comprehensive study including all monitoring stations in the Brazilian state of Minas Gerais has been made. Therefore the results and analyses presented in this paper, both in terms of model fit to better understand the historical behavior and of model forecast to predict the coming hours and days are of great potential relevance for local governments and policymakers to understand the dynamics of the \({\text {PM}}_{10}\) concentrations and take the necessary action to improve the environment and public health.
Some of the limitations of this study that can be considered as future working directions are: (1) the forecasting models discussed in this paper might not fully capture the whole signal in the data and others, e.g., based on deep learning48,49 and hybrid methods50,51, can be considered for all 29 monitoring stations in the Brazilian state of Minas Gerais to better understand the overall behavior; (2) the modeling and forecasting are based on univariate time series models and without geographical information, that can potentially be improved when considering multivariate and station-temporal models52; and (3) the influence of climate variables such as temperature, wind speed, radiation, and humidity, is not accessed in this paper, but their use might help to improve the forecasts and the spatio-temporal modeling approach as covariates.
Data availability
The data is available as supplementary material for this paper.
References
Martins, L. C. et al. Poluição atmosférica e atendimentos por pneumonia e gripe em São Paulo, Brasil. Revista de Saúde Pública 36, 88–94 (2002).
Goudarzi, G. et al. Health risk assessment on human exposed to heavy metals in the ambient air PM10 in Ahvaz, Southwest Iran. Int. J. Biometeorol. 62, 1075–1083 (2018).
Makri, A. & Stilianakis, N. I. Vulnerability to air pollution health effects. Int. J. Hygiene Environ. Health 211, 326–336 (2008).
Idani, E. et al. Characteristics, sources, and health risks of atmospheric PM10-bound heavy metals in a populated Middle Eastern City. Toxin Rev. 39, 266–274 (2020).
Wang, J., Hu, Z., Chen, Y., Chen, Z. & Xu, S. Contamination characteristics and possible sources of PM10 and PM2.5 in different functional areas of Shanghai, China. Atmos. Environ. 68, 221–229 (2013).
Guarnieri, M. & Balmes, J. R. Outdoor air pollution and asthma. Lancet 383, 1581–1592 (2014).
Anderson, J. O., Thundiyil, J. G. & Stolbach, A. Clearing the air: A review of the effects of particulate matter air pollution on human health. J. Med. Toxicol. 8, 166–175 (2012).
Roy, D., Seo, Y.-C., Kim, S. & Oh, J. Human health risks assessment for airborne PM10-bound metals in Seoul, Korea. Environ. Sci. Pollut. Res. 26, 24247–24261 (2019).
Maesano, C. et al. Impacts on human mortality due to reductions in PM10 concentrations through different traffic scenarios in Paris, France. Sci. The Total. Environ. 698, 134257 (2020).
Maleki, H., Sorooshian, A., Goudarzi, G., Nikfal, A. & Baneshi, M. M. Temporal profile of PM10 and associated health effects in one of the most polluted cities of the world (Ahvaz, Iran) between 2009 and 2014. Aeolian Res. 22, 135–140 (2016).
Medina, S., Le Tertre, A. & Saklad, M. The Apheis project: Air pollution and health—A European information system. Air Qual. Atmos. Heal. 2, 185–198 (2009).
Medina, S., Plasencia, A., Ballester, F., Mücke, H. & Schwartz, J. Apheis: Public health impact of PM10 in 19 European cities. J. Epidemiol. Community Heal. 58, 831–836 (2004).
Pérez-Martínez, P. J., de Fátima Andrade, M. & de Miranda, R. M. Traffic-related air quality trends in São Paulo, Brazil. J. Geophys. Res. Atmos. 120, 6290–6304 (2015).
Sánchez-Ccoyllo, O. R. et al. Vehicular particulate matter emissions in road tunnels in Sao Paulo, Brazil. Environ. Monitoring Assess. 149, 241–249 (2009).
Ribeiro, H. & de Assunção, J. V. Historical overview of air pollution in São Paulo Metropolitan Area, Brazil: Influence of mobile sources and related health effects. WIT Trans. Built Environ. 52,10 (2001).
Bravo, M. A. & Bell, M. L. Spatial heterogeneity of PM10 and O3 in São Paulo, Brazil, and implications for human health studies. J. Air Waste Manag. Assoc. 61, 69–77 (2011).
De Freitas, E. D., Martins, L. D., da Silva Dias, P. L. & de Fátima Andrade, M. A simple photochemical module implemented in rams for tropospheric ozone concentration forecast in the metropolitan area of Sao Paulo, Brazil: Coupling and validation. Atmos. Environ. 39, 6352–6361 (2005).
Encalada-Malca, A. A., Cochachi-Bustamante, J. D., Rodrigues, P. C., Salas, R. & López-Gonzales, J. L. A spatio-temporal visualization approach of PM10 concentration data in Metropolitan Lima. Atmosphere 12, 609 (2021).
do Meio Ambiente, C. N. Institutes the national air quality control programee. Tech. Rep., Official Journal of the Federative Republic of Brazil (1989).
do Meio Ambiente, C. N. Sets standards of primary and secondary air quality and even the criteria for acute episodes of air pollution. Tech. Rep., Official Journal of the Federative Republic of Brazil (1990).
Artaxo, P. O estado da qualidade do ar no brasil. Work. Pap. WRI Brasil 32 (2021).
Costa, A. F., Hoek, G., Brunekreef, B. & Ponce de Leon, A. C. Air pollution and deaths among elderly residents of Sao Paulo, Brazil: An analysis of mortality displacement. Environ. Health Perspectives 125, 349–354 (2017).
Bravo, M. A., Son, J., De Freitas, C. U., Gouveia, N. & Bell, M. L. Air pollution and mortality in São Paulo, Brazil: Effects of multiple pollutants and analysis of susceptible populations. J. Exposure Sci. Environ. Epidemiol. 26, 150–161 (2016).
Chiarelli, P. S. et al. The association between air pollution and blood pressure in traffic controllers in Santo André, São Paulo, Brazil. Environ. Res. 111, 650–655 (2011).
Ventura, L. M. B., de Oliveira Pinto, F., Soares, L. M., Luna, A. S. & Gioda, A. Forecast of daily PM2.5 concentrations applying artificial neural networks and holt-winters models. Air Qual. Atmos. Heal. 12, 317–325 (2019).
Leão, M. L. P., Zhang, L. & da Silva Júnior, F. M. R. Effect of particulate matter (PM2.5 and PM10) on health indicators: Climate change scenarios in a Brazilian Metropolis. Environ. Geochem. Heal. 44, 1–12 (2022).
Habermann, M. & Gouveia, N. Application of land use regression to predict the concentration of inhalable particular matter in São Paulo City, Brazil. Engenharia Sanit. e Ambiental 17, 155–162 (2012).
Braga, A. L. F., Pereira, L. A. A., Procópio, M., André, P. A. D. & Saldiva, P. H. D. N. Association between air pollution and respiratory and cardiovascular diseases in Itabira, Minas Gerais State. Brazil. Cadernos de Saúde Pública 23, S570–S578 (2007).
Pinto, W. D. P., Reisen, V. A. & Monte, E. Z. Previsão da concentração de material particulado inalável, na região da grande vitória, ES, Brasil, utilizando o modelo sarimax. Engenharia Sanitária e Ambiental 23, 307–318 (2018).
Schornobay-Lui, E. et al. Prediction of short and medium term PM10 concentration using artificial neural networks. Manag. Environ. Qual. An Int. J. 30, 414–436 (2018).
Neto, P. S. D. M. et al. Neural-based ensembles for particulate matter forecasting. IEEE Access 9, 14470–14490 (2021).
Albuquerque Filho, F. S. D., Madeiro, F., Fernandes, S. M., de Mattos Neto, P. S. & Ferreira, T. A. Time-series forecasting of pollutant concentration levels using particle swarm optimization and artificial neural networks. Química Nova 36, 783–789 (2013).
Lei, T. M., Siu, S. W., Monjardino, J., Mendes, L. & Ferreira, F. Using machine learning methods to forecast air quality: A case study in Macao. Atmosphere 13, 1412 (2022).
Yu, T. et al. Study on the regional prediction model of PM2.5 concentrations based on multi-source observations. Atmos. Pollut. Res. 13, 101363 (2022).
Li, J., Xu, G. & Cheng, X. Combining spatial pyramid pooling and long short-term memory network to predict PM2.5 concentration. Atmos. Pollut. Res. 13, 101309 (2022).
Cordova, C. H. et al. Air quality assessment and pollution forecasting using artificial neural networks in Metropolitan Lima-Peru. Sci. Rep. 11, 1–19 (2021).
Plocoste, T., Calif, R. & Jacoby-Koaly, S. Temporal multiscaling characteristics of particulate matter PM10 and ground-level ozone O3 concentrations in caribbean region. Atmos. Environ. 169, 22–35 (2017).
Calif, R. & Schmitt, F. G. Multiscaling and joint multiscaling description of the atmospheric wind speed and the aggregate power output from a wind farm. Nonlinear Process. Geophys. 21, 379–392 (2014).
Hyndman, R. J. & Khandakar, Y. Automatic time series forecasting: The forecast package for r. J. Stat. Softw. 27, 1–22 (2008).
Harvey, A. C. Forecasting, structural time series models and the Kalman filter (Cambridge University Press, 1990).
Zhang, G. P. Time series forecasting using a hybrid arima and neural network model. Neurocomputing 50, 159–175 (2003).
Liao, T. W. Clustering of time series data—A survey. Pattern Recognit. 38, 1857–1874 (2005).
Bell, M. L., Samet, J. M. & Dominici, F. Time-series studies of particulate matter. Annu. Rev. Public Heal. 25, 247–280 (2004).
Hyndman, R. J. & Athanasopoulos, G. Forecasting: Principles and Practice (OTexts, 2018).
Box, G. E., Hillmer, S. C. & Tiao, G. C. Analysis and modeling of seasonal time series. in Seasonal analysis of economic time series, 309–344 (NBER, 1978).
Sulandari, W., Suhartono, Subanar & Rodrigues, P. C. Exponential smoothing on modeling and forecasting multiple seasonal time series: An overview. Fluctuation Noise Lett. 20, 2130003 (2021).
Rodrigues, P. C., Awe, O. O., Pimentel, J. S. & Mahmoudvand, R. Modelling the behaviour of currency exchange rates with singular spectrum analysis and artificial neural networks. Stats 3, 137–157 (2020).
Sako, K., Mpinda, B. N. & Rodrigues, P. C. Neural networks for financial time series forecasting. Entropy 24, 657 (2022).
Coelho, Leite et al. Statistical and artificial neural networks models for electricity consumption forecasting in the Brazilian industrial sector. Energies 15, 588 (2022).
Sulandari, W., Subanar, S., Lee, M. H. & Rodrigues, P. C. Time series forecasting using singular spectrum analysis, fuzzy systems and neural networks. MethodsX 7, 101015 (2020).
Sulandari, W. et al. Indonesian electricity load forecasting using singular spectrum analysis, fuzzy systems and neural networks. Energy 190, 116408 (2020).
Rodrigues, P. C. & Mahmoudvand, R. The benefits of multivariate singular spectrum analysis over the univariate version. J. Frankl. Inst. 355, 544–564 (2018).
Acknowledgements
P.C. Rodrigues acknowledges financial support from the CNPq grant “bolsa de produtividade PQ-2” 309359/2022-8, Federal University of Bahia, and CAPES-PRINT-UFBA, under the topic “Modelos Matemáticos, Estatísticos e Computacionais Aplicados às Ciências da Natureza”.
Author information
Authors and Affiliations
Contributions
All authors participated in the conceptualization, methodology, software, and manuscript writing.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
da Silva, K.L.S., López-Gonzales, J.L., Turpo-Chaparro, J.E. et al. Spatio-temporal visualization and forecasting of \({\text {PM}}_{10}\) in the Brazilian state of Minas Gerais. Sci Rep 13, 3269 (2023). https://doi.org/10.1038/s41598-023-30365-w
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-023-30365-w
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.