Air quality is a critical issue related to people’s health and livelihoods, and one of the obstacles to regional economic development and social progress. In addition to air quality monitoring and management, air quality forecasting during periods of polluted weather has also become a focus of environmental management. Especially during major events and heavy pollution emergencies, timely and accurate air quality prediction and pollution source analysis can provide a decision-making basis for management departments. If the exhaust gas emissions of enterprises can be determined according to the requirements of regional ambient air indicators and meteorological conditions, and then it could guide enterprises to adjust production processes accordingly. Air pollution caused by unfavorable meteorological factors can be effectively avoided, and enterprises can expand the production of heavy pollution processes when the weather conditions are favorable. Based on air quality prediction and pollution source analysis, it is of great practical significance to make full use of meteorological conditions to coordinate the relationship between air quality and regional development.

Some scholars at home and abroad have conducted qualitative analysis research on factors affecting air quality from the perspective of the environment, society, and economic activity, considering various factors such as waste incineration, vehicle exhaust emissions, population growth, coal combustion, industrial waste gas discharge and industrial flue gas dust. These studies confirmed that air pollution results from environmental degradation that has been majorly generated from urban population growth, industrial activities, and road fleet1. Industrial waste gas discharge is the main cause of air pollution in developing countries2. Thermal power plants and manufacturing industries are the largest sources of urban air pollution3,4. Other studies have selected meteorological factors such as average temperature, relative humidity, visibility, wind force scale, sun exposure, and wind direction to research the correlation between these meteorological factors and air quality. Research results indicated that average temperature, relative humidity, visibility, and wind force scale are the principal factors that affect air quality5. The variation in pollutant emissions affects an area within a hundred-kilometers radius from the source, depending also on local meteorological and geomorphological conditions6.

There is also quantitative analysis of air quality based on pollutant transport, and diffusion processes. Currently, there are two main types of air quality prediction model: mechanism models and non-mechanism models. Mechanism models involve complex physical and chemical processes, which all possess great uncertainty. They require the establishment of a relatively complete emission source inventory, accurate meteorology fields, and related models of physical and chemical processes, such as pollutant transport and diffusion. Non-mechanism models, represented by statistical models and machine learning models, do not require complex pollutant boundary fields or meteorological boundary fields, nor do they need the investigation of complex mechanism processes generated by the results. This approach can determine the trend of pollution at a certain stage only by the extraction of data characteristics. Compared with mechanism models, non-mechanism models are more convenient and practical. The most commonly applied classical statistical methods mainly include linear and nonlinear models7, multiple regression equation8, time series9, etc. Some conventional machine learning methods that are widely used include support vector machines10, decision trees11, Bayesian networks12, artificial neural networks13, backpropagation (BP) neural networks14, etc. With the continuous development of artificial intelligence, deep machine learning models has been successfully implemented to forecast air quality using time series air pollutant and meteorological datasets with excellent performances15.

Some scholars also look forward to the research on air quality prediction, pointing out that the existing research on the impact of industrial waste gas emissions on air quality is qualitative analysis, and the air quality prediction research ignores the emission information of pollution sources1,2,3. Some extensive studies can be further conducted to gasses emission estimating and its impact on the surrounding environments16. Urban air pollution mainly comes from industry, transportation and daily life. Industrial waste gas discharge are the largest sources of urban air pollution17. Traffic and household emissions are relatively stable and can be regarded as constant, with little impact on fluctuations in air quality. From the daily emissions data of industrial waste gas in Zhangdian District in 2018, it can be seen that the daily emissions of industrial waste gas fluctuate greatly (Fig. 1). Air quality prediction results will inevitably be inaccurate if pollutant variable emissions are not taken into account. Individual studies use source emission inventories, which treated industrial pollutant emissions as constant18,19. The emission inventory of pollution sources is compiled based on the base year. According to the technical guidelines for the compilation of air pollutant emission inventory of various industries, it is mainly calculated by the emission coefficient method. The estimation is difficult to be accurate, and generally lags by 2–3 years. The data is constant and cannot be updated dynamically, and cannot reflect the impact of real-time changes in emissions from pollution sources.

Figure 1
figure 1

AQI and industrial waste gas emission statistics in Zhangdian District.

Because the existing air quality prediction research ignores the real-time emission effect of industrial pollutants in the model establishment, and cannot establish a quantitative correlation between air quality and industrial pollution sources, it cannot expand the application value of air quality prediction. This study uses machine learning algorithms to an air quality forecast model by considering real-time industrial waste gas emissions and meteorological factors as variables. The current weather forecast time frame (15 days) is considered a period. During this period, according to the weather forecast, the daily emission limit of industrial pollution is determined by model inversion. By increasing or reducing the output of polluting processes or sections within the enterprise and balancing the intensity of pollution emissions, it not only ensures that the regional air environment quality remains good, but also meets the company’s supporting production requirements. It not only ensures that the environmental quality meets the standard, but also meets the normal operation of the enterprise. Regarding the selection of model algorithms, the random forest algorithm has several advantages compared with other machine learning algorithms. Firstly, the random forest algorithm can evaluate the importance of input variables and accurately predict output variables20 Besides, it has good anti-noise ability and does not easily fall into the problem of overfitting21. Finally, the random forest algorithm is suitable for modelling high-dimensional data and has strong adaptability to data sets22. However, a key problem with the random forest algorithm is that parameters cannot be accurately optimised. In this paper, we use the “RandomizedSearchCV” and “GridSearchCV” functions to solve this issue and realise the precise optimisation of parameters. Next, the BP neural network, decision tree, and least squares support vector machine (LSSVM) are used to compare their model performance with the random forest algorithm. To eliminate long-term cumulative systematic errors caused by factors such as inter-annual fluctuations in the number of motor vehicles, a multi-step sliding window method (using the first 365 days of data to predict the next day’s AQI) was adopted for the training set. By continuously incorporating measured data of air quality, meteorological conditions and industrial exhaust emissions into the training set and updating the training set in real time, the impact of long-term changes in traffic emissions on air quality can be reflected.

Study area and data

Zhangdian District is located in the middle of Zibo City, Shandong Province. It is located in the junction of the Shandong Zhongshan Mountains and the North Shandong Plain. It belongs to the warm temperate monsoon type semi-dry and semi-humid continental climate. Zibo City is one of the five traditional architectural ceramics production areas in China, and its architectural ceramics enterprises are mainly located in Zhangdian District. Zhangdian District has a total of 60 key industrial enterprises above designated size, including 41 non-metallic mineral products (37 building ceramics enterprises and 4 cement production enterprises), 12 chemical products manufacturing enterprises, 3 non-ferrous metal smelting and processing enterprises, and 4 other enterprises. For reference, a relief of the research area is shown in Fig. 2.

Figure 2
figure 2

Relief amplitude of research area. The map was generated with ArcGIS10.2 (

The input variables are meteorological factors and daily emissions of industrial waste gases, while the output variable is the AQI (Air Quality Index). According to the “Ambient Air Quality Index (AQI) Technical Regulations (Trial)” (HJ 633-2012), the air quality index is divided into 0–50, 51–100, 101–150, 151–200, 201–300 and greater than 300 the six levels, corresponding to the six levels of air quality (excellent, good, light pollution, moderate pollution, heavy pollution and serious pollution). Data sources are shown in Table 1. Measured data of the AQI and the daily emissions of industrial waste gases of major polluters were obtained from Zhangdian District Bureau of Ecology and Environment. Meteorological factors (precipitation, air temperature, relative humidity, wind scale, air pressure, total sunshine intensity and precipitation) were taken from the WheatA-Big Data on Agricultural Meteorology.

Table 1 Data sources.


Establishment of the random forest model

The random forest algorithm is a classification and regression algorithm that integrates multiple decision trees through ensemble learning23. First, the random forest algorithm uses the decision tree as the basic random forest classifier. Then, the second random forest classifier bagging method is used to generate the training data set and a random subspace is used to establish the classification of each strategic decision tree. The third random forest classifier randomly selects some attributes, then divides and combines the optimal attributes of each tree. The introduction of double randomisation makes it difficult for the random forest to fall into overfitting. Besides, there is diversity among classifiers, so the random forest has superior classification and regression performance24.

The AQI prediction model is obtained by fitting training samples. The random forest modelling process is as follows:

  1. (1)

    Define the AQI prediction training set, \(X_{i} \to Y_{i} ,\,\left( {i = 298} \right)\). Here, Yi is the real value in the random forest prediction model, which is mapped to the measured AQI value of the ith sample in the data. Besides, Xi represents meteorological factors and industrial waste gas emissions of the ith sample in the data. The established feature vector, \(\left\{ {I_{i1} ,\,I_{i2} , \ldots I_{in} } \right\} \to X_{i}\) represents the ith sample to the nth impact factor.

  2. (2)

    Based on the training set, establish a single regression decision tree. Through the eigenvector X and its corresponding real value Y in the training sample, search for the splitting variables and splitting values. The regression decision tree divides the whole vector space into M partitions \(\left\{ {R_{1} ,\,R_{2} , \ldots R_{m} } \right\}\). Any partition can be mapped to model Cm, and the vector can be divided into two parts by the value of a feature. The expression is:

    $${R}_{1}\left(j,s\right)=\left\{\left(I|{I}_{j}\le s\right)\right\},$$

    In the above equations, j represents an impact factor and s signifies the value when splitting. The objective function of the vector space split variable and split value search is:

    $$z:\underset{j,s}{\mathrm{min}}\left[\underset{{c}_{1}}{\mathrm{min}}\sum_{{x}_{i}\in {R}_{1}\left(j,s\right)}{\left({y}_{i}-{c}_{1}\right)}^{2}+\underset{{c}_{2}}{\mathrm{min}}\sum_{{x}_{i}\in {R}_{2}\left(j,s\right)}{\left({y}_{i}-{c}_{2}\right)}^{2}\right].$$

    Here, z is the minimum variance of the measured AQI value, yi represents the measured value of AQI in the ith sample, xi is the eigenvector of the ith sample, while c1 and c2 denote the mean value of the measured AQI values in the first and second parts.

  3. (3)

    Construct a complete random forest model on the basis of a single decision tree, where the generated model is a multiple nonlinear regression analysis model. The predicted value of the AQI is the average value of all the predicted values of the decision trees.

Since the random forest algorithm cannot accurately find its optimal parameters, in this paper, the model is enhanced through the “RandomizedSearchCV” and “GridSearchCV” functions to find its optimal parameters. Among them, RandomizedSearchCV is used to obtain the best parameters by randomly selecting parameter values and performing assigned times parameter combinations within the assigned parameter range; GridSearchCV is used to obtain the best parameters by exhaustively running through the given parameter values; CV is used for cross -validation, as well as parameter adjustment. Typically, RandomizedSearchCV is used first to obtain the optimal solution with a high probability of parameters, and then GridSearchCV is used to fine-tune the parameters within a certain floating range to obtain the optimal combination of parameters. “Finding Parameters” in the above figure is what RandomizedSearchCV and GridSearchCV need to do, which is to find the optimal combination of parameters. The specific process is described in Figs. 3 and 4, as follows.

Figure 3
figure 3

Schematic diagram of finding the optimal parameters of random forest model.

Figure 4
figure 4

Flow chart of random forest model.

Importance evaluation of variables

The importance evaluation of variables is a vital part of the random forest algorithm. It can evaluate the influence of input variables on output variables by using the mean square residual reduction in the decision-making process of the random forest. It is the result of continuous analysis and optimisation in the training process of the random forest. Based on various permutations, the mean-square residual reduction (%IncMSE) can be used to measure the influence of corresponding independent variables and is the standard for variable importance scoring25. The following is the calculation method of the mean square residual:

  1. (1)

    Establish a regression tree for each training data set and then use this model to predict the OOB (out of bag) error. The mean square residual of b OOBs can be obtained: \(MSE_{1} ,\,MSE_{2} , \ldots MSE_{b}.\)

  2. (2)

    The number of variables selected by the self-help method in the random forest is random. Each variable Xi can be randomly transposed across b OOB datasets. This creates a new set of OOB tests. When the random forest regression model is used to predict the new test set, the mean square residual of the OOB after random replacement can be obtained. The matrix is as follows:

    $$\begin{array}{ccc}{MSE}_{11}& \cdots & {MSE}_{1b}\\ \cdots & \cdots & \cdots \\ {MSE}_{k1}& \cdots & {MSE}_{kb}\end{array}.$$
  3. (3)

    Next, subtract from line of the equation. Then divide the mean by the standard error to obtain the mean square residual of variable, i.e., the variable importance score. The equation is expressed as follows:

    $${VIM}_{i}\left(MSE\right)=\left(\frac{1}{b}\sum_{j=1}^{b}\left({MSE}_{j}-{MSE}_{ij}\right)\right)/{S}_{E},\left(1\le i\le k\right).$$

Evaluation of model prediction accuracy

In this study, the root mean square error (RMSE), mean absolute error (MAE), and coefficient of determination (R2) were used for comparison between the measured and modelled AQI values26,27. These values can be determined as follows:

$${R}^{2}=\frac{\sum_{i=1}^{k}{\left(\widehat{{y}_{i}}-\overline{y }\right)}^{2}}{\sum_{i=1}^{k}{\left({y}_{i}-\overline{y }\right)}^{2}}.$$

In the above equations, \(\widehat{y}_{i}\) represents the AQI forecast of the ith sample, \(y_{i}\) is the measured AQI value of the ith sample, \(\overline{y}\) denotes the average measured AQI value in all samples, and k is the sample size of the corresponding sample (k = 298).


AQI and variation trend analysis

The graphs in Fig. 5 illustrate how meteorological factors, daily industrial waste gas emissions, and AQI varied in Zhangdian District from 1st January 2017 to 31th December 2019. It can be seen that the period with the largest variations in AQI was from December to March, as there are multiple peaks during this time. The minimum value of AQI during these three months was 13 while the maximum AQI value was 313. Between June and August, there were also significant variations in the AQI. In the other months, the range of change was relatively low, with the AQI remaining around 90. The relative humidity fluctuated greatly from February to May, with an average of 46.2%. However, between July and August, the relative humidity only varied slightly, with an average value of 73.7%. The average temperature in February was the lowest, then from March to August it rose slowly, while from August to October it gradually decreased. The wind scale was relatively stable, although in March and April the wind scale was more erratic. In the other months, the wind scale was generally category 1 or 2. Visibility varied greatly throughout the study period. The average value was about 12.5 km, while the maximum value was 29.5 km and the minimum value was 1.0 km. The average pressure in July was the lowest with an average of 98.7 kPa then from August to December it rose slowly, while from December to July it gradually decreased. Total sunshine intensity varied greatly throughout the study period. The average value was about 15.85 J/m2, while the maximum value was 28.59 J/m2 and the minimum value was 0.66 J/m2. The precipitation fluctuated greatly from February to May, with an average of 46.2%. Finally, the average daily emissions of industrial waste gas over the whole study period were 153 million cubic meters, while the maximum value was 270 million cubic meters and the minimum value was 100 million cubic meters. Average daily emissions of industrial waste gas in 2019 were 33 million cubic meters and 85 million cubic meters more than in 2018 and 2017, respectively, but the AQI annual average in 2019 was lower than both 2018 and 2017. Because Shandong Province implemented several air pollutant emission standards (“Emission standard of air pollutants for building materials industry”, Effective January 1, 2019) (“Emission standard of air pollutants for industrial furnace and kiln”, Effective June 1, 2019) in 2019, stricter pollutant emission concentration limits were implemented.

Figure 5
figure 5

Trends of meteorological factors, industrial waste gas emissions, and AQI.

AQI and variation correlation analysis

To verify that meteorological factors and industrial waste gas emissions affect air quality, we conducted a correlation analysis of AQI, meteorological factors, and industrial waste gas emissions in this paper, with the results presented in Table 2. Results indicate that industrial waste gas emissions were positively correlated with AQI, while visibility were negatively correlated with AQI. A rise in industrial waste gas emissions leads to an increase in AQI and the deterioration of air quality. As the amount of particulate matter in the air increases, it leads to the occurrence of haze and reduces visibility. There is a negative correlation between AQI and precipitation in the year and most seasons. This is because raindrops in the cloud can absorb and absorb pollutant particles, and at the same time, rainwater can wash and wash pollutants, resulting in lower pollutant concentrations, improved air quality, and lower AQI values. The correlation in winter is not obvious, which may be due to less precipitation in winter and uneven spatial and temporal distribution. There is a positively correlation between AQI and air temperature in the year and most seasons. From the seasonal scale, there is no obvious correlation between AQI and air temperature in spring. AQI in summer is significantly positively correlated with air temperature, which may have a certain relationship with the activity of cold and warm air masses, because when warm air masses pass through, the temperature will increase and a large amount of pollutants will accumulate. When the cold air passes through, it will reduce the temperature and often accompanied by wind, which is conducive to the diffusion of pollutants. The activities of cold and warm masses often occur frequently in summer. In autumn, atmospheric turbulence activities will intensify with the increase of air temperature, which will dilute and diffuse pollutants in the vertical direction of the lower layer, and further lead to the decrease of AQI. While rises in temperature can cause the temperature inversion phenomenon in winter, and exacerbating the air pollution problem. Ye’s analysis of Fairbanks confirmed that air temperature and AQI were positively correlated, while visibility was negatively correlated with AQI28. Guo studied the correlation between meteorological factors and AQI and also verified that there was a positive correlation between temperature and AQI29. These are consistent with the correlation analysis results obtained in this paper.

Table 2 Correlation between seasonal and annual AQI and meteorological elements from 2017 to 2019.

Results indicate that the correlation of AQI with other meteorological elements (relative humidity, wind level, Air pressure, Total sunshine intensity and precipitation) is not the same on different time scales, because these meteorological elements vary greatly on different time scales. Taking relative humidity as an example, different scholars have studied the relationship between urban air pollution characteristics and meteorological conditions, and found that some cities have a positive correlation between pollutant concentrations and relative humidity30,31,32,33, and some cities have a negative correlation with relative humidity28,34,35,36,37,38. In Zhangdian District, there are different correlations between AQI and relative humidity in different seasons. There is an obvious positive correlation in winter, a negative correlation in summer, and no correlation in spring and autumn. Under low humidity conditions, the growth of condensation nuclei in the atmosphere aggravates pollution, and under high humidity conditions, it will have a scavenging effect on pollutants due to deposition39. On the other hand, relative humidity is negatively correlated with AQI. The reason may be that when the relative humidity is low, it is often accompanied by strong winds, which is easy to cause sand and dust weather and make the air quality worse. It can be seen that relative humidity is not the dominant factor affecting the development of pollution, and comprehensive judgments need to be combined with pollution emissions, meteorological conditions, and chemical processes.

Results of the random forest model

The data samples selected in this paper include meteorological factors (average temperature, wind scale, relative humidity, and visibility), industrial waste gas emissions, and AQI in Zhangdian District. In this study, we obtained a total of 1095 sets of data, among which 1064 sets of data were used as the training data for the AQI prediction model. The final 31 sets were used as test data to verify the model. The prediction process of the random forest model was implemented using the Python programming platform. In the Python program, we used the “RandomizedSearchCV” function to approximate the random forest algorithm parameters. Then, the “GridSearchCV” function was used to accurately search the parameters of the random forest. The optimal parameters that we obtained are presented in Table 3.

Table 3 Optimal parameters of the random forest model.

The random forest model was established after searching the optimal parameters of the random forest. The last 31 sets of original data were used as samples for prediction. The AQI prediction results are displayed in Table 4, which shows that the predicted AQI values are similar to the measured values, indicating that the predicted results are accurate.

Table 4 AQI prediction results of random forest model.

The AQI predicted by the random forest model was compared with the measured AQI. It can be seen from Fig. 6 that the trend of the predicted and measured AQI is fundamentally the same. Figure 7 illustrates that the R2 value is 0.90, and the scatter points are precisely distributed at both ends of the line, indicating that the linear fitting is accurate. We can conclude that in this region it is effective to use meteorological factors and daily emissions of industrial waste gases to predict the AQI.

Figure 6
figure 6

Comparison of modelled and measured AQI values.

Figure 7
figure 7

Linear fitting of predicted and measured AQI values.

Variable importance evaluation

In this study, we used Python to calculate the mean square residual (%IncMSE) in the random forest algorithm and determine the importance of each input variable. The Python code and parameters are presented in Fig. 8. A larger mean square residual reduction value indicates that the input variable has a larger influence on the output variable. As shown in Table 5, industrial waste gas (X1) was the greatest variable affecting AQI, followed by visibility (X5), relative humidity (X4), total sunshine intensity (X8), air pressure (X7), air temperature (X2) and precipitation (X6). The mean square residual value of the wind scale (X1) is the smallest, indicating that the influence of the wind on AQI is negligible compared with the other variables.

Figure 8
figure 8

Python code for importance evaluation calculations.

Table 5 Tanking of importance of variable.

Model prediction accuracy evaluation

By comparing the random forest algorithm with other machine learning algorithms, we can verify the applicability of the random forest algorithm for air quality prediction in Zhangdian District. In this paper, four kinds of machine learning algorithms were used to predict AQI, and their results were compared to ascertain the most appropriate machine learning algorithm. The RMSE, MAE, and R2 measures were used to evaluate the prediction accuracy of the four machine learning algorithms28. For these algorithms, lower RMSE and MAE values indicate higher prediction accuracy, while the closer the R2 value is to 1, the more accurate the prediction is. The results presented in Table 6 confirm that the prediction accuracy of the random forest model is better than the other three machine learning models, indicating that the random forest model is the most suitable algorithm for the AQI prediction model of Zhangdian District.

Table 6 Model prediction accuracy evaluation.

Control of industrial exhaust emissions based on target AQI

It can be seen from Table 7 that the measured AQI value of this region on 9th December was 263 (heavy pollution), and the industrial waste gas emission on that day was 191.9 million m3. The modelled results show that if the daily industrial waste gas emissions were controlled at 72.8 million m3, the air quality of the day could reach an acceptable level (AQI = 100). Conversely, on 18th December, the measured AQI value of the region was 49. The local meteorological conditions were favourable on this day, so the production time of high-polluting manufacturing processes load could be appropriately increased, and the daily industrial waste gas emissions could be increased by 378.9 million m3.

Table 7 Target industrial emissions at AQI of 100.

It can also be seen from Table 6 that the air quality in this region was poor in December 2019. There were 4 days of heavy air pollution, 3 days of moderate air pollution and 9 days of mild air pollution. According to the rationality of meteorological conditions, the air quality in this area could be maintained in good condition (AQI < 100) by increasing or reducing the industrial exhaust emission. It can also be seen from Table 6 that the total allowable exhaust emission in this area would be decreased by 361 million m3 compared with the actual emission in December 2019. The production capacity of enterprises would be decreased, but it would be better than the direct shutdown. According to Zibo City’s Emergency Plan for Heavy Pollution Weather (implemented in 2021), if the air quality index is greater than 200, these 60 key enterprises will directly stop work and production.

There are a large number of ceramic factories in this region, and there are two main sources of exhaust gases in the production of ceramics. The first is dust from crushing, screening, granulation, and spray drying in the manufacture of preformed moulds, glaze materials, and colouring materials. The second is high-temperature flue gas containing S02 and smoke produced in the operation of various kiln firing equipment. Due to the different operating times of each process in the different factories vary, the collective operational load and pollution load of the processes are not balanced. This leads to great fluctuations in the daily emissions of industrial waste gas. By reducing the scale of “firing” processes and appropriately increasing the level of “raw material preparation” or “moulding” in periods of adverse meteorological conditions, the daily emissions of industrial waste gas can be reduced to ensure that the local environmental air quality is maintained at an acceptable level. On the other hand, increasing the operation of “firing” processes in favourable weather can balance the requirements of enterprises, allowing them to reach production targets. Given this, factories could reasonably adjust their production processes depending on the coming meteorological conditions, especially adverse meteorological conditions, to ensure that the regional environmental air quality is preserved in an optimal state.

Feasibility analyze of enterprise process adjustment

Because the production process of the enterprise has the characteristics of multi-section cooperation, multi-machine parallel, and random “fluctuation” and nonlinear interaction between unit sections, the production process network presents great complexity and uncertainty. Production scheduling optimization research has always been a research hotspot. But the current research mainly focuses on the aspects of profit maximization40, time constraints41, capital constraints42, resource constraints43, energy constraints44, and production equipment constraints45. This research provides a new idea for the optimization of production scheduling in industrial enterprises.

The operation time of each production section within the enterprise is different, and the load is not balanced, so the sections that run every day are also different. The production process of some heavy air pollution industries (surface coating, pharmaceuticals, packaging and printing, building materials production, etc.) has certain discrete and intermittent sections, such as magnetic pole smears in motor manufacturers, purification in pharmaceutical companies, and burning in architectural ceramics companies. into the waiting section. These polluting sections have the characteristics of discontinuous intermittent, and the operation time is flexible and adjustable.

Adjustment of polluting processes or sections in the enterprise: (1) Verify the contribution index or scale model of each polluting process or section of the enterprise to the overall emission of the enterprise; (2) Insert the model index into the ERP (Enterprise Resource Planning) self-made parts material scheduling module to convert the process capability; (3) the process capability is adjusted by bringing the environmental prediction index in one cycle into the process capability calculation.


In this study, a random forest model is used to construct an air quality prediction model in Zhangdian District based on the real-time dynamic emission effect of industrial waste gas-meteorological conditions, and to quantify the impact of industrial waste gas on air quality in the region. Using this model, the daily emission limit of industrial pollution can be determined according to the weather forecast inversion, and the air pollution risk caused by unfavorable meteorological factors can be effectively avoided by adjusting the production capacity of the internal production process of the enterprise. This research actively responds to the “Fourteenth Five-Year Plan for National Economic and Social Development of Zhangdian District and the Outline of Vision 2035”: by promoting the implementation of typical production scenarios, empowering actions, focusing on digital industrial applications, using cloud computing, big data and other new-generation information technologies, and guidelines for building a new industrialized strong city in the country. It provides a new idea for Zhangdian District’s “14th Five-Year Plan” to achieve an average annual growth rate of regional GDP of more than 7% and the harmonious development of industry and environment.