Abstract
Solar and wind resources are vital for the sustainable energy transition. Although renewable potentials have been widely assessed in existing literature, few studies have examined the statistical characteristics of the inherent renewable uncertainties arising from natural randomness, which is inevitable in stochastic-aware research and applications. Here we develop a rule-of-thumb statistical learning model for wind and solar power prediction and generate a year-long dataset of hourly prediction errors of 30 provinces in China. We reveal diversified spatiotemporal distribution patterns of prediction errors, indicating that over 60% of wind prediction errors and 50% of solar prediction errors arise from scenarios with high utilization rates. The first-order difference and peak ratio of generation series are two primary indicators explaining the uncertainty distribution. Additionally, we analyze the seasonal distributions of the provincial prediction errors that reveal a consistent law in China. Finally, policies including incentive improvements and interprovincial scheduling are suggested.
Similar content being viewed by others
Introduction
To realize China’s carbon neutrality goal proposed in 20201, the installed capacity of renewable energy resources should be significantly increased. As China mentioned in the 2020 Climate Ambition Summit, the installation of wind and solar energy should reach no less than 1.2 Terawatt (TW) in 2030, almost 3 times more than that in 20192, becoming the dominant electricity generation resource. However, due to the salient intermittency and volatility, wind and solar energy operation and modeling face the critical challenges of a high degree of uncertainty, which must be considered in energy research3,4,5.
Various studies have investigated the generalized spatial and temporal characteristics of renewable energy resources in regional areas and compiled standardized test datasets, including statistical analysis studies of current wind and solar resources6,7,8,9,10 and important impact factors of renewable energy generation11, current wind and solar energy resource estimation studies using meteorological data and prediction methods12,13,14, and future wind and solar energy resource assessment studies based on wind speed and solar irradiation data15,16,17,18,19. However, renewable energy resources rely on weather conditions and thus are highly unstable, posing great challenges to accurate and reliable prediction. Some studies have examined the uncertainty of solar and wind power equipped with energy storage to assess their potential to meet future electricity demand20. Prediction methods such as linear regression models and eXtreme Gradient Boosting have been utilized to forecast the uncertainty of wind and solar generation in specific regional areas, considering seasonal or yearly analyses21,22. However, limited research has focused on analyzing the spatiotemporal uncertainty distributions of renewable energy23,24. There are research gaps in terms of error analysis benchmarks that consider long-term, high-granularity, and nationwide scales of wind and solar output prediction, particularly within the context of China.
Error-analysis benchmarks for wind and solar output forecasting are of great value in academic research and industry. First, a prediction error database of the wind and solar output should be obtained via benchmark prediction methods, e.g., neural network-based25, data mining26, and regression methods27. Second, a wide variety of studies, e.g., power system planning and operation28,29,30,31, energy scheduling32,33,34, and market operation and mechanism design studies35,36, must consider the intermittency and volatility of renewable energy resources via robust optimization37,38, stochastic programming39,40, and statistical analysis methods41,42. Third, the prediction error of renewable power determines the revenue risk of power generation companies, especially in markets with deviation punishment. In this regard, prediction error analysis can provide an important reference for the decision-making of intermittent renewables.
The motivation of this work is to develop a year-long error-analysis benchmark for hourly wind and solar generation forecasting in 30 provinces of China, which is expected to constitute a valuable resource and toolkit for market operators or planners. To this end, we use a one-year standard dataset including hourly wind and solar output data for 30 provinces of China11. Here, we establish a rule-of-thumb prediction model to conduct hourly predictions of the wind and solar output in a rolling fashion and to obtain basic prediction datasets. The results reveal the nationwide spatial distribution of the wind and solar energy uncertainty through the prediction error. The first-order difference and peak ratio of output data are determined as primary factors of the prediction error. To further analyze provincial forecasting characteristics, we provide the provincial probability distribution function (PDF) of prediction errors and distribution regularities, the influence of power generation intervals on prediction in each province, and the temporal features of uncertainty via seasonal analysis.
Results
Nationwide analysis of the uncertainty of wind and solar generation
We obtain an error-analysis benchmark for the forecasting of hourly wind and solar output potential in 30 provinces of China in 2016 using the autoregressive integrated moving average (ARIMA) model based on installation and hourly generation data retrieved from our previous study11. The spatial distributions of the wind and solar uncertainty across China are analyzed through the prediction error, as shown in Fig. 1a, b, respectively, excluding Taiwan, Hong Kong, and Macau, as well as wind energy in Tibet and solar energy in Chongqing (unsuitable for wind/solar energy construction10 or data limitations). The prediction error is calculated as the predicted value minus the actual value (please refer to Methods). The wind prediction error ranges from 2.1 to 13.6%, with the largest error in Tianjin (TJ) and the smallest error in Yunnan (YN). The overall prediction error of solar energy is smaller than that of wind energy, ranging from 3.9 to 10.0%, and the largest provincial prediction error is observed in Shanghai (SH), while the smallest provincial prediction error comes from Xinjiang (XJ). Detailed error analysis of wind and solar power for each province is shown in Supplementary Figs. 1–3, respectively. We divide the 30 provinces into four groups according to the wind prediction error: (i) >9%, (ii) 7–9%, (iii) 5–7%, and (iv) <5%. Four groups can also be distinguished in terms of solar energy according to the prediction error: (i) >8%, (ii) 7–8%, (iii) 6–7%, and (iv) <6%. The details of each group are provided in the Supplementary Information (SI).
The results demonstrate that, except for Southwest China, the wind prediction error in the other regions is relatively large, especially large in the eastern area, i.e., Shandong (SD), SH, Jiangsu (JS), Anhui (AH), and Henan (HA), and Northern area including Beijing (BJ), TJ, Liaoning (LN), Jilin (JL), Shanxi (SX), and Hebei (HE), ranging from 8.0 to 11.3% and 5.3 to 13.6%, respectively. These two areas account for 25.0% and 27.9%, respectively, of the total prediction error in China. Regarding solar energy, the prediction error is concentrated in the areas of Central China covering Ningxia (NX), Shaanxi (SN), Hubei (HB), Jiangxi (JX), and Hunan (HN), North China, and East China, ranging from 6.2 to 9.0%, 7.2 to 9.3%, and 6.8 to 10.0%, respectively, accounting for 17.5%, 25.0%, and 19.1%, respectively, of the total prediction error in China.
We compare the prediction errors of various methods, including random forest (RF), recurrent neural network (RNN), fully-connected neural network (FCNN), and support vector machine (SVM), for predicting nationwide renewable energy output. The results are presented in Fig. 1c and Supplementary Table 1. Our observations indicate that although each method demonstrates varying prediction error distributions across different provinces, the overall nationwide prediction errors remain similar among all methods, ranging from 6 to 9%. Further details can be found in the SI. Notably, ARIMA and RNN exhibit similar prediction errors and outperform other methods, benefiting from their inherent ability to effectively handle time series data. In the following part of this paper, we focus on the prediction error with the ARIMA model as a benchmark method.
Moreover, we examine the impact of the prediction time scale on the distribution of nationwide prediction errors for both wind and solar energy, as illustrated in Fig. 1d. We observe that prediction error increases with the prediction time scale, with a 2-h prediction resulting in a 3.40% error for solar and a 2.83% error for wind, a 6-h prediction resulting in a 6.14% error for solar and a 6% error for wind, and a 24-h prediction resulting in a 9.25% error for solar and a 10.86% error for wind. A detailed analysis of each hour’s prediction error reveals that the error mainly originates from the ending periods, e.g., during 5–6 h for the 6-h ahead predictions and during 15–24 h for the 24-h ahead predictions.
Key factors affecting prediction errors
Two statistical indicators are proposed to explore the factors impacting prediction errors. Due to the irregular distribution of the wind output and the daily periodicity of the solar output, we use hourly and daily output data to analyze the wind and solar prediction errors, respectively (Methods and Supplementary Fig. 4). We use the coefficient of determination (CoD) \({R}^{2}\), which measures the linear correlation, to quantify the relationship between the prediction error and various factors. The installed capacity is independent of the prediction error, with \({R}^{2}=0.002\) for wind energy (Fig. 2a) and \({R}^{2}=0.076\) for solar energy (Fig. 2b). In addition, the power generation reflected by the bubble size exhibited no correlation with the prediction error (Fig. 2a, b).
As shown in Fig. 2c, d, the results indicate that the first-order difference is a major influencing factor of the prediction error, which comprises a series of changes from one period to the next. The relationship between the prediction error and first-order difference is approximately linear. Regarding wind power, the relationship between the prediction error and hourly first-order difference yields \({R}^{2}=0.988\) (Fig. 2c), while the daily first-order difference does not impact the wind prediction error (please refer to the bubble size in Fig. 2c). Regarding solar power, the CoD between the prediction error and the daily first-order difference is \({R}^{2}=0.676\) (Fig. 2d). The hourly first-order difference, however, could not reflect the prediction error, as indicated by the bubble size in Fig. 2d. The reason is that wind power prediction is conducted hour-by-hour, and the daily wind power generation is irregular and cannot reflect the hourly wind generation pattern. Regarding solar power, power generation varies periodically daily, and the characteristics of the hourly first-order difference could be masked by this daily periodicity.
Another significant factor influencing the prediction error is the peak ratio, which reflects the frequency of the tendency changes in the power output series, with CoD \({R}^{2}=0.967\) for the hourly wind output (Fig. 3a) and \({R}^{2}=0.558\) for the daily solar output (Fig. 3c). Similar to the first-order difference, wind and solar energy differ in their hourly and daily features. To further explore the impact of different power generation levels on the prediction error, we evenly divide the installed generation capacity into 10 intervals. We also select a representative province in each wind and solar energy category for detailed analysis. The representative wind energy provinces are TJ, SD, SX, and Gansu (GS); the representative solar energy provinces are BJ, JS, HB, and Inner Mongolia (IM). We express the peak distribution in each power generation interval as a frequency (Fig. 3b for wind energy and Fig. 3d for solar energy). Regarding wind energy, peaks in provinces with a large prediction error, e.g., TJ: 13.6% and SD: 8.9% occur in both higher and lower power intervals, and the frequency fluctuates at 10%. However, in provinces with a small prediction error (SX: 5.4% and GS: 4.2%), peaks are concentrated in lower power intervals from 1 to 4, at 76.76% and 83.48%. In contrast, solar energy peaks are mainly located in higher power intervals, with the peaks in intervals above 4 accounting for 62.59%, 59.38%, 64.90%, and 89.61% in BJ, JS, HB, and IM, respectively.
Temporal analysis of provincial prediction errors
We examine the PDF and prediction error in each province within the above 10 power generation intervals to analyze further the spatial characteristics of the prediction error (Fig. 4 and Supplementary Table 2). The results reveal that the more concentrated the PDF is within a certain interval, the smaller the prediction error within this interval. In terms of wind generation, the average prediction error within interval 1 in TJ is small (only 10.6%), and the PDFs within this interval are concentrated from intervals 1–4; in contrast, the prediction error within interval 8 reaches 21.5%, and the PDF within this interval is distributed across almost all intervals. The prediction error within each interval also reflects the variance and fluctuation magnitude within the interval. As shown in Fig. 4a, the average prediction error within interval 8 in TJ is larger than that within interval 1, and the fluctuation range within these two intervals is 0–72.1% with a variance of \(404.2\), and 0–32.9% with a variance of \(134.5\), respectively.
As illustrated in Fig. 4 and Supplementary Table 2, we also discover that most of the provinces with large prediction errors reach wind and solar prediction errors in high power intervals. The proportions of intervals above 5 in TJ for wind energy, SD for wind energy, SX for wind energy, BJ for solar energy, JS for solar energy, and HB for solar energy are 64.9%, 64.0%, 60.3%, 61.2%, 56.9%, and 53.4%, respectively. This phenomenon is more obvious for wind energy because solar power never occurs at full generation, and there is almost no solar power generation within intervals 9–10. Instead, the prediction errors in provinces with a small prediction error are distributed almost equally among all intervals, e.g., the wind prediction error within each interval in GS ranges from 8.3 to 22.8%. This occurs because high power generation generally exhibits peak or inflection points, which fluctuate wildly and are difficult to predict. The proportion of peaks within each interval is provided in Supplementary Table 3. Thus, the uncertainty of power generation can be intuitively assessed based on power generation.
We also analyze the seasonal characteristics of the generation uncertainty of solar and wind power on a provincial level. Here, we compare the provincial prediction error in spring, summer, autumn, and winter. Nationally, we determine that spring and summer are dominant seasons for wind uncertainty, accounting for 55.48% of the total prediction error (Fig. 5a), and spring and winter are dominant seasons for solar uncertainty, accounting for 57.6% of the total prediction error (Fig. 5c). The provincial characteristics are also similar, as illustrated in Fig. 5b,d. The wind uncertainties in TJ and SD in spring and summer account for 59.9% and 57.4%, respectively, of the total prediction error; the solar uncertainties in BJ, HB, and IM in spring and winter account for 60.4%, 58.0%, and 63.9%, respectively, of the total prediction error. This occurs because solar irradiation in summer and autumn is sufficient with fewer rainy days, resulting in more stable solar power generation and relatively accurate prediction results.
Discussions
We provide an error-analysis benchmark for hourly wind and solar generation in 30 provinces of China with significance for research, industry, and policy decision-making. The proposed benchmark reveals statistical characteristics of wind and solar uncertainty, which is indispensable for academic research. First, it can help to build the PDF of wind and solar generation, providing scenario basis for stochastic economic dispatch43. Energy scheduling may also use renewable generation and consider their prediction errors as a probability distribution44. Second, the benchmark is applicable for robust optimization, because the best and worst-case operating conditions can be obtained through prediction results. It can also replace the assumed prediction errors to generate reasonable probability distribution and be used as expected forms in optimization formulations45,46. Third, risk assessment can also benefit from the benchmark, as the security region of power systems can be depicted based on the prediction results and errors47. Without our work, most of these research use assumed renewable generation and prediction error. In industry, the benchmark plays a critical role as a guiding reference for intuitive analysis of resource distributions and fluctuations, which could help to evaluate investment revenue and the risk of renewable projects. If prediction errors are large and renewable generation is unstable, renewable projects will take more risks, and the investment should be reduced. In addition, policy-makers and system planners need information contained in the benchmark when determining development strategies for cleaner energy systems. An emergent and valuable issue entails the implementation of energy storage devices to mitigate the power balance stress in power systems with an increasing share of renewable resources48,49, and the optimal sizing and setting processes of energy storage devices rely heavily on the spatial and temporal uncertainties of renewable generation. In this paper, we focus on the inherent uncertainty of renewable generation, and the forecasting errors are obtained merely by time-series analysis. In practice, the prediction errors of renewable generation may be impacted by more complicated factors such as weather forecasting quality and operational curtailment strategies. In some application scenarios, the forecasting tools may result in asymmetric errors conservatively. For instance, a system operator tends to forecast renewable generation conservatively for the sake of system reliability. These practical factors may lead to deviations in the distribution of the forecasting error, and can be incorporated into the analysis by replacing the benchmark forecasting model with a more realistic one, which deserves an in-depth investigation in the future.
The statistical analysis indicates that the first-order difference and peak ratio of renewable generation are two primary influencing factors of prediction errors, both reflecting fluctuations in power generation. The wind prediction error is affected by the hourly power generation because the prediction model is employed based on the irregular hourly wind output. In contrast, the solar prediction error is affected by daily fluctuations since solar generation exhibits daily periodicity.
Our results reveal the provincial distribution of the uncertainty of wind and solar generation, indicating different priorities for renewable energy development in different areas. Some of the top 10 provinces with the largest wind prediction error are TJ, SH, JS, and AH, with values of 13.6%, 11.3%, 9.6%, and 8.4%, respectively. In contrast, the solar prediction error in these provinces is 9.0%, 10.0%, 7.1%, and 6.8%, respectively, which indicates that JS and AH should prioritize the development of solar energy due to the small prediction errors and fluctuations. SH and TJ are commercial provinces with small areas and are not suitable for wind and solar energy development. YN, Fujian, GS, Zhejiang (ZJ), and Guizhou (GZ) should develop wind energy due to their smallest prediction errors of 2.1%. 2.6%, 4.2%, 4.9%, and 3.8%, respectively. ZJ, SX, GZ, and SH are some of the top 10 provinces with larger solar prediction errors, namely, 7.1%, 7.2%, 7.4%, and 10.0%, respectively, while the wind prediction errors in ZJ, SX, and GZ reach 4.9%, 5.3%, and 3.8%, respectively, and the potential wind capacity factor for Sichuan and GZ is approximately 15–25%10. Therefore, wind energy development in these provinces is a recommended pathway to reduce the adverse impact of renewable generation on power system operation.
The temporal analysis demonstrates that renewable generation in spring exerts the greatest impact on the power system, requiring the proactive deployment of flexible resources. Combined with the spatial distribution, the solar prediction error in North China in winter exhibits a large prediction error, ranging from 9.3 to 11.4%, with an average value of 10.4%, larger than the total prediction error of 3.9–10.0%, with an average value of 6.7%. As the Chinese government has issued the Electric Heating Policy to provide heat in North China in winter, the load demands in the power sector have increased significantly50. The flexibility-adjustable resources and volatility on the power source side exhibit inverse distributions, which have become a central problem in the consumption of renewable energy in these regions. In contrast, Southeast China achieves the smallest prediction error in regard to both wind and solar energy in winter, with average values of 2.8% and 5.1%, respectively. Additionally, existing research has suggested abundant offshore wind power resources in the area, with wind capacity factors higher than 50%, almost ranking at the top in China10,11. Due to the obvious seasonal distribution of offshore wind power, which dominates in spring and winter51, wind power represents a suitable alternative resource to offset the winter load peak in North and Northeast China.
Based on the prediction error analysis, we summarize two policy suggestions for China. First, the government should provide adequate policy support and incentives to encourage wind energy development in the Southwestern and Central areas of China and solar energy development in the areas of Southwest and Northwest China. These areas experience limited fluctuations in wind and solar generation, around 2.1–6.4% and 4.3–7.4%, reducing the adverse impact on the power system. However, the current installed capacities in these regions are insufficient, even lower than East area with less land. Second, the government should plan interprovincial energy transmission in the space dimension to reduce the winter load peak in North China and reduce the adverse impact of renewable energy. As concluded, the wind and solar fluctuations in North China are notable, accounting for 28.1% and 25.0%, respectively, of the total prediction error in China, especially during winter, with a proportion of 27.4% and 27.7%. However, during spring and summer, much energy consumption can be satisfied by renewable energy, resulting in an unbalance in different seasons and requiring additional energy sources. As such, the government should improve the power system infrastructure, systematically evaluate potential transmission projects, and plan additional power lines according to the resource and load distribution.
Methods
Wind and solar output data
Hourly wind and solar output data for 2016 pertaining to 30 provinces of China are retrieved from previous work11, except for Tibet wind, Chongqing solar, Taiwan, Hong Kong, and Macao. The dataset contains 8760 h of wind and solar output data, and wind and solar installed capacity data for these 30 provinces are included. We denote the hourly wind output as \({W}_{i,t+{{{{\mathrm{1,0}}}}}}\) and the hourly solar output as \({S}_{i,t+{{{{\mathrm{1,0}}}}}}\), where i and t are province and time slot indices, respectively, for \(i\in [1,N],t\in [1,T]\), \(N=30\), and \(T=8760\). As previously mentioned, daily wind and solar output data are also required for the analysis, which can be calculated as Eqs. (1)-(2):
where \({S}_{{{{\mbox{Day}}}},i,c,0}\) and \({W}_{{{{\mbox{Day}}}},i,c,0}\) are the daily solar and wind output, respectively, of province i in time slot t, and c is a day index, for \(c\in \left[1,{C}\right] \,{{{{{\rm{and}}}}}} \,C=365\).
Benchmark prediction model
Time series prediction is based on historical data, among which the autoregressive (AR), moving average (MA), and autoregressive moving average (ARMA) techniques are typical methods to study stationary time series and are suitable for a large number of problems. However, the fluctuations in wind and solar energy indicate that their power generation involves a nonstationary time series with a time-varying mean value and variance, which is difficult to study with these methods. Thus, to predict nonstationary sequences, the ARIMA prediction model is introduced by Box-Jerkins. Considering a certain number of differences in the ARIMA prediction model, wind and solar power generation series can be converted into a stationary series, convenient for prediction analysis. In the literature, the ARIMA model is widely used in short-term renewable forecasting and is validated to yield satisfactory results.
In prediction model construction, it is necessary to first determine whether the series is stationary. If the series is not stationary, it should be differentiated until the series meets the stationarity requirements. Suppose the real wind and solar power generation series are \({Y}_{t}\), the differential order can be denoted by d, and the differential process can be expressed as Eq. (3):
where \({X}_{t}\) is the stationary series of the original real data, B is the lag operator, and \({{{{{\rm{ADF}}}}}}{{{{{\rm{test}}}}}}=1\) passes the stationarity test. Except for the differential order d, the ARIMA model should also determine the autoregressive order p and moving average order q, and the ARMA model for \({X}_{t}\) can be expressed as Eq. (4):
where \({\varphi }_{i}\) and \({\mu }_{i}\) are the autoregressive parameter and moving average parameter, respectively, \({\alpha }_{t}\) is white noise with a mean of 0, \({\mu }_{0}\) is a deterministic trend quantity greater than 0, and \({B}^{i}\) is the ith power of B. Via the use of the prediction model, we can obtain the predicted series \({X}_{{{{{{\rm{predict}}}}}},t}\), which is a differential series of the predicted wind and solar power generation. Thus, the predicted power generation can be obtained through Eq. (5):
where \({Y}_{{{{{{\rm{predict}}}}}},t}\) denotes the predicted results of the ARIMA-based prediction model, and in this paper, this variable indicates the wind and solar output.
There are three major parameters of the ARIMA-based prediction model: differential order d, autoregressive order p, and moving average order q. Parameter d is determined based on the minimum number of differences required to obtain a stationary time series. The d value is generally smaller than three because the greater the difference order, the more information would be lost52. It should be noted that parameter d is completely determined by the properties of the original sequence, while the selection of p and q should consider the overall prediction effect. In general, p and q should remain within 1/5 of the length of the input data. Due to the large amount of wind and solar power generation data in each province in one year, usually 8760 h, we separate multiple prediction windows for each province and used the moving window method to predict wind and solar power generation. At present, the methods for p and q determination usually include the Akaike information criterion (AIC) and Bayesian information criterion (BIC), but the optimal parameter configuration can only be provided for a single prediction window. To unify the prediction models with the different prediction windows in the same provinces and minimize the prediction error, we randomly select 5 weeks of data throughout the year as a sample and traverse p and q for each province to obtain the best parameters with the minimum prediction error. The detailed parameters for each province are listed in Supplementary Table 4.
Other parameters, such as the autoregressive parameter \({\varphi }_{i}\) and moving average parameter \({\mu }_{i}\), can vary with the input data. These two parameters are determined by the autocorrelation coefficient and autocovariance, respectively, which can be obtained with the Yule–Walker estimation, least squares estimation or maximum likelihood estimation method53. In this paper, we build the ARIMA-based prediction model, and all the parameters except p, d, and q could be automatically generated.
In this paper, we set 6 h as the prediction time scale and 168 h as the input data dimension to predict wind and solar power generation. The reason is that 6 h-ahead forecast of renewable generation is widely used for power system scheduling and electricity trading in practice. The 6 h-ahead forecast also results in moderate errors that can serve as a benchmark for the uncertainty analysis.
Comparative prediction models
In this paper, we compare four prediction methods including RF, FCNN, RNN, and SVM. These four methods are all sample-based prediction approaches. We begin by constructing the samples using 168-h wind and solar generation data as input features and extracting subsequences of 2, 6, and 24 h as output for 2-h, 6-h, and 24-h step predictions, respectively. The RF method employs a tree-based prediction model that builds multiple decision trees during training. The structure of the decision trees is determined by parameters such as tree depth, the number of trees, and the maximum number of features considered when splitting nodes. The FCNN method utilizes a network structure consisting of interconnected perceptron. Each time slot’s generation data serves as an input feature for the FCNN, and the predicted generation is the output. The network structure is designed based on factors such as regularization, batch size during training, learning rate, and the number of neurons in each layer. The RNN is a neural network structure specifically designed for time series data, incorporating hidden variables to carry information from previous time slots. Similar to the FCNN, the RNN’s network structure is determined by parameters including the number of neurons, batch size, and learning rate. The SVM is an initial machine learning method employed to separate the dataset. The SVM solves an optimization problem to find an optimal hyperplane. Key considerations for SVM include regularization parameters, the margin of tolerance around predicted regression values, and the influence attributed to each sample. Further details on the network parameters and the tuning process can be found in the Supplementary Note and Supplementary Table 5.
Prediction error calculation
In this paper, the prediction error of wind and solar energy could be calculated as the unit megawatt (MW) prediction error. When using the ARIMA-based benchmark prediction model, we could obtain the predicted wind and solar energy generation, and the prediction error can then be calculated as Eq. (6):
where \({\varepsilon }_{{{{{{\rm{W}}}}}},i,t}\) and \({\varepsilon }_{{{{{{\rm{S}}}}}},i,t}\) are the wind and solar prediction error in province i in time slot t, \({W}_{i,t,*}\) and \({S}_{i,t,*}\) are the predicted wind and solar output, respectively, of province i in time slot t, and \({C}_{{{{{{\rm{W}}}}}},i}\) and \({C}_{{{{{{\rm{S}}}}}},i}\) are the wind and solar installed capacities, respectively, in province i. When determining the prediction error in a given province, we calculate the average value over 8760 h.
First-order difference
The first-order difference can be used to assess the variation in discrete time-series data. With the use of the first-order difference, we can obtain the increment in the original data, which can reflect gradient information. In this paper, prediction is conducted hour-by-hour, and the prediction accuracy is primarily determined by the hourly change in the generation data. Thus, in terms of wind energy, we use the first-order difference of hourly wind generation data to measure the hourly change, which can be calculated as Eq. (7):
where \({F}_{{{{{{\rm{H}}}}}},i,t}\) is the hourly first-order difference in province i in time slot t and \({W}_{i,t+{{{{\mathrm{1,0}}}}}}\) and \({W}_{i,t,0}\) are the real wind energy generation in time slots t + 1 and t, respectively. When evaluating the hourly first-order difference in a province, we calculate the average value over 8760 h.
Regarding solar energy, power generation exhibits daily periodicity, so we use daily solar energy generation data to measure the fluctuation, which can be expressed as Eq. (8):
where \({F}_{{{\mbox{Day}}},i,c}\) is the daily first-order difference in province i on day c. We also calculate the average value over 365 days to evaluate the solar energy fluctuations in a given province.
Analysis and calculation of the peak ratio
In this paper, we use the peak ratio to evaluate the prediction error. It should be noted that all the prediction methods learn the variation tendency of a given data series to predict future data. The easier a tendency is to learn, the more accurate the prediction. Thus, we aim to obtain a feature that could indicate the change in tendency to better measure the prediction error. The peaks of series data indicate inflection points, with previous data exhibiting an upward tendency and subsequent data exhibiting a downward tendency, which is a key feature reflecting the tendency change.
In regard to wind energy, we use four consecutive time slots to determine hourly peaks and traverse the time series to find all peaks, i.e., \(t=t+1\). The power generation in these four time slots should satisfy the following conditions to reach a peak: the first three hours should continuously increase, the first three hours should increase by more than 10% of the installed capacity, and the fourth hour should decrease, which can be expressed as Eqs. (9)–(11):
where \({P}_{{{{{{\rm{H}}}}}},i,t}\) denotes the hourly peaks in province i in time slot t, \({P}_{{{{{{\rm{N}}}}}},{{{{{\rm{H}}}}}},i}\) is the number of hourly peaks in province i, and \({P}_{{{{{{\rm{R}}}}}},{{{{{\rm{H}}}}}},i}\) is the ratio of hourly peaks in province i. We also calculate the average value over 8760 h to evaluate the wind energy fluctuations in each province.
Regarding solar energy, we use daily power generation data to obtain daily peaks. Similar to the hourly peak calculation, four consecutive days are chosen to determine peaks, and similar conditions should be satisfied, which can be expressed as Eqs. (12)–(14):
where \({P}_{{{{\mbox{Day}}}},i,c}\) is the daily peak in province i on day c, \({P}_{{{{\mbox{N}}}},{{{\mbox{Day}}}},i}\) is the number of daily peaks in province i, and \({P}_{{{{\mbox{R}}}},{{{\mbox{Day}}}},i}\) is the ratio of daily peaks in province i. The average value over 365 days is also calculated to express the solar energy fluctuations in each province.
Data availability
The source data underlying Figs. 1–5 and Supplementary Figs. 1-4, including the data of provincial wind and solar power generation of the 30 provinces in China, are provided as a Source Data file. Other data used in this study are available from the authors upon reasonable request. Source data are provided with this paper.
Code availability
The code used in this study is available from the authors upon reasonable request.
References
China Xinhua News. Xi’s statement at the General Debate of the 75th Session of the United Nations General Assembly (2020). [http://www.qstheory.cn/yaowen/2020-09/22/c_1126527766.htm] (2022).
Climate Ambition Summit. Leaders statements of president Xi Jinping (2020). [http://www.gov.cn/xinwen/2020-12/13/content_5569138.htm] (2022).
PrakashKumar, K. & Saravanan, B. Recent techniques to model uncertainties in power generation from renewable energy sources and loads in microgrids–A review. Renew. Sustain. Energy Rev. 71, 348–358 (2017).
Mahdi, S., Helena, L. & Nilay, S. Integrated renewable electricity generation considering uncertainties: the UK roadmap to 50% power generation from wind and solar energies. Renew. Sustain. Energy Rev. 72, 385–398 (2017).
Salvador, P., Juan, M. & Trine, B. Impact of forecast errors on expansion planning of power systems with a renewables target. Eur. J. Oper. Res. 248, 1113–1122 (2016).
Ayik, A., Ijumba, N., Kabiri, C. & Goffin, P. Preliminary wind resource assessment in South Sudan using reanalysis data and statistical methods. Renew. Sustain. Energy Rev. 138, 110621 (2021).
Kies, A. et al. Critical review of renewable generation datasets and their implications for European power system models. Renew. Sustain. Energy Rev. 152, 111614 (2021).
Rourke, F., Boyle, F. & Reynolds, A. Ireland’s tidal energy resource; an assessment of a site in the Bulls Mouth and the Shannon Estuary using measured data. Energy Convers. Manag. 87, 726–734 (2014).
Han, J., Mol, A., Lu, Y. & Zhang, L. Onshore wind power development in China: challenges behind a successful story. Energy Policy 37, 2941–2951 (2009).
Davidson, M., Zhang, D., Xiong, W., Zhang, X. & Karplus, V. Modelling the potential for wind energy integration on China’s coal-heavy electricity grid. Nat. Energy 1, 16086 (2016).
Lu, X. et al. Challenges faced by China compared with the US in developing wind power. Nat. Energy 1, 16061 (2016).
Gadad, S. & Deka, P. Offshore wind power resource assessment using Oceansat-2 scatterometer data at a regional scale. Appl. Energy 176, 157–170 (2016).
Churio, O., Marley, S., Chamorro, V. & Ochoa, G. Wind and solar resource assessment and prediction using Artificial Neural Network and semi-empirical model: case study of the Colombian Caribbean region. Heliyon 7, e07959 (2021).
Pereira, S., Abreu, E., Iakunin, M., Cavaco, A., Salgado, R. & Canhoto, P. Method for solar resource assessment using numerical weather prediction and artificial neural network models based on typical meteorological data: application to the south of Portugal. Sol. Energy 236, 225–238 (2022).
Weekes, S. et al. Long-term wind resource assessment for small and medium-scale turbines using operational forecast data and measure–correlate–predict. Renew. Energy 81, 760–769 (2015).
Joshi, S. et al. High resolution global spatiotemporal assessment of rooftop solar photovoltaics potential for renewable electricity generation. Nat. Commun. 12, 1–15 (2021).
Abreu, E., Canhoto, P., Prior, V. & Melicio, R. Solar resource assessment through long-term statistical analysis and typical data generation with different time resolutions using GHI measurements. Renew. Energy 127, 398–411 (2018).
Tahir, Z. & Asim, M. Surface measured solar radiation data and solar energy resource assessment of Pakistan: a review. Renew. Sustain. Energy Rev. 81, 2839–2861 (2018).
Sweerts, B. et al. Estimation of losses in solar energy production from air pollution in China since 1960 using surface radiation data. Nat. Energy 4, 657–663 (2019).
Tong, D. et al. Geophysical constraints on the reliability of solar and wind power worldwide. Nat. Commun. 12, 6146 (2021).
Zeng, P., Sun, X. & Farnham, D. J. Skillful statistical models to predict seasonal wind speed and solar radiation in a Yangtze river estuary case study. Sci. Rep. 10, 8597 (2020).
Joshi, S. et al. High resolution global spatiotemporal assessment of rooftop solar photovoltaics potential for renewable electricity generation. Nat. Commun. 12, 5738 (2021).
Yin, J., Molini, A. & Porporato, A. Impacts of solar intermittency on future photovoltaic reliability. Nat. Commun. 11, 1–9 (2020).
Anadón, D. L., Baker, E. & Bosetti, V. Integrating uncertainty into public energy research and development decisions. Nat. Energy 2, 17071 (2017).
Qazi, A. et al. The artificial neural network for solar radiation prediction and designing solar systems: a systematic literature review. J. Clean. Prod. 104, 1–12 (2015).
Colak, I., Sagiroglu, S. & Yesilbudak, M. Data mining and wind power prediction: a literature review. Renew. Energy 46, 241–247 (2012).
Reikard, G. Predicting solar radiation at high resolutions: a comparison of time series forecasts. Sol. Energy 83, 342–349 (2009).
Lu, X., McElroy, M. & Kiviluoma, J. Global potential for wind-generated electricity. Proc. Natl. Acad. Sci. USA 106, 10933–10938 (2009).
Zhang, S. & Chen, W. Assessing the energy transition in China towards carbon neutrality with a probabilistic framework. Nat. Commun. 13, 1–15 (2022).
Schyska, B. U. et al. The sensitivity of power system expansion models. Joule 5, 2606–2624 (2021).
Jeon, S. & Choi, D. Joint optimization of Volt/VAR control and mobile energy storage system scheduling in active power distribution networks under PV prediction uncertainty. Appl. Energy 310, 118488 (2022).
Olauson, J. et al. Net load variability in Nordic countries with a highly or fully renewable power system. Nat. Energy 1, 16175 (2016).
Morstyn, T., Farrell, N., Darby, S. & McCulloch, M. Using peer-to-peer energy-trading platforms to incentivize prosumers to form federated power plants. Nat. Energy 3, 94–101 (2018).
Zhou, D., Al-Durra, A., Zhang, K., Ravey, A. & Gao, F. A robust prognostic indicator for renewable energy technologies: a novel error correction grey prediction model. IEEE Trans. Ind. Electron. 66, 9312–9325 (2019).
Wu, H., Shahidehpour, M., Alabdulwahab, A. & Abusorrah, A. A game theoretic approach to risk-based optimal bidding strategies for electric vehicle aggregators in electricity markets with variable wind energy resources. IEEE Trans. Sustain. Energy 7, 374–385 (2016).
David, M., Boland, J., Cirocco, L., Lauret, P. & Voyant, C. Value of deterministic day-ahead forecasts of PV generation in PV + storage operation for the Australian electricity market. Sol. Energy 224, 672–684 (2021).
Zhang, Y., Gatsis, N. & Giannakis, G. Robust energy management for microgrids with high-penetration renewables. IEEE Trans. Sustain. Energy 4, 944–953 (2013).
Hosseini, S., Carli, R. & Dotoli, M. Robust optimal energy management of a residential microgrid under uncertainties on demand and renewable power generation. IEEE Trans. Autom. Sci. Eng. 18, 618–637 (2021).
Liu, N., Cheng, M., Yu, X., Zhong, J. & Lei, J. Energy-sharing provider for PV prosumer clusters: a hybrid approach using stochastic programming and stackelberg game. IEEE Trans. Ind. Electron. 65, 6740–6750 (2018).
Lu, R. et al. Multi-stage stochastic programming to joint economic dispatch for energy and reserve with uncertain renewable energy. IEEE Trans. Sustain. Energy 11, 1140–1151 (2020).
Constante-Flores, G. E. & Illindala, M. S. Data-driven probabilistic power flow analysis for a distribution system with renewable energy sources using monte carlo simulation. IEEE Trans. Ind. Appl. 55, 174–181 (2019).
Fan, M. et al. Uncertainty evaluation algorithm in power system dynamic analysis with correlated renewable energy sources. IEEE Trans. Power Syst. 36, 5602–5611 (2021).
Wu, H., Shahidehpour, M., Alabdulwahab, A. & Abusorrah, A. Demand response exchange in the stochastic day-ahead scheduling with variable renewable generation. IEEE Trans. Sustain. Energy 6, 516–525 (2015).
Papavasiliou, A., Oren, S. S. & O’Neill, R. P. Reserve requirements for wind power integration: a scenario-based stochastic programming framework. IEEE Trans. Power Syst. 26, 2197–2206 (2011).
Valencia, F., Collado, J., Sáez, D. & Marín, L. G. Robust energy management system for a microgrid based on a Fuzzy prediction interval model. IEEE Trans. Smart Grid 7, 1486–1494 (2016).
Bouffard, F. & Galiana, F. D. Stochastic security for operations planning with significant wind power generation. IEEE Trans. Power Syst. 23, 306–316 (2008).
Lara, J. D., Dowson, O., Doubleday, K., Hodge, B.-M. & Callaway, D. S. A multi-stage stochastic risk assessment with Markovian representation of renewable power. IEEE Trans. Sustain. Energy 13, 414–426 (2022).
Ziegler, M. S. et al. Storage requirements and costs of shaping renewable energy toward grid decarbonization. Joule 3, 2134–2153 (2019).
Hunt, J. D. et al. Global resource potential of seasonal pumped hydropower storage for energy and water storage. Nat. Commun. 11, 1–8 (2020).
Wang, J. et al. Exploring the trade-offs between electric heating policy and carbon mitigation in China. Nat. Commun. 11, 6054 (2020).
Ren, L., Ji, J., Lu, Z. & Wang, K. Spatiotemporal characteristics and abrupt changes of wind speeds in the Guangdong–Hong Kong–Macau Greater Bay Area. Energy Rep. 8, 3465–3482 (2022).
Amini, M. H., Kargarian, A. & Karabasoglu, O. ARIMA-based decoupled time series forecasting of electric vehicle charging demand for stochastic power system operation. Electr. Power Syst. Res. 140, 378–390 (2016).
Wei W. S. W. Time Series Analysis: Univariate and Multivariate Methods. CA: Addison-Wesley (USA) (1990).
Acknowledgements
We thank the National Key Research and Development Program of China no. 2022YFB2405600 for supporting J.W. and G.H. and the National Natural Science Foundation of China under grant No. 72241429, No. 72271008, No. 72243007, and No. 52277092 for supporting J.S., G.H., X.L., and J.W. We also acknowledge the support of State Grid Corporation of China, State Grid Jiangsu Electric Power Co., LTD. and State Grid Wuxi Power Supply Company.
Author information
Authors and Affiliations
Contributions
J.W., X.L., N.L., J.M., J.S. and G.H. conceived and designed the research. J.W., L.C., C.W.T., C. L. and G.H. developed the framework and formulated the theoretical model. J.W., L.C., X.L. and E.D. carried out the data search. L.C., Z.T. and M.S. carried out the simulations. J.W., X.L., G.H. and C.W.T. conducted the prediction-error analysis. J.W., L.C., Z.T., E.D., N.L., J.M., M.S., C.L., J.S., X.L., C.W.T. and G.H. contributed to the discussions on the method and the writing of this article.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks Mingquan Li, and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Source data
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Wang, J., Chen, L., Tan, Z. et al. Inherent spatiotemporal uncertainty of renewable power in China. Nat Commun 14, 5379 (2023). https://doi.org/10.1038/s41467-023-40670-7
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41467-023-40670-7
This article is cited by
-
Exergoeconomic analysis and optimization of wind power hybrid energy storage system
Scientific Reports (2024)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.