Background & Summary

Power is a fundamental element of human society. Access to affordable, reliable, and sustainable energy, including access to reliable electricity and power produced by renewable sources, are listed as important aspects of the United Nations Sustainable Development Goals1. Tracking dynamics and status of power production and consumption is of great importance as it reflects the manufacturing, social activities as well as human impacts on the environment. Current power statistics are based on inventories of power production, consumption, trade, etc2. This work usually has a time lag of at least one year3,4,5,6,7. Timely and effective management of the power sector, including monitoring shifts from fossil to low carbon sources, is valuable for effectively mitigating global climate change policy-making8,9. Thus low-latency data on global and national power production with the high-temporal resolution is urgently needed10.

As a result, high-temporal resolution power data is increasingly important and has received an increasing focus from governments, companies, and academic institutes11,12,13. Daily and hourly power data are critical to developing power system models14, or to understanding the patterns of human behaviors15,16. With the increasing awareness of the importance of such datasets, there has been an increase in open access at regional levels. For example, the EU has created an open platform ENTSO-E for electricity generation, load, and transmission data for Europe12. United States’ Energy Information Administration (EIA) also provides free access to data for its electricity generation and consumption13. China’s electricity generation and consumption data are available through its national grid or China’s National Bureau of Statistics17. However, the temporal coverage often varies between datasets. In addition, energy sources are reported or aggregated differently11,17,18,19,20. These inconsistencies have made it challenging to compare and evaluate progress in decarbonizing power systems across countries and regions.

International Agency (IEA) and BP provide well-integrated and unified data for power generated from different sources and cover a wide range of spatial regions2,20. International Renewable Agency (IRENA) also provides reports on global renewable energy installed capacity and generation19. However, those datasets have a time lag of at least several months and have at best a monthly time step. Monthly datasets may not provide sufficient information on power systems’ rapid changes, due to 1) changes in human behavior as the COVID-19 pandemic or the effects of weekends, and holidays21,22,23,24, 2) the impact of climate variabilities such as winter storms, summer heatwaves, and other climate variabilities causing shifts of demand, and intermittency of renewable power supply25, and 3) economic shocks such as abrupt variations of fuel prices or shortfalls of supply since the war between Ukraine and Russia26.

Here we constructed the first global daily and hourly power generation dataset (CarbonMonitor-Power) for the period going from 2016-01 to 2022-07. This dataset can be updated in near-real-time with a latency of between 1 day to a maximum of 1 month, depending on the country/region. The dataset includes daily and hourly power generation data from fossil fuels (coal, natural gas, and oil), nuclear, hydro, wind, solar, geothermal, biomass, and other renewables for 37 countries, which covers around 70% of the global power production and 68% of global power-related CO2 emissions. CarbonMonitor-Power provides a data basis to the Carbon Monitor dataset, to estimate the near-real-time daily CO2 emissions from power generation21,22,23. CarbonMonitor-Power represents a new resource for exploring high-time frequency patterns of the global power system and monitoring monthly to annual changes relevant to emissions reduction pledges (Fig. 1).

Fig. 1
figure 1

Examples of near-real-time source-specific power generation data. (a) Daily dynamics of total power generation and fossil power generation in Russia and (b) in the United States. (c) Effects of holidays on diurnal profile - a strong decline of total power generation in the United States during Thanksgiving Friday in 2021. (d) The averaged diurnal profile of total generation of April 2021 in the United States, shows that power generation on the weekdays is higher especially during peak hours than on weekends. (e) The average April diurnal profile of the energy mix of the United States’ power system shows that during noon time solar power and renewables have a significantly higher share in the power system. The CarbonMonitor-Power dataset not only records the dynamics of power generation data but also provides information on the energy structure at the hourly and daily level, giving insight into the progress of decarbonization of the power system at high frequency.

Methods

We mostly collected data from national grid operators which provide open-access power generation and consumption data at high temporal frequency. In constructing a harmonized database with global coverage, special attention was given to filling the data gaps. Although it is possible to directly acquire high-time-frequency power generation data for the EU for example, such data does not exist for some other countries like China. China’s national grid provides detailed information on installed capacity and utilization hours for major power sources, but for every month.

The framework used to generate the CarbonMonitor-Power dataset is shown in Fig. 2. We acquired raw data from the national grids of the 37 countries/regions listed in Tables 13. Raw data are acquired at the highest possible time resolution (5-minute intervals, hourly, daily, or monthly, depending on the source availability). We then developed national-specific methods for data processing and simulation (details see country-specific method below).

Fig. 2
figure 2

Data acquisition and processing framework.

Table 1 Summary table of power generation data sources, generation types included in original data, and their spatial and temporal resolutions for Australia, China, India and Japan.
Table 2 Summary table of pow er generation data sources, generation types included in original data, and their spatial and temporal resolutions for Brazil, Chile, Mexico and United States.
Table 3 Summary table of power generation data sources, generation types included in original data, and their spatial and temporal resolutions for EU27, UK, Russia and South Africa.

Power generation data acquisition

We firstly collected raw data from 12 regions (or 37 countries, including Australia, Brazil, China, 26 countries in EU27, UK, India, Japan, Russia, South Africa, the United States, Mexico, and Chile) with various energy sources. The raw data are collected from publicly available sources at the national or subnational levels. Data sources used in this study are summarized in Tables 13.

More than two million records of raw data have been collected from these 12 data sources, with nearly two thousand records newly generated and collected per day. Considerable data cleaning preprocess was performed as part of the data processing, due to frequent extreme values and missing values detected from near-real-time data.

To filter out extreme values, we first examine the quality of these high-temporal power generation datasets using an Interquartile Range (IQR) threshold method27 and detect the ‘outliers’. The IQR range is defined as the range between the 75th percentile and the 25th percentile. The upper limit is calculated as adding 0.5*IQR to the 75th percentile. The lower limit is calculated as subtracting 1.5*IQR from the 25th percentile. Values fall beyond the upper and lower limits are labeled as potential ‘outliers’. Afterward, manual processing was applied to evaluate whether each extreme value should be removed or to be kept. As a general rule, we keep extremes in the data set when there was evidence of abrupt social changes (COVID-19 confinements) and/or natural disasters (e.g., storms), which are known to have a strong and sudden impact on the power system. Then the linear interpolation function from the Python Pandas package was used to fill missing values.

After such detections, no outliers or missing values have been found in the raw data from most countries, even some of which may have pre-processed their raw data before releasing them publicly (such as the United States EIA13). In the end, pre-processing of removing outliers and filling missing values was only conducted on the raw data from China and EU27. In summary, there were 6.9% data records missing in the raw data from China, and 3.4% data records missing in the raw data from EU27. For EU27 countries, the data quality varied between different countries, ranged from 0% (no missing data or outliers, Sweden) to 29.1% (highest missing and outlier ratio, Croatia). Only two countries had higher than 10% data missing or identified as outliers. The missing data and outliers in raw data from EU27 mostly were detected at sub-hourly level (one out of two or four records missing for an hour). We filled in the missing data or replaced the outliers by interpolation method to provide a best estimate for the data at hourly time resolution. Data records which are identified as outliers or missing values are labeled as F (Filtered). Others are labeled as N (Normal)

Country-specific data processing

After the data preprocessing, for each country/region, we aggregate and/or dis-aggregate the power generation to daily (or hourly if possible) according to data availability, and to eight categories of power generation sources: coal, gas (natural gas), oil, nuclear, hydro (hydro-power), wind, solar and other renewables (including biomass, geothermal, and power generated from residual industrial heat and from other non-specified sources). In addition to those national data, three additional datasets are used to disaggregate the total generation data into specific generation types when needed: Monthly Electricity Statistics by IEA20, Statistical Review of World Energy by BP2, and the Renewable Energy Statistics by IRENA19.

However, the original data often do not have the same format and do not cover the same energy sources across countries. To establish a harmonized dataset with all sources of power, we collected information from other databases with a lower time frequency and disaggregated them to daily steps (details see country-specific method below).

Australia

The original data is acquired at hourly resolution for the following categories: Wind, Hydro, Solar (Rooftop), Solar (utility), Gas (Waste Coal Mine), Gas (Reciprocating), Gas (OCGT), Gas (CCGT), Gas (Steam), Distillate (Energy source: Diesel), Bioenergy (Biomass), Bioenergy (Biogas), Coal (Black), Coal (Brown), while, CCGT refers to power generated by combined cycle gas turbine, and OCGT refers to power generated by open gas cycle turbine. We aggregate all power generated from all types of power used in this study as following:

$${P}_{coal,h}={P}_{Coal.Black,h}+{P}_{Coal.Brown,h}$$
(1)
$${P}_{gas,h}={P}_{Gas.steam,h}+{P}_{Gas.CCGT,h}+{P}_{Gas.OCGT,h}+{P}_{Gas.Reciprocating,h}+{P}_{Gas.WasteCoalMine,h}$$
(2)
$${P}_{solar,h}={P}_{Solar.Utility,h}+{P}_{Solar.Rooftop,h}$$
(3)
$${P}_{otherrenewable,h}={P}_{Other.Bioenergybiomass,h}+{P}_{Other.Bioenergybiogas,h}$$
(4)

Brazil

The raw power generation data from Brazil is acquired from the Operator of the National Electricity System (http://www.ons.org.br/Paginas/). The data acquisition and download are performed at a daily base, with up to a week of latency (occasional delay caused by site maintenance). The original data is acquired at hourly resolution for the following categories: Wind, Hydro, Nuclear, Solar, Thermal (including Coal, Coal Mineral, Gas, Natural Gas, Combustive Oil, Diesel, Petrol (Gasoline), Biomass, Industrial Residuals). We aggregate power generated from all types of coals to coal power, and similarly, power generated from all gas types to gas power. The further aggregation to the power generated by energy source (s) at each hour (Ps, h) used in this study is as following:

$${P}_{oil,h}={P}_{CombustiveOil,h}+{P}_{Diesel,h}+{P}_{Petrol,h}$$
(5)
$${P}_{otherrenewable,h}={P}_{Biomass,h}+{P}_{IndustrialResiduals,h}$$
(6)

The power generation data is firstly produced at hourly time resolution, then further aggregated to daily resolution.

China

There are two types of core datasets for China’s power generation: Power Generation by Energy Type (P) and Coal Consumption data (CC). The Power Generation by Energy Type data is acquired from China’s Electricity Council (CEC, http://cec.org.cn), which provides information on power generation, consumption and usage on China’s national grid at monthly, seasonal and yearly time steps. The information on power generation is given primarily as installed capacity per energy source (IC), and cumulative utilization hour (CUH) per type of source (s). The power generation for month m from energy source s (Ps, m) is calculated as:

$${P}_{s,m}=I{C}_{m,s}\times \left(CU{H}_{m,s}-CU{H}_{m-1,s}\right)$$
(7)

This allows the direct calculation of power generation from energy sources including thermal, coal, natural gas, nuclear, hydro, wind, solar, biomass and geothermal (the last two with compromised time frequency and latency). Power generation by oil and other non-fossil sources, such as waste, recovery energy from industrial processes is not provided separately by CEC. But they are accounted for in the total thermal power. Therefore, we separated the thermal power from coal and gas production using factors derived from Monthly Electricity Statistics by IEA (PIEA) from the corresponding month of the latest year available:

$${F}_{oil,m}=\frac{{P}_{IEA,Oil,m}}{{P}_{IEA,CombustibleRenewables,m}+{P}_{IEA,OtherCombustibles,m}+{P}_{IEA,Oil,m}}$$
(8)
$${F}_{otherthermal,m}=\frac{{P}_{IEA,CombustibleRenewables,m}+{P}_{IEA,OtherCombustibles,m}}{{P}_{IEA,CombustibleRenewables,m}+{P}_{IEA,OtherCombustibles,m}+{P}_{IEA,Oil,m}}$$
(9)

The power generated by oil and by other renewable energy are then simulated as:

$${P}_{oil,m}=({P}_{thermal,m}-{P}_{coal,m}-{P}_{gas,m})\times {F}_{oil,m}$$
(10)
$${P}_{otherrenewable,m}=\left({P}_{thermal,m}-{P}_{coal,m}-{P}_{gas,m}\right)\times {F}_{otherthermal,m}$$
(11)

The monthly power generation data are then disaggregated to daily values using daily coal consumption (CC) data by power plants from eight coastal provinces in China (https://www.cctd.com.cn/) with the following equation:

$${P}_{s,d}={P}_{s,m}\times \frac{C{C}_{d}}{C{C}_{m}}$$
(12)

Where Ps, d being the power generation by energy source s on day d, Ps,m being the power generation of the month m. CCd is the coal consumption on day d, and CCm is the monthly coal consumption of month m. Day d is within month m. The correlation between China’s power generation and coal consumption was established at monthly time steps, and the detailed method can be found in our previous work21,22.

EU27&UK

For the year between 2016 and 2018, the power generation data for EU28 are acquired from ENTSO-E (https://www.entsoe.eu). From 2018 onwards, data for EU27 was continuously acquired from ENTSO-E, while data for the UK was acquired from BMRS (https://www.bmreports.com/bmrs/) and from Sheffield Solar (for solar power only, https://www.solar.sheffield.ac.uk/). The data acquisition is carried out on a daily basis. The data aggregation method is database specific instead of country specific.

ENTSO-E provides power generation capacity data at time resolution between every 15 minutes to every 60 minutes. Therefore, hourly power generation was firstly computed as

$${P}_{c,s,h}=Capacit{y}_{c,s,h}\times 1hr$$
(13)

The power generation (P) from country c from source s during the hour h is computed as the average power generation capacity during the corresponding time, multiplied by the length of duration (1 hour). Then the power generation data for each country (c) was further aggregated to the eight energy sources used in this study with the following equations at desired time resolution (excluding energy sources where aggregation was not required):

$${P}_{c,coal,h}={P}_{c,Browncoal{\rm{\& }}Lignite,h}+{P}_{c,Coalderivedgas,h}+{P}_{c,Hardcoal,h}$$
(14)
$${P}_{c,oil,h}={P}_{c,Oil,h}+{P}_{c,ShaleOil,h}$$
(15)
$${P}_{c,hydro,h}={P}_{c,PumpedStorage,h}+{P}_{c,Runofriverandpoundage,h}+{P}_{c,Waterreservoir,h}$$
(16)
$${P}_{c,wind,h}={P}_{c,WindOffshore,h}+{P}_{c,WindOnshore,h}$$
(17)
$${P}_{c,other,h}={P}_{c,Biomass,h}+{P}_{c,Geothermal,h}+{P}_{c,Otherrenewable,h}+{P}_{c,Waste,h}$$
(18)

BMRS provides power generation capacity data over the UK for 48 equally distributed time periods per day. The conversion from power generation capacity to hourly power generation follows the same methods as Eq. 13. Two energy sources were aggregated from the database to match the sources defined in this study, with the following equations:

$${P}_{gas,h}={P}_{ccgt,h}+{P}_{ocgt,h}$$
(19)
$${P}_{hydro,h}={P}_{Hydro(Pumpedstorage),h}+{P}_{Hydro(Non-pumpedstorage),h}$$
(20)
$${P}_{other,h}={P}_{Biomass,h}+{P}_{Other,h}$$
(21)

Among which, ccgt refers to power generated by combined cycle gas turbine, and ocgt refers to power generated by an open gas cycle turbine. The sources aggregated hourly power generation data was then further aggregated to daily, monthly and yearly resolution. In the end, we aggregate all countries to EU27 and UK to an aggregated dataset for EU27&UK.

India

The power generation data from India is initially acquired from the Power System Operation Corporation Limited (POSOCO, https://posoco.in/) on a daily basis, with one to two days of latency. The original data is provided for aggregated sources as compared to our required eight sources (Fig. 2): Gas and oil produced power are aggregated and called Gas_Naptha_Diesel; RES aggregates power produced by wind, solar, biomass, and other energy sources. To disaggregate these energy sources, we developed factors (F) based on the reference monthly power generation dataset (PIEA)20 of the corresponding month (m) from the last available year (the current year or the previous year). for oil and gas:

$${F}_{s,m}=\frac{{P}_{IEA,s,m}}{{P}_{IEA,Oil,m}+{P}_{IEA,NaturalGas,m}}$$
(22)

For solar and wind:

$${F}_{s,m}=\frac{{P}_{IEA,s,m}}{{\sum }_{m}({P}_{IEA,Solar}\,,{P}_{IEA,Wind}\,,{P}_{IEA,CombustibleRenewables}\,,{P}_{IEA,Geothermal}\,,{P}_{IEA,OtherRenewables})}$$
(23)

For other renewable:

$${F}_{s,m}=\frac{{\sum }_{m}\left({P}_{IEA,CombustibleRenewables}\,,{P}_{IEA,Geothermal}\,,{P}_{IEA,OtherRenewables}\right)}{{\sum }_{m}\left({P}_{IEA,Solar}\,,{P}_{IEA,Wind}\,,{P}_{IEA,CombustibleRenewables}\,,{P}_{IEA,Geothermal}\,,{P}_{IEA,OtherRenewables}\right)}$$
(24)

With these factors, we disaggregate the daily power production from its original aggregated sources as follows:

$${P}_{gas,d}={P}_{Gas\_Naptha\_Diesel,d}\times {F}_{gas,m}$$
(25)
$${P}_{oil,d}={P}_{Gas\_Naptha\_Diesel,d}\times {F}_{oil,m}$$
(26)
$${P}_{solar,d}={P}_{RES,d}\times {F}_{solar,m}$$
(27)
$${P}_{wind,d}={P}_{RES,d}\times {F}_{wind,m}$$
(28)
$${P}_{otherrenewables,d}={P}_{RES,d}\times {F}_{otherrenewables,m}$$
(29)

With day d being in the month m.

Japan

We acquire Japan’s hourly power generation data from the Organization for Cross-regional Coordination of Transmission Operators (OCCTO, https://www.occto.or.jp/en/), with a latency of one to two months. The data provides fossil power in one aggregated sector. Therefore, we disaggregate fossil power with factors derived from reference monthly power generation data28 for the coal, gas, and oil sectors:

$${P}_{s,h}={P}_{Fossil,h}\times {F}_{s,m}$$
(30)
$${F}_{s,m}=\frac{{P}_{s,m}}{{\sum }_{m}\left({P}_{IEA,Coal}\,,{P}_{IEA,NaturalGas}\,,{P}_{IEA,Oil}\right)}$$
(31)

Hour h being on day d in the month m. While for the hydro, wind, solar and other renewable categories, we apply the following aggregation:

$${P}_{hydro,h}={P}_{Hydroelectric,h}+{P}_{PumpedStorageHydroelectricity,}$$
(32)
$${P}_{solar,h}={P}_{Photovoltaic,h}+{P}_{PhotovoltaicRegulated,}$$
(33)
$${P}_{wind,h}={P}_{Wind,h}+{P}_{WindRegulated,}$$
(34)
$${P}_{other,h}={P}_{Biomass,h}+{P}_{Geothermal,}$$
(35)

Russia

Hourly power generation data from Russia is acquired from the United Power System of Russia (http://www.so-ups.ru/index.php). It provides hourly power generation for Thermal, Nuclear, Hydro, and Renewables. We develop factors (Fs) based on the reference yearly power generation dataset (Pbp, from BP Statistical Review of World Energy2) from the last available year (being previous year or the year before, y). for fossil energy sources (s):

$${F}_{s}=\frac{{P}_{bp,s,y}}{{P}_{bp,Coal,y}+{P}_{bp,Gas,y}++{P}_{bp,Oil,y}}$$
(36)

With these factors, we disaggregate the hourly fossil power generation to coal, gas, and oil power as follows:

$${P}_{s,h}={P}_{{\rm{Thermal}},{\rm{h}}}\times {F}_{s}$$
(37)

For wind and other renewables, we develop Fs based on reference yearly renewable power generation dataset (PIRENA, from International Renewable Energy Agency19) from the last available year (being previous year or the year before, y):

$${F}_{s}=\frac{{P}_{IRENA,s,y}}{{P}_{IRENA,Wind,y}+{P}_{IRENA,Other,y}}$$
(38)

With these factors, we disaggregate the hourly power generation of the category Other to and other renewable power as following:

$${P}_{s,h}={P}_{{\rm{Other}},{\rm{h}}}\times {F}_{s}$$
(39)

United States

Hourly power generation data from the United States acquired from EIA (https://www.eia.gov/beta/electricity/gridmonitor/). As the energy sources provided by EIA match the source categories used in this study, we did not apply further data aggregation steps following data acquisition and data cleaning. The data are acquired at the local time and aggregated to national totals.

South Africa

The hourly power generation data from South Africa is acquired from Eskom (https://www.eskom.co.za/dataportal/supply-side/station-build-up-for-the-last-7-days/). Eskom provides power generation data for the following categories: Coal (labeled as Thermal in the source data), Natural-gas, Oil (labeled as OCGT in the source data), Nuclear, Pumped Water Generation, Hydro Water Generation, Photovoltaic generation (PV), Concentrated Solar Power generation (CSP), Wind, and Other Renewable. We aggregate hydro power and solar power sources with the following equations:

$${P}_{hydro,h}={P}_{PumpedWaterGeneration,h}+{P}_{HydroWaterGeneration,h}$$
(40)
$${P}_{solar,h}={P}_{SouthAfrica,PV,h}+{P}_{CSP,h}$$
(41)

Mexico

The hourly power generation data from Mexico is acquired from Gobierno De Mexico (https://www.cenace.gob.mx/Paginas/SIM/Reportes/EnergiaGeneradaTipoTec.aspx) for the following categories: Coal, Gas, Combined cycle, Internal Combustion (major fuel types including hydrocarbons such as paraffinic, olefinic, naphthenic, aromatic), Conventional Thermal, Nuclear, Hydro, Wind, Solar, Biomass, and Geothermal. We aggregate gas, oil power and other renewable power sources with the following equations:

$${P}_{gas,h}={P}_{Gas,h}+{P}_{Combinedcycle,h}$$
(42)
$${P}_{oil,h}={P}_{InternalCombustion,h}+{P}_{ConventionalThermal,h}$$
(43)
$${P}_{otherrenewables,h}={P}_{Biomass,h}+{P}_{Geothermal,h}$$
(44)

Chile

The hourly power generation data from Chile is acquired from Coordinador Eléctrico Nacional (https://www.coordinador.cl/operacion/graficos/operacion-real/generacion-real/). Coordinador Eléctrico Nacional provides power generation data for the following categories: Coal, Petcoke, Biogas, Natural Gas, Diesel, Fuel Oil, Run-of-river, Storage, Wind, Solar, Geothermal, Biomass, and Cogeneration. The coal, gas, oil, hydro and other renewables power generation are calculated with the equations below:

$${P}_{coal,h}={P}_{Coal,h}+{P}_{Petcoke,h}$$
(45)
$${P}_{oil,h}={P}_{Diesel,h}+{P}_{Fueloil,h}$$
(46)
$${P}_{hydro,h}={P}_{Run-of-river,h}+{P}_{Chile,storage,h}$$
(47)
$${P}_{otherrenewables,h}={P}_{Geothermal,h}+{P}_{Biomass,h}+{P}_{Cogeneration,h}+{P}_{Biogas,h}$$
(48)

Data Records

Currently, there are two data records provided in this dataset, which can be downloaded at https://github.com/KowComical/CM_Power_Data or figshare29. All data are available for 1857 days (from January 1st, 2016 to 30th June, 2022 for countries except for China and South Africa. For China, the data record starts from 1st January 2018, and for South Africa from 1st April 2018):

  • A record of 4,015 records are the daily total and source-specific power generation from 8 power sources (i.e., coal, gas, oil, hydro-power, solar-power, wind-power, other renewables (biomass, geothermal and other renewable sources)) and for 2 individual countries/regions (China (from January 2018), India).

  • A record of 2,415,102 records are the hourly total and source-specific power generation from 8 power sources (i.e., coal, gas, oil, hydro-power, solar-power, wind-power, other renewables (biomass, geothermal and other renewable sources)) and for 35 individual countries/regions (i.e., United States, EU27 & UK, Russia, Japan and Brazil, Australia, Chile, Mexico, South Africa)

  • Two label types are used: N (Normal) and F (Filtered). F stands for data records where the raw data are filtered out for outliers and missing values, then subsequently filled with methods stated in Power Generation Data Acquisition of the Method section. N stands for data records where no outliers and missing values are detected.

Technical Validation

Correlation with reference data

We compared our dataset with the reference database (IEA monthly electricity generation data20 and BP annual electricity data2) over the overlapping time period of 2019 to 2022, and the results show that our data in general agrees well with the reference data (Fig. 3). For most countries, we used the monthly electricity generation as the reference database. For countries that are not covered by IEA (Russia and South Africa), we used the BP Statistical Review of World Energy2. For countries compared to the IEA database, the overlapping period is 2019 to April 2022. For countries compared to BP’s database, the overlapping period is 2019 to 2021. The data are displayed as monthly averaged power generation. In general, for annual total power generation, the CarbonMonitor-Power database shows good agreement with the reference dataset for all countries (R2 > 0.95). This indicates that on aggregated terms, the power generation provided by CarbonMonitor-Power is in line with the reference databases. There are also strong correlations between these two databases for electricity generated by major energy sources, including coal, gas, nuclear, hydro, solar, and wind. These account for about 95.4% of total electricity generation.

Fig. 3
figure 3

Correlation between CarbonMonitor-Power’s power generation data and power generation from other databases (BP2 for Russia and South Africa and IEA20 for the other  countries). Different colors indicate different world regions. The correlation coefficient is shown as squared term R2.

There are two energy sources, however, that show lower performance, namely power generated from oil and other renewables. The lowest correlation coefficient is observed in oil-fired power (R2 = 0.661). The data in Fig. 3 shows that we have a systematic overestimation for China and a systematic underestimation for Brazil for oil-generated power. For other countries, the data points are distributed with a scatter but without a bias. As oil is the smallest energy source in the power sector, it has the largest uncertainty. As for other renewables (“biomass, geothermal and other renewables” in figure), the low correlation coefficient (R2 = 0.682) with other databases is mainly driven by the higher estimation in China by CarbonMonitor-Power. This is mainly related to the method used: subtracting coal and gas-fired power from thermal power and then distributing the non-coal and non-gas thermal power with a disaggregation factor. We have noticed that the non-coal and non-gas thermal power collected by this database is much higher than the data provided by IEA. This led to the result that we estimate higher power produced by oil and by other renewables.

Systematic bias with reference datasets for most countries

The time series of CarbonMonitor-Power are compared to other datasets in Fig. 4. The main result is that the CarbonMonitor-Power data agree well with the reference dataset in terms of the overall trend, yet with significantly shorter latency and higher temporal frequency. For most of the countries, apart from China, CarbonMonitor-Power shows lower values than the other dataset, ranging from 1% in Ember to 7% in IRENA (detailed comparison for different energy sources see Tables 4, 5. This is likely due to differences in input data sources. CarbonMonitor-Power uses national grids as the main data sources. Reference data sources use a combination of data sources, including national census data and self-derived estimation methods. It is very likely that most countries have off-grid power generation, which is accounted for by IRENA19, IEA20 and BP30, and Ember’s18 methods, but not by this study (we focus on the power grid, as stated in the method section). The exception of China is caused by the same reason. For China’s power generation, we acquired our raw data from China’s Electricity Council, which provides all the power distributed by the national electricity grid. The reference databases such as IEA acquired their data from China’s National Bureau of Statistics, which provides power generated by Power plants above the designated size (measured by annual income). Therefore, it is likely that the power generated by small power plants and some distributed photovoltaics are not accounted for by the IEA database. But in general, the discrepancy between IEA and CarbonMonitor-Power for China is not large (except for solar power generation).

Fig. 4
figure 4

Time series of monthly total power generation data from CarbonMonitor-Power (red lines) and reference data. Dashed gray lines for BP’s Statistical Review of World Energy2, dashed light blue lines for The International Renewable Energy Agency (IRENA)’s Renewable Energy Statistics19, solid green grey lines for Ember’s monthly power generation32, and solid dark blue lines for IEA’s monthly electricity statistics20. For data from BP and IRENA, as the original data are provided as an annual total, it is plotted as a monthly average for comparison purposes.

Table 4 Summary of power generation datasets characteristics.
Table 5 Summary of comparison between CarbonMonitor-Power and reference power generation datasets.

Case study: UK’s grid data

Due to the lack of high-temporal resolution data, it is difficult to comprehensively compare all countries’ energy sources with other daily or hourly power generation data. It was nevertheless possible to find one additional data source (UK_ep https://electricityproduction.uk/from/all-sources/?t=10y) for the United Kingdom. Therefore, we compared the two databases for power generation from all eight sources at high time frequency as a case study (Fig. 5). From the comparison, we could find that although with some discrepancies, The hourly profiles agree very well between the two data sources except for Solar power, which is not covered by the UK-ep database. This shows that the CarbonMonitor-Power database also provides reliable information at high time frequency in addition to satisfying accuracy at aggregated time steps.

Fig. 5
figure 5

Comparison between two databases: CarbonMonitor-Power (red) and UK_ep (blue) for the diurnal profiles for eight energy sources: Coal, Gas, Oil, Nuclear, Hydro, Solar, Wind, and Other Renewables. The x-axis denotes the hour of the day and the Y-axis presents the power generated from each source. The figure shows an average hourly profile for the month of June 2022.

Usage Notes

As an expansion of current near-real-time data collection, it is our hope that our newly developed CarbonMonitor-Power data offers new opportunities to reveal the most up-to-date trends and variations of global power sector, especially at hourly to daily scales. We hope that CarbonMonitor-Power will facilitate a wide range of stakeholders with their research and decision making. To facilitate the usage of the CarbonMonitor-Power data, we provide the following examples to serve as a starting point on how the dataset could be used.

Hourly profile of power generation in major countries

The CarbonMonitor-Power database we present offers a wide range of possibilities in analyzing the hourly dynamics, daily patterns and seasonality of power generated in different countries. In addition to the strong holiday effect (Fig. 1c) and weekend effect (Fig. 1d) reflected by the single day hourly profiles and monthly average hourly profiles for the United States, power generation also show unique seasonal and geographical patterns in all major countries (Fig. 6). These patterns could be linked to different social-economical characteristics. For example, Japan (Fig. 6b) and United States (Fig. 6d) both show significantly increased power generation during mid-day time period in the 3rd quarter (July to September) of year 2021. However, such phenomenon is not observed for countries like France and South Africa. The increased power generation during summer mid-day in Japan and United States may be caused by increased consumption of power due to cooling demands, enhanced solar power production, or a combination of both. Our CarbonMonitor-Power dataset also provides source-specific power generation data, which allows further analysis of the energy mix for such distinctive seasonal and regional patterns. This example illustrates the advantages of a near-real-time, global-coverage, source-specific power dataset in facilitating analyses for countries across various geographical locations and with different social-economic status.

Fig. 6
figure 6

Examples of hourly near-real-time total power generation data from major countries. The x-axis denotes the hour of the day and the Y-axis presents the total power generated. The figure shows an average hourly profile for the 1st (Q1), 2nd (Q2), 3rd (Q3) and 4th(Q4) quarters of year 2021 for 1) South Africa, (b) Japan, (c) Russia, (d) United States, (e) Australia, (f) Brazil, (g) France and (h) Germany.

Usage case: increased use of coal as power source in EU27&UK under the impact of Russian’s invasion of Ukraine

The near-real-time and high time frequency features of this CarbonMonitor-Power dataset present a unique opportunity to closely follow the dynamics of the global and regional power system. Here we provide a usage case on how CarbonMonitor-Power may provide policy relevant information in a timely manner.

Short after the Russian invasion of Ukraine on the 24th February, 2022, with no significant increase in power demand (as compared to the same time period in 2021, Fig. 7a), power generation in Europe (EU27&UK) temporarily increased its reliance on coal as energy source (red area highlighted in Fig. 7b). Meanwhile natural gas as power source stayed at a similar level, followed by a further decline in April (Fig. 7c). From 24th February to 28th February, total power generation increased by 4.8% but coal power generation increased by 45.1%, while natural gas generated power decreased by 6.2% (all compared to 2021). From March to April, total power generation decreased by 1.3%. Among which, power generated from natural gas decreased by 6.3%, while power generated from coal increased by 23.1% (all compared to 2021). The immediate decline of gas and rise of coal as energy source in the power system may result from decreased Russian gas import, increased gas price due to speculation and panic buying, or a combination of both. The import of natural gas from Russia decreased sharply following the invasion of Ukraine, caused a further decrease of Europe’s natural gas supply26. A sudden increase of EU gas price was also observed short after the invasion: EU gas price (as shown by the Dutch TTF Natural Gas Calendar price) showed a sudden spike on the 24th February, and stayed at a high plateau with large variability for the following months31.

Fig. 7
figure 7

Usage examples of CarbonMonitor-Power: increased use of coal as power source in EU27&UK under the impact of Russian’s invasion of Ukraine. The figure shows how Europe’s (EU27&UK) daily power generation changes in the first four months of year 2022 for a) Total Power Generation, (b) Coal as Power Source and (c) Gas as Power Source. The x-axis denotes the dates. Power generated from corresponding source in year 2021 is plotted as baseline for comparison. A shaded color of pink indicates an increase in year 2022 as compared to 2021. A shaded color of purple indicates a decrease in year 2022 as compared to 2021. The shaded areas (starting at 24th February, marked by dashed lines) indicate the dates under the impact of Russian invasion of Ukraine.

The disproportional and substantial increase in coal generated power immediately following the Russian invasion of Ukraine may point to a previously undiscussed increase in power associated emission increase in Europe under the impact of Russian’s invasion of Ukraine. This highlights the importance of designing a risk-resilient regional power system.

Extension of the CarbonMonitor-Power dataset

The current CarbonMonitor-Power dataset covers power generation data from three types of fossil sources (coal, gas, and oil), nuclear energy and four groups of renewable energy sources (solar energy, wind energy, hydro energy and other renewables including biomass, geothermal, etc.). The current coverage is from January, 2016 and the spatial coverage is for 37 countries across all continents. In addition to the continuous update of the existing dataset, the CarbonMonitor team is working on further extending the dataset including improved spatial resolution and coverage. To achieve these, we are collecting additional power generation data for regions and countries that are not covered yet. For missing data, we try to use proxy data including fossil fuel consumption data and climate reanalysis data to fill the data gap. In addition, daily and hourly power generation profiles of neighboring countries with similar climate and social-economic conditions could be used as proxy data. Combined with monthly/annual power generation data, it is possible to construct daily/hourly power generation dynamics for countries where no such local data is available.