Consumption of fossil fuel resources leads to global warming and climate change. Apart from the negative impact of greenhouse gases on the climate, the increasing emission of anthropogenic heat from energy consumption also brings significant impacts on urban ecosystems and the surface energy balance. The objective of this work is to develop a new method of estimating the global anthropogenic heat budget and validate it on the global scale with a high precision and resolution dataset. A statistical algorithm was applied to estimate the annual mean anthropogenic heat (AH-DMSP) from 1992 to 2010 at 1×1 km2 spatial resolution for the entire planet. AH-DMSP was validated for both provincial and city scales, and results indicate that our dataset performs well at both scales. Compared with other global anthropogenic heat datasets, the AH-DMSP has a higher precision and finer spatial distribution. Although there are some limitations, the AH-DMSP could provide reliable, multi-scale anthropogenic heat information, which could be used for further research on regional or global climate change and urban ecosystems.
Machine-accessible metadata file describing the reported data (ISA-tab format)
Background & Summary
Anthropogenic heat is wasted heat in the form of sensible and latent heat (http://www.ametsoc.org/amsedu/) that is released to the urban canopy mainly by means of heating and cooling, running appliances, transportation, and industrial processes, which convert energy into anthropogenic heat1,
Therefore, calculating a high spatial resolution global scale anthropogenic heat flux (AHF) dataset is necessary for studying climate impacts. According to the characteristics of anthropogenic heat, three main approaches have been developed: observations, detailed statistical models (building/traffic energy modelling), and inventory approaches. As a classical method, the inventory approach is used globally to estimate global-scale anthropogenic heat12,13, using the assumption that any heat emitted by energy consumption is completely converted to anthropogenic heat. The observation approach includes an energy budget residual approach14,
AH-DMSP shows that the majority of anthropogenic heat is mainly produced in the developed regions-the eastern part of North America, Western Europe, the eastern and southern parts of Asia as a result of the high intensity and magnitude of human activities in these regions. All of the annual maximum values appear in urban regions. Changes in anthropogenic heat between 1990 and 2010 were significant in the metropolis regions of both the developing and developed countries.
DMSP/OLS nighttime light data
Cloud-free composite DMSP/OLS data from 1992 to 2010 were acquired from the National Oceanic and Atmospheric Administration’s (NOAA) National Geophysical Data Center25. Three of four satellites from the DMSP carry the OLS in low-altitude polar orbits to record nighttime data. The DMSP/OLS has the unique capability to detect low levels of visible near infrared (VNIR) radiance at night. With the OLS ‘VIS’ band data, it is possible to detect clouds illuminated by moonlight, plus lights from cities, towns, industrial sites, gas flares, and ephemeral events such as fires and lightning-illuminated clouds. Data values range from 1–63. Areas with no cloud-free observations are represented by the value 255. The stable average nighttime lights detected by DMSP/OLS contain the lights from cities, towns, and other sites with persistent lighting, including gas flares. Ephemeral events, such as fires, were discarded26,27. Background noise was identified and replaced with zero values. In this work, the DMSP/OLS data is regarded as a proxy to the energy consumption intensity in cities and towns, as well as in other persistent lighting sites, including gas flares (Data Citation 1: UCAR/NCAR—Earth Observing Laboratory https://doi.org/10.5065/D6V40S9M).
Normalized difference vegetation index (NDVI) Data
Gap-filled, snow-free, Nadir Bidirectional Reflectance Distribution Function (BRFD) Adjusted Reflectance (NBAR) data derived from the Moderate Resolution Imaging Spectro radiometer (MODIS) MCD43D products were used to generate NBAR-NDVI time series. For each year 2001–2010, the annual mean NDVI was calculated for each pixel27,28. To calibrate DMSP/OLS data, we prefer the annual mean NDVI to the annual maximum NDVI, because the former is more stable and less sensitive to seasonal and intra-annual fluctuations. To keep temporal consistency with the DMSP/OLS data, we extended the annual mean NDVI from year 2001 as constant values back to 1992, to complement the period 1992–2001. This decision is supported by the observation that there is not significant annual variation in the annual mean NDVI for urban regions during 2001–2010 (Data Citation 2: NASA EOSDIS Land Processes DAAC https://doi.org/10.5067/modis/mcd43d61.006).
Gridded population data
We obtained gridded population density data for the year 2000 which are 2.5 arc-minute grid cells of the Gridded Population of the World v3 (GPWv3), provided by the Centre for International Earth Science Information Network (Data Citation 3: NASA Socioeconomic Data and Applications Center (SEDAC) http://dx.doi.org/10.7927/H4FJ2DQN). The population density grids were derived by dividing population count grids by a land area grid, and represent persons per square kilometre29,30.
Statistical energy consumption data
Statistics of national primary energy consumption for 224 countries were obtained from the U.S. Energy Information Administration (EIA)11. The statistics include energy from the four main primary energy sources: coal, petroleum, natural gas, and renewable energy. Chinese urban level energy consumption was obtained from the Urban Statistical Yearbook for China (http://data.stats.gov.cn/easyquery.htm? cn=E0103). Gross national income (GNI) per capita for each of 224 countries was obtained from the World Bank (http://data.worldbank.org/income-level/OEC). A portion of the primary energy sources are used to produce secondary energy (for example, the coal used by a power station to generate electricity) in non-urban areas that is then consumed by urban residents at the end of the energy flow. In this case, the wasting heat in the processing procedure was not taken into consideration since it has little influence on total heat emitted. We therefore use only the four primary energy source classifications in this study.
The algorithm for calculating the global anthropogenic heat product (AH-DMSP) developed in this study is presented in a flowchart (Fig. 1). First, we extracted the urban areas from calibrated DMSP-OLS and MODIS NDVI from 1992 to 2010 (details in step 1 and 3). Second, for each country, we separated the primary energy consumption (coal, petroleum, natural gas, and renewable energy) into urban and non-urban consumption using the urban consumption ratio from the International Energy Agency (IEA) (details in step 2). Third, a strong exponential relationship between urban area and energy consumption (R2=0.9, P=0.039) was found from the available statistical data (Fig. 2). Based on this relationship, the energy consumption in urban areas obtained from Step 2 was inserted in the urban map obtained from Step 3. Most large factories or refineries are located in urban areas. Therefore, the energy consumed in an urban area not only includes the energy consumption for domestic use but also industrial usage. Human metabolism occupied a small share of the total heat released by human activities, and we therefore excluded it. We note that statistical errors in urban energy consumption are a function of urban area and energy consumption. In order to alleviate their effects, we can allocate the total urban energy into patches based on percent-normalized urban energy from Fig. 2. Finally, we downscaled the total urban energy consumption to the pixel-level using the corresponding spatial proxy obtained in Step 1. The rural energy consumption at the country level was allocated directly into each grid. Finally, the energy consumption value was divided by elapsed time, and the resulting AH-DMSP of each grid cell was converted to anthropogenic heat flux (AHF, W.m−2).
The calculation is simplified by the following balance equation: where An is the total anthropogenic heat of country n (1, 224); ki is the conversion rate of type i energy to heat, in which type i refers to the four sub-types of energy consumption: coal, petroleum, natural gas, or renewable energy, and Eni is the energy consumption of type i for country n. The values for k were obtained from the EIA (http://www.eia.gov/beta/international/data).
Step 1: Calibrating DMSP/OLS: Saturation and temporal fluctuation
Noise caused by less important persistent lighting sites such as gas flares, and ephemeral events such as fires in DMSP/OLS, occur mainly in the desert regions far from human habitants, but may affect anthropogenic heat precision. Here we masked out areas with a gridded population density less than 1 per 2.5 arc-minutes with the assumption that they would have no release of anthropogenic heat.
There are two drawbacks in the DMSP/OLS data: inter-annual variation and digital saturation in downtown areas31,
Step 2: Estimating ratios of urban energy consumption to total national consumption
The International Energy Agency (IEA) only calculates the ratios of urban energy consumption to total national consumption for four regions: United States, European Union, Austral-Asia, and China (Data Citation 4: Figshare https://doi.org/10.6084/m9.figshare.c.3647903). In addition to these four regions, we introduce a simple method to estimate the urban energy consumption ratios for other countries or regions. First, 224 countries were grouped into two classes: high-income and low-income countries (including middle-income countries) based on gross national income (GNI) per capita. Second, the average urban energy consumption ratios from the United States, European Union, and Australia were applied to the high-income countries, while the ratio from China was applied to the other countries. The urban and rural energy consumption for each country could then be separated.
Step 3: Extracting urban boundaries by support vector machines (SVM)
The existing global urban datasets have been applied for many studies of climate change and ecology (e.g., the Global Rural Urban Mapping Project or MODIS 500 m urban extent maps)36, but their temporal resolutions do not meet our requirements. Many previous studies indicate that the SVM algorithm31,34,37, which is widely used for classification and regression analysis, performs well at monitoring urban patches. It has the advantage of improving the accuracy of classification by encompassing substantial information, such as economic activities and urban vegetables38. Moreover, this method is not sensitive to the initial training samples, and it has the flexibility of multi-temporal and multi-spatial analysis39.
Here, we use the annual mean NDVI and calibrated DMSP/OLS as input information to the SVM-based algorithm, and extracted the urban boundaries for every year. We set the threshold (excluding water pixels inside the urban boundary) at DMSP/OLS >30 and NDVI <0.3 for developing countries39, and at DMSP/OLS >53 and NDVI <0.3 for developed countries. The Gaussian radial basis function (RBF, exp(−γ|μ−ν|^2)) was used as the kernel function, with the coefficient of cost (−c=2) set to 2 and gamma was 0.02 (−g=0.02). Noise in DMSP/OLS can be caused by a variety of reasons, such as variability in the atmosphere interference, and may cause errors in the boundary extraction output. Due to the stability of urban patches, we assumed that any urban patch detected in an earlier year by DMSP/OLS should be maintained in the results of later urban boundaries. In the end, we generated urban boundary data for each year during the period 2001–2012.
Step 4: Estimating energy consumption for all urban patches
When a city starts expansion, it normally means its urban population grows and energy consumption grows accordingly. However, when a city reaches a mature stage of economic development and infrastructure construction, energy consumption will grow relatively slower than its expansion due to increase of energy efficiency. City energy data are difficulty to find and the difficult of acquiring urban energy data lies in consideration of city boundary issues (Data Citation 4: Figshare https://doi.org/10.6084/m9.figshare.c.3647903). Therefore, we try to interpret energy consumption with city size in this study. The correlation parameters are developed with statistical data in China, which is provided in Supplementary File 1.
There was a strong exponential relationship between the urban area and energy consumption (y=−0.001x2+8.30x−24.25, P=0.039, where y is the urban energy consumption (10 million kg) and x is the urban magnitude (km2) (Fig. 2). This relationship was then applied to extract energy consumption for each urban patch from the total national urban energy consumption calculated in 2.2.2 and 2.2.3.
It is common practice to validate the accuracy of global-scale AHF estimation with anthropogenic heat data using a detailed statistical model, which consists of heat released from three components: the building sector, transportation, and human metabolism. The building sector is always divided into heat released from electricity consumption and from heating fuels, such as natural gas and fuel oil40. The detailed model focuses on the end-user of energy, while the inventory method focuses on primary energy. These methods can improve model accuracy by incorporating better information about individual cities20. For this study, the annual mean AHF estimated using our method was validated by AHF data estimated using the detailed model at multiple scales, from the city to province level. These data were estimated using the detailed statistical model and were compared with zone averaged AHF from the AH-DMSP for specific cities or provinces. Meanwhile, inter-validations of spatial patterns and single points for specific cities were also performed with previously created global gridded datasets. These annual mean global datasets were calculated by the inventory approach, and single points for specific AHF data were zone averaged.
The algorithm is coded in Matlab and C. AHF.m is the main procedure to calculate AH-DMSP. The global urban dataset is produced by Global_urban.m and its core parts are opened source code which are provided by Chang and Lin (http://www.csie.ntu.edu.tw/~cjlin/libsvm). Areas of global urban patches are calculated by Recursion.c and only binary files can be inputted and outputted in the procedure (write_binary.m and write_tif.m are designed for converting file format between binary and tiff). ArcGIS and Matplot or Python are exploited to draw figures. This code is available alongside the dataset at figshare (Data Citation 4: Figshare https://doi.org/10.6084/m9.figshare.c.3647903).
AH-DMSP is a global gridded dataset of annual mean anthropogenic heat and its spatial resolution for the entire planet is 1×1 km2. Tagged image file format (TIFF) and network common data form (NetCDF) of AH-DSMP are provided in the repository (Data Citation 4: Figshare https://doi.org/10.6084/m9.figshare.c.3647903). For the TIFF formats of AH-DMSP, its tfw format file is about spatial information of longitude and latitude for corresponding file. These data can be processed by GIS software, Matlab, Ncl or R etc. The uploaded data includes AH-DMSP at 1992,2001 and 2010 and is stored in uint16, which can be converted to annual mean anthropogenic flux (W.m−2) by a factor 0.1 (Data Citation 4: Figshare https://doi.org/10.6084/m9.figshare.c.3647903).
Spatial and temporal patterns of AH-DMSP
As shown in Fig. 3, spatial distributions of annual mean AHF are mainly in the developed regions, i.e., the eastern part of North America, Western Europe, and the eastern and southern parts of Asia. Obviously, the northern hemisphere contributes significantly more annual mean AHF than the southern hemisphere, which is consistent with the distribution of intensity and magnitude of human activities. All of the highest annual values appear in urban regions. Pixels with annual mean AHF greater than 1.59 W.m−2 occur in densely populated regions, and their numbers increase gradually with increasing global urbanization. The differences in spatial distribution of annual mean AHF between 2010 and 1992 are significant for the urban regions in rapidly developing countries, as well as in regions where there has been ongoing economic depression (for example, in Russia, Poland, and Ukraine).
Figure 4 shows a comparison between the annual mean AHF from AH-DMSP and AHF data estimated with a detailed statistical model (AHF-stat) from the city to province scales. The statistical tests at the city and province scales between AH-DMSP and AHF-stat were P=0.008 and P=0.65, respectively. At the city level, AH-DMSP has good estimates of anthropogenic heat for cities whose AHF was less than 30 W.m−2. However, with urban developments, the differences between AH-DMSP and AHF-stat became increasingly significant. Root mean square error (RMSE: W.m−2) between the AH-DMSP and AHF-stat both the city and province levels were 12.01 and 1.07, respectively. At the province scale, due to limited data from AHF-stat, the statistical test was insignificant.
To assess the performance of AH-DMSP at the global scale, we made a comparison with two other global scale datasets from the studies by Chen and Flanner studies (Fig. 5)20,21,41. There are significant differences between the AH-DMSP, Chen, and Flanner data, both in the value ranges and the spatial patterns. Spatial patterns are quite similar between AH-DMSP and Chen’s data, since both studies use DMSP/OLS nighttime data, but the two have quite different value ranges. Most areas had values lower than 0.44 W.m−2 or higher than 1.31 W.m−2 in AH-DMSP, but the Chen data had values in the range from 1.02 to 1.31 W.m−2. This implies a significant influence from saturation, and intra- and inter-annual fluctuations of the DMSP/OLS data. Conversely, the Flanner dataset, which is currently used by the climate modelling community, had a larger area with AHF values, most of which are lower than 0.44 W.m−2.
We make comparisons of the annual means from previous datasets of AHF and AH-DMSP using the AHF-stat results as the real values. Since it is difficult to access annual mean AHF-stat data corresponding to the study period, only the Chen and AH-DMSP datasets were parallel compared in this section. From Fig. 6, with a growth in city magnitude, there is a trend in biases between the global datasets and AHF-stat data, which shift from positive to negative values and become more significant (city names in Table 1). For small cities, although the AHF values from global datasets is underestimated, its values are similar to the AHF-stat (real) values. For mega cities, values are significantly overestimated. From Fig. 6, AH-DMSP is closer to the real value than the Chen data.
Although anthropogenic heat is only about 0.3% of the total energy transported to the extra-tropics by atmospheric and oceanic circulations, it could disrupt normal atmospheric circulation patterns and warm surface temperatures at the local and global scales10. Therefore, incorporating anthropogenic heat into climate models could improve the performance of simulations of surface climate warming6,10 and are beneficial for studying the impacts of increased urban heat on the concentration of precursor species (e.g., NOx and CO) and resident health, such as morbidity and mortality risk, in cities7,31.
However, detailed evaluation of the impact on climate is deterred by the limited precision and spatial and temporal resolutions of anthropogenic heat data. The DMSP/OLS data has provided an opportunity to produce an improved precision and resolution AHF dataset. The applied methodology has been validated by a detailed anthropogenic heat model at multiple scales, but there are still some needed improvements. The underlying assumption that all energy is converted into anthropogenic heat is unreasonable, since some energy consumed will be stored by buildings and converted to other forms of energy. This assumption could therefore lead to overestimation of anthropogenic heat. Due to difficulties in observation; it is difficult to establish ratios for converting energy consumption to anthropogenic heat.
The observation approach includes in situ eddy covariance observations13 and the energy budget residual approach14,
In order to keep temporal consistency with DMSP/OLS data, the annual mean NDVI for year 2001 was extended back as constant values to 1992, in order to complement the period 1992–2001. The simplified method produces less influence on the method performance, because the spatial resolution of MODIS instruments (1×1 km) has a limited ability for detecting variations in urban vegetation and cannot detect decade-scale variations35,39. Gridded population density data for the year 2000 were used to mask out noises in DMSP/OLS during the study periods. We assumed that areas with a gridded population density value lower than 1 have no release of anthropogenic heat and that the boundary of human habitations had no significant changes during the study period.
Validation revealed that the values of AH-DMSP were underestimated for certain mega-cities. The underestimation might relate to the selected ratio of urban energy consumption to the national, and an allocating function that distributes the total urban energy proportionally to each urban patch. Due to limitations in available statistics for urban sizes and energy use around the world, only four regions have ratios of urban energy consumption—United States, European Union, Austral-Asia, and China. The ratios for other countries were calculated based on gross national income (GNI) per capita, which could have negative influences on the accuracy of estimated anthropogenic heat for mega cities in these countries. Previous studies have argued that the SVM-based algorithm is not sensitive to initial training samples. Therefore, threshold values of NDVI and DMSP/OLS were used for extracting global urban patches42, and uniform training samples and thresholds of NDVI and DMSP/OLS were utilized around the world during the study period. This is another reason for underestimation in mega cities. However, separating total energy consumption into two types, urban and non-urban, then further downscaling urban energy to each city in each country is a step forward for calculating a robust global AHF product.
How to cite this article: Yang, W. et al. A new global anthropogenic heat estimation based on high-resolution nighttime light data. Sci. Data 4:170116 doi: 10.1038/sdata.2017.116 (2017).
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Yang, W. M. Figshare https://doi.org/10.6084/m9.figshare.c.3647903 (2017)
This study was supported by the National Basic Research Development Program of China (Grant no. 2015CB953602, 2016YFA0602500), and the National Science Foundation of China (Grant no. 41661144006, 91637104). The authors had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.