A high-resolution map of reactive nitrogen inputs to China

To feed an increasingly affluent population, reactive nitrogen (Nr) inputs to China’s lands and waters have substantially increased over the past century. Today, China’s Nr emissions account for over one third of global total emissions, leading to serious environmental pollution and health damages. Quantifying the spatial variability of Nr inputs is crucial for the identification of intervention points to mitigate Nr pollution, which, however, is not well known. Here, we present a database describing Nr inputs to China for the year 2017 with a 1 km × 1 km resolution, considering land use and Nr sources, compiled by using the CHANS model. Results show that the North China Plain, the Sichuan Basin and the Middle-Lower Yangtze River Plain are hotspots of Nr inputs, where per hectare Nr input is an order of magnitude higher than that in other regions. Cropland and surface water bodies receive much higher Nr inputs than other land use types. This unique database will provide basic data for research on environmental health and global change modelling.

To feed an increasingly affluent population, reactive nitrogen (Nr) inputs to China's lands and waters have substantially increased over the past century. Today, China's Nr emissions account for over one third of global total emissions, leading to serious environmental pollution and health damages. Quantifying the spatial variability of Nr inputs is crucial for the identification of intervention points to mitigate Nr pollution, which, however, is not well known. Here, we present a database describing Nr inputs to China for the year 2017 with a 1 km × 1 km resolution, considering land use and Nr sources, compiled by using the CHANS model. Results show that the North China Plain, the Sichuan Basin and the Middle-Lower Yangtze River Plain are hotspots of Nr inputs, where per hectare Nr input is an order of magnitude higher than that in other regions. Cropland and surface water bodies receive much higher Nr inputs than other land use types. This unique database will provide basic data for research on environmental health and global change modelling.

Background & Summary
China produced the world's largest amount of reactive nitrogen (Nr) through Haber-Bosch nitrogen (N) fixation (HBNF), around 40 Tg Nr in 2017 1 . A substantial input of Nr from agricultural and other activities has resulted in a range of adverse effects on human health and environmental quality, including the loss of biodiversity, soil acidification, and eutrophication 2 . Although early successes in mitigating Nr pollution have been observed in recent years with the introduction of strict pollution control measures, Nr pollution still presents an area of widespread concern in China 3 . To better understand the pathways for Nr losses and identify mitigation options, it is crucial to assess the current status and characteristics of N inputs to China's lands and water bodies. Previous studies on Nr inputs to China typically have a spatial resolution down to province or county level. However, this is not sufficient to use for a detailed estimation on health or environmental effects derived from Nr uses 4,5 . Meanwhile, more and more studies utilize complex process-based simulation models requiring high resolution gridded data, but so far lack high-resolution maps of Nr input 6 .
Crop production has markedly increased over the past 40 years in China, due to the substantial increase in the use of synthetic N fertilizers 7 . In 2017, the total N fertilizers applied to Chinese croplands amounted to 29 Tg N (173 kg/ha for rice, 213 kg/ha for wheat and 183 kg/ha for maize), accounting for some 30% of global total fertilizer use on only 9% of global cropland, while producing far lower crop yields compared to the global average 8 . Meanwhile, China raises around 40% of global livestock, and manure N has become an important source of N inputs to lands and water bodies. These features lead to a much higher Nr loading in China compared to other global regions. However, due to the lack of a comprehensive, high resolution dataset on livestock distribution, spatially allocating these Nr inputs on different land areas and catchments is a grand challenge. Furthermore, livestock distribution also changes annually with market and policy regulations, such as the relocating pigs program 9 , adding complications to the compilation of robust spatial datasets. Other than agricultural sources of Nr inputs, nitrogen oxide (NOx) emissions from fossil fuel combustion also contribute to Nr deposition to land and water. For instance, the Nr deposition can contribute to over 30% of total Nr input to the Lake Tai in East China 10 , and NOx has a large share in the Nr deposition.
Previous studies have quantified the spatial distribution of some of these Nr fluxes, such as emissions from fertilizer use 11 , or atmospheric Nr deposition, separately. However, there are few studies which integrate all relevant N inputs with high spatial resolution to allow for an identification of the key drivers of adverse effects on human environmental health. To fill this knowledge gap and at the same time update the quantification of N fluxes to the most recent year, 2017, we calculated the overall N budget for China first using the Coupled Human And Natural System (CHANS) model (https://person.zju.edu.cn/en/bjgu#930811). CHANS includes 14 different subsystems covering all the natural and anthropogenic sources of Nr input, recycling and losses. Then, we extract all the N input fluxes to land and water bodies, estimate their spatial distribution with a spatial resolution of 1 km × 1 km using spatial indicators such as land use or population distribution. This results in a unique database which will help users to explore spatial patterns of Nr inputs in China, to assess N input on sensitive ecosystems within safe boundary and support the development of mitigation strategies for Nr pollution in China.

Database structure. The Coupled Human And Natural Systems Nitrogen Cycling Model Spatial
Distribution (CHANS-SD) 1.0 database consists of three files (Fig. 1). The 'data file' provides N inputs of 6 land use types, including cropland, forest, grassland, water, built-up area and unused land. The 'readme file' explains the abbreviations used in the 'data file' and 'source file' , and provides the units of all variables (Fig. 1). The 'source file' includes the full references and input data used in the database (Fig. 1).

Data compilation.
We applied the CHANS model to calculate N inputs by land use type. The CHANS model calculates all nitrogen (N) fluxes that can be identified, together with the linkages among subsystems, within a country, state (province), city or watershed, with a mass balance principle. The system is divided into 14 subsystems: cropland, grassland, forest, livestock, aquaculture, industry, human, pet, urban green land, wastewater treatment, garbage treatment, atmosphere, surface water, and groundwater. N cycling starts from the entry of reactive N (Nr) that activated from N 2 into the system or from Nr direct input to the system from outside, and terminates when Nr is transformed to N 2 or lost to outside the system. N input to cropland subsystem is the largest component of N input to China, including N fertilizer, atmospheric N deposition, N from irrigation, N from livestock manure, N from human excretion, cropland biological  www.nature.com/scientificdata www.nature.com/scientificdata/ N fixation (BNF) and straw recycle. N input to forest subsystem, including forest BNF and N deposition, is the second largest component of N input to China given the large area of forest. N input into surface water subsystem, including Nr runoff from cropland, livestock and forest, human wastewater discharged, industrial waste water, WTP (Water Treatment Plant) effluent and N deposition, is the third largest component of N input in China (Fig. 2). The details of the CHANS model including all the code and parameters of N cycling, and the protocols for the calculation of all related N fluxes can be found from https://person.zju.edu.cn/en/bjgu#930811.
The National Bureau of Statistics of China 1 provides data of cropland N fertilizer use, planting area, livestock and population in the statistical yearbook of each province. Taiwan, Hong Kong, and Macao were not included owing to data limitations. Gridded datasets of land use and GDP is derived from the Resource and Environment Data Cloud Platform 12 , and data on China's hydro-basin distribution is collected from the FAO website GeoNetwork 13 .
In the cropland subsystem, livestock manure input to cropland (MANURE an , kg/yr) was calculated based on animal population (POP an , head), excretion factor (EXCRE an , kg N/head/yr) and rate of livestock manure applied to cropland (RE an , %) according to Eq. (1) (Fig. 3a), an an an an Amounts of chemical N fertilizer consumption at county scale were obtained from the statistical yearbook of counties (Fig. 3b). Human manure to cropland (MANURE hu , kg /yr) was calculated by urban population (POP ur , person), rural population (POP ru , person), excretion factor (EXCRE hu , kg N/person/yr), rate of urban excretion return to cropland (RE ur , %) and rate of rural excretion return to cropland (RE ru , %) according to Eq. (2) (Fig. 3c). www.nature.com/scientificdata www.nature.com/scientificdata/ hu ur ur ru ru hu Cropland BNF (CBNF, kg/yr) was calculated by planting area (area, ha) and N fixation rate (r fix , kg N/ha/yr) according to Eq. (3) (Fig. 3d), Straw recycling and N input from irrigation are calculated by using nationally uniform values (Table 1). N input to the grassland subsystem include N fertilizer, manure recycle, N deposition, grassland BNF and irrigation for artificial grassland. N input in forest subsystem include N fertilizer to artificial forests, N deposition and forest BNF.
In the surface water system, N inputs to each watershed include runoff from cropland, livestock, and forest areas, human wastewater discharge, industrial wastewater, WTP effluent and atmospheric N deposition. Cropland runoff was calculated by spatial distribution of cropland input (Fig. 4a). Livestock runoff using the spatial distribution of livestock manure (Fig. 4b). Finally, forest runoff was calculated by applying the spatial distribution of forest areas. Human wastewater discharge was calculated based on the spatial distribution of the human population (including urban and rural populations) (Fig. 4c). Industrial wastewater was calculated by using the spatial distribution of GDP since industrial output is highly correlated with GDP on regional scale (Fig. 4d). Similarly, the distribution of WTP and its effluent Nr are highly correlated with urban population on regional scale, therefore, WTP effluent was calculated by spatial distribution of urban population. Important components of N inputs to water. (a) N input from cropland runoff to water, shows similar distribution with N input to cropland; (b) N input from livestock runoff to water, shows similar distribution with N input from livestock manure to cropland; (c) N input from human wastewater to water, shows similar distribution with population; (d) N input from industrial wastewater to water, shows similar distribution with GDP. The data of Taiwan is absent. Base map is applied without endorsement from GADM data (https://gadm. org/). (2020) 7:379 | https://doi.org/10.1038/s41597-020-00718-5 www.nature.com/scientificdata www.nature.com/scientificdata/ For N deposition, the satellite observations on NO 2 columns were derived from GOME-2. GOME-2 overpass times provided global coverage of NO 2 with a variable ground spatial resolution of 80 km × 40 km (every day). We used the monthly TEMIS NO 2 product at a spatial resolution of 0.25° latitude × 0.25° longitude downloaded from the website of Tropospheric Emission Monitoring Internet Service 14 . The satellite observations on NH 3 columns were derived from IASI onboard the meteorological platforms MetOp-A and MetOp-B 15 with an elliptical footprint of 12 × 12 km up to 20 × 39 km depending on the satellite-viewing angle. The daily NH 3 columns were downloaded from IASI Portal 16 . We processed the daily data into the monthly NH 3 columns averaged by daily observations at a horizontal resolution of 0.25° latitude × 0.25° longitude using the arithmetic mean method 17,18 . N deposition of every subsystem was obtained from the national N deposition spatial distribution map by 'spatial join' tool of ArcGIS 10.6.

Data Records
The data are available in a single dataset 19 , which consists of three files: the 'data file' (CHANS-SD 1.0 Data File) is the main file, includes N inputs in all 6 land-use types. The 'readme file' (CHANS-SD 1.0 Read Me) explains the abbreviations and units, and the 'source file' includes the full references used in the database (Fig. 1). The CHANS-SD 1.0 database is the most comprehensive and up-to-date measurement-based dataset of Nr inputs over different land-use types (e.g., cropland, forest, grassland, built-up area and unused land) in China.

technical Validation
All data included in the CHANS-SD 1.0 database are derived from published statistical yearbook data of 293 prefecture-level cities, including 2,311 counties, and presents the best available dataset for China in 2017. Thorough quality assurance and control (QA/QC) has been conducted with each data record having been checked for possible errors and the extreme values were excluded. The data of N input to cropland system (e.g. N fertilizer) is less uncertain than the data of N input to water system, since N input to cropland is calculated directly from governmental statistical data while N input to water is calculated from data of other subsystems (e.g., cropland, livestock, forest, human, wastewater and industry). The stability of CHANS model has been validated by international peers and we have published many papers using the data and methods of this model 6,[20][21][22] , and the data of CHANS model show similar spatial distribution with other studies 23,24 . Overall, CHANS-SD 1.0 database provides high-quality open-access information on N input to China's land and water.

Code availability
No custom computer code was used to generate the data described in the manuscript. A CHANS Excel Calculator describes the data we used is available in figshare data record (https://doi.org/10.6084/m9.figshare.12637391.v5).