Background & Summary

Croplands cover about 11% of the total land area and are responsible for most of the 60% of anthropogenic nitrous oxide (N2O) emissions that are attributed to agriculture and 11% of the anthropogenic methane emissions from rice production1, summing up to 4.5% of the total anthropogenic greenhouse gas emissions2. Croplands are subject to climate change impacts3, land-use change4,5, climate mitigation strategies6,7, interact directly with the climate system8, consume large quantities of human freshwater withdrawals9 and are connected to various sustainable development goals10. Understanding and quantifying cropland dynamics is thus an integral research question for Earth System Science.

Future agricultural production faces several challenges that need to be understood in scope and implications: (1) growing and increasingly wealthier populations are projected to demand more and different compositions of food commodities11,12, (2) climate change impacts3,13 will require adaptation14,15,16,17,18, and (3) the environmental impact of agricultural production needs to be reduced, including pollution19,20, water consumption9, land consumption21,22 and greenhouse gas emissions23,24,25. The potential to address these challenges is often explored with computer simulation models of agricultural productivity or outputs of such agricultural productivity models in combination with other models, e.g. economic models of the agricultural sector5 or hydrology models26.

AgMIP (Agricultural Model Intercomparison and Improvement Project, see Table 1 for a complete list of abbreviations) was initiated to help improving agricultural modeling capacities across scales and aspects (soils, different crops, economics etc.) by intercomparing models in simulation experiments using common protocols27. The general idea of AgMIP is that different modeling groups around the world can participate, contributing data to the ensemble dataset. Protocols are developed to clearly describe important aspects in the configuration of the modeling experiments and all prescribed inputs are supplied to interested modelers. AgMIP analyses typically start out by describing the range of model results and thus quantifying the uncertainty embedded in the choice of the crop model used and its parameterization. This source of uncertainty has previously not gained much attention. Different groups using the same model are explicitly invited to participate, which allows for analyzing how important modelers’ choices are28 beyond model configurations as specified in the protocol or where protocol instructions were not implemented correctly.

Table 1 Acronyms used in article.

Data of the GGCMs provided by AgMIP27 in the framework of ISIMIP29 have been used to assess impacts of climate change3,30, study sources of uncertainties6,31 and have been used in combination with other data for cross-sectoral impact analyses26,32. The term “cross-sectoral” in the ISIMIP context refers to analyses using data from different impact categories, referred to as “sectors”. The data were also used for economic assessments of climate change impacts on agricultural production systems5,33,34,35,36. However, this first global-scale simulation ensemble of AgMIP that was conducted as part of the ISIMIP project3 revealed a broad range of GGCM results under different climate change scenarios and in response patterns6,31. This high level of uncertainty motivated the following GGCMI phase 1 to assess model performance and to identify general fields of model improvement. The protocol for GGCMI phase 137 thus designed a comprehensive modeling exercise aiming to understand GGCM skill in reproducing observed historic crop yield patterns in space and time. Besides inviting a broad group of modeling teams and models, the GGCMI phase 1 protocol also adds variants of management harmonization and weather input datasets. Different assumptions on growing seasons across GGCMs had rendered the initial GGCM simulations3 as difficult to compare. The broad availability of different historic weather data products, which also differ substantially in parts37, motivated the inclusion of different weather data products to address this source of uncertainty. A comprehensive initial model evaluation study based on the GGCMI phase 1 dataset38 showed that GGCMs typically have better skill than statistical models to explain observed crop yield variability but also have little explanatory power in regions where yield variability is mostly driven by changes in management or pest outbreaks, rather than weather variability. None of the GGCMs proved to be generally superior to the others but differences in model skill were reported38.The output data from this set of simulations are described here.

The GGCMI phase1 dataset provides an unprecedentedly large dataset of crop model simulations covering the global ice-free land surface. While the dataset has already served various analyses on crop yields38,39,40,41, there are still many aspects unexplored; variables apart from crop yields have hardly been assessed so far, with the exception of actual growing season evapotranspiration by Wartenburger, et al.42. The multi-dimensionality of the dataset (14 GGCMs, 11 input datasets, 3 harmonization levels, 4 crops, 14 output variables, time, and space) allows for further analyses and can also serve as an input to other models, the quantification of model uncertainties, and crop model emulation43,44. With the publication of the GGCMI phase 1 dataset we aim to promote further analyses and understanding of crop model performance, potential relationships between productivity and environmental impacts (e.g. water, nutrients), and eventually insights on how to further improve global gridded crop model frameworks and configurations.

Methods

The GGCMI phase 1 dataset37 consists of the model output of 14 GGCMs (Table 2) for up to 11 weather datasets covering various time frames and up to 3 harmonization levels (default, fullharm, harmnon) for the 4 priority one (P1) crops (maize, wheat, rice, soybean), as well as for any number of additional crops (priority two, P2). Not all models have been able to simulate all P1 crops and a number of models provided several P2 crop simulations (see Table 3). The GGCMI phase 1 dataset has been compiled by 14 different crop modeling groups that have followed the protocol instructions37 to achieve as much consistency as possible. It has to be noted, however, that not everything was or could have been harmonized across models. The focus of harmonization was on weather datasets to drive the models and on a few core crop management settings: the growing period and fertilizer inputs. Many other aspects of crop management have not been specified by the protocol37. This is in part owing to the complexity of this task, as the models have very different capacities to represent management options and thus require different sets of parameters. As such, we acknowledge the importance of soil parameters for crop model simulations45 but were unable to harmonize on these here. However, the lack of harmonization in various management and soil aspects can also be considered an asset of the analysis. For most regions in the world, the management systems are not documented and typically quite diverse so that the diversity of the assumptions made in the ensemble may better reflect this than a single harmonization target. Folberth et al.46 indeed show that assumptions made by the different EPIC modeling teams affect the models’ performances and sensitivities.

Table 2 GGCMs that provided data to the phase 1 dataset and their main characteristics.
Table 3 Crops simulated by the GGCMI phase 1 ensemble, abbreviations, and simulation priorities.

Modeling protocol

The overall modeling protocol is described by Elliott, et al.37 and we provide here only a summary of the main features. Modelers were asked to supply a minimum set of simulations, but could include additional simulations, addressing different directions of analysis. Online-only Table 1 lists all inputs used by the modeling groups. Modelers were asked to provide data for all simulated crops for all grid cells, even if crops are not currently cultivated in these areas. For these non-cultivated areas, input data were provided, but simulations could be skipped if no plausible assumptions on growing seasons could be made for that location37. According to the modeling protocol as described by Elliott, et al.37, modelers provided simulations for the default setup of their model (default), for harmonized growing seasons (i.e. prescribed grid-cell- and crop-specific sowing and maturity dates) and fertilizer inputs (fullharm) and for the same harmonized growing seasons but with unlimited nutrient supply (harmnon, also referred to as harm-suffn in some publications, e.g. Müller, et al.38). All simulations were conducted for purely rainfed (noirr) and for fully irrigated (firr) conditions as separate datasets so that crop yield could be aggregated with given crop- and irrigation masks to larger spatial entities in the post-processing38,47,48.

Global Gridded Crop Models

The GGCMs contributing to the GGCMI phase 1 data archive differ in model structure, input requirements, and processes covered and thus have implemented the simulations in different ways. We here first describe each individual GGCM with central references and a short description of the model setup and conduction of simulations. In the following section, we provide tabular overviews of these narratives.

CGMS-WOFOST

General description

CGMS-WOFOST (European Crop Growth Monitoring System with the WOrld FOod STudies crop simulation model)is a spatially distributed version of the WOFOST crop simulation model49,50,51. WOFOST is a mechanistic crop growth model that describes plant growth using light interception and CO2 assimilation as growth driving processes, and crop phenological development as a growth controlling process. The model can be applied using the following two modes: (1) a potential mode, in which crop growth is driven solely by temperature and solar radiation, and in which no growth limiting factors are taken into consideration; and (2) a water-limited mode, in which crop growth is limited by the availability of water. The difference in yields between the potential and water-limited modes can be interpreted as the effect of drought. Currently, no other yield-limiting factors (e.g. nutrients, pests, weeds, farm management) are taken into consideration.

WOFOST has been embedded in the European CGMS that was developed within the framework of the MARS (Monitoring Agricultural ResourceS project) project of the Joint Research Centre of the European Commission. CGMS allows the regional application of WOFOST by providing a database framework that handles model input (e.g. weather, soil and crop parameters), model output (crop indicators such as total biomass and leaf area index), aggregation to statistical regions and yield forecasting.

Model setup and protocol

The planting and harvest dates were taken from Elliott, et al.37 for the different crops. For each crop and grid cell a pre-run was executed in order to determine the temperature sum requirements (phenological heat units: PHU) from planting to harvest based on the 30-year AgMERRA (Agriculture Modern-Era Retrospective Analysis for Research and Applications) weather forcings52. This total PHU was then divided into a PHU from sowing to emergence, from emergence to anthesis and from anthesis to maturity based on the ratio of PHU values in the original WOFOST crop parameter files. As a result each grid cell receives its own variety definition in terms of temperature sum requirements for each of the 14 crops. The remaining model parameters were taken from the default WOFOST parameter files for each crop.

The cropping calendars provided by GGCMI are derived from regional sources (e.g. FAO which describe a static growing season for an entire region). However, many of those areas for which a growing season is defined do not have soil types that are suitable for crops. Therefore, grid cells for which a cropping calendar was defined but where soils are unsuitable were not simulated by CGMS-WOFOST. In practice, grid cells were excluded mainly in Northern Africa, Central Australia and Siberia.

Simulations with CGMS-WOFOST were carried out for the AgMERRA and WFDEI.GPCC weather forcings for all years within the range of the forcing set. The crop simulations for the irrigated scenario were executed with the WOFOST model running for the potential production scenario, assuming that crops were irrigated to the extent that no water stress occurs. The crop simulations for the rainfed scenario were executed with the model running in water-limited production scenario. For the latter scenario, large spin-up periods are not necessary as the model does not include simulation of carbon or nutrient pools. All simulations were started by starting the water balance calculation 90 days before the start of the crop sowing date with the water balance initialized at 50% of its water holding capacity (the range between wilting point and field capacity). This allows some time for the water balance to accumulate water as a result of rainfall. Finally, all 14 crops defined in GGCMI were simulated with CGMS-WOFOST for the two weather forcings mentioned above.

CLM-Crop

General description

The CLM crop model was developed to improve the fully-coupled simulations of the Community Earth System Model (CESM1) and to help begin answering questions about changes in food, energy, and water resources in response to changes in climate, environmental conditions, and land use within the CESM modeling framework. CLM crop was initially incorporated into the CLM4CN model53 by replacing the unmanaged C3 crop plant functional type (using the C3 photosynthesis pathway) which represented all crops globally, with a small number of interactive managed crops over temperate northern hemisphere latitudes54. The CLM4CN crop model introduced the managed crop types of maize, soybean, and spring wheat (which represented more generally temperate cereals). The new crops reside on their own soil column, independent to the remaining natural vegetation that shares a single soil column. These crops were chosen based on the availability of their corresponding algorithms in the crop model AgroIBIS (agricultural version of the Integrated Biosphere Simulator)55. The main additions to CLM4CN involved the addition of the AgroIBIS crop phenology and allocation algorithms to those of the existing algorithms used for natural vegetation. In the CLM version 4.5 of the CLM crop model the standard CLM calculation of the parameter Vcmax25 (photosynthetic capacity at common temperature of 25 °C) was reintroduced to crops, along with new fertilizer management and nitrogen fixation by soybeans56. The fertilizer functionality adds a central U.S. annual crop specific amount of nitrogen directly to the soil mineral nitrogen pool for each crop. In the CLM post-4.5 version of the crop model used in the AgMIP GGCMI studies, extra tropical crops were added, to include sugarcane, rice, cotton, tropical maize, and tropical soybean using the CLM version 4.5 parameterizations with modified parameter values from Badger and Dirmeyer57. Specifically for sugarcane and tropical maize, functional form of temperate maize is used because all three are C4 plants (i.e. they use the C4 photosynthesis pathway). For tropical soybean the functional form of temperate soybean is used and for rice and cotton the spring wheat functional form is used.

Model setup and protocol

CLM Crop simulations were configured following the experimental design and initial conditions generated in the recent CLM Crop model investigations by Levis, et al.58. In those simulations the CLM post-4.5 model was spun up for 1050 years with repeated 1900–1920 meteorological forcing and an atmospheric CO2 mixing ratio of 299.7 ppm generated from a previous 20th Century CESM simulation contributed to the CMIP5 (Coupled Model Intercomparison Project Phase 5) effort59. Following the spin up period, a 20th Century simulation was performed from 1901–2005 using transient meteorological forcing and atmospheric CO2 generated from the same CESM simulation as used for the spin up. For AgMIP GGCMI all simulation configurations were started in 1978 with initial conditions provided for CLM crops from the 1978 state in the 20th Century simulation of Levis, et al.58. The CLM Crop AgMIP simulations were performed over the 1978–2010 period for rainfed and irrigated versions of cotton, maize, rice, sugar cane, soy and wheat crops. Temperate and tropical versions of maize and soy were represented separated by latitude, with tropical versions from 30°S–30°N and temperate versions outside of those latitudes. Meteorological forcing was generated for CLM on a 6-hour time step from the daily values provided by the AgMERRA and WFDEI datasets. The diurnal cycle for each of the reference height forcing variables (downwelling solar radiation, temperature, precipitation, pressure, specific humidity, and wind) were prescribed from the CLM CRU-NCEP (Climate Research Unit and National Centers for Environmental Prediction)6-hour forcing time series for the same period. For the harmonization simulations the annual nitrogen fertilizer applied was taken from spatially-varying crop-specific values provided by the AgMIP GGCMI protocol rather than the U.S. annual crop specific amount of the default model. Attempts to modify the planting dates and crop phenology were unsuccessful and so were not included in the study.

EPIC-Boku

General description

EPIC-BOKU is a global grid-based modelling system based on the EPIC version 0810 model60. It contains routines for simulating crop growth and yield, hydrological, nutrient and carbon cycle, soil temperature and moisture, soil erosion and a wide range of crop management options. EPIC operates on a daily time step and can be used for long-term assessments. Potential plant growth is calculated based on intercepted solar radiation, conversion of CO2 to biomass and vapor pressure deficit. The potential growth is decreased by stresses imposed by temperature, nutrient deficit, salinity, aluminum toxicity, soil strength or aeration deficiency. Temperature stress occurs each day when average temperature exceeds the optimum temperature or falls below the base temperature, and water stress when soil water supply is insufficient to meet the potential plant evapotranspiration. Nutrient stress is calculated based on N (nitrogen) and P (phosphorus) deficit compared to optimal supply. Phenological development is based on daily heat unit accumulation that determines leaf area growth, canopy height, nutrient uptake, harvest index and, optionally, date of harvest. Crop yield is calculated from above-ground biomass and harvest index. EPIC incorporates equations that adjust radiation-use efficiency and evapotranspiration for elevated atmospheric CO2 concentration. The Penman-Monteith method was used to compute potential evapotranspiration.

Model setup and protocol

Global EPIC-BOKU was constructed within the “Global earth observation - benefit estimation: now, next and emerging” project of the Sixth Framework Programme of the European Commission to support integrated land-use modelling at global scale. It is run on a 5 arc-minutes grid by combining Geographic Information System layers on soils, relief, administrative units and a 0.5 arc-degrees (°) weather grid using the approach by Skalský, et al.61. Global crop simulations are performed on total cropland cover (GLC2000) stratified by homogenous response units at 5 to 30 arc-minutes grid resolution61, resulting in about 103,000 spatial modeling responses for cropland. The spatial modeling responses of crops and crop management variants are integrated in global economic land use models such as GLOBIOM62,63. The crops can be simulated for three management/input systems allowing to carry out the three GGCMI phase 1 configurations: 1) automatic nitrogen fertilization – N-fertilization rates based on crop specific N-stress levels (N-stress free days in 90% of the crop growing period). The upper limit of N application is 200 kg ha−1 a−1. 2) automatic nitrogen fertilization and irrigation – N and irrigation rates are based on crop specific stress levels (N and water stress free days in 90% of the crop growing period. N and irrigation upper limits of 200 kg ha−1 a−1 and 500 mm a−1). 3) subsistence farming – no N fertilizations and irrigation. All crops and management variants are simulated on total global crop land cover.

The GGCMI phase 1 protocol is applied to aggregate the spatial modeling responses of maize, rice, soybeans, and wheat to 0.5° × 0.5° grids. Sowing dates and the length of the growing season were obtained from Sacks, et al.64. Planting and harvesting dates are considered as the earliest possible dates. The planting and harvest operations were automatically postponed if the required PHU had not been accumulated on the given day. PHUs were estimated based on Princeton (default) and WATCH (fullharm and harmnon) historical weather data. Amount of fertilizer (N, P) as well as planting and harvesting dates were harmonized according to the GGCMI phase 1 data and protocol.

EPIC-IIASA

General description

EPIC-IIASA is a global grid-based modelling system based on the EPIC version 0810 model60, described above for the EPIC-BOKU model. In contrast to EPIC-BOKU, EPIC-IIASA used the Hargreaves method to calculate potential evapotranspiration, static computing of field water capacity and wilting point, the Cesar Izaurralde denitrification method65 and no water erosion was included. A detailed description of other differences in parameterization of fundamental biophysical routines in EPIC-IIASA and EPIC-BOKU are provided by Folberth et al.46.

Model setup and protocol

Sowing dates and the length of the growing season were obtained from Sacks, et al.64 for the default setup, and from the data provided by GGCMI phase 1 for the harmonized simulations. Harvesting dates are considered as the earliest possible dates of harvest. The harvest operations were automatically postponed if the required PHU had not been accumulated on the given day. PHUs were estimated based on Princeton (default) and WATCH (fullharm, harmnon) historical weather data.

The regions of spring and winter wheat were identified based on observed data by Sacks, et al.64, if available. Otherwise, the same rules as in Liu, et al.66 were applied, assuming that spring wheat is grown between 30°S and 30°N and winter wheat in regions with greater latitudes. For maize, a low-yielding cultivar with a harvest index of 0.35 was used in sub-Saharan Africa, while a harvest index of 0.5 was used for other regions. Maize with optimum temperature of 22.5 °C and a base temperature of 6 °C is used for temperate and colder regions in Europe and Russia, while an optimum temperature of 25 °C and a base temperature of 8 °C was used elsewhere. For rice, the harvest index and biomass-energy ratio were regionally modified based on Xiong, et al.67. All other crop growth parameters were left at the default values, which are based on literature.

In the default setup, crop-specific annual N and P fertilizer application rates were obtained from Balkovič, et al.68 and Mueller, et al.69 for European and other countries, respectively. P fertilizer was applied as a fixed amount together with tillage, while N dosing was triggered automatically based on plant requirements until the annual N application rate was fulfilled70. Irrigation was estimated based on the MIRCA2000 database using the automatic irrigation trigger in EPIC to supply water when the water stress exceeded 10% in one given day, with a maximum annual amount of 2000 mm.

EPIC-TAMU

General description

The EPIC version 1102 model is a further development of EPIC version 0810 described above for the EPIC-BOKU and EPIC-IIASA models, with the same fundamental routines for mechanistically simulating soil-plant-atmosphere dynamics. Additional model capabilities include improved soil water balance methods71, denitrification methods72, perennial crop growth routines73, and soil health impacts of biochar74.

Model setup and protocol

Data were computed for 40,500 pixels for which appropriate weather, soil and crop calendar data were available. A one-year spin up period was simulated in addition to the simulated years of each weather dataset. Planting occurs on the first day following the prescribed sowing date in which soil temperature is at least 2 °C above the 8 °C base temperature. Harvest occurs once the specified heat units are reached. Heat units to maturity were calibrated from the prescribed crop calendar data, and were limited to values between 900 and 3800 to ensure reasonable bounds75,76. Pixels with no prescribed harvest dates were provided a “fast-maturing” crop that would reach maturity at 900 heat units. Fast-maturing pixels were harvested one year following planting if maturity was not reached beforehand. For simulations with full irrigation, a high (0.99) threshold plant stress trigger was used, with any single application of water ranging from 25–100 mm. N, P and K (potassium) were applied on the sowing date based on the prescribed values for each site. For the harmnon runs, additional N was applied when a 0.99 threshold was triggered during the growing season. Additional N was added in increments of 25 kilograms per hectare. The Penman-Monteith method was used to compute evapotranspiration.

GEPIC

General description

GEPIC is based on the field-scale model EPIC v0810, which calculates potential biomass increase for each day of a defined growing season based on leaf-area index (LAI) and solar radiation, and subsequently reduces the potential to an actual biomass increase using the maximum stress out of water, temperature, nutrients, salinity and aeration as a correction factor. Key parameters for crops are base temperature, maximum temperature, maximum leaf area index and development of LAI over time, as well as an energy-biomass conversion coefficient describing the efficiency of photosynthesis. Besides plant growth, EPIC estimates changes in soil properties and nutrient cycles based on plant and soil management.

Model setup and protocol

Planting dates were estimated similar to the approach of Waha, et al.77, but with a simplified classification. Grid cells with <5 °C in the coldest month are defined as temperature limited, all other grid cells as precipitation limited. In precipitation-limited grid cells, crops are planted on the first day of the month after which four consecutive months provide the highest precipitation throughout the year. In T limited grid cells, planting dates for summer crops are defined by cumulating PHU starting from the coldest month of the year until a crop-specific germination threshold is met. For winter crops, the same process is carried out backwards from the coldest month of the year.

PHU were estimated from reported current sowing and harvest dates according to Sacks, et al.64 for the default setup and from the datasets provided by GGCMI phase 1 for the harmonized runs based on long-term climate averages.

We found that it is necessary to simulate depletion of soil nutrients in low-input regions like sub-Saharan Africa in order to represent current reported yields satisfactorily, as soil are usually highly depleted under such conditions due to continuous cultivation without sufficient nutrient replenishment and decreasing fallow durations78. An appropriate time-scale was found to be about 30 years. In order to represent this current state of soil depletion in the crop model, each decade was simulated separately with a run-time of 40 years, out of which only the last 10 years are used as a simulation result.

The choice of spring wheat and winter wheat is based on temperature thresholds published by Stehfest, et al.79: Winter wheat is planted in grid cells with a minimum temperature in the coldest month of the year of >−10 °C and <5 °C based on decadal monthly means. Winter wheat and spring wheat sowing areas change dynamically throughout the simulations period in each decade.

For maize, a dataset of national human development index for the year 2000 (retrieved from http://geodata.grid.unep.ch) is used for distinguishing low- and high-input countries. Maize with a high potential harvest index (corresponding to current hybrids) of 0.55 is planted in countries with a Human Development Index larger 0.8 and maize with a lower harvest index of 0.35 (corresponding to local conventional varieties80) in all other countries. For rice, the harvest index and maximum leaf area index were modified based on current literature81. All other crop growth parameters were left at the default values, which are based on literature and field trials76,80,82.

LPJ-GUESS

General description

The model is the crop-enabled version of LPJ-GUESS (Lund Potsdam Jena General Ecosystem Simulator), described in Lindeskog, et al.83. Its implementation bears similarities to LPJmL as described in Bondeau, et al.84, but differs in several important aspects, including not being calibrated to observed country-level yields, a new phenology scheme, and a dynamic calculation of the PHU required for a crop to achieve maturity. Sowing dates are calculated dynamically following Waha, et al.77. The PHU sum needed for full development of a crop in a particular grid cell is calculated using a 10-year running mean of heat unit sums accumulated from the sowing date to the end of a sampling period (ranging from 190 to 245 days) derived from default sowing and harvest limit dates83. There is no differentiation between varieties other than PHU, except for wheat for which either spring or winter sowing varieties are selected, based on prevailing climate. Crops are harvested upon full development. This dynamic variation of PHU to climate effectively assumes a perfect adaptation of crop cultivar to the prevailing climate. N limitation is not explicitly accounted for in this version of the model, which precedes Olin, et al.85.

Model setup and protocol

Outputs have been computed for 59,191 pixels covering the entire ice-free land surface. Spin-up was for 30 years using the first 30 years of the input-data timeseries. Spin-up only influences the initialization of the sowing date and dynamic PHU algorithms. A full spin-up of soil carbon pools, as required for standard LPJ-GUESS simulations86 was not required as they do not feedback on crop yields in this model version. Simulations ran uninterrupted for the whole timeseries. Simulations did not consider nitrogen limitation explicitly in this simulation set, so data for the fullharm setting are not available but only the default and harmnon settings.

LPJmL

General description

Simulations with LPJmL (Lund Potsdam Jena managed land) have been using the latest version available at that time as described by77,84,87, with the expanded soil implementation as described by Schaphoff, et al.88. The model computes daily gross primary production and autotropic respiration as a function of intercepted radiation, air temperature and water stress in a mechanistic way and allocates assimilates to the different organs as a function of phenological stage and water stress. Nitrogen dynamics are not considered explicitly in this version, which precedes the von Bloh et al.89.

Model setup and protocol

Data have been computed for 67,420 pixels (CRU land mask) with LPJmL from 1951-2099 in a transient simulation run, using a 200-year spin-up to initialize soil water and to bring soil temperatures into equilibrium (natural vegetation and soil carbon pools are neglected here, so the model spin-up simulation could be short), recycling the first 30 years of that time series for the spin-up phase.

National cropping intensities of the default runs have been calibrated to FAO statistics (1996–2000) as described by Fader, et al.87 but with a linear LAI-FPAR model for maize90 and maximum intensity levels for maize at a maximum LAI of 5. The harmnon runs were conducted without calibrated intensity settings but using a maximum LAI of 5 everywhere for all crops except wheat and sugarcane (maximum LAI = 7). The minimum root-to-shoot ratios at maturity were set to 10% based on insights from the AgMIP wheat91,92 and maize93 pilot studies. Sowing dates were computed as described by Waha, et al.77, but were kept constant after 1951. The model decides internally whether to grow winter or spring wheat on wheat areas. It has a preference for winter wheat, but if winters are too long, it will grow spring varieties84.

Simulations with LPJmL did not consider nitrogen limitation explicitly in this simulation set, so data for the fullharm setting are not available but only the default and harmnon settings. For the harmnon simulations prescribed planting days were used directly as input. To compute (PHU requirements for the parameterization of observed maturity dates per crop and grid cell, the model was run once with the WATCH climate data, prescribing varieties that never mature and recording accumulated PHU on the prescribed maturity date. From this run, the average 1972–2001 PHU requirements were extracted and prescribed in the harmnon runs, so that maturity dates vary between years but on average are consistent with prescribed maturity dates. Variety traits other than those determining the growing season length were not varied in space or time.

ORCHIDEE-crop

General description

Simulations with ORCHIDEE-crop (Organising Carbon and Hydrology In Dynamic Ecosystems crop model) have been conducted using an improved version from Wu, et al.94. The improvements include an allocation scheme resolving the source-sink regulation on biomass and yield, an irrigation scheme, and a fertilizer scheme. These updated developments are documented in Wang95.

Model setup and protocol

Simulations were performed over global land grids according to the land-sea mask of each climate input dataset. One-year spin-up was performed to balance the soil water budget. The time length of the spin-up was selected after testing the turnover time needed to balance the water budget of the 11-layer soil hydrology module in the model.

No calibration was made for GGCMI simulations. However, during development of ORCHIDEE-crop, the wheat and maize parameters were evaluated and calibrated against several agricultural eddy flux sites over Europe94. The rice phenology parameters were calibrated against phenological observation networks in China96.

For the fullharm scenario, all input data instructed in the protocol were used. For the default simulation, the nitrogen fertilizer map was derived from a combination dataset of FAO and the International Fertilizer Association97, which is static and crop-specific. For the harmnon scenario, an over-saturation rate of 500 kgN ha−1 was applied in order to eliminate nitrogen constraints.

pAPSIM

General description

The pSIMS platform98 leverages high-performance computing resources at the University of Chicago and Argonne National Lab. It comprises an assortment of survey-based and geospatial data sources, and field-scale crop models, including those based in the APSIM99 (referred to as pAPSIM, the parallel version of the Agricultural Production Systems sIMulator), to simulate food, fiber and biomass production systems at high spatial resolution and continental or global extents.

Model setup and protocol

The default set-up was based on fixed planting dates for each grid cell from the Sacks et al. crop calendar64, with additional detail in the conterminous US provided by crop calendar data of the US Department of Agriculture100. All crops were first simulated using a range of cultivar phenology parameters and the cultivar which best reproduced the harvest dates from the Sacks et al. crop calendar64 was selected to be used in the default set-up. For maize, grid cells described in the SPAM2000 (spatial allocation model) dataset101 as “rainfed high input” or “irrigated” were assumed to use high-yielding hybrid cultivars, parameterized with 50% higher max grain number and 10% higher grain filling rate. Fertilizer levels in the default setting were the same as those used in the harmonized scenario (fullharm)37, with half applied at planting and half applied 40 days later. Wheat cultivar groups were selected based on mega-environments102 and then phenology parameters were calibrated as with other crops. Soybean cultivars were selected based on standard maturity groups and were then calibrated to reproduce Sacks et al.64 harvest dates3. In the fullharm and harmnon settings, growing periods of all crops were calibrated in the same manner to reproduce given GGCMI growing periods37.

The simulation period was reinitialized each year on January 1st assuming a 50% full soil water profile in each location. Soil dynamics typically stabilized at expected levels before planting, though some caution must be taken for locations with planting very early in the calendar year (e.g. before the end of January). All other crop growth parameters were left at default values.

pDSSAT

General description

The pSIMS platform98 leverages high-performance computing resources at the University of Chicago and Argonne National Lab. It comprises an assortment of survey-based and geospatial data sources, and field-scale crop models, including those based in the Decision Support System for Agrotechnology Transfer (DSSAT) framework (CROPGRO103 and CERES (Crop Environment Resource Synthesis104)) (referred to as pDSSAT, the parallel version of the Decision Support System for Agrotechnology Transfer), to simulate food, fiber and biomass production systems at high spatial resolution and continental or global extents.

Model setup and protocol

The model setup and protocol is identical to that of pAPSIM described above as both models are run by the same group in the pSIMS environment98.

PEGASUS

General description

PEGASUS (Predicting Ecosystem Goods And Services Using Scenarios Model) combines a radiation use efficiency model to estimate daily photosynthesis and annual net primary production with a surface energy and soil water budget model. In addition, the model uses a dynamic allocation scheme to assign daily biomass production to the different organs of the crop. Thus, crop yield is eventually derived from the amount of carbon contained in the storage organs at harvesting date105. PEGASUS 1.1 simulates crop response to elevated CO2 and effects of extreme temperature events occurring at crop anthesis. A specific heat stress factor is calculated as a function of intensity and duration of extreme temperature events during crop anthesis according to crop specific temperature thresholds106. Farm management practices represented in PEGASUS include irrigation and fertilizer application, decision of planting dates and choice of crop cultivars105.

For the GGCMI phase 1 simulations, PEGASUS version 1.1 was used106.

Model setup and protocol

PEGASUS was calibrated to match average crop yields around the year 2000 of the Monfreda et al. dataset107, using a subset of the WATCH data (6 years from 1997 to 2002). Note that the calibration procedure in PEGASUS entails tuning only one global parameter, the light-use-efficiency coefficient (ε) as described in Deryng, et al.105.

For the default simulations, the calibrated version of PEGASUS from the ISIMIP fast-track3 was used, making use of PEGASUS’ internal algorithm to simulate planting date decision and choice of crop cultivars, as well as fertilizer data as referenced in Deryng, et al.105. This means that the default configuration allows for progressive adaption of planting dates and choice cultivars according to annual mean climate conditions.

For simulations of the fullharm and harmnon settings, a new calibrated version was used, using the same WATCH dataset as climate input and average crop yields around the year 2000 Monfreda et al. dataset107, but using: the harmonized crop calendar dataset and the harmonized fertilization application rates as specified by Elliott, et al.37. However, this calibrated version differs only for wheat, for which ε was set to 0.029 mol C m−2 s−1 APAR, instead of 0.027 mol C m−2 s−1 APAR. APAR (mol quanta m−2 s−1) represents the daily average absorbed photosynthetically active radiation. ε = 0.035 mol C m−2 s−1 APAR for maize and ε = 0.011 mol C m−2 s−1 APAR for soybean was used for both default and fullharm versions.

For this set of simulations, climate data were provided in one time-slice so that PEGASUS was run continuously over each time-period, including an initial 4-year spin-up. PEGASUS was run with downwelling longwave radiation input from the WFDEI dataset for AgMERRA and AgCFSR (Agriculture Climate Forecast System Reanalysis) simulations.

PEPIC

General description

PEPIC (Python based EPIC model) is a grid-based EPIC model compiled under the Python environment108. The EPIC model was initially introduced by Williams, et al.109 to evaluate the impacts of soil erosion on soil productivity. EPIC can be used to simulate a large number of soil-water-climate-management processes, for example, weather, hydrology, erosion, pesticide, nutrient, plant growth, tillage, soil temperature, and environmental control109. EPIC simulates crop growth at a daily step based on the concept of energy-biomass conversion. Daily potential biomass increase is the product of intercepted solar radiation and a crop-specific biomass-energy ratio. Several crop growth stresses (water, nutrient, temperature, aeration, and salinity) are considered to reduce the potential biomass to actual biomass. The crop grain yield is estimated by the product of the harvest index and actual biomass accumulation60.

Model setup and protocol

In PEPIC, the whole study domain is firstly categorized into a number of subareas depending on the study purposes (e.g. administrative boundaries, climate regions, watersheds). Input data need to be specified for each grid cell with a spatial resolution of 30 arc-minutes. After all simulations are completed for all grids cells, PEPIC extracts the results and presents the spatial distribution of desired variables for a given time period. Irrigated and rainfed crop cultivations are simulated separately. To get combined outputs for each grid cell, values from irrigated and rainfed cultivation were aggregated using an area-weighted averaging method.

Potential heat units are calculated with a PHU calculator from the SWAT (Soil Water Assessment Tool) website (https://swat.tamu.edu/software/), with input of planting date, growing season length, and monthly minimum and maximum temperature. In the simulation, different PHUs have been computed for each weather forcing dataset. For default setup, crop calendar data (planting and harvesting dates) were derived from Sacks, et al.64, and N and P fertilizer from FertiSTAT (database for statistics on fertilizer use by crop)110 were used. For the harmonized setups (fullharm and harmnon), crop calendar, N, P and K fertilizer from GGCMI were used37.

For the simulation forced by each weather forcing dataset, 20 years were treated as model spin-up period. Automatic irrigation was used for irrigated cultivation with sufficient water supply (maximum value of 1000 mm). For default and fullharm scenarios, P was applied directly prior to planting and N was applied three times based on input data: first time before planting, second time one month after germination, third time two months after germination. One third of N inputs were applied for each application. For the harmnon scenario, N was applied automatically based on crop N requirement, with a stress trigger of 0.99 and sufficient N inputs. Similar to N fertilization under the harmnon scenario, P inputs were also determined by the model without limitation.

For cultivars of wheat and maize, PEPIC adopted the same approach as GEPIC (Section 2.2.6) to distribute the cultivar distribution globally. Rice and soy used the default parameters from the EPIC model.

PRYSBI2

General description

The PRYSBI2 model (Process-based Regional-scale crop Yield Simulator with Bayesian Inference version 2.1) is a semi-process-based large-area crop model for major crops: maize, soybeans, wheat, and rice. Daily crop biomass growth and resulting crop yields are calculated for each global grid (1.125° in latitude and longitude). The daily biomass growth is calculated according to photosynthetic carbon assimilation based on the enzyme kinetics model (i.e. Farquhar model111). A sun/shade model112 is used for the calculation of intercepted solar radiation. The soil water balance is calculated by the SWAT model113. The crop development is calculated via PHU, as in the EPIC model. Daily temperature affects crop growth through mainly the changes in phenology, photosynthetic rate, and evapotranspiration rate. Daily precipitation affects crop growth through water stress calculated according to the SWAT model. Crop yield is calculated from above-ground biomass and a harvest index. The model (version 2.0) is described by Sakurai, et al.114.

We refer to this model as “semi-process-based” because the model parameters relevant to the past technological trend (i.e. it includes the past change of nutritional input, crop variety, and the degree of the irrigation etc.) were inversely estimated using historical crop yield data115 for each spatial grid using Markov Chain Monte Carlo methods. As such, the processes of fertilizer input and irrigation are not explicitly included put part of the inverse parametrization.

The version of the model is 2.1. From the version 2.0114, mainly following processes were changed.

1. The big leaf model was replaced by a sun/shade model112.

2. The calibrated technological factor no longer affects final biomass, but now affects daily biomass growth.

3. The estimated parameter set has been re-calibrated.

The PRYSBI2 model used here should not be confused with PRYSBI1 (an older version)116 which has a fundamentally different model structure.

The PRYSBI2 output was interpolated to the requested 0.5° resolution from its original 1.125° resolution at which simulations were conducted. This means that the output of a 0.5° grid was the same as the 1.125° grid in which the 0.5° grid was included. If a 0.5° grid straddled multiple 1.125° grids, the average value of these 1.125° grids was used. PRYSBI2 data do not distinguish irrigated and rainfed production as irrigation is subsumed in the technology factor as described above.

Model setup and protocol

The parameters relevant to the technological factor, including the temporal change rate of the technological factor and irrigation, were inversely estimated using historical crop yield data117 for each grid cell and crop using the DREAM (DiffeRential Evolution Adaptive Metropolis) algorithm118. The number of Markov Chain Monte Carlo steps was set to 50,000 for each grid cell. This large amount of calculation (about 3 × 109 simulations in total) was executed on the super computer system of the Japan Agency for Marine-Earth Science and Technology (JAMSTEC).

In PRIBY2, the parameter values for the grid cells for which the reference data do not exist were extrapolated using the relationship between (1) the parameter values estimated at the grid cells in which the reference data exist and (2) environmental factors, such as elevation, harvested area, latitude, longitude, irrigated area119, planting day64, and the value of gross domestic product.

The dataset of Sacks, et al.64 was used for parametrizing the planting date in the default setup. The parameter set that has the maximum likelihood to reproduce observed yield dynamics115 for each grid and each crop was used for the default run. The simulation was set up to include one spin-up year before the first year of the simulation, using the weather data of the first simulation year. No other spin-up procedure was conducted, which was the same setting as in the calibration procedures (to reduce calculation time).

GGCM configurations, calibration and evaluation

We distinguish two GGCM types: (i) site-based process models, and (ii) ecosystem models (Table 2). In addition to the models’ main characteristics (Table 2), Tables 4,5,6 provide overviews of agricultural practices and inputs used (Online-only Table 1), the most important biophysical processes implemented (Online-only Table 2), and calibration procedures (Table 4). Site-based models have typically been calibrated at field-scale level in previous model applications. Some of the site-based models were also calibrated at national scale, especially those EPIC models used to provide data to global or national economic analyses (EPIC-BOKU62,120,121, EPIC-IIASA122). Ecosystem models were either calibrated at national scale (LPJmL, PEGASUS) or not at all. An exception is PRYSBI2, which was extensively calibrated at grid-cell level. Generally, calibration of global-scale crop models is complicated by the lack of high-quality data and the absence of data on any aspect other than yield. Furthermore, calibration does not substantially improve model skill in global-scale applications, other than improving the reproduction of spatial patterns by imposing management-driven differences in yield levels in the calibration process38.

Table 4 Model calibration, evaluation, parameters, scale and methods.
Table 5 Weather datasets used to drive simulations in GGCMI phase 1.
Table 6 Filename conventions for standardized model outputs.

GGCMs have been evaluated in various forms: individually at field and global scale (see examples in Table 4, but note that this list is far from exhaustive) or in model intercomparison exercises also at field91,92,93 and global scale38. Aspects other than yield have not been evaluated in GGCMI, even though some models have been assessed also for other output variables65,72,83,94,123,124,125,126,127,128,129,130.

Input data

All input data that have been supplied to modelers for the simulations has been described by Elliott, et al.37 and are available for download at http://www.rdcep.org/research-projects/ggcmi. In addition to the nine weather datasets listed by Elliott, et al.37, modelers also supplied simulations for an updated version of the Princeton data (PGFv2) that span 1901 to 2012 as well as the GSWP3 (http://hydro.iis.u-tokyo.ac.jp/GSWP3/) dataset, which has been supplied by ISIMIP phase2a (Online-only Table 1). The complete set of weather datasets to drive the crop models thus comprises eleven historical datasets that are based on retrospective datasets and nine of these have been bias-corrected against different observation-based products, including CRU and GPCC (Global Precipitation Climatology Center). This broad set of input data is meant to cover the uncertainty introduced from different reanalysis products and different bias-correction methods. An analysis of the role of different weather input data for GGCMs’ skill to reproduce historic yield variability is still pending.

All weather variables are bias-corrected individually, and against different data products. The 2-m temperature is typically bias-corrected against different versions of the CRU dataset, but precipitation can be bias-corrected against CRU, GPCC or other targets. The WFEDI bias-correction provides 2 sets, in which only the precipitation bias-correction differs131, denoted as WFDEI.CRU and WFDEI.GPCC respectively, no other subversions of weather forcing datasets are included or used here (Table 5). With this approach to bias-correct individual weather variables, the physical consistency between variables is not necessarily maintained. As such, it also seems acceptable to supplement weather variables from one dataset to another, if not supplied by the latter. This is the case for downwelling long-wave radiation, which is not used by all GGCMs but only by some (Table 4) and which is also not supplied by all datasets (Table 5). Additionally, not all weather variables have been bias-corrected in the different weather datasets. For some, bias-correction targets are non-existent (e.g. wind speed), for others, the authors of the bias-corrected datasets decided to not bias-correct all variables, such as 2-m temperatures in the WATCH dataset, which was only corrected for elevation after interpolation132. In contrast, in the WFDEI datasets 2-m temperatures were corrected to CRU temperature averages and diurnal ranges. All bias-correction was applied at the monthly level.

Soil data were not supplied to modelers, who were requested to use their own soil input data. Acknowledging the importance of soil information for crop yield simulations45, it was not possible to harmonize soil parameters within phase 1 across the different GGCMs, given the diversity in soil input requirements (number of layers; chemical, physical, biological and specific parameters or variables per layer).

Data Records

Data format

Data come in netCDF4 (network Common Data Form 4) files, with a naming convention as in Elliott, et al.37, using only lowercase letters in file names, but properly capitalized letters in subfolders. Each file contains only a single output variable. Files are named following the GGCMI convention37 (Table 6):

$${[model]}_{-}{[climate]}_{-}{[clim.scenario]}_{-}{[sim.scenario]}_{-}{[variable]}_{-}[crop]{}_{-}{[timestep]}_{-}{[start-year]}_{-}[end-year].{\bf{nc4}}$$

In the data archive, each model has its own subfolder (proper capitalization of model names), which includes a subfolder for each climate dataset simulated, which again contain subfolders for each simulated crop, using the long crop name rather than the abbreviation used in the file name (Table 3).

Data availability

Data are available at https://zenodo.org/ (see Online-only Table 4). The GGCMs have provided different output sets, covering different climate datasets, crops (Online-only Table 3), and output variables (Table 7). The overall GGCMI phase 1 dataset at https://zenodo.org/ is structured by GGCM and crop, which have been published as 86 individual packages (Online-only Table 4). Given that models have submitted data for different crops, there are 13 individually published datasets for CGMS-WOFOST135,136,137,138,139,140,141,142,143,144,145,146,147, 6 for CLM-Crop148,149,150,151,152,153, 4 for EPIC-Boku154,155,156,157, 4 for EPIC-IIASA158,159,160,161, 2 for EPIC-TAMU162,163, 4 for GEPIC164,165,166,167, 15 for LPJ-GUESS168,169,170,171,172,173,174,175,176,177,178,179,180,181,182, 13 for LPJmL183,184,185,186,187,188,189,190,191,192,193,194,195, 4 for ORCHIDEE-crop196,197,198,199, 4 for pAPSIM200,201,202,203, 6 for pDSSAT204,205,206,207,208,209, 3 for PEGASUS210,211,212, 4 for PEPIC213,214,215,216, and 4 for PRYSBI2217,218,219,220. All data are published under the Creative Commons Attribution 4.0 International (CC BY 4.0) license.

Table 7 Output variables supplied by GGCMs for all simulations sets these have provided (Online-only Table 3).

Technical Validation

All data submitted to the GGCMI phase 1 were tested by a set of quality check scripts. Data were tested for compliance with data formats, checking units (Table 7), variable naming (Table 7), file naming (Table 6), and space and time dimensions. Formatting errors led to rejection from the data base. Statistics on data ranges, spatial coverage with valid data points were reported to modeling groups, so that they could check and decide if the simulation data needed fixing.

Usage Notes

The GGCMI phase1 simulation dataset was conducted with the objective to have as much spatial coverage globally for all crops as possible. As such, crops are also simulated in many regions, where these crops are not currently grown or cannot be currently grown. Growing season data were supplied for an as large area as possible37, with the intention to harmonize across models but not necessarily to suggest that cropping is possible during these periods. As such, management, soil and/or weather data at any given site may differ from conditions assumed for the corresponding grid cell and results should mainly be analyzed for larger spatial entities rather than individual sites. Any aggregation or analysis of these data should consider this caveat and either mask currently cropped areas with crop- and irrigation-specific masks133,134 or handle and interpret these data with the necessary caution. Since aggregation masks can affect results47 these should be selected carefully to fit the intended purpose.

Almost all data analyses already conducted focused on crop yields for which models have been evaluated individually and jointly38. All other output variables have not been evaluated in this context. Generally, all data from the GGCMI phase 1 archive should be subjected to plausibility checks. Analyses that are sensitive to outliers should test for extreme values that are likely to exist in rare cases. It is also advisable to generally assess the range of simulated data when conducting analyses with these data, which can provide an indication of the embedded uncertainty.

Despite the semi-automated quality control scripts that tested spatial coverage and data ranges of values in submitted files, not all errors in the output files provided by the modelers could be identified and/or corrected. All issues that were identified after utilization of the data in other publications as well as the corrections applied are described here. As such, simulation outputs of LPJ-GUESS and LPJmL were initially reported with an erroneous grid definition, in which all grid cells were shifted. In the LPJ-GUESS results all pixels were shifted one grid cell eastward and northwards, in the LPJmL data, all pixels were shifted one grid cell northwards. These erroneous data were used in the analyses of Müller, et al.38, Porwollik, et al.47, but corrected versions were used for Frieler, et al.39, Schauberger, et al.41. The data from pAPSIM and pDSSAT do not cover the full land surface, as the simulations were conducted with an incomplete land mask, missing part of the eastern coastlines. These data are not available and could not be supplied at a later stage. The output variables on growing season weather conditions (sumt, gsrsds and gsprec, Table 7) were not sufficiently clearly defined in the protocol37 and have thus been reported in an inconsistent manner. Outputs of pAPSIM and pDSSAT report average daily values, the other models report total growing season sums. LPJ-GUESS and LPJmL results for sumt have not included negative temperatures (°C) but only reported values above the crops’ base temperatures, whereas the other models included all values. Users are advised to compile their own growing season climate indicators using the weather input data (Table 5) and data on sowing and maturity dates (Table 7). CGMS-WOFOST provided wrong file names and dimensions for WFDEI.GPCC, which run until 2012 instead of 2010 and contain 2 empty elements for the last 2 time steps.