A Bio-Economic Crop Yield Response (BECYR) Model for Corn and Soybeans in Ontario, Canada for 1959–2013

This paper presents estimates of the effects of changing climate on crop yields for grain corn and soybeans in Ontario, Canada, for 1959–2013. We were able to use a database that is more comprehensive with respect to explanatory variables than some previous efforts had available. Our model includes climate variables, prices, land quality, groundwater level, CO2 concentration, and a time trend. Our results indicate that trends in temperature and precipitation during our study period have not yet resulted in appreciable threats to crop yields in the region.

till July/August. Unfortunately, measurements of CO 2 concentration in Ontario are only available for the period from 2005 to 2016. However, CO 2 data in Mauna, Hawaii, are available back to 1958. But, the seasonality of the Hawaiian data is different from the Ontario data. The CO 2 in Hawaii peaks in May (Fig. 1) and drops from June to October. Then, it increases slowly from October to June. And the range of CO 2 variation in Ontario is larger than in Hawaii. The difference in seasonal variation might be due to the local climate. Hawaii is located in the tropics and the temperature is more uniform through the year compared to Ontario 15 . So, photosynthesis happens all the year round in Hawaii but is limited during the winter in Ontario.  Table S6 for the estimated results and Fig. S2 for the residual plot in the CO 2 regression model.
In the absence of Ontario data for our study period and given the lack of seasonal synchronization of the Hawaiian and Ontario data, we elected to construct a proxy of historical Ontario CO 2 concentration for the period from 1959 to 2004. Figure 1 shows the actual CO 2 concentration in Ontario and Hawaii as well as the proxy of CO 2 concentration in Ontario. The proxy of CO 2 concentration and actual CO 2 concentration in Ontario has a similar pattern of seasonal variation. Figure 2 shows the residuals from the regression. Those residuals are generally small, so the fitted CO 2 presents the actual CO 2 well.
Most studies treat the effect of elevated CO 2 on crop yields as linear [16][17][18][19] , while recent studies find a nonlinear effect of elevated CO 2 on plant biomass [20][21][22] . Idso 23 reported that the CO 2 effect on sugar cane yield is nonlinear. But no study has found a nonlinear effect of CO 2 on corn and soybean yields (Table 1). This is important to investigate because the prediction of future crop yields might be biased if CO 2 effect is not correctly modeled. To compare the linear and nonlinear CO 2 effects, we apply three versions of our models: 1) No CO 2 effect, 2) Linear CO 2 effect, and 3) Quadratic CO 2 effect.

Results and Discussion
crop yield response to a pest dummy variable. There was an outbreak of soybean aphids in Ontario in 2001. To measure the effect of this pest event on yields, our soybean models include a dummy variable for the year 2001. Table 2 shows the estimated effects of weather, physical, and economic variables on grain corn and soybean yields. Table 2 shows that the dummy of 2001 has a statistically significant negative effect on soybean yields as expected. Soybean yield was reduced by about 30% in 2001 due to the outbreak.
Crop yield response to technology and CO 2 . Most studies treat the effect of elevated CO 2 on crop yields as linear [16][17][18][19] . Table 3 summarizes some of these results for corn and soybeans and compares them with our results  from Table 2. For corn, we find that yield increases by 0.04% per ppm (±0.8% per ppm at 95% C.I.) of CO 2 . Past studies 17,18 report that the changes in yields for corn range from −0.10% to 0.27% per ppm of CO 2 . For soybeans, we find that yield increases by 0.32% per ppm (±1.26% per ppm at 95% C.I.) of CO 2 . This increase is higher than previously reported results, which range from −0.00012% to 0.28% per ppm of CO 2 16-19 . But our confidence interval covers this range. We also find that the linear CO 2 effect on soybean is larger than that for corn. This is consistent with current crop physiology theory, which suggests that increasing CO 2 is more beneficial for C 3 crops than C 4 crops 18,20 .
Our results from a quadratic CO 2 effect, however, tell a more complex story. We find that corn yield decreases until CO 2 reaches 353 ppm, which was the average concentration during the growing season that occurred in 1994. Then, corn yield starts to increase at an increasing rate with CO 2 concentration . Reich et al. 20 also find similar nonlinear CO 2 effect. However, our quadratic CO 2 effect on corn yield is larger than the effect reported by Reich et al. 20 . The decrease of yields with higher CO 2 could be a results of multicollinearity problem between the time trend and CO 2 variables. Even though the nonlinear effect of CO 2 on crop yields is found in recent studies, the nonlinear effect of CO 2 is still a question in the field of plant physiology. Reich et al. 20 indicated that the plant biomass response to elevated CO 2 might be related to the soil N availability, but the explanation of this possible relationship is unknown yet. More researches are needed to answer this question.
For soybeans, we find that yield decreases until CO 2 reaches 360 ppm, which is the level that occurred in Ontario in 1998. Then, soybean yield starts to increase at an increasing rate with CO 2 concentration. Previous literature has reported inconsistent findings of a nonlinear effect of CO 2 on soybeans 20,21 . So, we have elected to use a linear CO 2 effect model in our results.
We acknowledge that collinearity between our proxy variable for CO 2 and our time trend means that it is difficult to separate the individual contributions of these variables on crop yields. The correlation coefficient of these two variables is 0.99. For corn, the Variance Inflation Factor (VIF) of time trend is 19 and the VIF of CO 2 is White circles represent the fitted monthly CO 2 in Ontario. Colored circles represent the actual monthly CO 2 in Ontario. The distance between the white and colored circles are the residuals in the CO 2 regression in Table S6, that is, the difference between actual CO 2 and fitted CO 2 . The color of the colored circles represents the value of the residuals. Red means positive, and purple means negative. Larger residual has more bright color. www.nature.com/scientificreports www.nature.com/scientificreports/ 33. For soybeans, the VIF of time trend is 18 and the VIF of CO 2 is 33. All the above values indicate that there is a multicollinearity problem between time trend and CO 2 variables. So, we cannot compare the individual contributions of our proxy variable for CO 2 and the time trend on crop yields.
Crop yield response to weather variables. Our results ( Table 2) indicate that corn and soybean yields respond to the precipitation before the growing season in different ways. Corn yield decreases with more precipitation before the growing season while soybean yield increases. Precipitation before the growing season can impair the timeliness of tillage operations and reduce soil temperature 24 . This can delay corn planting and germination 24 . Soybeans can be planted later than corn in Ontario without incurring a yield penalty. So excess moisture in the spring can shift planted area away from corn and toward soybeans and it can reduce yields in the land that is planted with corn. In average, the increasing precipitation before the growing season reduces 0.00045% of corn yield per year and increases 0.0017% of soybean yield per year.
A longer growing season leads to an increase in precipitation during the growing season and also an increase in degree days during the growing season because there are more days for which precipitation and temperature are measured. Degree days during the growing season is the accumulation of the positive difference between daily average temperature and base temperature (10 °C) during the growing season. According to the plant physiology theory, when plants received either insufficient or excess moisture and heat, the physiology processes of plant would be impeded [24][25][26] . So, when crops receive insufficient moisture or heat, we expect crop yields increase as more precipitation and degree days during the growing season. But, when crops receive excessive moisture and heat, we expect crop yields decrease as more precipitation and degree days during the growing season. We found that corn and soybean yields increased at a decreasing rate as the precipitation and degree days during the growing season increased (Table 2). Given the quadratic functional form in our model, these results imply that there is a level of growing season precipitation and growing season degree days at which yields reach a maximum with respect to each variable. We call these maxima peak yields. Peak yields with respect to water and heat vary across years due to interaction effects ( Table 2). To explore if recent precipitation and degree days had exceeded the water and heat levels associated with peak yields, we elected to study the time period from 2009-2013.
For our linear CO 2 effect model, the peak yield with respect to precipitation is reached when the precipitation during the growing season reaches 599 mm per season for corn and 655 mm per season for soybeans. The peak yield with respect to degree days is achieved when degree days during the growing season reaches 1507 °C days for corn and 1651 °C days for soybeans. The average precipitation and degree days during the growing season for 2009-2013 were 491 mm and 1269 °C respectively. For the most recent 5 years in our data set, it appears that climate has not yet resulted in appreciable threats to yields of corn and soybeans in Ontario. In average, the increasing precipitation and degree days during the growing season increase 0.084% per year in corn yield per   www.nature.com/scientificreports www.nature.com/scientificreports/ year and increase 0.13% per year in soybean yield (Table 4). Climate projections for Ontario suggest a wetter and warmer growing season along with significant changes in return rates for extreme events for 2020-2070 12,27-29 . More frequent extreme events could be an important driver of future yields. Our ongoing research is investigating this possibility through spatial stochastic simulation modeling.
The literature on the effects of climate change on crop production in Ontario is inconclusive. Brklacich and Smith 30 reported that increased future variation in climate might reduce crop production in Southern Ontario. However, Weber and Hauer 31 argued that Ontario agriculture would benefit from projected changes in climate. Cabas et al. 9 found that increased variation in temperature and precipitation might decrease crop yields in Ontario for 1981-2006. But they indicated that the longer growing season might offset the negative effect and result in higher crop yields. Our results suggest that projected changes in climate do not currently pose a threat to corn and soybean yields in Ontario. In the United States, Schlenker and Roberts 1 found that climate change threatens corn, soybean and cotton yields for 1950-2005. Miao et al. 3 found that climate change negatively affects corn and soybean production in the United States for 1977-2007. They also indicated that the omission of economic variables might have led to the overestimation of the effect of climate change on corn and soybean yields. Recently, Butler et al. 14 found that the longer growing season and decreased peak temperatures during the growing season have been beneficial for corn yields in the U.S since 1981. crop yield response to economic variables. Agricultural economists have argued that crop yields are influenced by output and input prices 8 . However, price variables have been omitted from several recent studies on the effects of climates on crop yields ( Table 1). The effect of the crop price on yields is theoretically ambiguous. Higher expected crop prices might induce farmers to use purchased inputs like fertilizer more intensively, increasing yields. But a higher expected crop price might also induce farmers to increase area planted to that crop, expanding production onto less suitable soils, reducing yields. A higher fertilizer price would result in less intensive fertilizer use and lead to a reduction in yields 3,8 . However, increased fertilizer prices might cause farmers to shift planted area having lower corn yield away from corn and into soybeans 3 . Thus, the remaining land with high corn yield might increase the average corn yield 3 . Therefore, the effect of fertilizer price on yields is theoretically ambiguous.
As corn and soybeans are planted in the spring, farmers do not know the crop prices that they will receive until after they plant. We used the crop price lagged one-year in our models to represent farmers' expectations. We find that corn and soybean yields increase at a decreasing rate as crop prices increase ( The biomass of C 4 grasses was not markedly raised at elevated CO 2 (+180ppm) in the first 12 years, but markedly raised in the subsequent 8 years.
The biomass of C 3 grasses was markedly raised at elevated CO 2 (+180ppm), but not in the subsequent 8 years  www.nature.com/scientificreports www.nature.com/scientificreports/ that decreasing of crop price lagged one-year decreases 0.22% per year in corn yield and decreases 0.33% per year in soybean yield (Table 4).
We used the current year index of fertilizer price as the input price in our models. We find both positive and negative effects of fertilizer price on crop yields for our study period. We find that when the fertilizer price index is lower than approximately 160 (100 = dollars in year 2002), corn and soybean yields decrease at a decreasing rate as the fertilizer price goes up ( Table 2). When the fertilizer price index is higher than approximately 160 (100 = dollars in year 2002), crop yields increase at an increasing rate as the fertilizer price gets higher ( Table 2). In average, decreasing fertilizer price increases 0.022% per year in corn yield and 0.065% per year in soybean yield (Table 4).

conclusion
This paper builds on previous research on the effects of climate on corn and soybean yields in Ontario, Canada. We constructed a database that is more comprehensive with respect to explanatory variables included in the estimation. Our model includes climate variables, prices, land quality, groundwater level, CO 2 concentration, and a time trend. We use a unique annual panel dataset for 29 counties in Ontario for the time period from 1959 to 2013. Our results indicate that historical trends in temperature and precipitation have not yet resulted in appreciable threats to crop production in Ontario, Canada. However, the prospect of more frequent extreme climate events calls for further research on the effect of climate on crop yields. We are undertaking spatial stochastic simulation modeling to investigate this possibility.
A growing body of evidence suggests that changes in climate are or will constitute a threat to crop production in some major crop producing regions of the world 11,32,33 , including the United States 1-3 . But the effect of changing climate on crop production is regional specific. Our research as well as previous work at the national level in Canada 31 suggest that historical trends in temperatures and precipitation have not yet had an adverse effect on crop yields in Canada. Weber and Hauer 31 found that all provinces in Canada would benefit from climate change. The above discussion suggests that for a temperate climate country like Canada, threats to crop production from climate have yet to materialize. This suggests the possibility that climate is causing changes in regional or national comparative advantage in agriculture. This may have important implications for future agricultural trade and global food security.
Past studies 11,34,35 use a fixed fertilization effect of CO 2 from experimental studies to predict future crop production. However, the fertilization effect of CO 2 may change by time period, crops, and studied locations. Long-term CO 2 -plant experimental data are not easy to obtain. Our BECYR model constructs a synthetic CO 2 proxy. This method contributes the estimation of crop yield model when actual long-term CO 2 data are unavailable.

Method and Data
We model crop yield as a function of precipitation and degree days during the growing season, precipitation before the growing season, cropland quality, groundwater level, crop price, fertilizer price, local CO 2 concentration and a time trend. Grain corn and soybeans are the two most valuable field crops in Ontario. Our panel data extend from 1959 to 2013 for 29 counties in Ontario, Canada. A quadratic functional form is used for every variable except the dummy variables to test for non-linear effects. We estimated two types of model: a pooled regression model and a fixed county effects model. The time-invariant variables, cropland quality and groundwater, are included in the pooled regression model but not in the fixed county effects model. The advantage of a Fixed Effects model is that it to capture the effect of county-level variation in local factors, such as soil type, groundwater, topography and others. Due to poor data quality for these local factors, the pooling regression models are less likely to capture their effects. In addition, the F-test of fixed effect model is higher than that of pooling regression model. So, our study focusses on the results from county level fixed effect models. The Fixed Effects model allows the intercept term to vary across the 29 counties. Note that, the time-specific effect is not discussed here since we focus on the specific effects across counties and we have included the variable of time trend. This model assumes that county-specific effect is unique for each county but constant over time.
Allowing the intercept to vary across counties is useful to capture the effects of other factors not included in each www.nature.com/scientificreports www.nature.com/scientificreports/ county. So, parameters α c vary across counties but not over time, and parameters β i , γ i and θ are constant for all counties and years. The model also includes an individual error term e t c , , which varies across both counties and times. Equation (1)  where Y t c , is the annual grain corn yield in year t and county c; D c (c = 1, 2, …, 29) is the dummy variable for each of the 29 counties; PBGS t c , is the precipitation before the growing season, which in this context is the 3 months prior to growing season in year t and county c; PRECI t c , is the total precipitation during the growing season in year t and county c; DD t c , is the degree days during the growing season with 10 °C base temperature in year t and county c; − P t 1 is the corn price in Ontario for the previous year t-1; PF t is the fertilizer price index in Canada for the current year t; T t is the time trend; CO t 2 is the proxy of CO 2 concentration in Ontario for the year t; PRECI is the mean of PRECI t c , for all studied counties and years; DD is the mean of DD t c , for all studied counties and years.
The model considers the linear and quadratic effect of each factor as well as the centering interaction between precipitation and degree days. The centering interaction is used to reduce the correlation between the variables and their interaction term. The estimated results are robust for heteroskedasticity and autocorrelation.
For validation, we did hold-out validation and out-of-sample simulation. The results from both methods indicate that our crop yield models for grain corn and soybeans are reliable. For details of the validation, please see the Supplementary Information.
Many counties in Ontario experienced changes in county boundaries from 1959 to 2013. There is a concern that yield data at the county level significantly change due to the historical county boundary changes. Xu 28 had determined the effect of county boundary changes on yield data and adjusted county boundaries. We used the adjusted county boundaries from her work for our crop yields and climate data.
Data on corn yields (Bushels/Acre) and soybean yields (Bushels/Acre) were collected from 1959 to 2013. The yield data were obtained from Agricultural Statistics for Ontario. Due to county boundary changes described above, we did retroactive adjustment to the original crop yields data for the years prior to the boundary changes for each county. For the counties with county boundary adjustment, the harvest area weighted average crop yields data are calculated for the new county boundary. Equation (2) shows the formula of the harvest area weighted average crop yields: where Yield New is the harvest area weighted average crop yields in the new combined county; is the crop yield in county k; Area k is the crop harvest area in county k.
For soybeans, yield data are not available for all counties over the whole study period. From the 1950s to 1970s, the planting of soybeans was restricted to southern Ontario due to that region having the longest and warmest growing season in Canada 36 . Advanced breeding technology allowed farmers to grow soybeans outside of Southern Ontario starting in the 1980s 36 . For this reason, very few soybeans were planted outside of Southern Ontario prior to that period. As a result, the data for soybean production and harvest area is very small in the early period and missing in some years. There is a concern that these data for small-scale soybeans production cannot well represent the soybean production in Ontario and therefore widens the variance of soybean yields. Therefore, we excluded soybean yields data for counties with harvest area less than 1,500 acres per year. In addition, to keep the continuity of soybean yields data, we excluded the soybean yields data in the early period until no data were missing. So, the Soybean yield data are missing in some years in some counties until the year of 1996. We estimated soybean yield model for two different periods: 1) the period of 1959-2013, which is unbalanced panel; and 2) the period of 1996-2013, which is balanced panel. The results from these two tests are similar. The estimated coefficients show the same sign and level of significance. Since the results are similar, we used the model estimated the period of 1959-2013, as it is based on more observations which allow us to better measure the effect of changing climate on crop yields over time.
Climate data were obtained from Natural Resources Canada (NRCan) 12,27 . The data includes historical precipitation before the growing season (mm) (PBGS), precipitation during the growing season (mm) (PRECI) and degree days during the growing season (°C) (DD) for the period.
Economic factors include real crop prices and fertilizer price index. We use the Consumer Price Index (CPI) from Statistics Canada, CANSIM Table 326-0021 to convert nominal crop and fertilizer prices to real prices to adjust inflation. As corn and soybeans are planted in the spring, farmers do not know the crop prices that they will receive until after they plant. So, the 1-year lag of crop price is used as the expected crop price in the models for corn and soybeans. Corn price data (dollars per metric ton) in Ontario from 1949 to 2012 were obtained from Statistics Canada CANSIM Table 001-0010 and Table 002-0043 for 1949 to 1984 and 1985 to 2012, respectively. Soybean price data (dollars per metric ton) in Ontario from 1949-2012 were obtained from Statistics Canada CANSIM Table 001-0010, Publication 22-007 and CANSIM Table 002-0043, for 1949 to 1984, 1985 to 1991, 1992 to 2012, respectively. The monthly price data are averaged to obtain annual data.
The composite fertilizer price index in Canada is used as the nominal fertilizer price (PF t ). The fertilizer price index (index 1998=100) in Canada from 1959 to 2013 was obtained from Statistics Canada Publication 62-004, CANSIM Table 328-0001, CANSIM Table 328-0014 and CANSIM Table 328 -0015, for 1959 to 1961, 1961 to 1998, 1998 to 2002, and 2002 to 2013. The quarterly fertilizer price index is averaged to obtain annual observations.
Technology development plays an important role in crop production, such as the development in plant breeding, and management practices in production systems. A time trend is used as a proxy of technology in this study for each crop. The time trend is a number sequence from 1 to 55.
The monthly CO 2 data in Egbert, Ontario for 2005-2016 is collected from The World Data Center for Greenhouse Gases (WDCGG). The monthly CO 2 data in Mauna, Hawaii is collected from the National Oceanic and Atmospheric Administration (NOAA). We used the following procedure to construct a proxy of historical Ontario CO 2 concentration for the period from 1959 to 2004. We first calculated the correlation coefficients for monthly average CO 2 concentration in Ontario and Hawaii for 2005-2016. The 12 correlation coefficients are all above 0.85. Then, we regressed the monthly Ontario CO 2 concentration on monthly Hawaii CO 2 concentration for 2005-2016. We used a linear monthly fixed effects model. We found that the adjusted R-square is 0.97, which shows a high goodness of fit. In addition, the estimated coefficients for Hawaii CO 2 concentration and monthly fixed effects are all statistically significant. We then used the estimated coefficients to construct a proxy for Ontario CO 2 concentration for 1959-2004. For the details of the methods and estimation results, see the Support Information (Tables S5-S6).

Data availability
The data and R code used in this paper are available at the University of Guelph data depository system (https:// doi.org/10.5683/SP2/6OJSHE). All the other method and materials are present in the paper or supplementary information.