Introduction

The world currently faces a major food security challenge to which crop genetic improvements can make a significant contribution. According to the FAO, world food production will need to increase by 70% by the year 2050 in order to meet growing demand1. Since rice is one of the world's three main staple foods, productivity gains in rice are essential to achieve this goal, especially for Asia where there is no room for land expansion. Major gains in productivity were achieved in the 1970's with Green Revolution technologies targeted at the better-endowed areas under irrigation2. Two issues have subsequently challenged scientists. The first is that maximum yields achieved in the better-endowed areas have stagnated at around 11 tons per ha, with little prospects for significant improvements3. Technological innovations for these areas have basically been directed at maintenance research and at compensating for losses in soil fertility and declines in water availability. The second issue is that the Green Revolution has remained elusive for rainfed areas, particularly those affected by flash flooding and drought4,5,6. Major efforts have been undertaken by rice scientists around the world to bring productivity gains to rainfed areas prone to these environmental stresses.

We focus here on the development of flood-tolerant rice through genetic improvements. Precision in the transfer of genes conferring flood tolerance has been greatly facilitated by marker-assisted selection (MAS), allowing for minimization of undesirable properties such as low yield under non-stress conditions7,8. Two alternative paths have been followed. One consists of introducing into modern rice varieties quantitative trait loci that contain the SNORKEL1 and SNORKEL2 genes, which induce elongation in traditional deepwater rice varieties, allowing them to survive during periods of prolonged flooding9. The second path involves the SUB1 quantitative trait locus that responds to ethylene buildup by restricting elongation, the natural response of the rice plant10,11. A particular advancement is the introduction of the SUB1 locus into Swarna (IRTP12715), India's most popular rice variety7,10. The resulting Swarna-Sub1 (IR05F102) maintains all the properties of Swarna while acquiring submergence tolerance. An MAS approach was used to introgress the SUB1 locus into Swarna10.

Swarna-Sub1 has been shown to withstand floods of up to 17 days in agricultural trials12,13. The potential impact of the technology in India alone is large. Of the country's 44 million hectares cultivated in rice, rainfed lowlands where flash flooding is common and Sub1 varieties would be beneficial account for approximately 12–14 million hectares, or 30 percent of the cultivated rice area. There would remain an additional 3–4 million hectares where water may remain stagnant for extended periods of time where even Sub1 varieties will not fare well. Swarna is cultivated on an estimated 30–40% of the area in rainfed lowlands. Swarna-Sub1 could thus beneficially replace Swarna on 9 to 12 percent of India's rice area.

We report on the first randomized control trial (RCT) where a stress-tolerant variety was taken to farmer's fields. We experimentally measure the yield effects of Swarna-Sub1 under field conditions in 128 villages of Orissa's flood-prone lowlands. Randomized experiments in the field to measure productivity impacts of new crop varieties are essential because the management strategies used by farmers often vary significantly from the agronomic practices used in trials on experimental plots. The experience with Bt cotton in India shows how management choices of farmers can limit the effectiveness of new technologies14. The differences between conditions in farmer's fields and experimental plots are evident in our data. Average yield of Swarna under non-flood conditions in our sample is approximately 3.3 tons per ha. Agronomic trials with the exact same variety produce yields of 5–6 tons per ha12.

This analysis goes a step further in trying to understand who will be the beneficiaries of flood-tolerant rice. The incidence of benefits across the social spectrum of the country's farmers depends on who cultivates lands more or less exposed to flooding. These results have policy significance as Swarna-Sub1 is expected to entirely replace Swarna in eastern India's rainfed lowlands. Our results suggest that Swarna-Sub1 will increase yields significantly more for the most historically disadvantaged group of farmers as they disproportionately cultivate flood-prone lands.

Results

Yield effects

The yield advantage of Swarna-Sub1 over Swarna by duration of submergence is depicted graphically in Figure 1. The Methods section gives a description of the statistical estimation. As can be seen in Figure 1a, the yield gain of Swarna-Sub1 is clearly increasing in the duration of flooding. While the estimated yield of Swarna-Sub1 is slightly lower than that of Swarna under non-flood conditions, there is a noticeable yield advantage that increases in the number of days flooded, up to 12–13 days. Figure 1b displays the treatment effect as the estimated yield advantage of Swarna-Sub1 (vertical difference between the two lines in Fig. 1a) along with its 95% confidence interval (± 1.96*SE). The estimated yield decrease of 180 kg per ha when plots are not flooded – an approximate 5.3% decline – is not statistically significant from zero. The yield advantage of Swarna-Sub1 increases as flood severity worsens, with a maximum advantage of around 718 kg per ha, an approximate 66% increase, occurring at approximately 13 days of flooding.

Figure 1
figure 1

Impact of flood-tolerant rice (Swarna-Sub1) on yield during the 2011 wet season.

(a) Relationship between yield (kg/ha) and flood duration (days). (b) Estimated yield advantage of Swarna-Sub1 relative to Swarna, as a function of duration of flooding. Solid black line is treatment effect and shaded area represents 95% confidence interval. Dots are impact estimates from 133 data points collected from published agronomic field trials and unpublished data from NGO trials on farmer's fields. (c) Distribution of flood duration. (d) Cumulative distribution of flood duration.

The result suggests that the yield advantage of Swarna-Sub1 under flooding is not achieved at the cost of a yield penalty in non-stress conditions. The SUB1 locus reduces damages due to flooding without reducing yields in the absence of flooding. While the magnitude of the impacts may depend on how fields are managed and conditions of floodwaters, evidence points to success of MAS in breeding a variety that not only reduces yield variability but also increases expected yield.

The impacts measured under heavy flooding with the RCT method are noticeably smaller than those from published agronomic trials in experimental stations and NGO-supported trials in a nonrandom set of farmer's fields (Fig. 1b). Estimates from a range of agronomic trials and observational comparisons suggest impacts of 1–2 tons per ha under prolonged submergence of more than one week15. This large difference in estimates could be due to a number of factors including timing of flooding, management techniques of farmers and turbidity of flood water. The rice plant is more sensitive to submergence earlier in the growing season – likely because of decreased root development, limited carbohydrate reserves and reduced ability for underwater photosynthesis16,17. Most agronomic trials induced submergence around 3 weeks after transplanting, when sensitivity is high. This compares to the experimental results presented here where natural flooding occurred around 75 days after planting – a time when carbohydrate stores are highest18. The latter two explanations are also relevant because farmers in our sample received no management guidance with the new technology and floodwaters in farmer's fields tend to be quite turbid, which affects survival of even tolerant varieties19. If management practices drive the differences, then with learning and adoption of improved management techniques, future observations of the yield advantage of Swarna-Sub1 may become even larger.

The distribution of days flooded shows substantial variation in flood duration for plots cultivated with Swarna and Swarna-Sub1 (Fig. 1c). A total of 1,960 plots were cultivated with Swarna and 314 plots were cultivated with Swarna-Sub1. The variation in flood exposure suggests that the estimated impacts at a given flood duration are generated by large numbers of observations, at least until 12 days. The variation in flood duration therefore provided a natural experiment that when combined with the randomized experiment allowed for estimation of productivity effects across the distribution of flood duration. Importantly, approximately 35% of plots were flooded for 7–14 days, thus falling in the range where the productivity benefits of Swarna-Sub1 are largest (Fig. 1d).

OLS regression results are largely consistent with estimates in Figure 1. For completeness, all variety types are included in the OLS regressions (see Supplementary Table 1 for frequency of other varieties). On average, Swarna-Sub1 offered a yield benefit of 232 kg per ha, which represents a 10.5% improvement (column 1, Table 1). The estimate is significant at the 8% level (p = 0.077, t-test). The mean yield benefit of 10.5% is an average across the various flooding intensities displayed in Figure 1c. Most notably, approximately 40% of plots are located in areas where flooding was not severe and the benefits were close to zero. The average treatment effect of 232 kg per ha represents the distribution of flooding during the sample year and not necessarily the distribution of flooding during every year in Orissa. In column 2 of Table 1 the yield advantage of Swarna-Sub1 is specified as a linear function of the duration of flooding. Yields are estimated to be lower by 118 kg per ha when a plot is not flooded. This difference is not statistically distinguishable from zero (p = 0.47). Importantly, the statistically significant and positive estimated coefficient on the interaction of the Swarna-Sub1 indicator and the duration of flooding indicates that the yield advantage of Swarna-Sub1 is increasing in the duration of flooding. For each additional day of flooding, we estimate that the yield gain increases by 64 kg per ha (p = 0.004). Following Panel B of Figure 1, in column 3 we specify the yield advantage of Swarna-Sub1 to be a piecewise linear function of days of flooding with a kink at 12 days. Our estimate of the yield advantage is −196.3 + 84.3 * Daysflood − 169.1 * (Daysflood − 12) * , where is the indicator function. This estimate is consistent with the impacts in Figure 1 and therefore represents our preferred OLS specification. The results in Table 1 are also robust to controlling for plot characteristics (Supplementary Table 2).

Table 1 Impact of flood-tolerant rice (Swarna-Sub1) on yield during 2011 wet season

Increased exposure of low caste groups to flooding

Understanding which social groups are most likely to realize the benefits of flood-tolerant rice requires identifying who cultivates land exposed to flooding. Caste remains as a marker of social status in India. Members of the scheduled castes and tribes (SC/ST) typically have high levels of poverty and marginality20,21,22,23. SC/ST's in our sample have lower consumption expenditures per household member as well as smaller agricultural landholdings. Within blocks (a geographic unit above villages), consumption per capita is lower by approximately 21% for SC/ST's. Average landholdings of SC/ST's are smaller by approximately 0.25 hectares (Supplementary Fig. 2). We use regression analysis to estimate whether plots cultivated by farmers belonging to lower caste groups are exposed to more flooding. For this analysis, the sample is limited to farmers in Bhadrak district since the 2011 floods in the sample occurred primarily in that district. The main results are summarized graphically in Figure 2a, while full regression results are in Supplementary Table 3. The average duration of flooding on plots cultivated by SC or ST farmers is longer by 1.83 days, when compared to farmers that belong to the highest caste group in our sample (Model 1). The difference in flood exposure is statistically significant at the 3% level (p = 0.026). This estimate represents a 21% increase in the length of flooding for plots cultivated by the lowest caste farmers. We also use a specification that flexibly controls for unobservable confounders at the village level by estimating a separate intercept for each village. This changes the interpretation of the estimate to be that when comparing two plots in the same village, plots cultivated by SC or ST farmers are expected to be flooded for 1.35 days longer (p = 0.055, Model 2). Adding control variables characterizing cultivators results in no notable change (Model 3). The strong correlation between caste and flooding suggests that the land allocation both across and within villages is likely to result in low-lying plots being cultivated by lower caste farmers.

Figure 2
figure 2

Relationship between caste and flooding.

(a) Predictions from OLS regression of days plot flooded on caste of cultivator. Bar heights represent predicted days of flooding. Whiskers are 95% confidence intervals (±1.96*SE). Model 1 includes caste indicators only. Model 2 additionally includes village fixed effects (village-specific intercepts). Model 3 includes village fixed effects and household covariates. Standard errors in all models allow for spatial correlation within villages. (b) Estimated probabilities of flooding using satellite data for villages in three districts of Orissa. Bar height is estimated probability from OLS regressions and whiskers are 95% confidence intervals (±1.96*SE). Gray bars are for village that is entirely higher caste people (OBC/general). Red bars are for village that is entirely lower caste (SC). Estimated probabilities are for most heavily flooded district (Kendrapara). *** P < 0.01, ** P < 0.05, * P < 0.10 compared to higher caste.

We use RADARSAT satellite imagery on flooding in September 2008 and geocoded village-level census information to further investigate the correlation between caste and flooding in a broader set of villages in the Bhadrak, Kendrapara and Jajpur districts of Orissa (see Supplementary Fig. 5 for map). The main results are summarized in Figure 2b (see Supplementary Table 4 for complete regression results). Villages where a larger share of the population are SC's are more likely to be flooded for 1–12 days. The estimated probability of being flooded for 1–7 days (September 18 observation only) is higher by 0.15 (p = 0.007) in a village where the entire population is SC's. This increase in probability is relative to a village that is entirely occupied by people belonging to higher castes. The estimated probability of a village being flooded for 8–12 days (observed as flooded on both September 18 and 25) is higher by 0.09 (p = 0.018) for a village with entirely SC's. This represents an approximate 48% increase for villages in the most heavily flooded district. The magnitude of this estimate is therefore large and particularly important since our previous results suggest that areas flooded for approximately one to two weeks are those likely to benefit the most from Swarna-Sub1. The share of the population that is SC is not significantly associated with the village being flooded for more than 12 days (observed as flooded on September 18, 25 and 30). It is in these villages where water likely remains stagnant and therefore the yield benefits of Swarna-Sub1 are reduced (Fig. 1 and Table 1).

As a further verification, we match village locations with additional satellite data from a flood event in August 2013. Regression results using these data are similar and provide further support for the positive correlation between the location of SC's and flooding (Supplementary Table 5).

In conclusion, the survey data from 2011 and the satellite imagery from 2008 and from 2013 show a strong positive association between flooding and habitation by scheduled castes. The main factors determining flood risk of a particular area, such as elevation and proximity to rivers, are largely time invariant. Therefore, the increased exposure of SC's to flooding in coastal Orissa is unlikely to change from year to year.

Simulated impacts of Swarna-Sub1

As a final exercise to measure distributional effects, we simulate the impact of replacing Swarna with Swarna-Sub1. This simulation is policy relevant because full replacement is a target of some state governments in the country's most flood-prone areas. We use the regression results from Column 3 in Table 1 to predict production (in kg) for each plot if it were cultivated with Swarna. We aggregate separately for ST/SC's and by district before computing the predicted percentage increase in total production for a scenario where all Swarna land is cultivated with Swarna-Sub1. The results show that universal replacement of Swarna with Swarna-Sub1 during the 2011 floods in our sample of farmers in Bhadrak would have resulted in a predicted increase in total rice production of 24.8% for higher caste farmers and 39.6% for SC/ST farmers (Fig. 3a). The difference in predicted impact of 14.8 percentage points is significantly different from zero (p = 0.003, delta method t-test). The intensity of flooding is varied by adding (or subtracting) a day of flooding to the observed 2011 flood duration for each plot and then repeating the simulation exercise. The projected impacts remain large and statistically significant for a flood event that is shorter by two days. Projected impacts in Balasore are noticeably smaller and statistically insignificant since the 2011 floods in this area were far less severe (Fig. 3b).

Figure 3
figure 3

Projected impacts of adoption of flood-tolerant rice (Swarna-Sub1) on rice production.

(a) Bhadrak district and (b) Balasore district. Graphs display forecasted percentage difference in total production between a scenario where all Swarna plots are cultivated with Swarna-Sub1 and a scenario where the plots remain cultivated with Swarna. Differing flood severity is simulated by adding (subtracting) an additional day of submergence for each plot to simulate a flood that is 1 day more (less) severe than 2011 floods. Dots and triangles represent point estimates and whiskers are 95% confidence intervals. Predicted difference in impact between SC/ST and OBC/General farmers is displayed as triangles. Regression estimates from Column 3 of Table 1 are used to generate predictions.

One concern with a policy of completely replacing Swarna with Swarna-Sub1 is that there may be some risks associated with relying so heavily on a single rice variety. By replacing only Swarna with Swarna-Sub1, the simulation explicitly holds constant the overall varietal diversity in the sample. Further, other than the SUB1 locus, Swarna-Sub1 is genetically equivalent to Swarna – minimizing the likelihood that Swarna-Sub1 confers a negative trait not conferred by Swarna10. To the extent that any crop diseases or insect attacks occurred during the sample year, the estimated treatment effects also account for any unknown differences in tolerance to these stresses between Swarna and Swarna-Sub1.

The simulated distributional impacts assume that any benefits of planting Swarna-Sub1 are fully captured by the farmer cultivating the land. This may not be the case if tenancy is widespread and land rental rates adjust to reflect the potential productivity of land. However, only 1% of the plots in the sample are cultivated by renters. Of the remaining plots, 89% are owner cultivated and 10% are cultivated under sharecropping arrangements. Not surprisingly, lower caste farmers rely more heavily on sharecropping. 8.2% of plots cultivated by higher caste farmers are sharecropped, while 18.3% of plots cultivated by lower caste farmers are sharecropped. The difference is highly statistically significant (p < 0.001, t-test). If owners are more likely to come from higher castes and flood-prone plots are more likely to be cultivated under sharecropping arrangements with lower caste farmers, then some of the benefits of Swarna-Sub1 would shift towards higher castes. An additional simulation exercise was performed where all sharecropped plots cultivated by ST/SC's were assumed to be owned by higher caste farmers and that half (the customary allocation) of the output of these plots is collected by higher castes. The results are similar under these conservative assumptions (Supplementary Fig. 3). The dominance of cultivation by landowners makes it unlikely that land tenure arrangements will influence the distribution of benefits.

The results are focused on cultivators and do not measure the impact of the technology on landless laborers. The impact on this group depends on whether farmers in flood-prone areas rely on hired labor and whether the increased yields during flooding result in increases in labor demand. It is important to emphasize that since our statistical analysis is limited to cultivators, we do not measure the spillover impacts on the landless.

Discussion

The experimental evidence reported here suggests that Swarna-Sub1, a recently released flood-tolerant rice variety, is expected to have positive impacts on yields when fields are submerged in the range of 7 to 14 days. Yields of Swarna and Swarna-Sub1 are not noticeably different when fields are not flooded or flooded for only a lesser period of time. We estimate that widescale adoption of Swarna-Sub1 prior to the 2011 floods in one of our sample districts would have resulted in an approximate increase in rice production of 26.5%. The impacts were measured in a randomly selected group of farmer's fields and can therefore be interpreted causally. While the impact estimates are lower than those from agricultural trials, the results confirm the importance of the SUB1 locus in increasing rice productivity during flooding. Many parts of the world are predicted to experience more frequent floods as a result of global climate change24. Breeding of submergence tolerance is one successful method to reduce the impact of these events on agricultural productivity.

The key benefit of the methodology is that it allows for statistically rigorous measurement of technological impacts in a real-world agricultural setting. This of course comes at a cost of not being able to observe all possible different flood conditions in a single year. Although the spatial variation in flooding intensity provided the opportunity to estimate impacts for a wide variety of flooding lengths, including no flooding at all, factors such as turbidity or timing of inundation could not be controlled in a field experiment. Our estimates are therefore specific to the conditions that occurred during the 2011 floods in Orissa. Floods of similar magnitude occurred in the sample area during four of the previous nine years, suggesting that the intensity of flooding observed in the experiment is not abnormal (Supplementary Fig. 4)25. Also, the benchmark data from agronomic trials in experimental stations and trials in farmer's fields (Fig. 1b) come from multiple years and locations. These results also suggest large benefits of Swarna-Sub1 under various flood conditions, making it less likely that our conclusions are due to anomalous flood conditions in the sample region. Nonetheless, the average effects and impact projections should be interpreted as being representative of the flooding conditions during our sample year.

The SUB1 locus has been inserted into commonly grown varieties in Bangladesh, the Philippines and Indonesia7. The benefits of these varieties should therefore extend beyond eastern India. Past research on adoption of risk-related technology suggests that the spread of these varieties could be especially rapid where exposure to flooding is more frequent26,27.

The distributional impacts of stress-tolerant varieties depend on the spatial location of the population. Variability in exposure to different stresses suggests that benefits across the farming population are likely to be uneven. We provided additional analysis showing that members of the scheduled castes in Orissa are likely to benefit substantially more from adoption of Swarna-Sub1. Within villages, scheduled caste farmers cultivate plots that are exposed to longer floods. Across villages, those where a greater share of the population is scheduled caste are more likely to be flooded for 1–12 days. The scheduled castes have been historically disadvantaged and are the focus of numerous poverty alleviation and affirmative action policies. Our results suggest that the historical process of land allocation and settlement has been unfavorable to scheduled caste people, pushing them disproportionately into flood-prone areas. This observation combined with our result that Swarna-Sub1 outperforms the main popular rice variety under flood conditions indicates that scheduled castes are likely to be a major beneficiary from the spread of Swarna-Sub1 in India. This advancement in agricultural biotechnology thus presents a unique opportunity to provide benefits to the poorest group of farmers without requiring social welfare handouts.

Methods

Experimental design

The experiment took place in eight blocks of the Bhadrak and Balasore districts of Orissa. Balasore is on the northern end of the state, neighboring the state of West Bengal. Bhadrak is next to Balasore on the south (see Supplementary Fig. 5 for a map). Half of the 128 villages were randomly assigned to treatment, with this village-level randomization being stratified by block. During late May and early June of 2011, enumerators visited each village and met with a local village leader to identify 25 farmers that cultivate Swarna and have land that is prone to flooding. Five farmers in each of the 64 treatment villages were then randomly selected to receive a five kilogram minikit of Swarna-Sub1 seed. This randomization succeeded in generating experimental groups that are similar on observable characteristics (Supplementary Table 6). The minikits were delivered to each selected farmer prior to sowing time in mid-June. The minikits contained only the seed and a short two page description of the variety and its properties. They did not contain any fertilizer or other inputs. We can therefore be confident that our results are due to the seed itself and not to other inputs. 318 of the 320 selected farmers accepted the seeds and indicated intention to cultivate Swarna-Sub1. The amount of seed in each minikit is sufficient to cultivate approximately 0.1–0.2 ha. Importantly, the results are robust to limiting the sample to plots that are 0.2 ha or less (Supplementary Table 2 and Supplementary Fig. 1).

Heavy rains in September 2011 brought flooding to many parts of central and northern Orissa, including much of Bhadrak district. The floods occurred approximately 60–70 days after transplanting during the end of the vegetative growth stage when panicles were beginning to initiate. The timing of the floods therefore contrasts to experimental trials where submergence was initiated approximately two weeks after transplanting12,13,28. Damages caused by flooding can vary depending on the stage of the growth cycle when flooding arrives. Enumerators returned to all villages following harvest and post-harvest production practices (in March 2012). In each treatment village the five original recipients as well as ten randomly selected non-recipients were sought for surveying. Five randomly selected farmers in each control village were also surveyed.

Regression analysis

The curves in Figure 1a are estimated using nonparametric locally weighted least squares regressions of yield on flood duration29. The support of flood duration is first divided into a grid of 50 points. For each point on the grid, a weighted regression is estimated where all of the observations within five days of flooding receive a positive weight according to the quartic kernel. The slope and intercept from each weighted regression are used to calculate the predicted values of yield for each grid point. The figure displays the predicted values for each point. The standard errors of the treatment effects in Figure 1b adjust for clustering at the village level by using a cluster bootstrap. 100 samples of 127 villages were drawn with replacement and the nonparametric estimates were generated for each bootstrapped sample. The standard errors are calculated as 1.96*σd, where σd is the standard deviation of the treatment effect for a flood duration of d days. The impact estimates from agronomic trials were obtained directly from published studies12,13,30. The trials were carried out in 2005 and 2006 at the International Rice Research Institute in the Philippines and during 2005–2007 at the Central Rice Research Institute in Cuttack Orissa. See the published studies for more details. The data on NGO supported trials were obtained as part of the Stress-Tolerant Rice for Africa and South Asia (STRASA) project of the International Rice Research Institute. The trials occurred during 2011 in Puri district of Orissa. All of the data from agronomic trials were aggregated and Lowess regressions were used to estimate the relationship between average yield and flood duration for each variety and the impact estimates were calculated as the difference between the fitted curves for Swarna and Swarna-Sub1.

OLS regressions are used to estimate impacts in Table 1. The corresponding regression equation for column 1 of Table 1 is,

where Yieldpivb is the rice yield on plot p, cultivated by farmer i, in village v and block b, Daysfloodpivb is the duration of flooding of the plot, Sub1pivb is an indicator for plots cultivated with Swarna-Sub1, OtherModernpivb is an indicator for plots cultivated with other modern varieties, TVpivb is an indicator for plots cultivated with traditional varieties, αb is a block fixed effect and εpivb is an unobserved error term. Since the omitted variety in the regression is Swarna, the estimates of β1, β2 and β3 all represent the average yield differences between Swarna and the other variety types. Standard errors are clustered at the village level, which is the first tier of randomization. Our estimate of β1 in equation (1) represents the average yield effect of Swarna-Sub1 across different flood durations. Estimates in columns 2 and 3 of Table 1 allow for the impact of Swarna-Sub1 to depend on the duration of flooding.

The regression equation used to estimate the relationship between caste and flood duration (Fig. 2a) is,

where SCorSTivb is an indicator for whether the farmer belongs to a scheduled caste (SC) or a scheduled tribe (ST), xivb is a vector of farmer-level covariates and αvb is a village fixed effect, i.e. a dummy variable for each village. Standard errors are again clustered at the village level. The estimate of β1 in equation (2) measures the expected difference in flood duration for SC/ST farmers within a village, when controlling for factors in x, i.e. the vertical difference between gray and red bars in Figure 2a. The full regression results are displayed in Supplementary Table 3.

The regression used to generate the relationship between village-level flooding and caste population shares (Fig. 2b) is,

where Floodedv is an indicator equal to 1 if village v was flooded, ShareSCv and ShareSTv are the shares of the population that are SC and ST, respectively and Jajpurv and Kendraparav are dummy variables for villages in Jajpur and Kendrapara. Our estimate of β2 measures the within-district expected difference in the probability of flooding for villages that are entirely SC's, when compared to villages that are entirely higher caste people. We used RADARSAT images (100 m resolution) of the flood during September 2008 in the Bhadrak, Kendrapara and Jajpur districts to identify villages that were flooded. The spatial location of village centers as well as demographic characteristics are from the 2001 census of India. The distance between each village center and the nearest flooded area was calculated using ArcGIS 10. We observe the centroid of each village and not the approximate boundaries. We account for the fact that farmers cultivate land away from the center of the village by considering a village to be flooded if its center was within 500 m of a flooded area. The results are robust to distance cutoffs of 250 m or 750 m. The standard errors reported in this estimation adjust only for heteroskedasticity and not for spatial correlation. We estimated Conley's spatial standard errors31 using a cutoff of 0.1 decimal degrees (approx 11 km). Statistical significance of our main results is not affected when spatial standard errors are used.

Impact simulation

The regression equation used to generate aggregate impact estimates (Fig. 3) is,

The impact on aggregate production is therefore,

The estimated percentage increase in total production due to Swarna-Sub1 is the estimate in equation (5) divided by the predicted total production when all Swarna plots continue to be cultivated with Swarna. That is,

The impact estimate in equation (6) is estimated separately for each district/caste group pair and displayed in Figure 3 of the main text. The estimated standard errors of the impact projections are calculated using the delta method, which is appropriate for calculating standard errors of nonlinear combinations of parameter estimates.