Main

Damaging floods are increasing in severity, duration and frequency, owing to changes in climate, land use, infrastructure and population demographics7,12,13,14. An estimated $651 billion (USD) in flood damages occurred globally from 2000 to 20192. Investments in flood adaptation reduce mortality and asset losses3,15. Yet, only 13% of disaster funds are allocated to preparedness, mitigation and adaptation16. Fundamental to prioritizing disaster mitigation efforts is quantifying global changes in flood hazard, exposure and vulnerability. We use the IPCC17 definitions of ‘flood hazard’ as the frequency and magnitude of events and of ‘exposure’ as the people, livelihoods, ecosystems and assets located where a hazard has or could occur. ‘Vulnerability’ is defined as the propensity for loss of lives, livelihoods and property and for other aspects of wellbeing to be adversely affected18. Previous global flood exposure and vulnerability studies have relied on modelled flood hazard7,8,10,19. One study estimates that population growth in 100-year floodplains (areas with a 1% annual flood probability) outpaced total population by 2.6% from 1970 to 2010 in 22 countries7. Vulnerability influences a wide range of adverse outcomes from flood events, including death, disease, psychological trauma, migration, property loss and poverty20. Studies enabled by global flood models reveal trends including declines in loss of life and reduced property damage when controlling for hazard size3,10,11. Sub-Saharan Africa is the only region with increasing flood mortality rates since 199010, where urban flooding has been growing and is expected to continue to do so7,8.

Flood exposure and vulnerability assessments are limited by the uncertainty in hazard models, which is due to the challenges of incorporating rapid anthropogenic change, to inadequate calibration data and to poor quality topographic data21. Humans modify land use and rivers, shifting flood water and reshaping exposure5,22. Differences in modelling assumptions lead to high disagreement between population and area exposure estimates across models9,21. Contrary to models, satellite-based remote sensing can directly observe inundation23,24, implicitly accounting for changes in climate, land use and infrastructure that are not reflected in modelled flood extents.

Here we measure global flood exposure from earth-observing satellites. We developed the Global Flood Database to systematically map the maximum observed surface-water extent during 913 large flood events documented by the Dartmouth Flood Observatory (DFO) from 2000 to 2018. The Global Flood Database (http://global-flood-database.cloudtostreet.info/) complements existing surface-water products that consist of monthly25 or daily26 observations by providing a geospatial event catalogue to aid model calibration and intercomparison27. With a spatial resolution of 250 m, the moderate-resolution imaging spectroradiometer (MODIS; a multispectral optical instrument mounted on NASA’s Terra and Aqua satellites that each image the globe daily28) resolves large, slow-moving flood events, but has limited ability to resolve urban floods. We estimate exposure trends using methods similar to a previous study7, by comparing the proportion of the observed flood-exposed population in 2015 to that in 2000 for each country (equation (6), Methods). Owing to uncertainty in population data29, exposure is estimated as a range across two datasets: the global human settlement layer (GHSL)30 and the high-resolution settlement layer (HRSL)31. We then estimate change in flood exposure for the near future (2030), using flood hazard extents from the Global Flood Risk with Image Scenarios (GLOFRIS) model, which is based on present-day emissions scenarios32 and socioeconomic trends33 for large events (the 100-year return period). We compare observations from the recent past (2000–2015) to modelled estimates in 2030 to identify countries on slowing, continuing or increasing flood exposure trajectories. This analysis may enable prioritization of adaptation measures where flood exposure has been growing or may grow faster than the total population under a changing climate.

Satellite-observed inundation

We analyse 12,719 MODIS images from 2000 to 2018 to produce 913 flood maps (Fig. 1a, Extended Data Fig. 1). We detected surface water at 250-m spatial resolution by applying empirically derived and Otsu-optimized thresholds34 (Extended Data Fig. 2) to the short-wave-infrared, near-infrared and red bands (bands 7, 2 and 1) from MODIS. Results were validated using 30-m-spatial-resolution Landsat scenes coincident with the day of maximum inundation (n = 123 events) for 30,685 points, yielding a mean accuracy of 83% (s.d. = 15%) for empirical thresholds and 80% (s.d. = 12%) for Otsu thresholds (Extended Data Fig. 3). Errors of commission (greater than 65%) were concentrated in northern latitudes, where low sun angle on dark soil causes low reflectance that mimics water35. Errors of omission show no geographic pattern (Extended Data Figs. 4, 5).

Fig. 1: Summary statistics of the Global Flood Database.
figure 1

a, Number of flood events in the Global Flood Database per country (colour scale), along with the centroid locations and area of each flood event (circles). Countries with no observations are shaded grey (NA, not available). b, Total (cumulative over 2000–2015) exposed population (circles) and exposed area (colour scale) per country (Supplementary Table 6). c, Estimates of annual global population (right axis, red shading; upper bound, GHSL; lower bound, HRSL) and area inundated (left axis, blue line). The 913 flood events represent those for which high-quality data were available (Methods, ‘Flood map quality control’). Population and area exposure to floods are lower in 2000 and 2001 until a second satellite (MODIS Aqua) was launched, increasing the likelihood of mapping a flood. Base maps: Natural Earth, tmap R package51.

Source data

Of the 3,054 flood events in the DFO catalogue (compiled largely from news reports), we successfully mapped 913 events with mostly cloud-free MODIS observations. We found no temporal bias of events over time due to increased news-media-reporting trends in the DFO catalogue when compared to another database (the Emergency Events Database36; Extended Data Fig. 9a). MODIS could not detect floods in 2,141 events because of persistent cloud cover (n = 495 events), small or flash floods (n = 300 events), inaccurate catalogue locations (n = 94), complex terrain (for example, dense forest, cities; n = 44 events) or other reasons (n = 1,208; Extended Data Fig. 6, Supplementary Table 9). Event maps may underrepresent the maximum flood extent, owing to the aforementioned uncertainty, and damaging floods underrepresented by the media may be absent from the DFO catalogue.

Most events in the Global Flood Database occurred in Asia (n = 398; 52 in China and 85 in India), followed by the Americas (n = 223; 98 in the US), Africa (n = 143), Europe (n = 92) and Oceania (n = 57; Fig. 1a). Many flood events occurred across multiple countries, giving rise to 2,617 single-country events observed by MODIS. We estimate that 255–290 million people (about 3% of the global population) have been exposed to at least one observed event since 2000 and three flood events on average (735–892 million total exposures; Fig. 1c). Consistent with flood model estimates37, 90% of exposure is concentrated in south and southeast Asia. Most flood events were caused by heavy rainfall (n = 751), followed by tropical storms or surges (n = 97), snow or ice melt (n = 52) or dam breaks (n = 13). The largest cumulative global inundated area occurred in 2003 and 2007, with highest population exposure in 2007 and 2010. We highlight notable events with high human and socioeconomic losses in Fig. 2a–d.

Fig. 2: Observed inundation and flood duration for selected extreme events.
figure 2

a, b, Observed inundation exceeding permanent water (from the Joint Research Program25) for the events in the Global Flood Database with the highest mortality (a; cyclone Nargis, Burma, 2008; roughly 100,000 people) and with the most expensive recovery (b; hurricane Katrina, USA, 2005; $60 billion (USD)). c, d, Flood duration exceeding permanent water for the events in the Global Flood Database with the highest estimated exposure (c; India and Bangladesh, 2004; 27 million people exposed) and with the largest area (d; Russia, 2003; 98,000 km2). Base maps: Light Gray Canvas, Esri, HERE, Garmin, INCREMENT P, OpenStreetMap contributors and the GIS user community.

Flood-exposed population 2000–2015

The total global population increased by 18.6% from 2000 to 2015, compared with 34.1% in areas of observed inundation. Between 2000 and 2015, 58–86 million people, or 23%–30% of the total population exposed, were newly residing in areas where inundation was observed at least once. The change in the proportion of the population exposed to large flood events (equation (6), Methods) represents a global-mean increase of 20%–24% (s.d. = 53%) across 119 countries. Increased flood exposure was concentrated in low- and middle-income countries (Fig. 4a). Flood exposure trends are probably underestimated in rapidly urbanizing countries, because urban floods are underrepresented in the Global Flood Database. We excluded 15 countries from the trend analyses because the uncertainty in the population estimates was larger than the estimated trends (Supplementary Discussion). The proportion of the population exposed to floods increased across all flood types, but was highest in regions with floods caused by dam breaks, where it increased by 177% (Supplementary Table 8). Increased exposure near flood mitigation infrastructure (such as dams) could be due to the levee effect38.

The proportion of the population in inundated areas increased by more than 2% in 70 countries and by more than 20% in 40 countries. Example locations with large population growth in observed inundation areas include Guwahati, India and Dhaka, Bangladesh (Fig. 3c, d). Countries with increased flood exposure were concentrated in Asia and sub-Saharan Africa. Large basins in south and southeast Asia (Indus, Ganges-Brahmaputra and Mekong) had the largest absolute numbers of people exposed (17.0–19.9 million, 107.8–134.9 million and 20.2–32.8 million, respectively) and increased proportions of the population exposed to inundation (36%, 26% and 11%, respectively; Supplementary Table 7).

Fig. 3: Population dynamics per pixel (250-m resolution) in observed inundated areas, 2000–2018.
figure 3

a, New Orleans, USA, loses population after hurricane Katrina. b, Manaus, Brazil; no population change. c, Dhaka, Bangladesh, exhibits increasing population in inundated areas in the peri-urban zone. d, Guwahati, India, an urbanizing town on the Brahmaputra River has repeatedly been exposed to flooding over the past two decades. Base maps: Google, 2015.

In 21 countries, there was little change in the proportion of the population exposed to floods (between −3% and 2% growth), especially where populations have declined in eastern Europe and Russia39. Population growth in floodplains was heterogenous across countries. For example, in Brazil, flood exposure increased on average, but little to no population growth was observed in inundated areas recorded in the city of Manaus (Fig. 3b).

In 28 countries, the proportion of the population exposed to floods decreased by more than 3%. For example, in the US, the flood-exposed population decreased in New Orleans after hurricane Katrina (Fig. 3a)40. Our data indicate that the flood-exposed population decreased in Sri Lanka, potentially because nearly 500,000 people41 were displaced after the 2004 tsunami, as a result of policies that required residents to relocate 100 m from the shoreline. In the Yangtze basin, the proportion of the population exposed to floods decreased by 7%. MODIS probably did not capture increases in urban flood exposure in at least eight countries with rapid urbanization (for example, with annual urbanization greater than 3%; Angola, Afghanistan, Cambodia, Namibia, Chad, Senegal, Sierra Leone and Oman42, see double asterisks in Supplementary Table 5).

Estimated flood exposure 2010–2030

We calculated the population that will be exposed to floods in the near future (2010–2030) in countries with sufficient MODIS observations (n = 119 countries), using the World Resources Institute flood-risk analyser Aqueduct37. Across these countries, the flood model (GLOFRIS) estimates that 580 million people were exposed to a 100-year-return-period flood in 2010. By 2030, the World Resources Institute37 estimates that up to 758 million people will be exposed in the 100-year flood zone, with the additional 179.2 million people being exposed as a result of demographic shifts (116.5 million people) or climate change (50.3 million people; assuming representative concentration pathway (RCP) 8.5), and synergistic climate–land use interactions (12.4 million people). The proportion of the population exposed to floods is expected to increase globally by 2030, but with variation across countries (Fig. 4b; global-mean increase of 4%, s.d. = 90%) and no sensitivity to the return period (Extended Data Fig. 8). In 57 countries, the increase in flood exposure is expected to outpace future population growth, especially in Asia and Africa7. Although we are already halfway towards these projections, they remain uncertain because of uncertainty in climate43 and future population models. The difficulty in predicting changes in migration patterns and country-specific urban development means that increases in future flood exposure could be underestimated where urbanization is rapidly increasing.

Fig. 4: Change in the proportion of the population exposed to floods observed from 2000 to 2015 and predicted for 2030 per country.
figure 4

a, Multiplicative change from 2000 to 2015 in the proportion of the population exposed to observed inundation (equation (6), Methods). b, Multiplicative change from 2010 to 2030 in the proportion of the population exposed to floods. (equation (7), Methods). The change ranges (coloured shading) in a and b are from ref. 7 to facilitate comparison. c, Countries where the proportion of the population exposed to floods: (1) grew from 2000 to 2015 (multiplicative change >1.02 in a and ≤0.97 in b; pink; ‘decreasing’ flood exposure); (2) is expected to grow from 2010 to 2030 (multiplicative change >1.02 in b and ≤0.97 in a; blue; ‘new’ flood exposure); (3) grew from 2000 to 2015 and is expected to grow from 2010 to 2030 (multiplicative change >1.02 in a and b; purple; ‘continuously increasing’ flood exposure); and (4) is expected to remain constant or decrease (multiplicative change ≤1.02 in a and b; orange; ‘never increasing or little change’ in flood exposure). Countries shown in grey had insufficient flood observations or population uncertainty. Base maps: GADM (Global Administrative Areas) 2018, version 3.6.

We compare the change in the proportion of the population exposed to observed large flood events in the recent past (between 2000 and 2015) and predicted to be exposed in the near future (between 2010 and 2030) for the 106 countries with robust population data, using equation (7) (Methods). We identify countries for which a change in flood exposure greater than the population growth is new or continuously increasing, and for which the change in flood exposure relative to population growth is decreasing or the same. For this classification, ‘new’ signifies flood exposure increasing more rapidly than population only in the future period; ‘decreasing’ signifies flood exposure increasing more rapidly than population only in the past period; ‘continuously increasing’ signifies flood exposure increasing more rapidly than population growth in both time periods; and ‘never increasing or little change’ signifies flood exposure increasing more rapidly than population growth in neither time period.

Nine regions and 32 countries, spread across four continents, have ‘continuously increasing’ flood exposure (Fig. 4c, Supplementary Tables 4, 5). Five of these countries (four in Africa plus India) exhibit high continuing increases (more than 20%) in the proportion of the population exposed to floods. Five regions and 25 countries will have ‘new’ flood exposure, concentrated in Europe and North America, with the highest increases (more than 50%) in the flood-exposed proportion in Oman and Sudan. Although 3 regions and 29 countries have ‘decreasing’ flood exposure, models still estimate that 2.2 million additional people will be exposed to 100-year-return-period floods by 2030 in those countries. Three regions (Melanesia, Central Asia, and Western Asia (the Middle East)) and 20 countries have seen ‘never increasing or little change’ in flood exposure.

Discussion

Our results provide evidence from satellite observations that increases in flood exposure are higher (20%–24% from 2000 to 2015) than previously estimated (2.6% from 1970 to 2010)7. We find that the proportion of the population exposed to floods increased in 70 countries, across all continents. This finding is in contrast to previous studies that report increases in only 22 or 55 countries, concentrated in sub-Saharan Africa and Asia7,10. We identify additional increases in flood exposure in southern Asia, southern Latin America and the Middle East. Our estimates are higher than previous ones probably because our observations capture floods caused by dam breaks, pluvial events and snowmelt, which are not included in global models. In addition to increased flood exposure in the recent past, we identify 57 countries where exposure is predicted to grow, indicating flood-prone development patterns that place lives and livelihoods at risk.

There are four limitations in our analysis: (1) the incomplete event record, which does not include smaller yet impactful flood events44; (2) the limited ability of MODIS to map urban floods; (3) the uncertainty in the spatial population distribution; and (4) the uncertainty in predicting climate extremes. The Emergency Events Database36 estimates that more than 1.1 billion people were exposed to flood events from 2000 to 2018, 159–208 million more people than estimated by our database. Our study probably underestimates flood exposure trends in rapidly urbanizing countries, owing to uncertainty in satellites and population growth models. The population data used in this study tend to overestimate observed flood exposure29, with uncertainties too large to reliably estimate a flood trend for 15 countries (Supplementary Discussion).

Future work could improve flood-exposed population estimates by: (1) incorporating more events (for example, through social media45) and satellites over longer time periods or at higher resolution; (2) modelling event extents where satellite temporal coverage is insufficient (for example, flash floods); (3) assigning return periods to compare trends from observations to models; and (4) improving spatial estimates of the past, present and future global population.

The Global Flood Database provides a catalogue of global spatial flood event data at 250-m resolution, available for public download. These data could aid calibration of flood models and comparison to improve modelled flood hazard and exposure estimates. Identifying human settlement growth in areas of observed inundation could inform adaptation strategies such as mitigation and managed retreat46. Flood observations may affect the pricing of financial instruments such as municipal bonds47 and insurance48, and may aid planning for a changing (or already changed) tax base. Population growth in the observed inundated areas is largely due to increased economic development and migration to floodplains. Floodplains may be expanding because of increasing impervious surface area49 and climatic changes14. Increasing flood exposure is also rooted in historical and political processes that produce conditions that may make settling in floodplains the only option for vulnerable populations50. Vulnerability analyses, together with the improved flood exposure estimates presented here, should drive investment in flood adaptation directed to the people and places that need it most.

Methods

Flood event catalogues

We used the DFO flood event catalogue as the source for identifying dates and approximate locations of 4,712 major flood events since 1985 (as of 31 December 2018; Extended Data Fig. 1a). Other publicly available global flood event catalogues, such as the Emergency Events Database (Em-Dat)36, have limited location data at the country level. Mapping entire countries when an event occurs in a small area or crosses borders introduced computational challenges and errors. The DFO database provides spatial estimates of flood locations (for example, points and polygons), not available in Em-Dat, that allow us to filter satellite imagery repositories in focused areas for application of flood detection algorithms. DFO also lists the main flood cause, which we simplified into four categories: dams, heavy rain, snow or ice melt, and tropical storms and surges (Supplementary Tables 6, 8).

The inclusion criteria for DFO (primarily large-media-coverage events, including those covered by FloodList (http://floodlist.com/)) and Em-Dat (10 or more flood-related deaths or at least 100 people affected) differ and possibly introduce bias. We compared DFO and Em-Dat events temporally and spatially at the country level to assess the differences. We matched the DFO and Em-Dat events over the study period (2000–2018; during the satellite data record) by using country names and overlapping date periods, using the fuzzyjoin R package52.

The number of total flood events in the DFO from 2000 to 2018 (n = 3,195) is greater than that in Em-Dat before 2009 (n = 3,010), but less than that in Em-Dat after 2009 (Extended Data Fig. 9a). The number of flood events per year in DFO and Em-Dat is positively and significantly correlated over time (Pearson correlation r = 0.591, P < 0.01), consistent with previous results for 1985–2019 (r = 0.636, P < 0.001)14. Spatial comparison reveals that DFO reports more floods than Em-Dat in the US (192 more events), Australia (79 more events) and Russia (31 more events), but fewer events in South America (36 fewer events), Central America (30 fewer events), the Caribbean (20 fewer events) and Africa (166 fewer events; 94 fewer in west Africa; Extended Data Fig. 9b). This comparison between the databases suggests that the DFO represents trends in major flood events over time, but may underrepresent floods in Africa and South America.

Satellite data and inundation detection algorithm

For historical flood observation, we use the MODIS instrument onboard NASA’s Terra and Aqua satellites. MODIS is an optical satellite commonly used for inundated area mapping26,53,54,55,56,57, is freely available and has had consistent daily coverage since February 2000 and twice-daily coverage since February 2001. The DFO contains 3,127 eligible flood events that co-occurred with MODIS imagery (Extended Data Fig. 1b).

We used the Google Earth Engine platform58 to preprocess and apply water detection algorithms to the MODIS images. The polygon areas provided by DFO represent approximate areas affected by the events. Therefore, we selected all HydroBASINS Level 459,60 watersheds that intersect with the DFO event polygon as our mapping unit (region of interest) for each event. For each event in the database, we collected and analysed every MODIS image acquired over the selected watersheds during the event date range provided in the DFO. In total, we analysed 12,719 individual MODIS tiles across the 3,127 events (Extended Data Fig. 1).

Terra (MOD09GA/GQ) and Aqua (MYD09GA/GQ) MODIS images used in this study were corrected for atmospheric scattering and absorption to provide estimates of surface reflectance at resolutions of 250 m and 500 m61. MODIS data provide reflectance values (stored as digital numbers scaled by 10,000) in the visible (457–670 nm) and near-infrared (841–1,250 nm) wavelengths at 250-m resolution; short-wave-infrared (1,628–2,155 nm) wavelengths commonly used to identify surface water are provided at 500-m resolution. We pan-sharpened the short-wave-infrared band to 250 m using an adapted version of the corrected reflectance algorithm62 to match the resolution of other bands.

Estimates of inundation extent were produced at 250-m resolution using thresholding approaches based on an existing algorithm53. We produced inundation maps for every event using four versions of the algorithm: 3-day standard, 2-day standard, 3-day Otsu and 2-day Otsu.

The ‘standard’ versions of the algorithm identify water using fixed threshold values on stored reflectance values (digital numbers) of the short-wave-infrared (SWIR) band (band 7; 1,628–1,652 nm) and an index, B2B1ratio, defined as

$${{\rm{B1B2}}}_{{\rm{ratio}}}=\frac{{{\rm{DN}}}_{{\rm{NIR}}}+13.5}{{{\rm{DN}}}_{{\rm{red}}}+\mathrm{1,081.1}},$$
(1)

where DNNIR and DNred are the digital numbers of the near-infrared (band 2) and red (band 1; 621–670 nm) bands. A pixel is classified as water via the following:

$${{\rm{pixel}}}_{{\rm{water}}}={{\rm{DN}}}_{{\rm{red}}} < C\vee {{\rm{B1B2}}}_{{\rm{ratio}}} < {K}_{1}\vee {{\rm{DN}}}_{{\rm{SWIR}}} < {K}_{2}.$$
(2)

In the standard algorithm, K1 = 0.7, K2 = 675 and C = 2,027. The constants in equations (1) and (2) were determined empirically (R2 = 0.91) using regression discharge data from the US Geological Society (USGS)-gauged river reaches53.

The ‘Otsu’ versions of the algorithm adjust the thresholds by estimating K1 and K2 (equation (2)) adaptively for each flood event34,63. Otsu thresholding requires a bimodal distribution, in our case representing spectral reflectance of water and non-water, to determine a threshold that minimizes interclass variance (that is, misclassification). We extracted a sample of 2,500 water and non-water pixels (1,250 sampled for each class) from a median composite of MODIS images for each flood event, with clouds removed using the internal cloud-state band. Water and non-water pixels for each flood event were differentiated by matching the flood event year to the permanent water classification for that year from the Joint Research Center global surface-water yearly history dataset25. From our sample, interclass variance was calculated as the between sum-of-squares (BSS):

$${{\rm{BSS}}}_{T}=\sum _{k}{({\bar{{\rm{DN}}}}_{T,k}-{\bar{{\rm{DN}}}}_{T})}^{2},$$
(3)

where DNT,k is the average mean surface reflectance (provided as a digital number) in band T and class k defined by a selected threshold. BSST was calculated iteratively across each bin of a bimodal histogram, representing candidate threshold values, for B2B1ratio and DNSWIR. The maximum BSST, and thus the minimized interclass variance, was selected as a threshold for both B2B1ratio and DNSWIR, and then applied to equation (2) (Extended Data Fig. 2a, b). Using the flood events that passed quality control (see below), the average Otsu thresholds for B2B1ratio and DNSWIR were K1 = 0.77 and K2 = 599, respectively (Extended Data Fig. 2c, d). Compared to the standard thresholds, the Otsu method provides threshold estimates that represent global water conditions as opposed to USGS gauge data. Although the Otsu method estimates event-optimized thresholds, the fact that the median Otsu thresholds approximate the standard thresholds confirms that the standard thresholds perform consistently on a global basis.

We use equation (2) to classify each MODIS image over a region of interest and period of a flood event. After classifying each MODIS image, using either standard or Otsu versions, we calculate multiday composites to reduce false detections. Using 3-day composites, a pixel maintains a water classification if at least three observations out of a possible six (at least 50%) were classified as water; 2-day composites require two observations out of four (at least 50%). Reducing images to multiday composites removes misclassifications due to cloud shadow, a common misclassification with water64. We did not mask clouds with the MODIS 1-km internal cloud-state band, because it removed large portions of flooded area detectable under thin or cirrus cloud conditions. To prevent confusion between water and terrain shadows, areas with slopes greater than 5° were masked out of the final classification using a digital elevation model65, similarly to other water detection studies57.

Inundated pixels are defined as those classified as water following the 3- or 2-day compositing and that lie outside of permanent water defined by the Global Surface Water dataset25. In the Global Surface Water dataset, pixels are identified as permanent water when the Landsat observations in 1985–1999 and in 2000–2016 have water presence. After post-processing, each flood event has four data products (3-day standard, 2-day standard, 3-day Otsu and 2-day Otsu), each of which contains four bands: (1) the maximum extent of inundation; (2) the number of days inundated; (3) the number of clear observations; and (4) the proportion of clear observations.

Evaluating the inundation detection algorithm

To assess the accuracy of the Global Flood Database, we identified 123 flood events with coincident Landsat 5, 7 and 8 imagery at 30-m resolution available within 24 h of the day of maximum inundation and less than 20% cloud cover. Maximum inundation dates were estimated by selecting the day (between the start and end dates for each event) with the largest inundated area estimated by the flood detection algorithm. The 123 flood events used for accuracy assessment span 15 biomes, representing diverse landscape conditions66 (Extended Data Fig. 3).

The number of sampling points selected in remote sensing analysis can affect map accuracy60. We conducted sensitivity analysis to determine the number of validation points required to minimize the variance in precision, recall and overall accuracy. We sampled 500 points for 10 floods events, stratified as 25% in permanent water, 50% in flood water and 25% in non-water regions. Points were randomly subsampled, without replacement, to assess accuracy from 0 to 500 points (Extended Data Fig. 4a). We found that the standard deviation in accuracy fell below 0.1 when 250 or more points were sampled, and therefore chose to sample 250 points per flood event for the remainder of the dataset.

Interpretation of validation points was undertaken by a team of analysts who identified each point as water, non-water or no data, totalling 30,685 validation points. These analysts had access to Landsat images visualized in natural colour, false-colour infrared and two indices that highlight water (the normalized difference vegetation index and the modified normalized difference water index)56,67 to decide whether each pixel was at least 50% dry or wet. Each validation point was assessed by three separate analysts, with the majority vote determining the class of the validation point.

Classification agreement and errors were calculated by comparing per pixel classes from the produced flood maps to the validation data (Supplementary Table 1). Errors of omission (εom) and commission (εcom)56 were calculated as follows:

$${\varepsilon }_{{\rm{om}}}=\frac{{f}_{{\rm{n}}}}{{t}_{{\rm{p}}}+{f}_{{\rm{n}}}},$$
(4)
$${\varepsilon }_{{\rm{com}}}=\frac{{f}_{{\rm{p}}}}{{t}_{{\rm{p}}}+{f}_{{\rm{p}}}},$$
(5)

where tp is the count of true positives, fn is the count of false negatives and fp is the count of false positives.

The 3-day standard algorithm performed best, with an overall accuracy of 83%. 43% of floods had accuracies of more than 90% and 65% had accuracies above 75% (Supplementary Table 2). The standard version of the algorithm was more consistent than the Otsu version. Although Otsu thresholds reduced false detections and increased accuracy in some events, other events resulted in overpredicted flood extent (Extended Data Fig. 4b). Errors of omission had no clear geographic pattern, whereas errors of commission were inflated at higher latitudes (Extended Data Fig. 5).

Flood map quality control

To create the final library of flood maps, every map underwent a quality control process to eliminate poor-quality maps and choose the best map between the two thresholding methods. Because the 3-day composite versions of the algorithms had higher accuracy than the 2-day composite versions on average, all final maps were chosen from the 3-day composite results (Extended Data Fig. 4c). Each flood map (n = 3,195) was visually inspected to assess whether the map was a suitable representation of flooding. We used a quality control procedure similar to that for the NASA flood detection algorithm35. Quality control was completed by analysts, using the metrics summarized here (see Supplementary Table 3 for a complete list). Analysts recorded: (1) whether a flood map mapped area additional to permanent water (from water masks25,68 or Google Earth); (2) whether a flood map was obscured by clouds; and (3) which version of the algorithm (standard or Otsu) best matched visible water from MODIS imagery for the maximum inundation date. They determined the product to be a useful representation of the flood event if it mapped inundation beyond permanent water and was not largely obscured by clouds. To make quality control decisions, analysts viewed the DFO polygon, all original MODIS imagery for the flood event, the standard flood map, the Otsu flood map, underlying high-resolution satellite imagery from Google Earth and a hyetograph of the 95th percentile of precipitation in the region of interest estimated by the PERSIANN data product69.

279 flood events were assessed by at least two separate analysts to calculate intercoder reliability. Analysts agreed on classifying the flood event as “a useful representation of the flood event” (Supplementary Table 3, question 3) for 203 events, representing 73% intercoder reliability. Flood events marked as ‘maybe’ or for which analysts disagreed were quality checked by B.T. to make final decisions.

Some floods of low quality may be present in the database that should not have passed quality control, and local knowledge of any area should be leveraged when using these global data. We encourage users to pair our online catalogue of events and flood dates with the MODIS worldview tool (https://worldview.earthdata.nasa.gov/) to visually examine whether additional flood extent could be mapped by downloading individual MODIS images where water is present in only 1–2 observations and therefore underestimated in the 3-day composites. Supplementary Table 9 includes the quality control information for each flood event.

Quality control results yield 913 flood maps determined to be useful representations of flooding (29.4% of the all DFO events that were mapped). Maps that used Otsu thresholding (124 flood extents; 13.6%) were shown to better capture flood extent than those that used standard threshold, but most flood maps used the standard threshold (789 flood extents; 86.4%). The Global Flood Database produced by this study therefore includes 789 maps using the standard threshold and 124 maps using event-specific Otsu thresholds.

A large proportion of flood events from the DFO (2,212 events; 43.1% of events mapped) did not reveal areas of widespread flooding and failed quality control (Extended Data Fig. 6). The top three reasons noted for failing quality control are extreme cloud cover (n = 495; 16% of events), no standing water beyond existing permanent water (n = 300; 10% of events) and unmapped floods in urban areas (n = 44; 1.5% of events). MODIS may fail to capture: (1) rapid, flash flood events; (2) small channels of water below 250-m resolution (for example, flooded streets in urban areas70); (3) inundation below dense canopy cover (for example, greater than 60%)71; and (4) maximum inundation if the event catalogue start and end dates are not inclusive of the peak flood day.

Estimating observed flood exposure, recent past

To examine flood exposure trends, we use the multiplicative change in the proportion of the population exposed to floods between 2000 and 20157:

$${{\rm{change}}}_{{\rm{fe}}}^{2000\mbox{--}2015}=\frac{{p}_{{\rm{fe}}}^{2015}/{p}_{{\rm{tot}}}^{2015}}{{p}_{{\rm{fe}}}^{2000}/{p}_{{\rm{tot}}}^{2000}}.$$
(6)

Therefore, \({{\rm{change}}}_{{\rm{fe}}}^{2000\mbox{--}2015}=1.35\) is equivalent to a 35% increase; \({{\rm{change}}}_{{\rm{fe}}}^{2000\mbox{--}2015}=1\) (that is, no change) occurs when the total population and flood-exposed population increase or decrease at the same rate.

Each country’s statistic is calculated individually (Supplementary Table 5) and the global mean is an average across countries (weighting each country equally). We also estimate the change in the proportion of the population exposed to floods for distinct flood types (Supplementary Table 8) and for the five basins with the largest total population exposed to floods in our archive (UK, Indus, Ganges-Brahmaputra, Mekong and Yangtze; Supplementary Table 7).

To estimate the global flood-exposed population, we calculated the maximum inundated area across the Global Flood Database between 2000 and 2018. This observed inundated area was intersected with the GHSL for years 2000 and 2015 to calculate the flood-exposed population. GHSL was selected because of its global availability over time with a consistent method, matching resolution of MODIS (250-m pixels) and better accuracy compared to other globally available population data72,73,74,75. GHSL allocates population from census data (within several years of 2000 and 2015) according to the intensity of built-up-area estimates from Landsat (for approximately 2000 and 2015). Other global gridded population datasets either use inconsistent methods over time (for example, Landscan)76 or inflate estimates in rural areas because population is not allocated on the basis of built-up area29,77.

GHSL population estimates have multiple sources of error, including census estimates, incorrect estimation of built-up area or failing to distribute population in rural areas where forest cover obscures built-up area. Unfortunately, neither the Gridded Population of the World (GPW) nor GHSL datasets provide uncertainty estimates72. To understand potential sources of error, we conducted a sensitivity analysis on the flood-exposed population in 2015 estimated by GHSL compared to a higher-resolution dataset, HRSL (Supplementary Discussion). HRSL was selected as a second dataset for its high resolution (30-m) and near global representation (n = 183 countries).

We found that the global-mean bias of HRSL to GHSL was 0.67 (s.d. = 0.40), indicating that GHSL systematically predicts higher flood-exposed populations. The bias of GHSL to predict more exposed population was not constant by region; bias was three times as high in Africa compared to Europe (Supplementary Discussion, Extended Data Fig. 7). We estimate all absolute numbers of exposed population in a range, using upper and lower bounds estimated from the two population datasets. Countries (n = 15) for which the potential population error spread was higher than the flood trend are not included in trend analyses (Supplementary Table 5, single asterisks).

Owing to the potential noise of scattered singular flood pixels, especially along coastlines, which contain mixed pixels at the ocean–land interface, we removed isolated pixels (not connected to at least two other pixels) for area and population calculations. This reduced the population exposure count globally by approximately 20 million people, but did not change the results of the comparison to global flood models or trends.

The flood-exposed population was estimated per country by summing populations residing in the observed floodplain for years 2000 and 2015. Countries with a ratio of flood maps to total known flood events from the DFO of less than 0.13 (the 50th percentile across all countries) were marked as having insufficient data (n = 86 countries; Extended Data Fig. 6c), leaving 119 countries for this trend analysis. Country estimates of the population inundated in at least one observed flood event from 2000 to 2018 are significantly correlated with flood exposure estimates from GLOFRIS78 for the 100-year return period (r = 0.89, P < 0.001; Extended Data Fig. 10). These results suggest that the distribution of the flood-exposed population recorded in the Global Flood Database is consistent with results from a flood model, and that the data may be used to compare past and future trends.

Estimating modelled flood exposure, near-term future

Estimates of the population exposed to future floods is taken from Aqueduct37, for each country available in the Global Flood Database and for which robust population data are available based on uncertainty analyses (n = 106). These data are from the output of GLOFRIS, which uses an average of five climate model outputs coupled to a hydrologic and hydraulic model. We used the RCP 8.5 climate model results and the 100-year flood zone, which is consistent with other flood exposure studies7,10,11,79, and the SSP2 socioeconomic pathways scenario (which predicts future growth will follow historical patterns)33. Flood exposure estimates for 2030 remain uncertain because climate models exhibit high uncertainty for extreme events and often disagree on precipitation trends43. The multiplicative change in the proportion of the population at risk of flood exposure is

$${{\rm{change}}}_{{\rm{fe}}}^{2010\mbox{--}2030}=\frac{{p}_{{\rm{fe}}}^{2030}/{p}_{{\rm{tot}}}^{2030}}{{p}_{{\rm{fe}}}^{2010}/{p}_{{\rm{tot}}}^{2010}},$$
(7)

where \({p}_{{\rm{fe}}}^{2030}\) assumes RCP 8.5 and SSP2, and \({p}_{{\rm{tot}}}^{2030}\) assumes SSP2. We assessed the sensitivity of our choice of the 100-year flood zone and found little variation in trends in the population at risk of flood exposure across return periods (Extended Data Fig. 8a–d, Supplementary Discussion).

We use estimates of flood-exposure population from Aqueduct80 and summarize the methods here. The 2010 flood-exposure population data in this product were estimated by intersecting the GLOFRIS inundated area with the Landscan 2010 gridded population81 corrected by the SSP2 2010 population cell estimates. 2030 population estimates in rural areas downscale SSP2 country estimates proportional to the 2010 Landscan distribution. 2030 urban population estimates downscale SSP2 country projections, using projected urban land use from the Netherlands Environmental Assessment Agency 2UP model82 and local suitability for population growth. This 2030 projection does not take into account urban–urban or urban–rural migration patterns, or differentiate population growth suitability per country in protected areas or flood zones (which could be high in the Global South)83. Future flood-risk estimates may be overestimated in rural areas and underestimated in urban areas, which would mean that flood exposure trends reported here are probably underestimated in rapidly urbanizing regions.