Environmental DNA signatures distinguish between tsunami and storm deposition in overwash sand

Sandy onshore deposits from tsunamis are difficult to distinguish from storm deposits, which makes it difficult to assess coastal hazards from the geological record. Here we analyse environmental DNA from microbial communities preserved in known tsunami and storm-deposited sediments and intercalating soils and non-marine sediments near Cuddalore, India, and Phra Thong Island, Thailand. Both sites were impacted by the 2004 Indian Ocean Tsunami and a subsequent storm flooding event (2011 Cyclone Thane at Cuddalore and a 2007 storm at Phra Thong Island). We show that the microbial communities in the overwash deposits are significantly different from soil and sediments that are not derived by overwash processes at both locations. Our method also successfully discriminates between modern tsunami deposits and storm deposits. We suggest molecular techniques have the potential to accurately discriminate overwash deposits from catastrophic natural events. Overwash deposits from tsunamis and storm surges in Thailand and India can be differentiated from each other and from terrestrial and marine sediment through differences in their soil microbial communities, according to DNA meta-barcoding of sediments.

T he rapid growth of coastal populations significantly increases the exposure of people and assets to coastal hazards such as tsunamis and storm surges 1,2 . Therefore, assessing coastal hazard risks lies at the core of policy planning and sustainable coastal development 3 . One key piece of information to assess flooding risks in coastal areas is to examine the frequency and magnitude of past events 3,4 . The magnitude and impact of modern events are well captured by instrumental and observational data 3 . However, the information that these technological advances provide is insufficient as coastal hazard scientists attempt to extend the knowledge of hazard intensities into the historical past to derive recurrence rates of large, infrequent catastrophic events 3,5 . To address the inherent limitation that the insufficient timespan of observational data provides, the occurrence of extreme wave events is commonly inferred from sedimentary deposits in the geological records [5][6][7][8][9] .
Tsunami and storm hazards in the geological record. Significant debate exists within the geoscience community about how to differentiate between the sediment deposits left behind by tsunamis and storms [10][11][12][13] . Tsunami and storm surge deposits are frequently indistinguishable due to similarities in sedimentary characteristics, even though they are generated by different mechanisms 12,14,15 . For this reason, multiproxy approaches have been developed by combining different diagnostic features of tsunami and storm deposits. However, each technique has inherent limitations. For example, grain size analysis can be inconclusive 9,16 ; microfossils such as diatoms, foraminifera, and ostracods may only be preserved for a relatively short time in tropical settings as compared to temperate locations [17][18][19] ; and chemical elemental signals present in the geological record might be misleading if they have been modified or removed by natural processes such as precipitation 20 and microbial activities 21 , or if the elemental sources are ambiguous 16 . In addition, compiling multiple proxies might not unequivocally show that deposits were laid down by overwash processes 22 .
To date, several studies have focused on microbial community changes after a tsunami flooding event (Table 1). A number of studies utilize conventional culturing techniques that target the culturable fraction of the microbial communities [24][25][26][27] . However, the culturable fraction of the community is unlikely to contain a detailed signature of the type of overwash since only a few microorganisms (<5%) within natural microbial communities are culturable 34,35 . Somboonna et al. 29 , and Asano et al. 21,28 were pioneers in using metabarcoding to study tsunami deposits. They reported that microbial communities in the sediment samples on the coastal zone vary before and after tsunami events, implying that tsunami deposits may contain a distinctive microbial signature. While all the studies listed in Table 1 focused on either comparing between tsunami deposits and unflooded terrestrial sediment samples, or marine and intertidal sediment before and after the flooding event (except 36,37 ), none investigated the microbial community changes within the geological record at the same location. Moreover, there are no studies that have investigated the differences in the microbial community between tsunami and storm deposit.
In this study, we applied molecular techniques to examine the microbial signatures of pulsed overwash deposits in two locations and attempt to answer some key questions in coastal hazard studies: (i) Do microbes differ in respective locations and environments? If yes, can we identify individual tsunami and storm deposits from intercalating soil and sediments? (ii) If we can differentiate overwash deposits from intercalating soil and sediment, can we distinguish between modern tsunami and storm deposits? (iii) Is it possible to identify the source of overwash deposits using molecular techniques? (iv) Can we detect a global microbial indicator for overwash deposits?
To answers all these questions, we collected samples from one site each in India and Thailand (Fig. 1) that were both impacted by the 2004 India Ocean Tsunami (2004 IOT). At the Thailand site, the 2004 IOT deposits occurred in a swale, and the storm overwash formed during an unnamed tropical depression in 2007, and was deposited behind the modern berm on a sandy beach ridge (Fig. 1c). At the Indian site, the 2004 IOT deposits were deposited behind a sandy beach dune, that was subsequently inundated by Tropical Cyclone Thane in December 2011 (Fig. 1f). We selected the Thailand site because the 2004 IOT deposits is clearly demarcated by a sandy layer between two organic mud layers 5,9 , and which differ from the sandy sediments coastward of the swale (Fig. 1c). The Thailand site serves as a good site to benchmark our microbial metabarcoding approach to investigate geological records. Next, we examined sediment samples from the India site, in which different units are sandy and display limited sedimentary differences (Fig. 1f). The Indian site allows us to explore our microbial metabarcoding approach to compare with conventional sedimentological and stratigraphic methods and data, such as sediment grain-size characteristics (Fig. 1), and organic geochemistry, such as C, N, and S concentrations (Supplementary Fig. 4). We applied a simple, next-generation highthroughput microbial metabarcoding approach (Fig. 2) that utilizes the genetic material extracted directly from modern environmental samples to characterize the microbial assemblages that are present (both alive and dead organisms) in the deposits 36 . Knowing "what is there" in the geological record allows the investigation of whether microbial diversity and abundance are underpinned by geological (e.g., deposition, weathering) or ecological (e.g., carbon cycle, nitrogen cycle) processes, and whether the microbial community differences can be related to the type of deposit or tied to the deposits source.
Study sites description. We collected sediment samples from two well-researched study sites: Phra Thong Island, Thailand 5,9,16 and Cuddalore, Tamil Nadu, India [38][39][40] (Fig. 1). Both sites were impacted by the 2004 Indian Ocean Tsunami (2004 IOT) and by different storm events; a minor tropical depression in 2007 at Phra Thong Island, and a Category 2 tropical cyclone-Cyclone Thane in 2011 at Cuddalore. Thus, these two sites provide a rare opportunity to directly compare: (i) known tsunami and storm events that have impacted the same coastline and (ii) the same tsunami event that impacted two different sites with markedly different coastal morphologies (Fig. 1, Supplementary Fig. 1). The Phra Thong site contains a ridge and swale setting with a variety of environmental and sedimentological features including sandy beach ridges and dunes and organic-rich, muddy swales (Fig. 1d, Supplementary Fig. 1) 5 . In contrast, the Cuddalore system has lower environmental diversity and is dominated by the sandy sediments characteristic of a coastal beach-dune and associated • Tsunami-affected land had a higher pH in the surface soil layer compared to subsurface as opposed to land not affected by tsunami.
• Soil microbes in unaffected land had a higher culturable microbial population compared to tsunamiaffected land.
• Area that was once submerged by tsunami had a higher culturable microbial population in the soil after the water had permanently receded compared to areas that were permanently submerged. 25 2004

Indian Ocean Tsunami
India, southeast coast between Vanagiri and Nagoor coastal sediments.
NA Cultured-based isolation of microbes and fungi.
• Pre-tsunami sediment samples had higher culturable microbial diversity than post-tsunami samples.
• Based on potential metabolic functions in the microbial communities using MetaGenomics-Rapid Annotation using Subsystems Technology (MG-RAST), the microbial community metabolic system in both locations were different. S1 was predominantly advanced metabolic subsystems of regulating and cell signaling, cell wall and capsule, protein metabolism, sulfur metabolism and carbohydrate whereas S2 carried high metabolic potentials for pathways of respiration, photosynthesis, and drug and bioactive compound production.
• The habitat prediction based on percent of species indicators for marine, brackish, freshwater and terrestrial niches indicated that S1 largely consisted of marine-habitat indicator species.  • Tsunami-affected soil has more strains isolated than the unaffected soil.
• The isolated strains were found to belong to a single genus, Arthrobacter. This genus has a losses gene that responds to adaptation to an environment with high-iron concentration and the tsunamiaffected soil were found to be rich in iron. Metagenomic data show that tsunamiaffected soil samples have an overrepresentation of denitrificationrelated gene as well as the presence of pathogenic and marine bacterial genera and salt tolerant bacteria. 27 backshore system (Fig. 1g, Supplementary Fig. 1) 38,39 . Detailed geomorphological description is included in the methodology section.

Results
Microbial community composition and diversity within samples. We analyzed the microbial community present in two sediment cores, two beach sand, two intertidal sand, and two marine sediments (total 21 sediment samples) collected from Phra Thong Island and one sediment core, two beach sand, two lagoon sediments, and 6 marine sediments (total 36 sediment samples) and 16 water samples from Cuddalore (Supplementary Data 1). All samples were analyzed with metabarcoding of the SSU rRNA gene which is a universal genetic marker. The metabarcoding generated an average of 84,011 DNA sequences that were grouped into a total of 25,034 amplicon sequence variants (ASVs). Microbial community diversity at each site was calculated after the dataset was rarefied to the minimum number of sequences 42,412 to simulate an even number of sequences per sample. We applied the Shannon (H') and Simpson's (D) diversity indices to measure the abundances and evenness of the microbial community present within each sample 41,42 . Both diversity indices show that the overall microbial community diversity on Phra Thong Island (H′ = 6.3283, D = 0.9952; Fig. 3 Overall, the bacterial communities as determined from 16S rRNA present in all the samples were predominantly of the phylum Proteobacteria ( Supplementary Fig. 2). At Phra Thong Island, there was a large variety of Proteobacteria classes, including Gammaproteobacteria, Deltaproteobacteria, Alphaproteobacteria, and Betaproteobacteria ( Supplementary Fig. 2). Conversely, and likely reflecting the lower diversity, the phylum Proteobacteria at Cuddalore were mainly from two classes: Gammaproteobacteria and Alphaproteobacteria (Supplementary Fig. 2). Within the kingdom Archaea, the phylum Thaumarchaeota were represented by order Nitrososphaerales and order Nitrosopumilales at both locations, with a much • Unflooded soil had higher microbial diversity compared to the flooded soil.
• Hierarchical clustering showed that the community structure of the soil bacteria in flooded soil (both 2 weeks and 2 months after) was clearly different from the unflooded soil.
• The effects of the tsunami on soil bacteria in agriculture fields may have lasted at least 1 year.
A 1.5 m-long sediment core was retrieved and sampled in the field. Seven samples from palaeotsunami sandy layers and from intercalating peat layers were collected. Three additional samples were taken from the surface sandy beach and a nearby natural wetland.
• 172 diatom species were identified with limited fossilized foraminifera tests and radiolaria detected from the samples.
• Foraminifera DNA a sequences were detected in all modern samples and most of the palaeo-deposits.
• The majority of the foraminifera DNA a belonged to brackish and marine taxa. • There is limited microbial differences between field flooded for 2 months and unflooded field.
• There were higher number of sulfur-oxidizing bacteria (SOB) and halotolerant SOB in the field flooded for 2 months.
• Tsunami flooding created a unique environment that house halotolerant bacteria that are not found in marine sediments or agricultural soils. 21 a DNA is deoxyribonucleic acid, here refer as genetic material extracted from the environmental samples that were later sequenced to generate the DNA sequences data b rRNA is ribosomal ribonucleic acid, it is commonly used in microbiology studies as a tool to characterize unknown microbial organisms lower proportion on Phra Thong Island ( Supplementary Fig. 2). The eukaryotic community as determined from 18S rRNA in both locations was dominated by the phylum Ciliophora, class Colpodea and class Spirotrichea, and the phylum Cercozoa, class Filosa ( Supplementary Fig. 2). As Phra Thong Island and Cuddalore harbored very different archaeal and bacterial communities, we adopt a consistent approach but analyzed the microbial community structure at each location independently.
Microbial community dissimilarity between environments. We applied principal coordinate analysis (PCoA) with a Bray-Curtis dissimilarity matrix to illustrate the difference between samples collected from (i) Phra Thong Island and Cuddalore and (ii) different environments at each site; marine sediments collected offshore, beach and intertidal sand collected onshore, lagoon sediment samples, and pit and core samples from backdunes and swales for both 16S and 18S rRNA data ( Fig. 1  In subsequent analyses, we excluded water samples and lagoon sediment samples from the Cuddalore dataset to standardize the testing variables for both locations. Subsequent PCoA with the smaller sample dataset ( Supplementary Fig. 3b) shows that at both locations, the microbial communities from marine, beach and intertidal sediments ( Supplementary Fig. 3b, filled shape) were dissimilar from samples collected from the backdunes and swales ( Supplementary Fig. 3b, outlined circle and triangles; Axis 2 = 8.1% of the variance). We observed that overwash deposits have a high similarity with the microbial communities in the overlying and underlying soil ( Supplementary Fig. 3b). Likewise, the PCoA calculated based on the 18S rRNA database ( Supplementary Fig. 4) shows that the marine water samples were clustered separately from sediment samples ( Supplementary  Fig. 4a, Axis 1 = 15% of the variance), and the microbial dissimilarity between sediment samples was separated based on the sampling locations ( Supplementary Fig. 4a, Axis 2 = 10.5% of the variance). As we remove water samples from the analysis and focuses on examining the microbial dissimilarity of the eukaryotic community in the sediment samples ( Supplementary Fig. 4b), we Fig. 1 The study sites at Phra Thong Island (Thailand) and Cuddalore (India) that were both impacted by the 2004 Indian Ocean Tsunami (2004 IOT). Subsequently, in 2007, an unnamed tropical depression caused a storm overwash in the Thailand and in December 2011, the Indian site was inundated by a very severe tropical cyclone Thane. Location map ((a) is created in QGIS v3.10 while (b) and (e) are satellite images courtesy of Google Earth Pro v7.3.3.7786, Image data © 2019 Terra Metrics, Maxar Technologies) of Phra Thong Island (a, b(i), b(ii)) and Cuddalore, India (a, e(i), e(ii)). The stratigraphy (created in Adobe Illustrator v25.2.1) and the microbial sample collection depths (white dots) from a backdune pit and sediment cores from two swales (Swale X and Swale Y) are illustrated in (c) (top panel). f (bottom panel) shows the stratigraphy and microbial sample depths (white dots) from the backdune pit at Cuddalore. The grain size analysis (Mean (phi) and sorting) of the sediments are included on the right side of the respective stratigraphic diagrams. Environmental samples such as beach, intertidal, lagoon (only for Cuddalore), and marine sediment samples were also collected from both sites (Phra Thong Island: b(ii), d; Cuddalore: e(ii), g). Marine sediment and water samples were collected at 2 m (M1) and 20 m (M2) water depths offshore of Phra Thong Island, and marine sediment and water samples from offshore of Cuddalore were collected at 5 m (m1), 9.5 m (m2), and 15 m (m3) water depths. A schematic cross-section of the study site (created in Adobe Illustrator v25.2.1) is included on the bottom panel for Phra Thong Island (d) and Cuddalore (g). A detailed sample description is listed in Supplementary Data 1. Fig. 2 Schematic diagram of the environmental DNA metabarcoding workflow. Microbial metabarcoding approaches using genetic material extracted from environmental samples. A universal genetic marker (i.e. primer was used to amplify a small subunit of ribosomal ribonucleic acid (SSU rRNA)). SSU rRNA is stable and conserved for all organism genetic material that is functionally and evolutionarily similar, yet highly variable across different species. The amplified deoxyribonucleic acid (DNA) (also referred to as amplicons) were then sequenced through Next-Generation Sequencing (NGS) facility that generates high-throughput DNA sequence data. The data are processed through a bioinformatic pipeline that: (i) filters and trims low-quality sequences, (ii) corrects errors generated during sequencing (DADA2) and (iii) clusters the sequence variances (referred to as amplicon sequencing variants (ASVs)) based on the error model. These ASVs are used to infer the biological identity of members of the microbial communities. The differences between communities in the environmental samples and overwash sediments are then visualized using ordination techniques and tested statistically.
observed that the eukaryotic community in the Phra Thong Island samples were clustered together. Even though there is a faint separation between samples collected from the sandy backdunes (outlined shapes) and marine sediments and beach samples (filled shapes) in Cuddalore dataset, multivariate analysis on the 18S rRNA database shows no significant differences between marine sediments, beach, overwash deposits and intercalating soil and terrestrial samples ( Supplementary Fig. 4).
Effects of geochemistry and grain size on microbial dissimilarity? To assess whether the dissimilarity between the microbial communities was affected by local geochemistry or nutrient availability in the sediments, we measured the three most important chemical elements in soil-specifically carbon, nitrogen, and sulfur 43 from each sediment sample. We performed linear regression analysis to determine the relationship between each chemical variable with each ordination axes and projected this relationship onto the PCoA plot ( Supplementary Fig. 3b). The results revealed that the chemical data were significantly correlated (Supplementary Table 1; p-value <0.02) to the dissimilarity between sampling sites but not between sample types at each site ( Supplementary Fig. 3b).
On Phra Thong Island, soil samples extracted from the cores collected from muddy Swale X and Swale Y (further inland) had the highest total organic carbon (TOC) content (0.63-8.74%), whilst sandy backdune pit samples (nearer to coast where storm samples were collected) had a relatively lower TOC content (0.24-0.54%; Supplementary Fig. 5). A similar trend was observed for total nitrogen (TN) and total sulfur (TS) data on Phra Thong Island, where the core samples from Swale X and Swale Y had higher TN (0.01-0.75%) and TS (0.01-0.26%) compared to the backdune samples that had minimal amounts of TN (≤0.02%) and TS (≤0.01%) (Supplementary Fig. 5). In the sedimentary Fig. 3 Boxplot of Shannon and Simpson's diversity indices with 95% confidence interval error bars, characterizing the richness and evenness of ASVs that were present on Phra Thong Island and at Cuddalore. Shannon and Simpson diversity indices indicate that the richness and evenness of ASVs on Phra Thong Island were higher than at Cuddalore. Marine sediments have the highest diversity compared to other sample types (intertidal, beach, overwash deposits and intercalating soil and terrestrial samples) at both locations, followed by the beach and intertidal sand. Backdune samples from Cuddalore and swale samples from Phra Thong have relatively lower diversity than the marine and beach samples at both locations. On Phra Thong Island, the 2007 storm and 2004 IOT deposits have relatively higher diversity compared to the soils above and below the two event deposits. In contrast, Cyclone Thane and the 2004 IOT deposits at Cuddalore have the lowest diversity compared to other sediment samples collected at the same study location.  Fig. 5).
The Cuddalore system had a much simpler geochemical profile as TN or TS were not detected in the Cyclone Thane deposits, 2004 IOT deposits and the majority of the intercalating sand layers (Supplementary Fig. 5). The one exception was the post-2004 IOT aeolian deposit (refers to sediment that was transported and deposited by the wind) which had TN of approximately 0.01% (Supplementary Fig. 5). The TOC in the Cuddalore sediment samples was also consistently low (0.03-0.11%), with the exception of marine sediments that had a slightly higher amount of TOC ranging from 0.25% to 1.53% ( Supplementary  Fig. 5).
To investigate grain-size variation (Fig. 1) and the possible link to ASVs we ran an Envfit analysis with the inclusion of grain size distribution data (mean and sorting) shown in Supplementary Table 2. Supplementary Table 2 shows that there is a significant correlation (p-value <0.05) that links the environmental DNA to grain-size mean or sediment sorting. As we focus on understanding the microbial dissimilarity between each stratigraphic unit within the sediment profiles, i.e. examining the overwash deposits versus intercalating soil and terrestrial samples, we observed that there was no correlation between microbial dissimilarity and mean grain size (Supplementary Table 2).
Distinguishing between terrestrial soil, tsunami, and storm deposits. In order to resolve the complex relationship between storm and the 2004 IOT deposits, and intercalating soil and sand layers, we performed a constrained ordination using distancebased redundancy analysis (dbRDA) to estimate the effects of each explanatory variables 44 . We performed dbRDA using Bray-Curtis dissimilarity distance separately for each study site using the same variables at each site. Our dbRDA produced a total of four constrained axes and 29 unconstrained axes, the first two constrained axis (referred to as canonical axis, CAP) are shown in Fig. 4. The results show that on Phra Thong Island, the marine sediment samples, intertidal and beach sands clustered separately (16% of the variance) from the backdune pit samples and swale core samples (canonical axis 1 in Fig. 4a; ANOVA F 1,15 = 4.0117, p-value = 0.001 in Supplementary Table 3). A similar cluster was observed in Cuddalore samples, where the marine sediment samples and beach sands were clustered separately from the sandy backdune pit samples, supported with 25.8% of the variance (canonical axis 1 in Fig. 4b; ANOVA F 1,29 = 13.4769, p-value = 0.001 in Supplementary Table 3).
As we focus the analysis to overwash deposits and the intercalating sediment samples, we observed that on Phra Thong Island, the first canonical axis ( Fig. 6).
A similar result was observed in Cuddalore. Cyclone Thane deposits were significantly different from the 2004 IOT, and overlying and underlying sediments, supported with 16% of the variance (canonical axis 1 in Fig. 4d; ANOVA F 1,23 = 4.5135, pvalue = 0.001; Supplementary Table 3). The second canonical axis in Fig. 4d marginally separates the 2004 IOT deposits from the intercalating samples supported with 6.1% of the variance (ANOVA F 1,23 = 1.6988, p-value = 0.081, Supplementary Table 3). Nonetheless, multivariate analyses on community composition for samples from Cuddalore (Supplementary Table 4) showed that all sample types differed significantly from each other, except for tsunami deposits and the overlying and underlying aeolian and reworked sediments (PERMANOVA composition: pseudo-F 4,29 = 5.674, p < 0.0001). It must be noted that the lower dissimilarity between tsunami deposits and intercalating soil may be due to the significantly higher dispersion in aeolian deposits (>50%) than marine sediment, beach, and overwash deposits (see Supplementary Material). On average we find that the tsunami deposits were <30% similar to storm deposits and <25% similar to aeolian deposits ( Supplementary Fig. 6).
Temporal variability of microbial dissimilarity between sample type. We sampled the deposits preserved on Phra Thong Island in consecutive years (2014 and 2015) at adjacent sites (<1 m away). That data showed that the differences in microbial community composition among sample types were consistent through time (type by year interaction: F 2,13 = 0.8520, p = 0.7337; year: F 1,13 = 1.0434, p = 0.4236; Supplementary Table 4). The microbial population remained relatively unchanged despite our sampling sites being in a tropical setting with frequent rainfall and significant fluctuation of groundwater levels 5,9,16,17 .
Characterization of the deposition source with specific microbial markers. Differential analysis using a negative binomial generalized linear models (DESeq2 package; 43) revealed that within the 527 ASVs identified on Phra Thong Island samples and 92 ASVs identified in Cuddalore samples (Fig. 5 -left dendrogram), there were substantial differences in the microbial communities preserved in the storm deposits and the 2004 IOT deposits. At both study sites, storm deposits were clustered together (except one storm deposits labeled as 77cmS2 at Cuddalore), and separated from tsunami and intercalating sediment samples (Fig. 5-top dendrogram). Likewise, tsunami deposits on Phra Thong Island are clustered together (Fig. 5a-top dendrogram). Nonetheless, the differences between tsunami and intercalating aeolian sediments at Cuddalore are less clearly defined (Fig. 5b-top dendrogram).
In general, ASVs that were prominent in the 2007 storm and Cyclone Thane deposits were from the class Gammaproteobacteria and class Actinobacteria (Fig. 5). ASVs that were unique to the 2007 storm deposits were from the family Blastocatellaceae (Subgroup 4), family Subgroup 6, family Holophagae, family Nitrososphaeraceae, family Nitrosomonadaceae, family Microscillaceae, and family Acetobacteraceae (Fig. 5a). ASVs that were unique to the Cyclone Thane deposits were from the family Parcubacteria, family Chromobacteriaceae, family Rubinisphaeraceae, family Burkholderiaceae, family Micromonosporaceae, family Bacillaceae, family Nocardioidaceae, family Sporichthyaceae, family Caulobacteraceae and family Sericytochromatia (Fig. 5b). Notably, family Chitinophagacea and family Thermoplasmata appeared to be taxa that were unique across both storm deposits (Fig. 5) and not found in other samples at each site. There were no taxa that were only present within the Phra Thong Island 2004 IOT deposits. Nonetheless, we did identify family Clostridia and family Paenibacillaceae ASVs that were unique to the Cuddalore 2004 IOT deposits.

Discussion
Do microbial communities differ between the respective study sites and samples? Our study shows that the microbial communities from the Phra Thong Island and Cuddalore samples were significantly different from each other and that these differences also existed between the microbial communities in the offshore, onshore, backdune and swale environments (Supplementary Fig. 3). We also observed that the microbial communities preserved on Phra Thong Island were significantly dissimilar to those at Cuddalore (Supplementary Fig. 3). These large differences in community composition likely correspond to differences in the site geomorphology and different sedimentary processes that build the sedimentary profiles. Phra Thong Island has organic-rich swales with fine to medium-grained sandy beach ridges 5,16 , while Cuddalore's sediment is mainly sand with limited differences in grain size distribution and characteristically low organic and carbonate content 39 (Fig. 1, Supplementary Fig. 1). The microbial community structure, diversity, and distribution at both sites are primarily affected by sediment physicochemical characteristics followed by geomorphological characteristics such as grain size and sediment type 45 .
Is it possible to identify individual tsunami and storm deposits from overlying and underlying soils and sediments? Our first focus was to investigate the possibility that microbial metabarcoding approaches can identify individual tsunami and storm deposits from the overlying and underlying soils or sediments. First, we examined the storm deposits microbial communities, and the result shows that our approach can distinguish the 2007 storm deposit on Phra Thong Island and Cyclone Thane deposits at Cuddalore from the soil and sediments (p-value = 0.0001; Fig. 4, Supplementary Table 4). Notably, our approach is capable of identifying sediment deposited from relatively minor storm events. This finding highlights the clear potential for Fig. 4 Distance-based redundancy analysis (dbRDA) using Bray-Curtis dissimilarity. dbRDA shows that microbial communities in marine sediments, beach, and intertidal samples were different from backdune pit and swale core samples (a, b). c and d show that at Phra Thong Island and Cuddalore, storm deposits and 2004 IOT deposits have distinct microbial communities that are dissimilar to soil and terrestrial samples. The ellipses (dashed line) bound each cluster by sample type at an 80% confidence interval. The significant values of respective canonical axes are reported in Supplementary Table 2  applying microbial approaches to aid historical coastal hazard study on identifying tsunami and storm deposits.
The next research question examined whether it is possible to identify/differentiate 2004 IOT tsunami deposits from the soil and sediments above and below it. The microbial community dissimilarity between tsunami deposits and soil and terrestrial samples is significant (p-value < 0.0001; Fig. 4, Supplementary Table 4). Our observation is in agreement with the studies summarized in Table 1 that microbial communities were different between tsunami deposited sediments and non-tsunami soils and sediments. Similar to Nayak et al. 25 , Asano et al. 28 , Godson et al. 26 , and Somboonna et al. 29 , we observed that tsunami deposits have relatively lower diversity as compared to nontsunami soils and sediments (Fig. 3). Likewise, Ramesh et al. 24 reported that marine sediment has higher microbial diversity immediately after a tsunami flooding event and the microbial community diversity gradually decreased in subsequent days. We hypothesize that an overwash deposit transported from offshore to onshore could transport material from various environments, and thus, contains a richer microbial diversity. The abrupt changes during the flooding event, and the recovery of the landscape to the pre-event condition may have reduced the diversity over time. Nonetheless, our 2014 and 2015 sampling in Phra Thong Island suggests that during a normal year, there is little microbial variation in any of the samples (Supplementary Table 4). Hiraoka et al. 27 is the only study reported that tsunami flooded soil contains a rich microbial community as compared to unaffected soil, however, they are only referring to one genus that was successfully isolated from the samples.
When compared to the storm deposits, the microbial signature in the Cuddalore 2004 IOT deposits was more subtle as the tsunami deposits have about 30% similarity with the overlying and underlying soil layers (Supplementary Fig. 6). We observed that microbial communities in aeolian sediments post-2004 IOT are grouped with microbial communities in the upper part of the 2004 IOT deposits. Meanwhile, the microbial communities from the lower portion of the 2004 IOT deposit are grouped with tidal sediment pre-2004 IOT (Fig. 5b). Our observation may suggest mixing of underlying sediments into the 2004 IOT deposits during deposition followed by a subsequent mixing of tsunami deposits in the overlying reworked sands. The mixture between sediment units occurred as a tsunami inundation may erode material from the underlying substrate or land surface 46 . It could also affect the incorporation of underlying sediments during tsunami backflow to the sea 4 .
Can we distinguish between modern storm and 2004 IOT deposits based on ASVs? When we compare modern storm and 2004 IOT deposits based on ASVs, our microbial metabarcoding approach can reliably distinguish between tsunami and storm deposits. Our multivariate analysis shows that the differences in microbial communities between tsunami and storm deposits, as well as in soil and terrestrial and offshore sediments, were statistically significant ( Fig. 4; Supplementary Table 4). Therefore, our approach can diagnose an overwash event that is different from typical soil microbial composition for both sites. Tsunami and storm have distinct sedimentary deposition mechanisms 45 but both extreme waves can transport sediment from the nearshore marine environment 47,48 . Some tsunami waves have been reported to contain offshore sediments from inner continental shelf environments 49,50 that were scoured from 30 to 50 m water depth. It is likely that such sediment would contain distinctive microbial taxa that are unlikely to be found in the intertidal and beach sediments that dominate the source of storm deposits, along the same coastline. Furthermore, all locations may be modified by post-event weather conditions including strong winds and extreme precipitation that can potentially alter the soil microbial communities 51 either directly (e.g. transporting unique microbial communities) or indirectly (e.g. altering the soil chemistry) 52,53 .
Can we use a eukaryote community-level approach to identify overwash deposits? We extended our metabarcoding approach to include the 18S rRNA gene, which is a universal genetic marker for eukaryote 54 . The analysis of eukaryotic community shows no significant differences in discriminating storm, tsunami, and soil and terrestrial samples (Supplementary Fig. 4). This may be due to the low efficiency in our primer sets to amplify eukaryotic rRNA gene, as eukaryotic rRNA genes tend to have long variable region, gene region that demonstrates considerable sequence diversity among different eukaryotes 54 , while next-generation sequencing technology has restricted sequencing length. Instead of using a community-level approach, we suggest a metabarcoding approach targeting a specific group of eukaryotes.
Can we identify the source of the overwash deposits? Since we can identify overwash deposits from intercalating soils and sediments, the next challenge is to identify the source of the overwash deposit. Our study found no similarity between 2004 IOT deposits and marine sediment (Supplementary Fig. 6). Signature taxa that can identify storm and tsunami deposits live in a vast range of habitats. This result possibly suggests that microorganisms with the capacity to tolerate a broad range of environmental conditions increased in relative abundance after the pulse disturbance of the overwash as opposed to "habitat specialists" 55 , that would have found themselves at a competitive disadvantage. Our finding is in line with other tsunami microbial studies (Table 1) who report microbial community changed before and after a tsunami inundation event, and these community changes after the flooding disturbance remained up to 10 years after the event 21 .
Is there a global microbial signature for overwash deposits? We identified 527 ASVs on Phra Thong Island and 92 ASVs at Cuddalore that signify the differences between tsunami and storm deposits despite the ubiquity of most of the taxa found in this study. In particular, the family Chitinophagaceae and the family Thermoplasmata are found only in storm deposits at both locations. While we find this a promising step in the search for definitive signatures, we have limited evidence to conclude that these taxa are a global signature for storm deposits. Furthermore, we did not identify any ASVs that are only present in the 2004 IOT deposit at either location.
Our work addressed six primary questions on the use of microbes to investigate extreme wave events. Earlier attempts to use microbial communities to identify tsunami deposits (Table 1) commonly suffered from limitations in identifying and characterizing the microbial communities due to methodological restrictions. The experimental design in these earlier studies was also not able to adequately examine the potential of using microbial communities to tackle the struggle in historical coastal hazard study-to identify overwash deposit from the geological records. Our work overcomes these limitations by applying advanced molecular techniques and robust statistical testing. The microbial metabarcoding approach that we adopt clearly demonstrates that microbial communities do differ between recent tsunami and storm deposits preserved in similar geomorphic settings at two distinct locations. We acknowledge that our study focuses only on modern tsunami and storm flooding events, and thus it will be interesting to investigate older or palaeo-overwash deposits from other sites. Such work will facilitate an examination of the potential of microbial signatures existing in overwash deposits over long time frames. Nevertheless, we concede that our work could not confidently identify the factor/s that caused the community structure differences between tsunami and storm deposits. Future work involving detailed chemical analysis such as oxygen and heavy metal ions present in the sediments will facilitate the understanding of what caused the changes and how microbes respond to disturbance induced by coastal flooding. We present for the first time, unambiguous evidence for discriminating tsunami and storm sedimentary deposits using the metabarcoding approach. Our study addresses a key challenge in the analysis of coastal hazards that could support improved risk assessment for coastal regions.

Methods
Study site description and sample collection. Phra Thong Island is approximately 125 km north of Phuket on the west coast of southern Thailand in the Andaman Sea (Fig. 1a, b(i), b(ii)). This island was impacted by the 2004 Indian Ocean Tsunami (2004 IOT) when the Sumatran megathrust ruptured. Since the formation of beach ridge sequences and a series of swales on the western side of Phra Thong Island are geomorphologically favorable for preserving tsunami deposits 5,16 , the site was used in over 10 studies examining tsunami events 9 . The 2004 IOT deposits are preserved in the marshy swale-Swale X (9°8.1742′ N, 98°1 5.916′ E) and Swale Y (9°7.917′ N, 98°15.744′ E) in Fig. 1b(ii), c, d; Supplementary Data 1), underneath an organic soil layer 5 . We obtained local permission from Chulalongkorn University to conduct research at this site and received approval from the land-owner to perform sample collection. To avoid crosscontamination between samples and interference of modern DNA to the targeted samples, all tools such as coring tube, Van Veen grab, hacksaw, hand trowel and Van Dorn bottle were pretreated with 20% bleach solution and rinsed between samples. Powder-free surgical gloves were worn when handling the samples. Sediment cores from Swale X and Swale Y were collected using a plastic push-core that was pretreated with a 20% bleach solution. Each stratigraphic layer was subsampled using a sterile 50 mL conical tube. Six sediment samples were collected from Swale X, three sediment samples were collected from Swale Y and 5 sediment samples were collected from a backdune pit (Supplementary Data 1). Aside from collecting sediment samples from the geological record, we also collected environmental samples; beach and intertidal sediments using sterile 50 mL conical tube, as well as marine sediments from 2m and 20m water depths using a Van Veen grab ( Fig. 1 b(i), b(ii), d; Supplementary Data 1). We used a Van Dorn bottle to collect marine water from 2m and 20m water depth. The sedimentary profile, i.e. grain size characteristics and mineralogy of the environmental samples, varies between offshore, onshore sediment and overwash deposits 16 .
Our study site in India is located along the coastline of Cuddalore, Tamil Nadu, southeast India (Fig. 1a, e(i), e(ii)). The southern part of India was affected by the 2004 IOT, with the coast of Tamil Nadu experiencing the highest death toll and most severe damage 56 . Anna University provided permission and field assistance in sample collection from the study site. We pretreated all the tools with 20% bleach solution, rinsed the tools in between collecting each sample to prevent crosscontamination between samples. We worn dust-free surgical gloves during sampling to avoid the potential introduction of modern DNA into the samples. We collected 30 sediment samples from a backdune pit in Devanampattinam, Cuddalore (11°45.198′ N 79°47.334′ E; Fig. 1e(i), e(ii); Supplementary Data 1), as they contain both the 2004 IOT deposits and the 2011 Cyclone Thane deposits [38][39][40] . The backdune pit is made up primarily of quartz with little organic and carbonate content. Underneath the surface aeolian sediment is the 2011 Cyclone Thane deposit, and below the storm deposits is reworked aeolian sand followed by the 2004 IOT deposits (Fig. 1f, g; Supplementary Data 1). Beneath the 2004 IOT deposits is a fineto medium-grained sand unit laminated with heavy minerals (Fig. 1f, g; Supplementary Data 1). We also collected sediment samples from the beach, a nearby lagoon that is rich in organic matter, and marine sediment samples (at 5 m, 9.5 m, and 15 m water depths; Fig. 1e(i), g; Supplementary Data 1) using a Van Veen grab pretreated with 20% bleach solution. All samples were collected using sterile 15 mL conical tubes. A Van Dorn bottle pretreated with 20% bleach solution is used to collect lagoon and marine water samples. The water samples were filtered through a 0.20 uM sterivex filter to capture the water community. Upon collection, all the samples (both from Phra Thong Island and Cuddalore) were transported using a portable liquid nitrogen dry shipper and stored in the Ultra-Low Freezer facility at −80°C at the Asian School of the Environment, Nanyang Technological University, Singapore immediately upon arrival.
Grain-size analysis. Each overwash deposits and its overlying and underlying samples were treated with 15% Hydrogen Peroxide (H 2 O 2 ). This chemical treatment is to remove organic materials that are present in the samples to determine the granulometry of the clastic component. The samples were washed with deionized water at least three times before performing grain size analysis using the Malvern Mastersizer 3000. The Malvern Mastersizer 3000 uses laser obscuration to measure particle size distribution. Each sample was subjected to 1 min sonication to disaggregate the sediment before taking the measurement. The raw data was analyzed using GRADISTAT version 9.1 57 . This program calculates grain size statistic using the Folk and Ward and moments method 58 , and generate statistics including the mean(phi) and sorting that were reported in Fig. 1.
Chemical analysis. Dried bulk sediment samples were sent to the Stable Isotope Laboratory at Hong Kong University to be analyzed for TOC, TN, and TS. Briefly, 30 mg of sediment samples were weighed in a 5 mm × 9 mm silver capsule (Sercon) and directly acidified with 6 N of hydrogen chloride (HCl) to remove residual carbonate before analysis. All samples were dried overnight at 60°C and combusted and analyzed using an Elemental Analyzer attached to an Isotope Ratio Mass Spectrometer (EA-IRMS).
DNA extraction from sediments. We extracted total DNA from 250 mg sediment samples using the DNeasy PowerSoil Kit (Qiagen, Hilden, Germany). We started the extraction by adding 200 μL of Phenol: Chloroform: Isoamyl-alcohol (25:24:1) 59 before adding Solution C1. After centrifuging at 28,000g for 1 min, we transferred the supernatant to a new tube and added 100 μL Solution C2 and 100 μL Solution C3. The mixture is briefly vortexed and incubated at 4°C for 5 min. We then centrifuged the mixture at 28,000g for 1 min and transferred the supernatant to a new tube, adding Solution C4 in the ratio of 1:1 with the supernatant, and 650 μL of 100% molecular grade ethanol. For the remaining steps, we followed the manufacturer's protocol. The extracted DNA was cleaned with OneStep™ PCR Inhibitor Removal Kit (ZYMO Bioresearch, Irvine, CA) to remove any potential impurities such as polyphenols, humic acids, and fluvic acids from the soil that can inhibit downstream amplification reactions. The DNA concentration was determined using the Qubit ® Fluorometer (Thermo Fisher Scientific, USA).
Construction of amplicon sequencing library. We performed a polymerase chain reaction (PCR) on the extracted total DNA to amplify a targeted fragment from the genetic material using a short single-stranded DNA primer set. We used two different primers sets: 926wF (AAA CTY AAA KGA ATT GRC GG) and 1392R (ACG GGC GGT GTG TRC) is a universal primer that amplifies the 16S rRNA gene V6-V8 regions in archaea, bacteria and eukaryotes 60,61 ; while 574F (GCG GTA ATT CCA GCT CCA A) and 1132R (CCG TCAA TTH CTT YAA RT) amplifies the 18S rRNA gene V4-V7 regions in eukaryotes 54 . Each reaction is comprised of 12.5 μl of 2x KAPA HiFi Hotstart Ready Mix (KAPA Biosystems, Cape Town, South Africa), 10 μl of 1 μM of primer, and 2.5 μl of 5 ng/ml of total DNA, with each sample done in triplicate. A PCR consists of three steps: 1. separation of the double-stranded DNA into single strands at high temperatures (denaturation), 2. binding of the DNA primers to the edge of the targeted region (annealing) 3. building the complementary strand of the targeted fragment through polymerization (extension). We started by denaturing the double-stranded DNA at 95°C for 3 min, then steps (1) through (3) for 20 times: (1) denaturation at 95°C for 30 s, (2) primer annealing at 55°C for 30 s, and (3) extension of the complementary strand at 72°C for 30 s. This is followed by a final extension at 72°C for 5 min. The PCR was performed in a ThermoFisher SimpliAmp Thermal Cycler. To prevent any potential bias in this PCR-based method, we used the minimum amplification cycle and maintained the same PCR setting and reaction mixture for all the samples. All the PCR products were purified with Agencourt AMPure XP beads (Beckman Coulter, Singapore) following the manufacturer's protocol. These PCR products were sent to the Macrogen Asia Pacific Pte. Ltd. sequencing facility to generate millions of DNA sequences using an Illumina MiSeq machine with MiSeq Reagent Kit v3 chemistry (2 × 300 bp).
Sequence analysis. The raw sequences generated from Illumina Miseq were initially processed by removing primers from the sequences using cutadapt version 2.10 62 . Next, the sequences were filtered by truncating those that are shorter than 280 in forward reads (R1), and 230 in reverse reads (R2). We also removed sequences that have a quality score of less than or equal to 2 and have an expected error rate (sum (10^(-quality score/10)) of higher than 2 (R1) and 5 (R2) using the filterAndTrim function in DADA2 package in R. We applied DADA2 63 that used a novel algorithm to calculate the error introduced during sequencing to construct an ASVs table. This ASV is analogous to the traditional operational taxonomic units (OTUs), it infers the true sample composition as oppose to OTUs that cluster sequences with a fixed dissimilarity threshold i.e. 97% similarity 64 . Taxonomy classification was performed using SILVA version 132 database 65 as a reference to identify bacteria and archaea for the 16S rRNA dataset, and PR 2 database version 4.12.0 66 (https://github.com/pr2data-base/pr2database) as a reference to identify eukaryote using RDP naïve Bayesian classifier as implemented in R DADA2 package for the 18S rRNA dataset. ASVs that have less than 10 counts were removed, followed with a bootstrap value lower than 90% at the supergroup/ phylum level were discarded. The final ASV table contained a total number of 25,034 ASVs generated from 926F-1392R primers and 5840 ASVs from 574F-1132R primers.
Statistical analysis. Statistical analysis was performed in R version 3.6.1 67 using the phyloseq package. All relevant R code is available from the associated GitHub Repository (see section Data and Code availability). The number of reads in each sample was normalized using median sequencing depth. The richness of the samples was calculated using the Shannon index (H') and Simpson index (D) available in the vegan package 68 to accounts for both ASVs abundance and evenness in the dataset. These samples were rarefied to the minimum number of ASVs across all samples to simulate an even number of reads per sample before calculating the ASVs richness. Ordination analysis was performed to understand how the community differs from one sample from the other. The number of ASVs was transformed with square-root to minimize the effect of abundances toward composition analyses. We applied PCoA calculated using the Bray-Curtis dissimilarity index available in the vegan package to measure the similarity and dissimilarity between samples type. We included the result from the Envfit analysis (also available in the vegan package) 68 into the PCoA plot to understand whether the dissimilarity between different samples type is driven by changes in the environment chemistry. The Envfit result was calculated with 9999 permutations generated using TOC, TN, and TS. In subsequent analysis, we separated the dataset based on the study site and removed water samples and lagoon sediment samples from the dataset to standardize the testing variables. No water samples and lagoon sediment samples were collected from Phra Thong Island. We performed a constrained ordination with distance-based redundancy analysis (dbRDA) that is available in the phyloseq package ordinate function to visualize the dissimilarity between sample types and their driving explanatory variables. The ellipses included in the dbRDA plots were calculated with 80% confidence intervals.
We used dissimilarity-based permutational multivariate analyses of variance (PERMANOVA 69 ) to examine differences in microbial community structure (square-root transformed, normalized median sequences) between sample types in each location. Analyses were undertaken in PRIMER v7 (PRIMER-E, UK). Bray-Curtis similarities were calculated for all pairs of samples. Sample type was a fixed factor, with five levels. For the Phra Thong Island dataset, we also included year as a fixed, orthogonal factor (two levels: 2014, 2015) to determine whether the effects of sample type were consistent in time. p-values were calculated using 9999 unrestricted permutations of raw data (Cuddalore; appropriate for one-factor analyses) or restricted permutations of residuals (Phra Thong Island; appropriate for multi-factorial analyses 70 ). A posteriori pairwise comparisons were done to determine which sample types differ from which, with p-values calculated using Monte Carlo simulations. Permutational multivariate analyses of dispersion were used to examine differences in dispersion among sample types at each location. pvalues were calculated using 9999 permutations.
We used the DESeq2 package 71 to perform differential analysis to determine which ASVs respond to the differences between 2004 IOT deposits and storm deposits at the respective study sites. DESeq2 is a negative binomial generalized linear models and uses the Wald test for significance testing. An adjusted p-value (p-value corrected for multiple hypothesis testing 72 ) of 0.001 was used as the cutoff to select the ASVs that were significantly different present in either 2004 IOT deposits or storm. The differentially present ASVs were extracted from the dataset and visualized using a heatmap. A hierarchical cluster analysis was performed on these ASVs using Euclidean distance to group ASVs based on the similarity between sample type using Ward's minimum variance method 73 .
Reporting summary. Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability
Raw sequence data that support the finding of this study have been deposited in National Center for Biotechnology Information (NCBI) GenBank under the BioProject ID PRJNA343068. All statistical results are reported in https://github.com/slimelab/ Tsunami-microbes.