Comparative mangrove metagenome reveals global prevalence of heavy metals and antibiotic resistome across different ecosystems

The mangrove ecosystem harbors a complex microbial community that plays crucial role in biogeochemical cycles. In this study, we analyzed mangrove sediments from India using de novo whole metagenome next generation sequencing (NGS) and compared their taxonomic and functional community structures to mangrove metagenomics samples from Brazil and Saudi Arabia. The most abundant phyla in the mangroves of all three countries was Proteobacteria, followed by Firmicutes and Bacteroidetes. A total of 1,942 genes were found to be common across all the mangrove sediments from each of the three countries. The mangrove resistome consistently showed high resistance to fluoroquinolone and acriflavine. A comparative study of the mangrove resistome with other ecosystems shows a higher frequency of heavy metal resistance in mangrove and terrestrial samples. Ocean samples had a higher abundance of drug resistance genes with fluoroquinolone and methicillin resistance genes being as high as 28.178% ± 3.619 and 10.776% ± 1.823. Genes involved in cobalt-zinc-cadmium resistance were higher in the mangrove (23.495% ± 4.701) and terrestrial (27.479% ± 4.605) ecosystems. Our comparative analysis of samples collected from a variety of habitats shows that genes involved in resistance to both heavy metals and antibiotics are ubiquitous, irrespective of the ecosystem examined.

The mangrove ecosystem harbors a complex microbial community that plays crucial role in biogeochemical cycles. In this study, we analyzed mangrove sediments from India using de novo whole metagenome next generation sequencing (NGS) and compared their taxonomic and functional community structures to mangrove metagenomics samples from Brazil and Saudi Arabia. The most abundant phyla in the mangroves of all three countries was Proteobacteria, followed by Firmicutes and Bacteroidetes. A total of 1,942 genes were found to be common across all the mangrove sediments from each of the three countries. The mangrove resistome consistently showed high resistance to fluoroquinolone and acriflavine. A comparative study of the mangrove resistome with other ecosystems shows a higher frequency of heavy metal resistance in mangrove and terrestrial samples. Ocean samples had a higher abundance of drug resistance genes with fluoroquinolone and methicillin resistance genes being as high as 28.178% ± 3.619 and 10.776% ± 1.823. Genes involved in cobalt-zinc-cadmium resistance were higher in the mangrove (23.495% ± 4.701) and terrestrial (27.479% ± 4.605) ecosystems. Our comparative analysis of samples collected from a variety of habitats shows that genes involved in resistance to both heavy metals and antibiotics are ubiquitous, irrespective of the ecosystem examined.
Mangroves are estuarine ecosystems composed of saline tolerant plants and are found in 60-70% of the coastal areas, exclusively in tropical and subtropical regions 1 . They are exposed to fresh and oceanic water, experiencing a wide variation of salinity throughout the tidal cycles 2 . Mangroves are important as they are a rich reservoir of microbial diversity and act as a buffer zone between land and sea. Furthermore, mangroves are also a source of novel enzymes and small biomolecules such as LipA-like lipase 3 , aspergilumamide-A peptide 4 , pyrrolizidine alkaloid penibruguieramine-A 5 , GH44 family endoglucanase 6 , pullularins E, F peptides 7 and salt-tolerant endo-β-1, 4-glucanase Cel5A 8 . They also serve as a potential phytostabilizer to absorb heavy metal pollutants in industrial areas 9 . In addition, recent studies have shown that mangroves can enhance fish abundance 10 20 for analysis. The pipeline, in brief, joins the mate pairs and trims off low quality regions using SolexaQA 21 followed by dereplication and artificially duplicated reads (ADRs) analysis 22 using DRISEE (Duplicate Read Inferred Sequencing Error Estimation) 23 . Sequences showing similarity to fly, mouse, cow and humans were removed using Bowtie 24 . Annotation was done against the RefSeq 25 and Subsystems database 26,27 for diversity and functional analysis, respectively.
Comparative analysis of metagenomes across distinct mangrove areas. The metagenomic data from this study was compared to samples from Brazil 17 and Saudi Arabia 19 , both of which are available at MG-RAST server (Table 1). Brazilian mangrove data 17 consisted of four different samples that had different anthropogenic impacts. The BrMgv01 and BrMgv02 samples were obtained from two different sites that had experienced an oil spill in 1983. The first sample did not show any strong effects from the oil spill but the second sample still show oil effects. BrMgv3 was collected in a site near an urban area while the last sample, BrMgv04, was isolated from what was determined to be pristine conditions. A second dataset collected in Saudi Arabia also had a total of four samples (RSMgr01, RSMgr02, RSMgr03 and RSMgr04) collected from the rhizosphere of Avicennia marina, commonly known as the grey mangrove, in the Red Sea 19 . All samples are available at MG-RAST.
Comparative analysis of resistance to antibiotics and heavy metals in various ecosystems. For resistome analysis, the twelve mangrove datasets as described above and four mangrove datasets from our previous study 18 were included. In addition, datasets collected from soil samples in four agricultural and adjacent grassland samples from Sweden 28 were added. Six different forest soil samples from Puerto Rico 29 and USA 30 were included, as were sixteen oceanic soil samples from the Global Ocean Sampling Expedition 31,32 ( Table 1).
Analysis of metagenomic data. Venn diagrams were generated using the Venny 2.0 program 33 . Normality testing using the Shapiro-Wilk test and the Kruskal-Wallis "Nemenyi" tests was performed to evaluate whether the OTU's and functional genes abundances were different within and between the ecosystems using R 34 and PAST 35 . The dataset was standardized dividing each OTU abundance value by the sum of all abundances in each sample. A Principal Component Analysis (PCA) was used to compare the microbiota of each site using the vegan package 36 . Multiple linear regressions were conducted with the first two PCs obtained in the PCA analysis and the site as an independent variable. All the analyses were performed using R software 34 . Availability of data and materials. The

Results
The sequencing from the four India datasets resulted in a total of 9 GB, of which 32,080,253 reads were obtained with an average length ranging from 252 ± 9 bp to 409 ± 139 bp. After quality control pipeline, 13 ± 3.08% reads were assigned to ribosomal RNA genes, 38.56 ± 4.41% to predicted proteins with known functions, and 48.62 ± 6.45% to predicted proteins with unknown function (hypothetical proteins) ( Table 2).
Analysis at domain level. All the samples had sequences that map to the Bacteria, Archaea, Eukarya and Viruses. A small percentage of the sequences (0.011 to 0.75%) were not assigned to any organism. Bacteria were the most abundant domain recovered from all the mangrove datasets, ranging from 94.8 to 99.2% of the total. Regardless of the low sequence proportion compared to other domains, the number of sequences affiliated with viruses was the highest in Saudi Arabia samples (Fig. 1B). The first two components in the PCA explained more than 98% of variation and there was a clear separation among the samples ( Fig. 2A). To determine if the separation among mangrove samples isolated from the different countries were statistically significant, the scores of the first two PC (Principal Component) were used as dependent variables in the multiple linear regression. The clustering effect in the first PC was due to the community abundance at domain level from India. On the other hand, all communities were different when analyzed by the second PC (Table 3). Although the reads mean frequency of Bacteria was not statistically different among the countries, the higher proportion in Indian samples (96.7-99.2%) could explain the separation of this country from the other two (Brazil: 95.1-96.7%; Saudi Arabia: 94.7-96.2%) in the first PC.
Analysis at phylum level. A total of 66 phyla were recovered from all samples. The richness at phylum level was quite similar across the geographic localities, except for the following Eukarya phyla: Annelida, Brachiopoda, Chytridiomycota, Echiura, Entoprocta, Glomeromycota and Xenoturbellida, which were found exclusively in the Indian samples. The Kruskal-Wallis comparison of the reads abundances among the countries indicated that Brazil (Fig. 1A) had 23 and 11 phyla statistically different from Saudi Arabia and India (Fig. 1C), respectively. Only 5 phyla were statistically different between Saudi Arabia and India. Twenty-eight bacterial phyla were retrieved from all mangroves of the three different countries. The most abundant phylum among the samples was Proteobacteria, which accounted for 50.7 to 64.28% of the sequences in Brazil, 62.6 to 64.2% in Saudi Arabia and 56.7 to 90.5% in India. Firmicutes and Bacteroidetes were the second or the third most frequent bacterial phyla recovered. The PCA using only the frequency of the bacteria phyla showed a clear separation of India from Brazil and Saudi Arabia mainly due to Proteobacteria and Bacteroidetes reads abundance (Fig. 2B).
Five archaeal phyla (Crenarchaeota, Korarchaeota, Thaumarchaeota and Nanoarchaeota) were recovered from all samples in all countries (Fig. 3A). The Crenarchaeota abundance was statistically higher in Brazilian samples than in the other countries (BR-SA: P = 0.043; BR-SA: P = 0.021) while Euryarchaeota (P = 0.021) and Korarchaeota (P = 0.043) abundances were statistically lower in India than in Brazil. Euryarchaeota was the most abundant among all archaeal phyla between and within all samples. Overview of the dominant bacterial and archaeal genera. A total of 593 bacterial and 61 archaeal genera were recovered from all the collection sites (Supplementary Data 1). Most of the bacterial genera were present in all samples (Fig. 4), and all archaeal genera were obtained from all of the geographic locations.

Comparative functional analysis of mangrove sediments.
There were also three genes related to antibiotics resistance and toxic compounds: Cation efflux system protein CusA (0.311% ± 0.101), Acriflavine resistance protein (0.428% ± 0.0843) and Topoisomerase IV subunit A (EC 5.99.1.3) (0.136% ± 0.073). Cation efflux system protein CusA is involved in resistance to copper and silver while  Acriflavine resistance protein and Topoisomerase IV subunit A are involved in resistance to antiseptic Acriflavine and fluoroquinolones antibiotics respectively, which are both clinically relevant [37][38][39] .
Overall, the DNA-directed RNA polymerase beta subunit (EC 2.7.7.6) (0.471% ± 0.131) was the most abundant gene followed by Acriflavine resistance protein (0.42% ± 0.08). Within the resistome, Acriflavine resistance   Comparative functional analysis across different ecosystems. The high abundance of acriflavine resistance protein and the widespread presence of genes related to fluoroquinolone resistance in all the mangroves samples was intriguing and resulted in a deeper examination of the resistance to antibiotics and toxic compounds in the mangroves and other ecosystems. Publicly available metagenomes of marine and terrestrial (forest, grassland and agricultural soil samples) ecosystems were compared to the mangroves sediments (  (Fig. 6A). However, a deeper look into the functional level of Multidrug Resistance Efflux Pumps shows acriflavine resistance protein to be highly enriched in all the samples irrespective of ecosystem and anthropogenic activities. The relative abundance of the acriflavine resistance was similar in mangroves and terrestrial which were in turn significantly different from Ocean (p < 0.005) (Fig. 6B). Similarly, other clinically relevant antibiotic resistance genes (ARGs) such as those related to resistance to fluoroquinolones and beta-lactamase were significantly higher (p < 0.005) in ocean (28.178% ± 3.619, 9.913% ± 2.208) compared to mangroves (9.82% ± 3.776, 5.489% ± 0.742) and terrestrial (11.18% ± 8.327, 10.247% ± 5.826). Methicillin resistance was found to be statistically different in all the ecosystems although the relative percentage was more similar between mangrove (3.034% ± 0.808) and terrestrial (2.159% ± 0.682) as compared to Ocean (10.776% ± 1.823) respectively. In addition, resistance genes related to heavy metals such as cobalt, zinc and cadmium were significantly (p < 0.015) different in Ocean, Mangroves, and Terrestrial ecosystems (Supplementary data 4).

Discussion
In this study, whole-metagenome of Indian mangrove samples were sequenced to examine the community structure and functional content using the Illumina technology. These were compared to samples isolated from mangroves in Brazil 17 and Saudi Arabia 19 , which were sequenced by a different platform (Roche 454). Although the three datasets were generated by two different NGS platforms, our analysis showed that they had similar taxonomic diversity in their microbial communites 40 .
Bacterial and archaeal diversity in the mangrove sediments. The bacterial phyla, Proteobacteria (61.2% ± 10.82), was the most abundant in the samples examined from each of the geographic locations ( Fig. 1A-C). Proteobacteria had previously been noted as being highly abundant in mangrove samples 41 . This phylum has a high metabolic diversity, with a wide distribution in marine environments, playing an important role in nutrient cycling 42 . The second most frequently found phyla within the mangrove metagenomics samples we examined was Archaea. Members belonging to this phylum inhabit extreme environments, playing important roles in the biogeochemical cycles. However, our knowledge as to the niche they occupy, and the role they play in the mangrove microbial community is still limited 43 . Within the archaeal kingdom, Euryarchaeota (81.29% ± 7.993) were found most frequently within and between all the samples when compared to other Archaeal phyla (Fig. 3), and were also found to be highly abundant in mangrove sediments of Sundarbans (India) 43 and other ecosystems such as marine sediment (North Sea of Atlantic Ocean) 44 and German bight 45 (a shallow region of the North Sea that borders Germany). The archaeal community we found in the mangroves had many methanogen genera, including Methanothermobacter, Methanocaldococcus, Methanococcus, Methanosarcina, Methanococcoides, Methanospirillum, Methanoculleus, Methanosaeta, and Methanoregula. The diversity and presence of these specific genera is an indication that methane metabolism is important in mangrove ecosystems. The archaeal ammonia oxidizer Nitrosopumilus (0.15% ± 0.2) and Cenarchaeum (0.03% ± 0.04) of the phylum Thaumarchaeota was also observed in all samples. At the bacterial genus level, Pseudomonas (2.61% ± 1.7) was found to be most abundant in MG_BNH (8.06%) (India). Pseudomonas spp. thrives in many diverse environments that range from individual humans to the rhizosphere 46 . Pseudomonas has been found to influence plant growth 19,46,47 by releasing siderophores, antibiotics, biosurfactants and solubilization of potassium into forms that are accessible for plants, and has also been noted for its critical nitrogen fixation role in the mangrove ecosystem 48,49 . In addition, they are also tolerant to aromatic hydrocarbons, organic and heavy metal contaminants 50,51 . Another aromatic hydrocarbons degrader, Geobacter, was found at a similar frequency in the samples from Brazil (2.65% ± 0.08) and Saudi Arabia (2.05% ± 0.6). Its presence in the Indian samples (1.03% ± 0.53) differed statistically (p 0.02) from the samples of the other two countries. Geobacter has also been found in petroleum contaminated environments  and pristine deep aquifers capable of Fe (III) reduction 52 and are also strong candidates for immobilization uranium 53 (U (VI) ) and bioremediation of aromatic hydrocarbon contaminants. Neptuniibacter, a copiotrophic microorganism which can degrade aromatic hydrocarbon such as carbazole 54 , was the 3 rd most abundant genus in all the samples (1.61% ± 4.61). The high standard deviation was due to the overwhelmingly high frequency in MG_BNH (16.91%) (Indian sample) that could be due to this region having higher anthropogenic activity when compared to the other samples. In addition, Marinobacter, a ubiquitous marine aromatic hydrocarbon degradation genus 55 of the Proteobacteria phylum, was significantly (p 0.0124) abundant in the Indian (3.15% ± 2.23) samples compared to Brazil (0.44% ± 0.04) and Saudi Arabia (0.82% ± 0.35). Our previous study also showed a dominance of Marinobacter in mangrove samples 18 . Genome analysis of Marinobacter has revealed its potentiality to survive in oil-polluted water 55 and has suggested that it could be used for bio-monitoring of oil spills in mangroves 56 . Two important genera from the phylum Actinobacteria, Streptomyces and Mycobacterium, were found. Mycobacterium were found at more significant frequency (p 0.0209) in Brazil (1.17% ± 0.34) than in Saudi Arabia (0.35% ± 0.03) or India (0.35% ± 0.16). Interestingly, a pristine Brazilian sample (BRMgv-1: 1.72%) and one that was highly impacted by human activity (BRMgv-2: 0.98%) had a higher frequency of Mycobacterium compared to the samples associated with anthropogenic activity (BRMgv-3: 1.19%), or to the sample from pristine (BRMgv-4: 0.79%) environment. It is interesting that Mycobacterium spp., such as M. chlorophenolicum, M. farcinogenes and M. austroafricanum, were observed in samples from mangrove sediments that were contaminated with PAH (Polycyclic aromatic hydrocarbon) 57 . Seven genera belonging to order Desulfobacterales were found in all samples and showed similar frequency across all samples. Desulfotalea is a psychrophilic genus 58 . Desulfovibrio is known to be aerotolerant 59 . Geobacter is recognized as an aromatic hydrocarbons degrader 60 and Pelobacter is an iron and sulfur-reducing mesophilic anaerobe 61 . These were significantly less abundant (p 0.0209, 0.034, 0.026 and 0.037 respectively) in Indian samples when compared to Brazil and Saudi Arabia. An ammonia oxidizing bacteria, Nitrosococcus (0.97% ± 0.28), was present in all the samples with no significant difference within or between the groups. Samples collected from mangroves in each of these countries shared a total of 97.9% of the OTUs, which accounted for 99.97% of the total reads abundance (Fig. 4). Similar results were obtained when compared to forest and vineyard soils 62 . The high number of shared OTUs between the mangroves corroborates the functional genes statistical analysis (Supplementary data 3) between the samples.
Resistance to antibiotics and heavy metals in various ecosystems. High abundance of fluoroquinolones and acriflavine resistance proteins were found in the mangrove samples of India, Brazil and Saudi Arabia irrespective of the collection site. In order to examine the consistency of these resistance genes in other ecosystems, whole metagenomic datasets from others studies, including 18 , ocean 31,32 , forest 29,30 agricultural and grassland soil samples 28 were compared, specifically targeting marine and terrestrial environments with and without anthropogenic activity (Table 1). Heavy metal resistome in diverse ecosystems. The presence of genes involved in heavy metal resistance in rivers, activated sludges, aquaculture farm sediments, etc. 63,64 have been previously described. In our study, genes that play a role in the resistance to antibiotics and toxic compounds were found across all of the ecosystems. Cobalt-zinc-cadmium resistance were found in all the samples, but the percentage of reads that mapped to these genes was found to be significantly lower (p < 0.01) in the samples collected from the ocean (5.713% ± 2.589) compared to those collected from mangroves (23.495% ± 4.701) or terrestrial (27.479% ± 4.605) ecosystems (Fig. 6). Among the genes that determine cobalt-zinc-cadmium resistance, the cation efflux system protein CusA was the most abundant gene. CusA and the cation efflux system provide bacteria with resistance to copper and silver. Although copper is an essential element, it can be lethal to plants even at low concentrations 65 and can lead to several ill effects such as chlorosis, yellow coloration, and retardation of growth 66 . This copper resistance symbiotic bacterium is associated with plants found in mine tailings 67 . Metal ion solubility generally increases with decreasing pH 68 , and the presence of CusA has been found to be associated with soil types with low pH 68 . Marine samples showed significantly lesser enrichment of CusA (5.37% ± 3.55), which could be due to the higher pH of marine water 69 and the lack of plants in this ecosystem. Another annotated function that was seen involved copper homeostasis, but all the ecosystems exhibited comparable level of this particular functionality (7.068% ± 1.154, 6.058% ± 1.343 and 7.238% ± 2.116 for marine, mangroves and terrestrial, respectively). Arsenic resistance genes were also found consistently across all the samples (Fig. 6) 70 demonstrated a similar presence of genes involved in arsenic metabolism in paddy soil, with the authors concluding that these genes play an important role in avoiding arsenic risk through biotransformation.
Antibiotic resistance genes (ARGs) patterns across ecosystems. Genes involved in antibiotic resistance have been observed in distinct patterns across different ecosystems 71 . In our study, the Multidrug Resistance Efflux Pumps functional category was the most abundant drug resistance function across all the ecosystems (Fig. 6A). Interestingly, among the subtypes within this functional category, Acriflavine resistance proteins were significantly abundant in every sample (Fig. 6B). Acriflavine has antibacterial properties and is used as an antibiotic 72 . It has been shown to have antiviral and antitumor activities through its topoisomerase inhibition properties 73,74 . The widespread prevalence of acriflavine resistance was also observed in the past from clinical samples in 11 Asian countries 75 . In the recent years, metagenomic analyses showed that acriflavine resistance genes were highly abundant in South China paddy soil 76 and aerobic activated sludge and anaerobically digested sludge 77 . In our analysis, we found that the acriflavine resistance genes were widespread in aquatic and terrestrial ecosystems that had significant human activity or were from pristine environments ( Fig. 6A and B). Fluoroquinolone drugs, which target DNA gyrase and topoisomerase IV, are widely used as the first line for nosocomial infections 78,79 . Fluoroquinolone as well as methicillin resistance gene were found to be significantly higher in marine (28.178% ± 3.619 and 10.776% ± 1.823 respectively, p < 0.001) as compared to mangroves (9.82% ± 3.776 and 3.034% ± 0.808, respectively) and terrestrial (11.18% ± 8.327 and 2.159% ± 0.682, respectively) ecosystems.
Beta-lactamase was highly abundant in the marine (9.913% ± 2.208) and terrestrial (10.247% ± 5.826) ecosystems. Within the terrestrial samples, forest (4.38% ± 1.66) had similar abundance comparable to mangroves (5.489% ± 0.742) while the agricultural and grassland samples were found to be highly enriched (14.64% ± 3.54). Antibiotic-resistant genes have been found to be similarly abundant in soils that contain or lack manure 80 . Similarly, our result showed a high abundance of β-lactamases genes in agricultural and adjacent grasslands samples with comparable frequency while the forest and mangroves samples had relatively lesser abundance. The high abundance of β-lactamases in agricultural soil have been demonstrated in a recent functional metagenomic study by Lau et al. 81 who identified 34 new antibiotic resistance genes that were related to multi-drug efflux systems, indicating a potential high-level resistance towards aminoglycosides, sulfonamides, and a broad range of beta-lactams. As β-lactamases have been hypothesized to play a vital role in the survival of the bacteria in its natural habitat 82 , the presence of the genes involved in this resistance have been noted in metagenomic samples from a variety of habitats. For instance, a proficient β-lactamase enzyme was isolated from Oceanobacillus iheyensis in the ocean sediments at a depth of 1050 meters 83 . Recently, a novel β-lactamase gene was discovered from Pelagibacterium halotolerans B2T, which was isolated from the East China Sea 84 . The high abundance of β-lactamases in oceans also indicates the rich diversity of enzymes and the promising prospects of novel antibiotic discoveries.
Antibiotics and antibiotic resistance genes have been found in diverse environments that include deep terrestrial subsurface, glacier ice core and samples collected from deep sea that have not been in contact with humans 83,85-88 , but they are mostly present at non-inhibitory concentrations [89][90][91] . The antibiotic resistance genes were dominant in the resistome having significant differences among the ecosystems with the ocean having highest relative abundance compared to mangrove and terrestrial ecosystems (Supplementary data 4). It has been hypothesized that the function of such resistance genes in the natural environment could be related to some basic physiological processes such as biosynthesis of the cell wall 92,93 , trafficking of signaling molecules, detoxification of metabolic intermediates 88 or antibiotic detoxification 88,[94][95][96] . Untouched environments can have novel antibiotic resistance genes 97 that can give rise to more multidrug resistant strains via horizontal transfer when human activities encroach upon them. For instance, when soil samples from pre-and post-antibiotic areas were compared, plasmids from the earlier era had fewer antibiotic resistance genes 97,98 and this was followed by a significant rise in their presence in later sampes 99 . The notion of clinically relevant pathogens acquiring resistance genes from the environment is a likely possibility 97

Conclusion
We have analysed the metagenomic profiles of mangrove sediments across India and compared them with publicly available samples from Brazil and Saudi Arabia mangrove. Distinct patterns unique to the Brazilian and Saudi Arabian mangroves were observed which differentiated them from samples collected in India. Although there were differences, a significant number of microbial genera were found to be present across all of the three geographic regions. Proteobacteria and Euryarchaeota were the most abundant phyla within and between all the mangroves for bacteria and archaea, respectively. A functional analysis that compared the mangroves samples with metagenomic sample taken from ocean, forest, agriculture and grassland showed the presence of highly enriched acrylflavine, copper, fluoroquinolone, β-lactamase and methilicin resistant genes distributed consistently in patterns throughout all the examined ecosystems. Further, our study showed that heavy metals and antibiotic resistance genes are founnd in microbial populations from mangroves and the other ecosystems, including both pristine areas and environments that experience significant human activity. The widespread existence of antibiotic resistance genes could be a warning bell, indicating a source of new genes that could further increase the rise in antimicrobial resistance that could have clinical significance.