Antibiotic Resistome Biomarkers associated to the Pelagic Sediments of the Gulfs of Kathiawar Peninsula and Arabian Sea

Antibiotic resistance has been one of the most persistent global issue. Specifically, marine microbiomes have served as complex reservoirs of antibiotic resistant genes. Molecular advancements have enabled exploration of the uncultured microbial portion from hitherto difficult to sample niches such as deeper oceans. The Gulfs of Kathiawar Peninsula have been known for their unique properties like extreme tidal variations, different sediment textures and physicochemical variations. Pelagic sediment cores across four coordinates each of the Gulf of Kutch, Gulf of Khambhat and an open Arabian Sea were collected, processed for metagenomic sequencing and assessed for antibiotic and metal resistome. The dominant genes were mostly resistant to macrolides, glycopeptides and tetracycline drugs. Studied samples divided into three clusters based on their resistome with carA, macB, bcrA, taeA, srmB, tetA, oleC and sav1866 among the abundant genes. Samples from creek of Gulf of Kutch and mouth of Gulf of Khambhat were most diverse in resistance gene profile. Biomarkers observed include gyrA mutation conferring resistance gene in the Arabian Sea; Proteobacteria species in Gulf of Kutch and Arabian sea; while Aquificae, Acidobacteria and Firmicutes species in the Gulf of Khambhat. Region-wise differentially abundant 23 genes and 3 taxonomic biomarkers were proposed for antibiotic resistance monitoring.

Metagenomics data analysis. The sediment samples were shotgun sequenced and 160 to 222 million of paired-end reads were obtained with ~97% raw reads having a Q-score more than 20. Each sample was De Novo Assembled and the statistics of the assembled sequences are as described in Table 1. The statistics for the A and GOCS1-4 were not shown here to avoid redundancy of the results. The reads for the GOKS1-4 were submitted to EBI metagenomics under the project accession number PRJEB26615.
Sediment antibiotic and metal resistome. The resistome analysis revealed that considering all the nine samples, we observed hits against a 2367 ARGs that included the predicted genes encoded by different organisms for antibiotic resistance as well as the validated mutations in genes that conferred resistance to the microbe. Among them, hits to 2355 ARGs were encoded by the sediment microbiota in the four samples from GOK; 2353 by the GOC microbiota and the A had hits against a set of 2292 ARGs. The ARGs had a length between 18 to 400 amino acid with an average length of 200 amino acid in the studied samples. On the whole, it was observed that GOK was most diverse and the Gulfs were relatively more diverse than the A sample. In GOK, AFM samples were found to be more diverse with highest in the ARGs observed in the GOKS4 sample followed by GOKS3 > GOKS1 > GOKS2. Among all the studied samples, the diversity observed was GOKS4 > GOKS3 > GOCS4 > A > GOCS3 > GOCS2 > GOKS1 > GOKS2. The maximum hits against the bacteriocidal and metal resistance genes was observed in the GOKS3 sample with near similar proportions in all sample except GOKS1 and GOKS2 which had relatively lesser proportion of hits (~25% lesser hits) ( Supplementary Fig. 2). In contrast to ARGs, diversity of metal resistant genes was more in the GOC and A compared to most of samples of GOK with GOKS3 as an exception. When looking into the three sites, Khambhat had few genes corresponding to resistant to Copper, Arsenic and Cetyl trimethyl ammonium bromide (CTAB) significantly varying from other two sites with slightly higher proportion of the same, while genes against multiple metals were lesser in proportion (Fig. 1a). In the Kutch samples, genes against the Cobalt and Nickel were significantly higher compared to other sites (Fig. 1b). On the whole, Iron chelation related genes were enriched in the Gulfs with genes against organo-sulfate and acid molecules being significantly varying between the TMS compared to AFM and open ocean regions (Fig. 1c,d). On the whole, considering the cluster comparison, genes conferring resistance to Arsenic, Copper and Silver were most dominant and significantly varying between the clusters ( Fig. 2 and Supplementary Fig. 3). From the total genes, ~3-6% of the genes were those with MDR like predicted efflux pump related functions and 7-10% were efflux transport mechanism based functioning genes corresponding to Acriflavine, Arsenic (As), Cetrimide (CTM), Cetyl trimethyl ammonium bromide (CTAB), Chlorhexidine, Ethidium Bromide, Hydrochloric acid (HCl), Lead (Pb), Nickel (Ni), Silver (Ag), Sodium Cholate, Sodium Deoxycholate (SDC), Triclosan, Triton X-100, Zinc (Zn) and other similar multiple compounds reflecting the effect of these compounds in the environment in co-selection of antibiotic resistance.
The mode of actions by which the microbiota confer the antibiotic resistance were similar in trend for all the studied sediment samples (Fig. 3a). The antibiotic efflux was the most dominant mode of action (>45%) in all the samples, followed by the antibiotic target alteration in the GOK and A samples; whereas the GOC samples had almost equal proportions of ARGs using the antibiotic target alteration and antibiotic inactivation in half of the samples. Almost 5% of the predicted resistance genes were classified under those that are reported to employ multiple mechanisms of action to confer resistance. ARGs conferring resistance by antibiotic target protection and target replacement based mechanism were among the lower proportions in the studied samples.
The samples formed three significant clusters when assessed for the antibiotic type that the ARGs were resistant to (Fig. 3b). The A and GOK-TMS sediment samples formed a separate cluster from the rest of the samples. Among the rest of the samples, GOCS1 and GOCS4 formed the cluster 2, while the remaining four samples www.nature.com/scientificreports www.nature.com/scientificreports/ formed the cluster 3. The ARGs against multiple drugs were almost >40% in all the samples (Data not shown), followed by Glycopeptide resistant ARGs with >10% of the total ARGs, with relatively higher abundance in the A, GOKS3 and GOCS2 samples. The latter samples had similar trend in many of the drug class groups. Macrolide and tetracycline were the next most abundant drug classes in all the samples. The hits corresponding to the ARGs against the remaining class of antibiotics including carbapenem, fosfomycin, fusidic acid, murpirocin, nitrofuran, phenicol, phenolic, pleuromutilin, pyrazinamide, rifamycin and streptogramin were in similar proportions across the samples with the set of genes in each class corresponding to the niche and microbiota encoding them. GOCS1 and GOCS4 had lower abundance of diaminopyrimidine, while relatively higher of cephamycin. GOKS1 and GOCS2 had similar proportions of few groups such as relatively higher ARGs against nitroimidazole and macrolides. On the whole, ARGs were distributed against 36 different drug classes.
There were 63 genes that had a percent abundance of >0.5 in at least one of the studied nine samples (Fig. 4). Among these the carA gene was most abundant in all the samples, followed by macB. Apart from these, bcrA, taeA, srmB, tetA, oleC and sav1866 were also relatively abundant and varied in proportion between the samples. While the rest of the genes were tightly within an equal range in the samples as for example abcA gene proportion ranged between 0.5% to 0.7% in the studied samples. The genes apart from these 63 accounted for 49-55% (shown as others in Fig. 4) of the total ARGs abundance in each sample.
Further, when the absent ARGs from the total ARG diversity of all studied samples (2367 genes) were assessed for five sample/groups viz., A, GOK-TMS, GOK-AFM, GOC-TMS, GOC-AFM; we observed that maximum unique ARO categories were absent in the GOK-TMS samples, while no unique ARO was absent in the GOK-AFM samples ( Supplementary Fig. 4). There were 30 such ARGs that were commonly absent in A and the rest of the groups. GOC-AFM group had 45 unique ARGs absent and GOC-TMS had five. www.nature.com/scientificreports www.nature.com/scientificreports/ Marine microbial community corresponding to the resistome. The contigs having positive ORFs with the resistance genes were assessed for taxonomy using the MarRef database. Proteobacteria was the most dominant phylum in all the studied samples, followed by second most abundant phylum: Euryarchaeota in GOKS2, GOCS1, GOCS2; Actinobacteria in A, GOKS1, GOCS2, GOCS3 and Bacteroidetes in GOKS3 (Supplementary Table 2). While GOKS4 had Firmicutes as second most abundant phylum with Actinobacteria and Bacteroidetes in almost similar proportions (Supplementary Table 2). As observed in the ARG analysis, based on the resistant drug class 3 major clusters (Cluster 1: A, GOKS1, GOKS2; cluster 2: GOCS1, GOCS4; cluster 3: GOKS3, GOKS4, GOCS1, GOCS2) were formed among the nine sample (Fig. 3b). The taxonomic abundance profiles were checked for www.nature.com/scientificreports www.nature.com/scientificreports/ significant difference at phylum level between the clusters. The results revealed that the Cluster 2 had the highest number of significantly varied phyla (n = 18 at p < 0.05) compared to the other clusters ( Fig. 5a), wherein the proportion of the commonly abundant Proteobacteria and the Actinobacteria was relatively lesser in the samples than the other clusters. Firmicutes, Thermotogae and few of the archaea phyla were slightly higher in proportion in the Cluster 2 samples. For the Cluster 1 samples, Acidobacteria proportion was significantly low, while the Cluster 3 had lower proportions of Candidatus and Crenarchaeota phyla (Fig. 5a).
Further, at genus level, the samples showed similar trend to the phyla level assessment, however, showed slightly different trend compared to the drug class analysis, wherein the GOKS2 (which was falling under Cluster 1 as per drug class analysis) was observed to be falling with the samples GOCS1 and GOCS4 i.e. Cluster 2 (Fig. 5b). Most of the significantly active genus (n = 15) observed in the three clusters were extremophiles, with nearly equal proportions of gram-positive and gram-negative bacteria. Thermotoga, Geobacillus, Thermosediminibacter, Desultomaculum and Thermaerobacter were among the higher proportions.

Region specific biomarkers.
We assessed the presence of specific ARO as well as microbial community at different classification levels for the two Gulfs and Open sea using LEfSe. In total, 23 ARO were found to be having LDA >2.0 between the GOC and A samples (Fig. 6a), as the difference between GOC and A was more than that with either of them with GOK, thus specific marker for GOK was not observed. Only the ARO:3003942 was higher in abundance in GOC compared to both GOK and A samples, which is an ATP-binding cassette (ABC) antibiotic efflux pump, viz., abcA, a multidrug resistant ABC transporter that confers resistance to methicillin, daptomycin, cefotaxime, and moenomycin. All the other ARO that were less abundant in the GOC belonged to two main groups viz., the resistance-nodulation-cell division (RND) antibiotic efflux pump conferring multi-drug resistance and the gyrA genes resistant to fluoroquinolone, nybomycin and triclosan ( Table 2). Among these, ARO:3003374, ARO:3003374 and ARO:3003940 were also significantly higher in GOK compared to GOC (Individual biomarker proportions shown in Supplementary Fig. 5).
Total 24 microbial taxa were having an LDA >2.0 among the three regions under study (Fig. 6b). However, there were 3 significant markers (p < 0.05) falling under the Proteobacteria phyla. Among others with lower significant impact were biomarkers from the genus Hirchia falling under the Proteobacteria phyla; one species from the Bacteroidetes and remaining 14 biomarkers in the GOC were from three phyla viz., Acidobacteria, Aquificae and Firmicutes, and this region was exception with no biomarker reported from the Proteobacteria phylum.

Discussion
Since past few years, researchers have been witnessing that microbiota not only attain antibiotic resistance on exposure to antibiotics that are a part of normal households and health centers, but they also gain their antibiotic resistome from natural environment via horizontal gene transfers or in response to different kinds of stress imposed on them by their niche 7 . Recently, with the advancement of molecular techniques for microbiome exploration, studies have been assessing relation between metals and antibiotics which has revealed that antibiotic resistance in bacteria involves co-resistance to metals 13 . In context to this, we performed resistome analysis of pelagic sediment microbiota across four coordinates in the Gulf of Kutch and compared it with similar observations in the other Gulf of Kathiawar peninsula i.e. the Gulf of Khambhat and a sample from the open sea (Arabian sea). We extended the interpretations to propose microbial and functional biomarkers specific to the three regions. www.nature.com/scientificreports www.nature.com/scientificreports/ As the two Gulfs and the coastline of Arabian sea from the Kathiawar peninsula are known for their unique attributes in terms of sedimentation, salinity and on-shore activities, the physicochemical parameters of the samples were assessed. The outcomes revealed a specific trend for percentage salinity which was relatively higher in the AFM samples of the GOK (Supplementary Fig. 1). Similar proportions have been observed and reported earlier and the higher salinity could be attributed due to its inverse estuary property 11 . Also, owing to its characteristic for harboring rich biodiversity, the %TOC of the GOK sample was slightly more than GOC and the sample  The present study, to the best of our knowledge, provides a preliminary metagenomics outlook in the deeper marine sediments of the sampled region, hence the main objective was to observe the proportion of coding genes in this metagenomes against reported antibiotics and metals. Therefore, though recent studies have used a minimum cut-off of 80%, in our study, the genes with percent identity 50 as minimum were considered so that any signatures of resistant genes are not missed 14,15 . Previously, it was reported that at lower identity cut-off the sensitivity for true positives reduced, but still there was at least half the proportion was truly predicted 14 . It can be hypothesized to be more sensitive in case of marine niche which is well known for novel and rapid changes in coding genes and highly prone to horizontal gene transfers. Further, PCR based validation and confirmation would enhance the knowledge of the newly acquired mutations in the studied metagenomes and their role in the resistance. Among the mechanisms employed by the ARGs against the antibiotic target molecules, the antibiotic efflux was almost used by 48-54% of the genes in all the nine studied samples (Fig. 3a) 9 . The antibiotic efflux has been observed to be the dominant mode used in earlier resistome studies from similar niche, as the mode allows removal of target molecules from the intracellular compartments of the microbes 2,16 . The efflux pump related genes may be involved in co or cross-resistance with metal resistance. However, the second most abundant mechanism used was different from the former mentioned study, wherein they observed antibiotic target replacement  www.nature.com/scientificreports www.nature.com/scientificreports/ as dominant which was quite less in percentage in our samples. Possible reasons for this varied observation could include different sequencing platforms used, difference in the database or database versions, difference in the sites of study, etc. The results were somewhat in line with those observed in the Antartica samples, wherein they reported varied antibiotic inactivation modes 17 , however, we also observed antibiotic target alteration as the second most abundant mode except for the GOCS1 and GOCS4 samples.
Unlike the GOC samples 9 and similar to previous reports 18 , there was a trend observed in the GOK samples as we moved from AFM to the TMS samples, in terms of both the drug classes and the ARGs (Fig. 3b). The trend observed could be owing to the relatively less discharge rate by rivers in the GOK compared to the GOC, leading to a more stable core microbiome from creek to mouth side based on its niche parameters and nearby on-shore activities. Also, similar to most of sediment studies, the multi-drug resistant (MDR) genes were most abundant 9,19 followed by the glycopeptide, tetracycline, macrolide resistant. The results also support the high occurrence of the MDR efflux pump mechanism in the samples. Among the metal resistant genes, Cu, Ni, Co, As related were significantly varying between the sites revealing the probable presence of these metals in the sediments. Also, the most hits were observed against Cu, Hg, As, Ni, Fe with few genes related to Tungsten (W) and Tellurium (Te), similar trend was earlier reported in case of sediment samples, however, Ni and As were not among the dominant genes 13 .
As seen previously, the macrolide, tetracyclin and glycopepetide resistant genes were dominant in all the samples. The group of genes encoding resistance against glycopeptides vanRA-RN were among the genes with >0.5% proportion, however, the proportion of macrolide resistant carA and macB genes was highest in the samples (Fig. 4). Such results have also been reported previously in deep sediment samples 9 . Our observations were similar, wherein the abundance of macB was co-related to that of bcrA gene, these respectively acting against the macrolides and peptide that interfere with important bacterial biosynthesis 20 .
The absence of more ARGs was observed in the GOK-TMS samples which may reflect the higher ARG diversity in the GOK-AFM samples owing to their high salinity ( Supplementary Fig. 4). It has been recently reported that high salt stress condition increased the efflux of antibiotics in the E. coli by increasing the ARG expressions 21 . This could be one of the reasons for higher ARG diversity in the GOK-AFM samples and also the GOCS1 sample. Also, contaminants from anthropogenic activity tend to increase the ARGs and MDRs, which was also reflected in the GOCS1 and GOCS4 samples that have extensive activities as mentioned earlier 9 .
Total of 24 phyla observed in the sediment samples were further assessed for statistically significant variation between the three clusters observed in the Fig. 5b. As the samples were acting against specific groups of antibiotics, we assessed the corresponding microbial phyla variation in the same (p < 0.05) (Fig. 5a). The Cluster 1 which was supposedly more pristine had lower proportions of Acidobacteria. The most dominant taxa in sediments viz., Protobacteria was also the most abundant corresponding to encoding the ARGs 22 , however, its proportion was significantly lower in the Cluster 2 while the proportion of Firmicutes that is linked with human gut related ARGs was higher in Cluster 2 14 . This may suggest the mixing of human faeces in the samples and also the influence of human activities, as the cluster had huge number of significantly varying phyla. Further, similar analysis for genus level revealed significant features (those varying with p < 0.05 between the clusters) with highest abundance of Thermotoga (Fig. 5b). Most of the varying genus were extremophiles. The variations reflect the community of the concerned niche and thus their relevance to acquiring ARGs.
There were 22 biomarker ARGs (BM-ARGs) observed in the A and one in the GOC (Fig. 6a) i.e. abcA efflux pump acting against methicillin, daptomycin, cefotaxime, moenomycin drugs which are majorly antibiotics against the gram-positive microbes. This may reveal the in-flow of these in the GOC niche, however, these have not been abundantly reported from sediments in earlier reports. Also, the BM-ARGs in the A mostly belonged to the gyrA mutations conferring resistance to fluoroquinolones ( Table 2). Open sea is hypothesized to be having different profile from the Gulfs, as the latter undergo constant sedimentation in varied way compared to the open sea. These BM-ARGs serve a preliminary point for assessing and monitoring the possible ARGs contaminants in the niche. Also, 24 biomarkers of bacterial community (BM-BC) were observed in the three regions (Fig. 6b). The GOK and A regions had BM-BC belonging to the Proteobacteria phyla with A having species belonging to alpha and gamma Proteobacteria, while GOK from gamma Proteobacteria along with one genus Formosa, from the Bacteroidetes phyla. Acidobacteria, Proteobacteria and Bacteroidetes have been commonly observed in ARG rich sediment samples owing to their high abundance 17 . The BM-BC observed in these are unique to marine and have fewer community members and thus may serve to be potential biomarkers. In GOK, Enterobactericia were among the predicted biomarkers which could owe to the hypothesis that wastewater source could be leading to carbepenem and β-lactam resistance in the pathogenic family, which has remained a challenge in the marine sediments as earlier observed 13,23 . In GOC, the observed BM-BC belonged to Aquificae, Acidobacteria and Firmicutes. The region stood unique as per earlier observations with different biomarkers and also those belonging to Firmicutes, which were abundant in the GOC samples, rather than any BM from the most dominant Proteobacteria phyla.
The present study, revealed the resistome of the pelagic sediments across the two Gulfs and compared it to an open sea coastline of Arabian sea of Kathiawar peninsula. The overall outcomes clearly showed the influence of both the niche environment and the on-shore activities on the ARG profile of the sediment microbiota. The most dominant ARGs included carA and macB acting against macrolides, followed by 61 other ARGs with a proportion of >0.5% in at least one of the studied nine samples. These belonged to the ARGs against drug classes viz., glycopeptides, macrolides, and tetracycline. The nine samples seemed to form three major clusters with respect to ARGs and the proposed reason behind it seem to be the salinity variations and nearby on-shore anthropogenic activities. Efflux pump was the dominant mechanism of action in the observed ARGs. The corresponding bacterial community was dominated by Proteobacteria followed Firmicutes in one cluster, and Actinobacteria in the rest of the clusters. Significant variations in the clusters at both phyla and genus level revealed a core microbiome in the specific niche. Differentially abundant features among the samples from each region revealed 23 gene and 24 taxonomic biomarkers, which may be potentially validated further for their application in monitoring the specific/ similar niche for ARGs and antibiotic related contaminations. Metagenomic DNA extraction. Metagenomic DNA was extracted from each GOK sediments (4 sites × 3 core sections = 12 samples) using the PowerSoil DNeasy Kit (Qiagen, Germany) as described earlier 9 and was quantified using Qubit Fluorometer v3.0 with Invitrogen ™ Qubit ™ dsDNA High Sensitivity assay kit (ThermoFicher Scientific).
High-throughput sequencing of metagenomic DNA. The high-quality DNA from the three corer sections viz., T, M and B from each sampling point of GOK were pooled in equal concentrations and processed for metagenomic shotgun sequencing by the Trueseq Nano kit as per the manufacturer's protocol. The library was assessed for quality and mean base-pair length followed by sequencing on the HiSeq. 4000 (Macrogen Inc., Seoul, Rep. of Korea).
Metagenomic data analysis. The generated paired-end raw reads were quality checked by FastQC 24 and filtered for adaptor sequences, short read removal by the Scythe (v0.994) and Sickle 25,26 . The filtered sequences (almost 96.5% of the total raw reads on an average were retained) from each sample were processed for De Novo assembly by the CLC Genomics Workbench v.11 (Qiagen, USA), all the parameters except a high k-mer value of 31 were kept default. The obtained scaffolds were then subjected to ORF prediction using MetaProdigal 27 . The predicted ORFs in each of the four GOK samples were individually blasted (e-value threshold of 1e-5) against the protein sequences reference file, that comprised of 2,893unique ARG sequences downloaded from the Comprehensive Antibiotic Resistance Database (CARD, https://card.mcmaster.ca/) 28 . The samples were normalized by computing the percentage of ARG hits to understand the complexity of the sample based on the number of unique ARGs under each sample. Similarly, the samples from the studied sites were assessed for metal resistance against the BacMet database 29 . The contigs showing hits for ARGs were extracted and checked for the marine microbial taxa by performing blastn against the MarRefDB 30 . The five samples from GOC and A sites were also processed by same pipeline 9 and their raw results were considered for comparison between the three regions for antibiotic resistance mechanisms, drug classes against which the ARGs were observed, ARG diversity, ARG enriched microbiota profiling at phylum and genus levels and biomarkers prediction. Statistical analysis. Significant variation (p < 0.05) in the ARG abundance as well as the corresponding microbiota analysis was assessed with following variables: sites, samples near the creek (referred as AFM: Away From Mouth) and those near the mouth (referred as TMS: Towards Mouth Side) of Gulfs, using two-way Welch's t-test with a Benjamin-Hochberg FDR correction using STAMP (Statistical Analysis of Metagenomic Profiles) v1.07 software package 31 . To identify niche-specific biomarkers by LEfSe analysis, the Kruskal-Wallis test (alpha value of 0.05 for ARG; alpha value of 0.1 and 0.05 for microbial taxonomy) was performed with LDA score of >2 at multiple taxonomic levels using the linear discriminant analysis (LDA) effect size (LEfSe) 5 . For input, the microbiota abundance profiles were computed at each taxonomic levels from phylum to species and the ARG abundance profile was assessed based on only the ARO (Antibiotic Resistance Ontology) category hits.

Data availability
The metagenomic sequence of the four GOK samples in the study has been deposited to the EBI metagenomics under the BioProject PRJEB26615.