Microbial diversity in Mediterranean sponges as revealed by metataxonomic analysis

Although the Mediterranean Sea covers approximately a 0.7% of the world’s ocean area, it represents a major reservoir of marine and coastal biodiversity. Among marine organisms, sponges (Porifera) are a key component of the deep-sea benthos, widely recognized as the dominant taxon in terms of species richness, spatial coverage, and biomass. Sponges are evolutionarily ancient, sessile filter-feeders that harbor a largely diverse microbial community within their internal mesohyl matrix. In the present work, we firstly aimed at exploring the biodiversity of marine sponges from four different areas of the Mediterranean: Faro Lake in Sicily and “Porto Paone”, “Secca delle fumose”, “Punta San Pancrazio” in the Gulf of Naples. Eight sponge species were collected from these sites and identified by morphological analysis and amplification of several conserved molecular markers (18S and 28S RNA ribosomal genes, mitochondrial cytochrome oxidase subunit 1 and internal transcribed spacer). In order to analyze the bacterial diversity of symbiotic communities among these different sampling sites, we also performed a metataxonomic analysis through an Illumina MiSeq platform, identifying more than 1500 bacterial taxa. Amplicon Sequence Variants (ASVs) analysis revealed a great variability of the host-specific microbial communities. Our data highlight the occurrence of dominant and locally enriched microbes in the Mediterranean, together with the biotechnological potential of these sponges and their associated bacteria as sources of bioactive natural compounds.


Molecular characterization.
BLAST similarity search corresponded with the morphological identification achieved with two (S.spi and E.dis) of the three sponge samples collected in the Faro Lake (Sicily; see Tables S1-S2). In particular, molecular analyses confirmed S.spi as Sarcotragus spinosulus and E.dis as Erylus discophorus. ITS region displayed the best alignment for the identification of S.spi specimen with 98% of identity. Concerning the sample O.per, collected in the same site, the sequence of the species Oceanapia cf. perforata, identified by morphological analysis was not available in GenBank. Nevertheless, 18S and 28S rRNA primer pairs allowed a partial identification at the genus level, with high similarity to Oceanapia isodictyiformis, plus several hits annotated as Oceanapia sp. at low percentage of identity (Tables S1-S2).
In the case of sponges collected from "Porto Paone" in the Gulf of Naples, CO1 and 18S rRNA were the best molecular markers since they allowed the identification of sponges at the species level. Concerning T.aur sample, the species Tethya aurantium was identified (95% of identities) by CO1, whereas the sample A.dam, collected in the same site, was well identified as Axinella damicornis by 18S rRNA primers, producing a high percentage of sequence similarity (99%) (Tables S1-S2).
Molecular analysis confirmed the samples A.acu and A.oro, as Acanthella acuta and Agelas oroides, respectively, with 28S and 18S rRNA reporting the highest percentage of sequence similarity (100%) (Tables S1-S2). Alignments are reported in Figures S1-S7. Diversity analysis. Alpha rarefaction on the observed features and three diversity indices (Chao1, Shannon and Simpson), were used to determine whether each sample was sequenced up to a sufficient depth. Rarefaction curves indicated that the majority of taxa was captured, since all samples under analysis reached a plateau ( Fig. 1;  Figure S8). Alpha diversity within each sample, measured through diversity indices at the family, genus and species levels, revealed a considerable bacterial diversity for all sponges under analysis, particularly when the species were considered (Table S3). When the impact of the abiotic features (temperature, pH and salinity) was examined, no statistical differences were observed between the groups. Overall, PERMANOVA analysis revealed that the environmental features did not affect the species composition or abundance in the symbiotic community. Further, a principal component analysis (PCA) was performed to reveal the bacterial species that greater contributed to the clustering of samples (Fig. 2). PCA showed seven components, with eigenvalues of 87. 57, 21.42,19.12, 11.49, 10.82, 8.48, and 8.11. The first two components, including around 65.2% of the inertia of data, were used for further analyses, in order to detect the most interesting patterns. Firstly, PCA displayed three major groups, with 167 bacterial species that particularly influenced the clustering (Fig. 2).
Taxonomic profiling. ASVs analysis was performed on those reporting a confidence percentage ≥ of 75%.
In sponges T. aurantium and A. damicornis, both collected in the Porto Paone, in the Gulf of Naples, 76 and 98 ASVs were detected, respectively. The most abundant bacterial classes in T. aurantium were Alphaproteobacteria, Gammaproteobacteria, Acidimicrobiia and Dadabacteriia, whereas bacterial community of A. damicornis was dominated by Gammaproteobacteria, Nitrospira and Deltaproteobacteria (Fig. 3).
As reported above, the sponges O. cf. perforata, S. spinosulus, A. oroides and G. cydonium revealed a similar composition in bacterial species distribution. In fact, a high abundance of Cloroflexi and Proteobacteria was observed in these species ( Fig. 4; Figure S9).
The absolute abundance of each bacterial phylum retrieved from the eight sponge samples was reported in Table S12.

Discussion
In this study we analyzed the biodiversity of marine sponges in the Mediterranean, focusing on four sampling sites: one in the Messina Strait (North-East of Sicily) and three in the Gulf of Naples. This work represents an important step forward in the investigation of the Mediterranean, being considered as a biodiversity hotspot 25 . Long-term variations of biodiversity are significant signs of environmental change. Concerning the Mediterranean sea, data are available to compare possible variations in the species richness and faunal compositions, which are responsible of loss or turn-over of biodiversity 18,19,26 . Moreover, enclosed saline coastal basins, such as the case of the Faro Lake, represent good models of aquatic system to study temporal variation of sponge biodiversity 26 .
Firstly, we identified seven sponges, complementing the traditional identification by morphological features using a molecular approach, based on DNA sequencing of 28S and 18S rDNA, ITS and CO1. Our results www.nature.com/scientificreports/ demonstrated that none of the molecular markers alone was able to define the sponges under analysis up to the lowest taxonomic level. Indeed, molecular markers were found to be suitable depending on the species of sponge to be classified. However, it must be considered that the barcoding analysis could be negatively affected by the lack of curated sequences collection. Among these used molecular markers, 28S and 18S rRNA are characterized by sufficiently heterogeneous regions useful to address phylogeny at different levels 27,28 . Because of their rapid evolution, ITS regions are considered markers at high resolution 29 . The COI mitochondrial DNA locus, despite the high variability at the sequence level, it is easy to amplify for its conservation across multicellular animals and abundant in eukaryotic DNA 30,31 . In fact, it resulted to be the most successful molecular marker to discriminate sponges at various taxonomic levels 32,33 . According to these literature data, no single marker exists for all sponge species, having each marker its strength and limitations 34 . This difficulty is also linked to the incomplete sequences annotated in database, so limiting phylogeny-based molecular taxonomic approaches that are commonly used for species identification. For this reason, a multi-locus-based molecular approach is recommended for the reliability in the case of sponge identification 34 . This was in complete agreement with our experimental strategy for the identification of sponges under analysis.
An important finding achieved by this study regarded the fact that, among the three sponges collected at Faro Lake, only E. discophorus was recorded in 2013 during a survey on the long-term taxonomic composition and distribution of the shallow-water sponge fauna from this meromictic-anchialine coastal basin 26 . The other two species, O. cf. perforata and S. spinosulus, were not reported so far, suggesting them to be new colonizers of this lake. Recently, the significant number of first reports of species from several biogeographic regions found in the Faro Lake 35-38 is probably related to the import of bivalves from Atlantic and Mediterranean sites, for aquaculture activities. All three sponges usually live on rocks, coralligenous concretions and marine caves in the Mediterranean. In addition, O. cf. perforata is a rare species in the Mediterranean. Concerning the other sponges collected in the Gulf of Naples, T. aurantium, A. damicornis, A. acuta and A. oroides, represent typical species for the Mediterranean, as well as, G. cydonium.
Furthermore, through metataxonomic analysis, we also investigated the bacterial diversity among these Mediterranean sites. Recent advances in molecular ecology techniques, such as the sequencing of bacterial 16S rRNA gene, led to a clear picture of the taxonomic and functional composition of marine microbiota, including associated symbionts 39 .
Our results showed that sponges under analysis host diverse bacterial communities. Surprisingly, sponges collected in the Faro Lake were characterized by a more diversified composition of phyla in comparison to those collected in the Gulf of Naples ( Fig. 4; Figure S9). Moreover, G. cydonium revealed a little sequencing depth (Fig. 1), probably related to the uniqueness of the collecting site (Table 1), which has strictly influenced the symbiotic community by selecting a few species of well adapted bacteria. In fact, Secca delle Fumose represents a www.nature.com/scientificreports/ good case study, due to the variations in seawater pH and gas-rich hydrothermal vents 40 . As reported in literature, extreme environments are well-known to inhabit a macro-and micro-biota with high biotechnological value [41][42][43][44] . Moreover, PCA analysis suggested interesting results for the sponges collected from Punta San Pancrazio (Ischia Island). In fact, A. oroides revealed considerable similarities to the sponges retrieved in the Strait of Messina, since they clustered in the same group (Fig. 2). On the other hand, A. acuta separated from the other sponges under analysis, revealing a completely different symbiotic community that needs to be taken into consideration (Fig. 2).
The phylogeny of sponges must also be considered in our analysis, because it probably influenced the community structure. In fact, S. spinosolus, A. oroides, E. discophorus belonging to Dictyoceratida, Agelasida, Tetractinellida orders, respectively, were recorded as High Microbial Abundance (HMA) species, while T. aurantium, A. damicornis and A. acuta were instead indicated as Low Microbial Abundance (LMA) species [45][46][47] . HMA sponges hosted a more diversified symbiont community than LMA, which were discovered to be extremely stable over seasonal and inter-annual scales 46 . The correlation of sponge taxonomy to the abundance and diversification of microbial communities was evident in the heat-map, since the HMA species displayed higher values of bacterial features (Fig. 3). Moreover, these considerations could justify the clustering obtained through the PCA analysis, where T. aurantium and A. damicornis separated from S. spinosolus, A. oroides and E. discophorus (Fig. 2).
Many studies reported about the sponge associated-bacteria as good candidates for the isolation of natural compounds, useful in biotechnological applications. This study represents a first evaluation of the biotechnological potential of the aforementioned sponges. For this reason, we will further discuss the known bioactivities of the most abundant bacterial phyla identified in the considered sponges, according to the available literature.
Dehalococcoides and Anaerolineae (a class of the phylum Chloroflexi) seem to be peculiar species of both collection sites, being detected in S. spinolosus, E. discophorous, and A. oroides (Figs. 3, 4; Figure S9). This was an interesting finding, because both bacterial groups were isolated for the first time from Mediterranean sponges. In fact, Anaerolineae were found most abundant in Aaptos suberitoides and Xestospongia testudinaria collected in South East Misool, Raja Ampat, West Papua (Indonesia) 72,73 . No data were reported so far for marine biotechnology applications. In contrast, the anaerobic Dehalococcoides showed surprising capabilities to transform various chlorinated organic compounds via reductive dechlorination. For this reason, Dehalococcoides were extensively used for the restoration of environments contaminated by chlorinated organics, which are normally released through industrial and agricultural activities 74,75 . ASVs analysis showed a peculiar abundance of the phylum Verrucomicrobia (class Verrucomicrobiae) in the sponge O. cf. perforata (Figs. 3, 4; Figure S9). Little information was reported on the abundance and ecology of aquatic Verrucomicrobia, being prevalent in lakes characterized by nutrient abundance and phosphorus availability 76,77 . These bacteria play an important role in global carbon cycling, processing decaying organic materials and degrading various polysaccharides [78][79][80][81] . It was found that the sponge-symbiotic Verrucomicrobiae bacteria (e.g. Petrosia ficiformis) exhibited enrichment of the toxin-antitoxin (TA) system suggesting the hypothesis that these bacteria use these systems as a defense mechanism against antimicrobial activity deriving from the abundant microbial community co-inhabiting their host 77 . Rubritalea squalenifaciens (strain HOact23T; MBIC08254T) is a rare marine bacterium belonging to the phylum Verrucomicrobia, isolated from Halichondria okadai (collected in Japan), from which a novel acyl glyco-carotenoic acids, diapolycopenedioic acid xylosyl esters A, B, and C, were isolated as red pigments with a potent antioxidative activity 82 .
Furthermore, Nitrospirae (class Nitrospira) was the most abundant bacterial phylum in the three sponges from the Gulf of Naples, A. damicornis, G. cydonium and A. acuta, as well as, in O. cf. perforata and E. discophous from the Faro Lake (Figs. 3, 4; Figure S9). The first described Nitrospira species was N. marina, isolated by Watson et al. 83 from water collected in the Gulf of Maine. In particular, Nitrospira spp. play pivotal roles in nitrification as anaerobic chemolithoautotrophic nitrite-oxidizing bacterium 84 . These bacteria also have been found in several sponge species such as Theonella swinhoei and Geodia barretti 85,86 . Concerning their biotechnological potential, very little information is available so far. A recent work, using BLASTp search against the Integrated Microbial Genomes (IMG) database, identified a Pseudoalteromonas luteoviolacea gene encoding for a L-amino acid oxidase (LAAO) with antimicrobial properties in a strain belonging to the phylum Nitrospinae 87 .
Summarizing, our data pointed out the attention on the species biodiversity of the Mediterranean Sea and on 16S rRNA sequence datasets, which allowed to the detection of several signature resident microbial fauna. In addition, data reported on the biotechnological potential of the bacteria identified in the eight sponges under analysis, suggest the need for further validations through bioassay-guided fractionation to identify novel metabolites useful for the pharmaceutical, cosmeceutical and nutraceutical fields. The site Faro Lake (0.263 Km 2 ) is the deepest coastal lake in Italy located within the Natural Reserve of "Capo Peloro" (NE Sicily). The Faro Lake is characterized by a funnel-shape profile, with a steep sloping bottom reaching the maximum depth of 29 m in the central area and a wide nearshore shallow waters area. In the deepest part, the Faro Lake shows typical features of a meromictic temperate basin, with an oxygenated mixolimnion (the upper 15 m) and a lower anoxic and sulphidic monimolimnion 88 . Two channels, a northern and a northeastern, connect the lake to the Tyrrhenian Sea and the Strait of Messina. Salinity ranges are from 26 to 36 PSU, temperature from 10 to 30 °C and pH ranges from 7.0-8.6 26  www.nature.com/scientificreports/ (depth = 20 m; 40°49ʹN, 14°5ʹE, Fig. 5D). All collecting sites were selected on the basis of some data reporting on the great biodiversity and, in some cases, the presence of alien species 26,89,90 . Collected samples were immediately washed at least three times with filter-sterilized natural seawater. A fragment of each specimen was preserved in 70% ethanol for taxonomic identification; another fragment was then placed into individual sterile tubes and kept in RNAlater © at − 20 °C used for molecular analysis. Details on sampling were reported in Table 1.

Morphological analysis of the sponges.
For the taxonomic analysis, the spicules of each sponge specimen, spicule complement and skeletal architecture, were examined under light microscopy following published protocols 91,92 . Taxonomic decisions were made according to the classification present in the World Porifera Database (WPD) 14 . The sponge samples were all identified at the species level.
DNA extraction and PCR amplification. About 10 mg of tissue was used for DNA extraction by QIAamp® DNA Micro kit (QIAGEN), according to the manufacturer's instructions. DNA quantity (ng/μL) was evaluated by a NanoDrop spectrophotometer. PCR reactions were performed on the C1000 Touch Thermal Cycler (BioRad) in a 30 µL reaction mixture final volume including about 50-100 ng of genomic DNA, 6 µL of 5X Buffer GL (GeneSpin Srl, Milan, Italy), 3 µL of dNTPs (2 mM each), 2 µL of each forward and reverse primer (25 pmol/µL), 0.2 µL of Xtra Taq Polymerase (5 U/µL, GeneSpin Srl, Milan, Italy) as follows (for primer sequences, see also PCR products were separated on 1.5% agarose gel electrophoresis in TAE buffer (40 mM Tris-acetate, 1 mM EDTA, pH 8.0) using a 100 bp DNA ladder (GeneSpin Srl, Milan, Italy) and purified by QIAquick Gel Extraction Kit (Qiagen) according to the manufacturer's instructions. PCR amplicons were then sequenced in both strands through Applied Biosystems (Life Technologies) 3730 Analyzer (48 capillaries). Sequences produced were ~ 650 bases long in average with more than 97.5% accuracy, starting from PCR fragments. Each 18S, 28S and CO1 PCR products were aligned to GenBank using Basic Local Alignment Search Tool (BLAST) and then aligned with highly similar sequence using MultiAlin (http:// multa lin. toulo use. inra. fr/ multa lin/, see Figures S1-S7).   After 16S amplification, a PCR clean-up was done to purify the V3-V4 amplicon from free primers and primer dimer species. This step was followed by another limited-cycle amplification step to add multiplexing indices and Illumina sequencing adapters by using a Nextera XT Index Kit. A second step of clean-up was further performed and then libraries were normalized and pooled by denoising processes (Table S14), and sequenced on Illumina MiSeq Platform with 2 × 300 bp paired-end reads. Taxonomy was assigned using "home made" Naive Bayesian Classifier trained on V3-V4 16S sequences of SILVA 132 database 102 . Frequencies per feature and per sample are shown in Figures S10-S11. QIIME 2 (Quantitative Insights Into Microbial Ecology) platform 103 was used for microbiome analysis from raw DNA sequencing data. QIIME 2 analysis workflow was performed by demultiplexing, quality filtering, chimera removal, taxonomic assignment, and diversity analyses (alpha and beta).
Taxonomy BarPlot ( Figure S9) was generated through a R version 4.1.1 104 using Cairo graphics library 105 . Species diversity was estimated by i. Chao 1 index 106 , which is qualitative species-based method; ii. Shannon 107,108 and iii. Simpson 109 indices, which are quantitative species-based measures. All these indices were estimated at three taxa levels (Level 5 = Family, Level 6 = Genus, Level 7 = Species). For alpha and beta diversity, significant differences were assessed by Kruskal-Wallis test and pairwise PERMANOVA analysis, respectively. Moreover, Bray-Curtis and "un-, weighted" UniFrac metrics were used to calculate a distance matrix between each pair of samples, independently from the environmental variables.

Data availability
The full dataset of raw data was deposited in the SRA database (Submission ID: SUB8692761; BioProject ID: PRJNA683751). www.nature.com/scientificreports/