Molecular diversity of benthic ctenophores (Coeloplanidae)

Coeloplanidae, the largest family of benthic ctenophores, comprises 33 species, all described based on traditional morphological characteristics, such as coloration, length, and number of aboral papillae, which are highly variable and can be affected by fixation methods and environmental conditions. Thus, there is a need for reliable genetic markers to complement the morphological identifications at the species level. Here, we analyzed 95 specimens from 11 morphologically distinct species of benthic ctenophores from the Red Sea and Sulu Sea, and tested selected regions of four genetic markers (ITS1, 18S rRNA, 28S rRNA and COI) for their ability to differentiate between species. We show that the barcoding region of the mitochondrial gene, cytochrome oxidase subunit I (COI), is highly variable among species of Coeloplanidae, and effectively discriminates between species in this family. The average Kimura-2-parameter (K2P) distance between species-level clades was 10%, while intraspecific variation was ~30 times lower (0.36%). COI-based phylogeny supported the delineation of four recently described new species from the Red Sea. The other nuclear markers tested were found to be too conserved in order to separate between species. We conclude that COI is a potential molecular barcode for the family Coeloplanidae and suggest to test it in pelagic ctenophores.

. 18S rDNA interspecific sequence divergences (p-distances) of ctenophore species from the family Coeloplanidae. Each species is represented by a single sequence, except C. bannwarthi, C. fishelsoni, C. lineolata and V. multiformis, which are represented by two individuals each. *C. lineolata and C. punctata were considered to belong to the same clade.
rDNA, the Internal Transcribed Spacer 1 (ITS1) and the barcoding regions of the mitochondrial gene COI. We tested the validity of these genes as molecular markers using various species from the Gulf of Aqaba (Red Sea) that were recently described 2 , as well as two additional unidentified species collected off northern Borneo (Sulu Sea). Despite its relatively low genetic divergence rate in ctenophores, the nuclear marker 18S rDNA was first analyzed allowing us to verify that our specimens are indeed ctenophores from the Coeloplanidae family. We chose to amplify the C1-D2 domain of the nuclear 28S rDNA because this marker has been found to be phylogenetically informative in other animal groups (e.g., Porifera) 12 . The hyper-variable ITS1 had already been sequenced for a few specimens 7,13 and is commonly used as an alternative barcoding marker in several groups 14 , when higher resolutions of genetic relationships are examined (i.e., species delineation, population genetics). The last marker tested was the mitochondrial COI gene, which is known as the universal barcode, commonly used in molecular systematic studies.

Results
Phylogenetic analyses of the nuclear 18S rDNA marker. The final alignment of the 18S rDNA sequences contained 1,746 positions, of which 1,712 were constant, 32 were variable and 27 were parsimony-informative. The average p-distance between Coeloplana species was 0.03 ± 0.007% SE, ranging between 0.0-0.21%. This marker could not differentiate between species from the genus Coeloplana (Fig. 1, Table 1). The average p-distance between genera (i.e., Coeloplana vs. Vallicula) was 1.5 ± 0.03% SE.   Phylogenetic analyses of the nuclear 28S rDNA marker. The final alignment of the C1-D2 domain of the 28S rDNA sequences contained 721 positions, of which 647 were constant, 70 were variable and six were parsimony-informative. The average p-distance between Coeloplana species was 0.44 ± 0.056% SE, ranging between 0-0.84%. The average p-distance between genera (i.e., Coeloplana vs. Vallicula) was 9.5 ± 1% SE. Although the 28S rDNA marker was more variable than the 18S rDNA, it could not differentiate between several Coeloplana species (Fig. 2, Table 2).
Phylogenetic analyses of the nuclear ITS1 marker. The final alignment of the ITS1 sequences contained 397 positions, of which 243 were constant, 130 were variable and 94 were parsimony-informative. The average p-distance between Coeloplana species was 4.7 ± 1.05% SE, ranging between 0-10%. The average p-distance between genera (i.e., Coeloplana vs. Vallicula) was 28.7 ± 3% SE. Although ITS1 is considered to be a hyper-variable marker, it could not differentiate between several valid Coeloplana species. For example, C. lineolata, C. fishelsoni and C. bannwarthi have identical ITS 1 sequence (Fig. 3). This marker, however, differentiated well between the two coeloplanid genera (Fig. 3, Table 3).
Phylogenetic analyses of the mitochondrial COI marker. The final alignment of the barcoding region of the COI gene contained 657 positions, of which 375 were constant, 282 were variable, and 259 were parsimony-informative. All COI sequences were heavily AT-biased, with an average of A+T content of 72.25 ± 2.25%. Both maximum likelihood (ML) and Bayesian analyses yielded similar tree topologies with 15 well-supported clades (Fig. 4). The distinction between Coeloplana punctata Fricke, 1970 and C. lineolata Fricke, 1970, both associated with soft corals of the genus Sarcophyton, was not supported by our molecular data. The average Kimura-2-parameter (K2P) distance between Coeloplana species was 10 ± 0.36% SE. The minimal interspecific distance was 2.3% (between C. loyai from mushroom corals in the Red Sea and Coeloplana sp. 3 collected from a mushroom coral in the Sulu Sea, Fig. 5), while the maximal intraspecific distance was 0.36%. This mitochondrial marker successfully differentiated between all Coeloplana species analyzed, except for C. punctata and C. lineolata (see Discussion; Fig. 4, Table 4). The average K2P distance between the genera, Coeloplana and Vallicula, was 32.1% ± 2.2% SE.

Discussion
Though debated due to several limitations [15][16][17] , DNA barcoding is a useful tool for species identification and the discovery of new species 8,18,19 , especially when integrated with morphological taxonomy 17,[20][21][22] . We show that COI is a variable marker within Coeloplanidae and can thus be used to identify species in this family. Moreover, our results support the designation of four new Red Sea Coeloplana species that were recently described by Alamaru et al. 2 .

Suitability of the various genetic markers to distinguish between Coeloplanidae species. The
18S rDNA sequences and the C1-D2 domain of the 28S rDNA were not variable enough and failed to differentiate among species within the family Coeloplanidae, while successfully separating the two Coeloplanidae genera (Coeloplana and Vallicula), in agreement with the 18S rDNA phylogeny presented by Podar et al. 7 for the phylum Ctenophora. However, the insufficient number of ctenophore 28S rDNA sequences currently available in public databases did not allow to test the utility of this marker for other groups from this phylum. The ITS1 marker, usually considered to be hyper-variable 7, 14, 23 , was not variable enough to differentiate between some Coeloplana species, in agreement with the recent work of Simion et al. 24 , who found that the ITS1 region of ctenophores is relatively conserved and can be easily aligned, even between distantly related ctenophore taxa. In addition, because ITS1 in some species includes more than one microsatellite region, it is challenging to sequence using the Sanger sequencing method. Indeed, these polymeric repeats induced in vitro slippage errors during amplification and sequencing reactions, thus hampering the determination of the sequences. Furthermore, in a few cases, intra-individual variation appears to occur, affecting the reliability of this marker for identification of species, as paralogous copies may be compared rather than orthologous ones 25 . We thus conclude that the ITS1 marker is not suitable for large taxonomic surveys.
In contrast, we show that the mitochondrial COI sequences have a higher divergence rate than the hyper-variable ITS1 marker, in agreement with the extremely fast evolution rate of the mitochondrial genomes of the pelagic ctenophores Mnemiopsis leidyi A. Agassiz, 1865 9 and Pleurobrachia bachei A. Agassiz, 1860 10 . Our results show that benthic ctenophores of the family Coeloplanidae also present a high mitochondrial evolution rate, resulting in an average K2P distance of 10% between species. This high interspecific variation, along with ~30-times lower mean intraspecific variation (0.36%), emphasizes COI as an effective DNA barcode in ctenophores. Moreover, a barcoding gap 15 was observed for all species analyzed in this study, except for the C. punctata and C. lineolata clade. The suitability of the COI gene should be further verified by analyzing additional COI sequences from various ctenophore species and populations and by considering independent nuclear markers. The only exception is the 2.3% distance between C. loyai collected from the mushroom coral Ctenactis echinata in the Gulf of Aqaba and an unidentified ctenophore (Coeloplana sp. 3) collected from another mushroom coral, Cycloseris costulata, off northern Borneo in the Sulu Sea 26 (Fig. 5). Due to the poor state of preservation of the Sulu Sea sample, we could not identify it based on its morphology, and therefore cannot conclude at this stage whether the latter sample collected from C. costulata represents a different species or a member of a different population of C. loyai. The species analyzed in the framework of this study show very low intraspecific genetic variability for COI mitochondrial marker. Asexual reproduction, which is known to occur among benthic ctenophores 27 , may thus play an important role in their life history 28, 29 . Molecular support for the identification of four recently described Coeloplana species. Alamaru   Cryptic diversity in benthic ctenophores. One of the samples originally identified as C. fishelsoni based on its morphology 2 (sample 2011-3 collected from Xenia) showed more than 7% sequence divergence compared to the other five C. fishelsoni samples. This sample (accession number KT885976) clustered closer to C. anthostella and C. huchonae in the COI-based phylogeny, rather than with other C. fishelsoni sequences. This pattern was also observed in the ITS1 tree (accession number KT885963), though this should be considered with caution since the presence of intra individual variation may affect the phylogenetic results. The same was observed for a sample originally identified as C. bannwarthi (sample number 2000-11 collected from the sea urchin D. setosum). Based on COI sequences (accession number KT886018), this specimen presented more than 3% sequence divergence compared to other C. bannwarthi specimens, and thus belongs to a different clade, a pattern also observed in the ITS1 tree (accession number KT885962). As the two diverging samples of C. fishelsoni and C. bannwarthi displayed the morphology of the described valid species, we attribute the molecular differences to a possible cryptic species diversity. These samples were thus considered as separate clades in all genetic analyses and were labeled as C. fishelsoni var. and C. bannwarthi var. in the phylogenetic trees. We also found that an unidentified specimen with green dots (Coeloplana sp. 4) formed a distinct clade in the COI tree, suggesting that it is another unidentified Coeloplana species. Although this specimen exhibited some similarities to C. punctata (i.e. identical host and pattern of multiple dots across the entire body), it differed in the color of the dots (green versus brown). These results are currently supported by a single mitochondrial marker (COI), and, even in the absence of a stop codon, we cannot exclude the amplification of a nuclear mitochondrial DNA segment (numt). Additional genetic markers, as well as careful morphological evaluation, will therefore be necessary to substantiate these cryptic species.

Synonymy of the species Coeloplana lineolata and Coeloplana punctata. Our analysis of COI
sequences supported the majority of previously designated species based on morphological features (i.e., classical taxonomy). However, COI does not differentiate between C. lineolata and C. punctata, which are currently accepted as valid species originally described by Fricke 30 from Madagascar. Sequences of these two species cluster into a well-supported clade (Fig. 4) with a very small K2P distance of 0.15%, well within the range of COI intraspecific divergence observed for coeloplanids (Table 4). In contrast, ITS1 sequences of C. lineolata and C. punctata were extremely variable. However, as the support for the ITS1 phylogeny was generally low, these results are not reliable.
It some cases, it was challenging to morphologically differentiate C. lineolata from C. punctata. Indeed, the parallel lines in contracted individuals appeared as dots in relaxed individuals (Fig. 6). When species identification was ambiguous, the specimens were categorized as "brown dots" or Coeloplana sp. 1, which in later COI molecular analysis clustered with the C. lineolata and C. punctata clade (Figs 4 and 6). Our combined phylogenetic analysis and morphological observations suggest that there is no support for the designation of two different  species. Additional molecular data using different markers and samples from the type locality will be needed to validate this conclusion.
Sulu Sea species. The samples collected from the sea star Echinaster sp. in the Sulu Sea clustered into one clade with 100% bootstrap support. Based on photos of the sampled specimens (Fig. 7), we suggest this clade to represent C. astericola Mortensen 1927. The average K2P distance between Coeloplana specimens sampled from Echinaster sea stars and Coeloplana specimens sampled from Sarcophyton corals was 5.3%, suggesting that these two clades belong to different species. Coeloplana sampled from Sarcophyton corals off Borneo (Coeloplana sp. 2) may either belong to one of the two un-sequenced species known to live on Sarcophyton (C. wuennenbergi Fricke, 1970 or C. mellosa Gershwin, Zeidler and Davie, 2010) or constitute a completely new species. Unfortunately, the ethanol fixation of the samples for molecular analysis caused major morphological deformities, thus precluding a morphological description, as well as assignment to either a valid or a new species. Further sampling and inspection of live material would resolve their taxonomic status.

Conclusions
Based on our results, we conclude that COI is a suitable barcode for benthic ctenophores from the family Coeloplanidae. We suggest testing the utility of this mitochondrial marker on other families and orders in the phylum Ctenophora. COI may prove to be especially useful in the delineation of pelagic ctenophore species known to be very fragile and challenging to preserve. It is possible, however, that the traditional Folmer primers 31 might not be suitable because ctenophores seem to have extremely high rates of mitochondrial evolution. Our results support the designation of four new Coeloplana species recently reported 2 . Based on the COI phylogenetic reconstruction and on previous morphological descriptions, we suggest that C. punctata and C. lineolata might belong to the same species. We conclude that the Coeloplana sampled from Sarcophyton corals off Borneo could constitute a new, undescribed species. Comprehensive morphological examination of this species requires further sampling using adjusted fixation protocols. Our molecular results suggest the presence of cryptic benthic ctenophore species in the Red Sea. Finally, we found no indication of cospeciation between Coeloplana species and their hosts. Our molecular study indicates that Coeloplana is a highly diverse genus, which can be effectively differentiated into species using the COI marker. As this group is cryptic and poorly studied, we assume that many species remain to be described.

Methods
Collection and observation. Benthic 32 . In the Red Sea, sampling was done mainly at night, as most benthic ctenophore species were easier to locate due to their extended tentacles and better contrast with the background water, whereas off Borneo, ctenophores were sampled during daytime dives. Some specimens were collected together with their hosts, and dislodged from them later in the lab using a pipette with a gentle stream of sea water. In other cases, the ctenophores were dislodged from their hosts in situ using a small pipette. Each ctenophore encountered was photographed in situ. For each specimen collected, the date, site, depth, and host were registered. Due to the expected difficulties, which are inherent to the morphological examination of fixed material, each collected live specimen was inspected in the lab and photographed using a high-resolution camera mounted on a stereoscope. Field circumstances did not allow for this procedure to be followed off Borneo. The species identification was conducted based on all existing Coeloplanidae literature, as previously reviewed 2 . Following identification and documentation, whole specimens were preserved in 95% EtOH for molecular analysis. DNA sequencing. Genomic DNA was extracted from individual ctenophores preserved in 95% EtOH using the Qiagen Blood & Tissue kit (Venlo, Netherlands) according to the manufacturer instructions. Genomic DNA was used for PCR amplification of four genetic markers (for details and primer sequences see Table S2).
All PCR reactions were performed on a TProfessional Basic (Biometra, Göttingen, Germany) in 25 µl total reaction volume containing 2 µl of DNA template (~50 ng), 2.5 µl of 10X ExTaq TM buffer, 2 µl of dNTPs supplied with ExTaq kit (2.5 mM each), 0.2 µl of TaKaRa ExTaq TM polymerase (5 units/ Sequence alignment. New sequences generated in this study were aligned with available sequences of platyctenids from public databases (Table S1). For the rRNA and ITS datasets, alignment was performed under the L-INS-I algorithm of MAFFT v7.017 33 as implemented in Geneious 6.1.8 (www.geneious.com). For the COI dataset, a translation alignment was performed with the available CDSs using the same algorithm and program.
Phylogenetic analysis. Phylogenetic analyses were performed for each gene dataset separately using both the ML and the Bayesian criteria. ML analyses were performed with RAxML v8.0.26 34 under the GTRGAMMA model. Specifically, the tree searches were conducted with 100 runs. Branch supports were computer based on 1,000 slow bootstrap replicates. In addition, the ML analysis of the COI gene was performed using a codon partition.
Bayesian analyses were performed with Mr Bayes version 3.2.6 35 under the GTRGAMMAI model. For each dataset, two runs with four chains each were conducted, with default temperatures and default prior distributions. The chains were run for 10,000,000 generations and sampled every 100 generations. Model parameters were allowed to be optimized independently for each codon position partition. Convergence was achieved before 2,500,000 generations for all markers (i.e., standard deviation of split frequencies was verified to have reached 0.009). The first 2,500,000 generations were thus discarded for all markers (burnin), and the Bayesian consensus was computed based on 150,000 trees.
Inter-and intraspecific genetic variabilities were computed using MEGA6 36 . For the rRNAs and the ITS datasets, pairwise p-distances were computed between each pair of sequences (each species was represented by a single sequence) (Tables 1-3). For the COI gene, both average pairwise K2P distances and average p-distances were calculated (Table 4). Variance estimates were computed using 1,000 bootstrap replicates. Data Availability. All data generated or analyzed during this study are included in this published article and its Supplementary Information files.