Genetic evidence against monophyly of Oniscidea implies a need to revise scenarios for the origin of terrestrial isopods

Among the few crustacean taxa that managed to inhabit terrestrial environments, Oniscidea includes the most successful colonizers in terms of species richness and abundance. However, neither morphological traits nor molecular markers have definitively resolved phylogenetic relationships among major Oniscidea clades or established the monophyly of the taxon. Herein, we employed the highly conserved, nuclear protein-coding genes Sodium-Potassium Pump (NAK) and Phosphoenolpyruvate Carboxykinase (PEPCK), along with the traditionally used 18 s and 28 s ribosomal RNA genes, in an attempt to clarify these questions. Our dataset included sequences representing all major Oniscidea clades and closely related aquatic taxa, as suggested by previous studies. We applied Bayesian Inference and Maximum Likelihood methods and produced a robust and fully resolved phylogenetic tree that offers strong evidence against the monophyly of Oniscidea. The amphibious genus Ligia appears to be more closely related to representatives of marine suborders, while the phylogenetic pattern of the remaining Oniscidea implies a complex history of the transition from the marine environment to land. With the exception of the basal clade, all other established major clades have been recovered as monophyletic, even though relationships within these clades call for a revised interpretation of morphological characters used in terrestrial isopod taxonomy.


Results
Extracted DNA concentration was >15 ng/μl in all cases, with the A260/A280 purity rate over 1.5. Attempts to amplify and sequence all targeted loci were successful for almost all samples. The final compiled aligned dataset after Gblocks treatment consisted of 1,984 base pairs (bp). The initial alignment lengths and numbers of conserved, variable and parsimony-informative sites are shown in Table 1 for all sequenced loci separately. Among the tested models, the highest Akaike weight values, indicating the best fit to data, were exhibited by TIM2ef + I + G for 18 s, TIM3 + G for 28 s, TIM2 + I + G for NAK, and GTR + G for PEPCK.
Prior to calculation of genetic divergence, available sequences were grouped at the suborder level and those of Oniscidea were further grouped into the five known major subclades. Ligia specimens were grouped separately from the rest of the Diplocheta, as they appear to form a separate clade on the produced phylogenetic tree (Fig. 5). Genetic distances between examined taxa appeared to be constantly higher for ribosomal genes compared to the protein-coding ones. Genetic variation ranged between 6.6-30.2% in the case of 18 s, 33.3-71.6% for 28 s, 16.7-30.6% for NAK and 19.3-29.5% for PEPCK. The minimum and maximum genetic divergence values were not constantly found between the same groups for all genetic markers. More specifically, the maximum genetic distance was found between Tylida-Crinocheta, Sphaeromatidae-Crinocheta, Asellota-Valvifera and Asellota-Crinocheta, whereas the minimum values were identified between Asellota-Phreatoicidea, Tylida-Mesoniscus, Ligia-Sphaeromatidae and Valvifera-'Diplocheta' (excluding Ligia) in the case of 18 s, 28 s, NAK and PEPCK genes, respectively. All within-and between-group p-distances are given in Supplementary Material.
The Bayesian Inference (BI) and Maximum Likelihood (ML) trees exhibited largely congruent topologies. Nevertheless, in some cases, high BI posterior probabilities did not coincide with high ML bootstrap values (>80). This can be attributed to the fact that, in contrast to BI, the ML method implemented in available softwares (e.g. RAxML, PhyML, IQ-TREE) perceives gaps (−) and missing data (given as N or? in DNA alignments) as  Table 1. Aligned bases length, before and after GBlocks treatment (for ribosomal genes), conserved, variable and parsimony-informative sites for all genes used in this study. unknown characters that do not provide additional information for the resolution of phylogenetic relationships. Two out of four targeted loci are coding rRNAs whose three-dimensional structure is dependent on highly conserved regions which are interrupted by variable regions accumulating mutations, including indels. These regions are not under strong evolutionary pressure and, hence, mutations can explain the occurrence of gaps in final alignments. On the other hand, the BI approach takes into account insertion and deletion events that contain phylogenetically useful information. Therefore, only the BI tree is presented herein (Fig. 5). Holoverticata (sensu Schmidt 1 ) is recovered as a well-supported clade, containing the traditionally recognised sub-clade structure: Crinocheta and Synocheta form two well-supported, monophyletic sister clades, and Microcheta is the intermediate clade of these and the more basal, monophyletic Tylida. Nevertheless, Diplocheta (hence, also Ligiidae) appear to be polyphyletic, with Ligia being the sister taxon of Valvifera + Sphaeromatidea, and the genera Ligidium Brandt, 1833, Tauroligidium Borutzky, 1950 and Typhloligidium Verhoeff, 1918, traditionally grouped in Ligiidae, forming a well-supported monophyletic group, as the sister clade of Holoverticata. The monophyly of Oniscidea as currently defined is questioned, and could be saved if Ligia is excluded from the taxon. The basal position of Colubotelson Nicholls, 1944 (Phreatoicidea) and Asellus Geoffroy, 1762 (Asellota), as well as the statistically supported retrieval of Valvifera and Sphaeromatidae within the 'Onisicdea' clade, indicates the closer relationship of terrestrial isopods with these two suborders. Phylogenetic relationships inside Crinocheta also show some interesting patterns with important implications for oniscidean taxonomy. Porcellionidae form a well-supported clade with Trachelipodidae and part of Agnaridae (as the latter appear to be polyphyletic), while Armadillidiidae, traditionally considered sister-group of the Porcellionidae, is grouped with representatives of other families (e.g., Cylisticidae and part of Agnaridae). Also, Platyarthrus Brandt, 1833 and Trichorhina Budde-Lund, 1908, presently included in the family Platyarthridae, do not seem to be related, and the representative of the most diverse family Armadillidae appears in a more basal position within Crinocheta.
Within Synocheta, the monophyly of Trichoniscidae is not supported, as Styloniscus Dana, 1852, type-genus of Styloniscidae, seems to fall within the former. Moreover, no support for the monophyly of the subfamilies Trichoniscinae and Haplophthalminae could be found.

Discussion
This is the first time that nuclear protein-coding genes are used to resolve phylogenetic relationships among major groups of Oniscidea. The fact that this study is so far the only one that produced a fully resolved and robust molecular phylogeny of all five major oniscidean clades, proves the advantages of using these markers. NAK has been used before 25 in terrestrial isopod phylogenetics, but at a lower taxonomic level. Of course, given the depth of phylogeny attempted herein, the use of mitochondrial genes, with their high mutation rates and, hence, saturation effects, is not appropriate 26 . Also, the use of untreated nuclear ribosomal genes sequences, such as of 18 s and/or 28 s, might have led to biased or insufficiently supported results, as they contain regions that evolve at very different rates. Gblocks treatment was recruited to overcome possible issues that may arise due to the properties of these regions. Herein, we managed to produce a robust and sufficiently inclusive phylogeny of terrestrial isopods www.nature.com/scientificreports www.nature.com/scientificreports/ using a more reliable data set of nuclear DNA markers. This phylogeny has important implications for oniscidean systematics, as it undermines the validity of several morphological characters traditionally used in terrestrial isopod taxonomy. The transition of isopods from the marine to the terrestrial environment might also need to be revisited in light of the new evidence.
A number of unique adaptations to terrestrial life have led authors to assume that Oniscidea underwent only one transition from marine to land 2,6,27 . However, the low number of studies using molecular data in the past failed to confirm the monophyly of Oniscidea 15,16 , but also failed to provide a consistent phylogenetic pattern 28,29 . According to the results of our analysis, the monophyly of Oniscidea, as currently defined, is not supported, since the genus Ligia, generally considered as con-familiar with Ligidium and a small number of other related taxa, none of which exploit littoral environments, appears to be a closer relative of a group of marine isopods, such as the Valvifera and Sphaeromatidae. The monophyly of Oniscidea could be saved if Ligia is excluded. The assumed synapomorphies of 'Ligiidae' , such as the residual maxillipedal segment at the back of the cephalon, are rather symplesiomorphies, as has been previously suspected 1 . Ligidium and related genera of the polyphyletic family Ligiidae could be assigned to a new family (we propose Ligidiidae, from the most speciose genus Ligidium) that can be more safely defined by more reliable synapomorphies, such as the shape of the uropods with the endopod inserted distally compared to the exopod (cf. Figs. 1B and 2B). The genus Ligidioides Wahrberg, 1922 (not included in our analysis) has a uropod more similar to that of Ligia, i.e., with the insertions of the endopod and exopod at the same level 30 , and might remain in the family Ligiidae, but this has to be investigated by a future molecular analysis that also includes this genus. Lins et al. 16 came to similar conclusions regarding the relationships of Ligia with marine taxa, but these authors did not include other Ligiidae in their analysis, so they could not discuss the monophyly of the family. A common evolutionary history of the mitochondrial genomes of Ligia and Idotea Fabricius, 1798 was highlighted also by Kilpert and Podsiadlowski 31 . The high genetic divergence between Ligia and Ligidium was also evident from their distant position in the phenetic tree presented by Michel-Salzat & Bouchon 15 . Our findings are in agreement with all of these studies, a fact that further corroborates our hypothesis.
In view of the new phylogeny, the critical question regarding the transition from the marine environment to land should be addressed by taking into account the ecology of species in the major clades and, most importantly, the fact that the relevant event(s) happened sometime in the middle or even lower Mesozoic 27 , so that a large number of crucial forms might have been extinct without leaving any fossils of ancestral lineages. In fact, the oldest fossil Oniscidea are much younger and consist of highly derived forms 32 , while coastal marine or amphibious forms of animals that do not have hard skeletons, shells or teeth, are rarely fossilized anyway.
Considering that: (a) the most basal clade (Diplocheta, excluding Ligia) consists of freshwater-related taxa, (b) the subsequent clade (Tylida) includes taxa mostly living along marine coasts (even though the genus Helleria is fully terrestrial), and with a divergent morphology compared to other Oniscidea (at least regarding the form of cephalon, the distinct epimera on most thoracic segments, and the unique type of respiratory structures on pleopods, not connected to those of other taxa, see Fig. 3), and (c) Microcheta are fully terrestrial (albeit dependent on very high humidity) and they exhibit an overall morphology closer to that of the more derived Oniscidea (see Fig. 4), one might consider revisiting scenarios regarding the transition of isopods form the marine environment to land. Even though most Ligia species are amphibious, there are some species that live inland [33][34][35][36] . This means that we might envision a similar but independent transition that led to the common ancestor of 'Ligidiidae' , given that this group consists today of species mostly living in close connection to freshwater. On the other hand, Tylidae might represent another transition, since they exhibit many characters that are difficult to recreate via a plausible transformation series from Diplocheta-type characters (cf. Figs. 1, 2 and 3). If this proves true, the next clade, Microcheta, which is basal to all Orthogonopoda, connected to very humid, freshwater-related habitats and with a more differentiated morphology than Tylida in many characters (cf. Figs. 3 and 4), would represent a third invasion to land, maybe using a freshwater path. Of course, this would undermine the actual monophyly of Oniscidea.
On the basis of current evidence, this is only a tentative hypothesis that has to be evaluated through careful elaboration of physiological traits and, hopefully, further fossil findings. Obviously, the very old origins of the Oniscidea 27 , coupled with the difficulty of fossilization of these organisms, might have led to the permanent loss of crucial information from several basal clades representing possible direct ancestors of terrestrial forms. The phylogenetic reconstruction based on modern forms cannot recover such extinct clades, except in the case of some exceptional, but highly unlikely, fossils being found in the future.
The monophyly of Crinocheta and Synocheta seems to be unambiguous. The hypothesis by Tabacaru and Danielopol 11 that Synocheta is a sister taxon with Mesoniscidae cannot be supported. The phylogenetic relationships inside the two major clades reveal that certain morphological characters that have been considered important in oniscidean taxonomy, such as the type and form of pleopodal lungs, the ornamentation of tergites or the shape of uropods, might not be very useful. In particular, Porcellionidae and Armadillidiidae, even though they seem to share a similar type of pleopodal lung, at least in comparison with that in Trachelipodidae, appear to belong to distant clades; the former related to Trachelipodidae and part of Agnaridae (the monophyly of which is not supported), and the latter to Cylisticidae and other families. This is in agreement with the recent findings by Dimitriou et al. 25 . In turn, Cylisticidae appears to be closer to Armadillidiidae, even though they have styliform uropods. Within Synocheta, the traditional distinction between Trichoniscinae and Haplophthalminae, based largely on the presence of ornamentation on tergites, does not seem to be supported since Calconiscellus Verhoeff, 1927, a member of Haplophthalminae, appears to be the sister-taxon of Caucasonethes Verhoeff, 1932 and nested within other genera of Trichoniscinae. Furthermore, the status of Styloniscidae as a separate family from Trichoniscidae is also undermined. More detailed analyses, using more extensive taxonomic sampling inside these clades, are necessary to clarify these issues.
The closer relationship of terrestrial isopods with Valvifera and Sphaeromatidae than with Asellota or Phreatocidea, revealed by our analysis, agrees with the hypothesis of Brusca and Wilson 10 . www.nature.com/scientificreports www.nature.com/scientificreports/ In conclusion, Oniscidea should not be considered monophyletic. Systematics in this very old group, which presents an amazing case of animal invasions to land, are in urgent need of extensive revision, taking into account robust molecular evidence. New techniques, such as whole genome sequencing, transcriptomics and ultra-conserved elements, should be applied to the whole range of terrestrial isopod taxa, in order to resolve the complete phylogenetic history of the group and shed light on crucial questions regarding the evolution of terrestriality in this taxon. Modern terrestrial isopoda is probably the only animal taxonomic group lower than Class that includes representatives of most steps of the transition from aquatic environments to almost all terrestrial environments, despite the presumed large number of extinct forms 37 . Furthermore, considering the fact that these animals have evolved structures analogous to the complex organs of terrestrial vertebrates, such as lungs (pleopodal lungs) and the placenta 4 (marsupial, egg-feeding 'cotelydons'), a detailed phylogenetic reconstruction can provide invaluable information on many exciting aspects of evolutionary biology, but also physiology, behaviour, ecology, and several other fields.

Methods
Sample collection. Using both field collecting, deposited and loaned material, we compiled a data set including 34 Oniscidea species, representing 30 genera and 14 families. Moreover, non-Oniscidea specimens of Valvifera (Idotea), Sphaeromatidea (Sphaeroma Bosc, 1801) and Asellota (Asellus) were also included. Colleagues that kindly sent us material are mentioned in the Acknowledgements. Freshly collected specimens, as well as the majority of available museum specimens were placed in 96% ethanol until further laboratory procedures, but we also managed to retrieve genetic data from specimens preserved in 70% alcohol for a relatively long period. Detailed information about specimens is given in Table 2.
Amplification of targeted loci. Total genomic DNA was extracted from available specimens using a DNeasy Blood and Tissue Kit (Qiagen, Hilden, Germany), following the manufacturer's proposed protocol. Quality and quantity control of extracted DNA was performed with NanoDrop 2000/200c (Thermo Fisher Scientific Inc., USA). The final concentration was measured in ng /μl and purity was verified with A260/A280nm absorption ratio.
The non-coding nuclear genetic markers 18 s and 28 s, and the protein-coding Sodium-Potassium Pump (NAK) and Phosphoenolpyruvate Carboxykinase (PEPCK) genetic loci were targeted with common PCR procedures using gene specific primers. Desired regions were successfully amplified using 18Aimod/700 R primer pair for 18s 38 , 28sa/28 sb for 28s 39 , NAK for-b/NAK rev 2 or NAK for-b/NAK 638 R for NAK 24,25 and PEPCKfor/ PEPCKrev 24 and the newly designed PEPCK 545 R (5′-CCRAAGAANGGYSTCATNGC-3′) for PEPCK. All PCR reactions were carried out in a Veriti thermal cycler (Applied Biosystems, USA). Taking into account the genetically diverse samples, we used a touchdown PCR approach to eliminate aspecific products and save time, opposed to using multiple reactions, specific for different taxa. This way we managed to increase specificity, sensitivity and yield 40 . In each case, the final reaction volume was adjusted to 20 μl, including 0.5 U of Kapa Taq DNA Polymerase, 3 mM MgCl 2 , 1X of Kapa PCR buffer A, 0.3 mM dNTP (Kapa) 0.3 µM of each primer and >20 ng of DNA template. The reactions' thermal profile followed Dimitriou et al. 25 . Amplicons were purified with a Qiaquick Purification Kit (Qiagen, Germany) following the proposed instructions. The final products were sent for sequencing of both DNA strands at Macrogen facilities (Amsterdam, The Netherlands). Data processing. CodonCode Aligner (v. 3.7.1; CodonCode Corp., USA) was used to manually inspect chromatograms, generate assemblages and make edits, where necessary. Our final dataset also included sequences of additional Ligia spp. and Colubotelson thomsoni Nicholls, 1944 (Phreatoicidea) retrieved from NCBI GenBank. The latter was included to serve as an additional outgroup. In the case of the genus Ligia, apart from the data generated in the framework of the present study, a chimeric sequence combining data from all targeted genes from the congeneric species L. oceanica (Linnaeus, 1767), L. hawaiensis (Dana, 1853) and L. exotica Roux, 1828 was included in our analyses. In this way, we manage to verify the phylogenetic position of the genus in the produced tree in a robust way. Accession numbers of all sequences used herein are given in Table 2. Sequences from each targeted gene were separated in different files and multiple sequence alignments were performed using MAFFT v.7 41 . MEGA v.6 42 was used to calculate genetic distances for each alignment. Relatively longer sequences with no overlapping fragments for the majority of the samples were trimmed prior to further data elaboration.
Given that ribosomal genes consist of multiple conserved and flanking hypervariable regions, related to their functional three-dimensional structure after gene expression, alignment might be challenging 43 . In order to test the sensitivity of produced alignments and remove possible poorly aligned regions for 18 s and 28 s genes, we used Gblocks v0.91b 44 through the Gblocks server available at http://molevol.cmima.csic.es/castresana/Gblocks_ server.html. The analysis was run allowing smaller final blocks, less strict flanking and gap positions. The positive effects of removing divergent and ambiguously-aligned blocks in phylogenies are discussed by Talavera and Castresana 45 . phylogenetic analyses. The optimal nucleotide substitution model for each loci was selected according to Akaike's Information Criterion (AIC) 46 48 and RAxML-NG web server 49 respectively.
The concatenated data set was fed as partition blocks to MrBayes. Bayesian Inference analysis was run with the selected model of nucleotide evolution for each gene, under the default settings for within-partition among-site rate variation, allowing rate heterogeneity between partitions. BI, applying Metropolis-coupled Markov Chain Monte Carlo algorithms, was set to run four independent times with eight chains per run for 20 million generations and a sampling frequency of 100. Stationarity and convergence among runs, were ensured by monitoring the average standard deviation of split frequencies of the four simultaneous and independent runs in MrBayes.  Table 2. Species, locality of origin and GenBank accession numbers of individuals used in the molecular phylogenetic analyses. (√ will be replaced with accession numbers when available).