Ancient mitogenomics clarifies radiation of extinct Mascarene giant tortoises (Cylindraspis spp.)

The five extinct giant tortoises of the genus Cylindraspis belong to the most iconic species of the enigmatic fauna of the Mascarene Islands that went largely extinct after the discovery of the islands. To resolve the phylogeny and biogeography of Cylindraspis, we analysed a data set of 45 mitogenomes that includes all lineages of extant tortoises and eight near-complete sequences of all Mascarene species extracted from historic and subfossil material. Cylindraspis is an ancient lineage that diverged as early as the late Eocene. Diversification of Cylindraspis commenced in the mid-Oligocene, long before the formation of the Mascarene Islands. This rejects any notion suggesting that the group either arrived from nearby or distant continents over the course of the last millions of years or had even been translocated to the islands by humans. Instead, Cylindraspis likely originated on now submerged islands of the Réunion Hotspot and utilized these to island hop to reach the Mascarenes. The final diversification took place both before and after the arrival on the Mascarenes. With Cylindraspis a deeply divergent clade of tortoises became extinct that evolved long before the dodo or the Rodrigues solitaire, two other charismatic species of the lost Mascarene fauna.

Sequences of representatives of all extant testudinid taxa from Madagascar and Aldabra plus a 405-bp-long sequence of the extinct Malagasy species Aldabrachelys grandidieri from Austin et al. (2003) were included. For comparative purposes, representatives of two other island radiations of extant and extinct giant tortoises were added (Aldabrachelys from Madagascar, extinct, and Aldabra; Chelonoidis spp. from the Bahamas, extinct, and Galápagos plus their extant continental South American congeners): For individual lengths and voucher numbers of the Cylindraspis sequences, see Table S5. Accession numbers with asterisks indicate sequences produced for the present study.

Amplicon sequencing
For fresh samples with high molecular weight DNA, amplicon sequencing of the mitochondrial genome was conducted. For each sample, two long-range PCR reactions were performed (LR1 and LR2) yielding amplicons with an overlap of at least 106 bp and an individual length of approximately 7,100-10,450 bp, depending on the primer combination. For each long-range PCR, a 50 μl volume was used, containing 3.6-32.0 ng of DNA and 1 unit of TaKaRa LA Taq DNA Polymerase, Hot-Start Version (Clontech Laboratories Inc., Mountain View, CA, USA), and the reaction mixture recommended by the manufacturer. PCR conditions comprised initial denaturation at 93°C for 3 min, followed by 30-40 cycles of 93°C for 20 sec, 50-55°C for 30 sec, 68°C for 12 min, and a final elongation step at 68°C for 20 min. For primer sequences, amount of DNA template, number of repetitive PCR cycles, annealing temperatures, and fragment lengths see Tables S6 and S7. PCR products were visualised and, if necessary, excised from a 2% agarose gel and purified using the NucleoSpin Gel and PCR Clean-up Kit (Macherey-Nagel GmbH & Co. KG, Düren, Germany). The combined long-range PCR products covered most of the mitochondrial genome from tRNA-Phe (situated before 12S) to tRNA-Thr (situated after cyt b), missing out tRNA-Pro and the control region.
The authenticity of the long-range PCR products was verified by Sanger-sequencing part of the 12S and cyt b genes with well-established internal primers (Table S6) following standard procedures (Fritz et al. 2014). For cycle sequencing, the total reaction volume of 10 μl contained 2 μl sequencing buffer, 1 μl premix, 0.5 μM of the respective primer, 1 μl DNA template, and ultrapure H 2 O. Using the ABI Prism Big Dye Terminator v.3.1 Cycle Sequencing Kit (Applied Biosystems, Foster City, CA, USA), 25 cycles were performed at 96°C for 10 sec, 50°C for 5 sec and 60°C for 4 min. Reaction products were purified by gel filtration using the Performa DTR V3 96-Well Short Plate Kit (EdgeBio, Gaithersburg, MD, USA) and 400 µl of a 5% Sephadex solution (GE Healthcare, Munich, Germany). Sequencing was performed inhouse on an ABI 3730 Genetic Analyser (Applied Biosystems).
Authenticity of mtDNA sequenced for the present study For fresh samples, long-range PCRs and subsequent amplicon sequencing were conducted to minimize the risk of sequencing nuclear copies of mitochondrial DNA (numts; Bensasson et al. 2001;Fritz et al. 2012;Cui et al. 2013). During mitogenome assembly of Cylindraspis and fresh samples, a strict mismatch threshold of 2 was selected to prevent the integration of divergent reads of possible nuclear origin. Base frequencies of all sequences obtained for the present study, as well as their ratios of synonymous and non-synonymous substitutions, corresponded to expectations for mtDNA. In addition, protein-coding genes contained no internal stop codons, and nucleotides successfully translated into amino acids. Therefore, we are confident to have sequenced authentic mtDNA and not numts.

Annotation of the alignment of mitogenomes (45 sequences)
To facilitate partitioning (Table S8), a somewhat shorter alignment was used for phylogenetic analyses: (1) stop codons of coding genes were excluded as these do not code for any amino acid; (2) gene overlap was deleted in seven instances because these short regions could not be identified with a single gene and may have evolved differently; where necessary, adjacent codon positions also had to be deleted in order to maintain an intact reading frame; (3) alignment positions that cause frameshifts in coding regions were removed (once in ND1, twice in COI, once in ND3, once in ND4, and once in ND6); and (4) 23 intergenic spacer regions were excluded. Those regions ranged from 1-16 bp, with the exception of a 159-bp-long indel between ND5 and ND6 present only in the outgroup.
In addition, GenBank sequence JN999704 was slightly modified before usage. This partial mitogenome was assembled by 454 pyrosequencing (Lourenço et al. 2011) and contains several stretches of unknown positions (stretches of N), which were not recovered by PCR-based Sanger sequencing, as was the case for other samples in the original publication. Three issues emerged which were considered by us to be nothing more than sequencing/assembling artefacts, which consequently have been altered manually for calculations: (1) At the beginning of COI, the sequence is one triplet too short, without producing any internal stop codons. However, in accordance to the general picture of the alignment, this triplet can be divided into three individual indels, which were filled up with Ns ( Fig.  S3).
(2) There is a 37-bp-long deletion in the DNA coding for tRNA-Arg, which was filled with Ns, as the specimen would lack this tRNA otherwise. (3) At the end of cyt b, the sequence had a unique internal stop codon (4th last codon), which was amended to match the common pattern (Fig. S4).

Phylogenetic analyses
Phylogenetic relationships of the mitogenomes were inferred using RAxML 8.0.0 (Stamatakis 2014) and MrBayes 3.2.6 (Ronquist el al. 2012). The best evolutionary models (Table S9) and partitioning schemes were determined with PartitionFinder2 (Lanfear et al. 2016) and the Bayesian Information Criterion. Three different partition schemes were examined: (1) unpartitioned-1 partition; (2) gene-partitioned, i.e., 13 coding genes, 12S, 16S, and tRNAs combined-16 partitions; and (3) codon-partitioned, i.e., 13 times 3 codon positions extra, 12S, 16S, and tRNAs combined-42 partitions. For Maximum Likelihood (ML) and Bayesian Inference (BI), the codon-partitioned data set was selected. For ML, five independent searches were carried out using the GTR+G substitution model, different starting conditions, and the rapid bootstrap option. Subsequently, 1000 non-parametric thorough bootstrap replicates were calculated and the values plotted against the best tree. For BI, two parallel runs (each with four chains) were performed with 10 million generations (burn-in 0.25; print frequency 1000; sample frequency 500). Calculation parameters were analysed using Tracer 1.7.1 (Rambaut et al. 2018). The cyt b alignment was explored only with RAxML for a codonpartitioned alignment, as suggested by PartitionFinder2, using the GTR+G model. As alternatives, an unpartitioned and a 'third-codon-position-extra' scheme were tested. In addition, uncorrected p distances were calculated in MEGA7 (Kumar et al. 2016) using the pairwise deletion option.

Divergence time analyses
Molecular dating relied on the uncorrelated lognormal relaxed clock models implemented in BEAST 1.8 (Drummond et al. 2012) using a Yule tree with the HKY substitution model and four rate categories. MCMC chains ran for 20 million generations, with parameters and trees sampled every 20,000 generations. Tracer 1.6 (Rambaut & Drummond 2007) served to check for convergence of the runs using Effective Sample Sizes (ESS) of parameters, resulting in ESSs over 200 after discarding 10% of the initial trees as burn-in. Trees were summarized using TreeAnnotator 1.8 and the maximum clade credibility tree and mean node height options. Four nodes were calibrated using priors under lognormal distributions (Table S10). Following Joyce et al. (2013), the total clade of Testudinidae (= crown Testuguria) was constrained with a minimum at 50.3 Ma (i.e., the top of the Wasatchian North American Land Mammal Age) and a maximum at 100.5 Ma (i.e., the base of the Late Cretaceous). This constraint was based on the occurrence of the unambiguous pan-testudinid Hadrianus majusculus Hay, 1904 in sediments referred to the Wasatchian North American Land Mammal Age. The phylogenetic position of H. majusculus as a stem testudinid, not a crown testudinid, was recently confirmed (Vlachos & Rabi 2018). Following Kehlmaier et al. (2017) the node formed by the extant Chelonoidis carbonarius and C. denticulatus was calibrated based on the fossil C. hesternus (Auffenberg, 1971), which was collected from sediments referred to the Laventan South American Land Mammal Age. However, we establish the minimum at 11.8 Ma, not 12.55 Ma, based on the minimum published age for the Laventan, and implemented a maximum of 33.9 Ma corresponding to the base of the Oligocene, as no tortoises have been reported from South America prior to the Oligocene (de la Fuente et al. 2018).
The minimum age of crown Testudinidae was constrained at the top of the Eocene (33.9 Ma) using the late Eocene (Priabonian) Cheirogaster maurini Bergounioux, 1935 andGigantochersina ammon (Andrews, 1904), which were recently hypothesized to be nested deeply within Testudinidae (i.e., crown Geochelona) by Vlachos & Rabi (2018). Although the presence of a supracaudal scute (i.e., fused marginal scutes XII) is not necessarily a synapomorphy of this clade (see Crumly 1985versus Vlachos & Rabi 2018, it is notable that this derived character unique to tortoises has a broad presence in Europe by the end of the Eocene, which further supports the assertion that crown Testudinidae was established by then. As the maximum for this clade, the base of the Tertiary (66.0 Ma) was used because no "tortoise," stem or crown, has as of yet been reported from the Mesozoic.
Finally, the minimum age of crown Testudininae (i.e., the clade formed by all extant tortoises to the exclusion of Manouria and Gopherus) was constrained by reference to the late Eocene (Priabonian) C. maurini, which, as noted above, was recently hypothesized to be nested within the clade Geochelona (Vlachos & Rabi 2018). In contrast to other, potential, geochelonans from the late Eocene, C. maurini already exhibits the absence of a cervical scute, a character that is uniquely found in geochelonans among cryptodires. As no tortoises with derived characters have been reported from prior to the late Eocene, the maximum for this clade was established at 47.8 Ma.

Biogeographic analyses
Ancestral ranges were inferred using the Maximum Likelihood framework implemented in the R package BioGeoBEARS (Matzke 2013) and run in RASP 4 (Yu et al. 2015). BioGeoBEARS allows for estimating ancestral ranges of taxa using and comparing alternative models: The LAGRANGE dispersal-extinction-cladogenesis (DEC) model, a likelihood version of dispersalvicariance model (DIVALIKE) and a likelihood version of the BAYAREA model (BAYAREALIKE). These models make different assumptions about anagenetic and cladogenetic processes impacting the results (Matzke 2014). The best-fit model was selected based on Akaike Information Criterion scores, resulting in the selection of the DIVALIKE model. Statistics from BioGeoBEARS analysis for the three models are shown in Table S11.
For BioGeoBEARS, the time-calibrated BEAST tree and a matrix of extant geographic distributions in presence-absence format corresponding to the following 11 biogeographic areas were used to follow the geological boundary of extant continents and islands, not recent biogeographic realms: (A) Asia, without India; (B) Europe; (C) India; (D) Africa, including Arabia; (E) Madagascar; (F) North America; (G) South America; (H) Galápagos; (I) Caribbean; (J) Mascarenes; and (K) Aldabra + Seychelles. Paleogeographic differences were acknowledged using six different time bins: (i) early Eocene (> 47.8 Ma); (ii) middle + late Eocene (33.9-47.8 Ma); (iii) Oligocene (23.0-33.9 Ma); (iv) early Miocene (16.0-23.0 Ma); (v) middle + late Miocene (5.3-16.0 Ma); and (vi) Plio-Pleistocene (< 5.3 Ma). Dispersal probabilities between areas were scaled from 0 (e.g., for the dispersal to areas that were not yet formed) to 1 (connected land masses). A weighted probability matrix was established to assess the likelihood of dispersal between areas: 1 (land masses connected), 0.75 (land masses poorly connected by land or separated by minor oceanic barriers equivalent to the distance of the Galápagos Islands from South America or of Aldabra to Madagascar), 0.5 (land masses separated by large terrestrial barriers or by intermediate oceanic barriers along the currents), 0.1 (land masses not connected by land, but separated by extensive oceanic barriers along the currents), 0.01 (land masses not connected by land or direct oceanic currents), and 0 (dispersal impossible, as one of the two land masses does not exist). The values for each time bin were then scored by explicit reference to global paleogeographic reconstructions (Scotese 2013) in combination with the ages of various islands and plateaus found throughout the Western Indian Ocean (Duncan et al. 1989). The data matrix is available through Dryad https://doi.org/10.5061/dryad.08kprr4xz.

NGS data for Cylindraspis
Compared to shotgun sequencing, hybridization capture increased the content of endogenous mtDNA for the Cylindraspis sample NHM(UK) 2000.49 by two orders of magnitude, i.e., from 0.02% (readpool: 1.72 million reads) to 2.83% (readpool: 1.89 million reads; Table S12). Using hybridization capture, the historic tissue sample and seven subfossil bone samples of Cylindraspis produced mitochondrial DNA data of high quality that allowed assembling nearcomplete mitogenomes. The remaining eleven bone samples yielded patchy contigsequences with considerably lower coverage and many ambiguous sites. Yet, cyt b data of four additional bone samples were of sufficient quality to be examined. These cyt b sequences were aligned with the cyt b data of the eight Cylindraspis mitogenomes, with stop codons excluded, resulting in a length of 1,143 bp. This allowed us to compare our data for these Cylindraspis samples with cyt b data (405 bp) for 11 specimens from Austin & Arnold (2001) produced by Sanger sequencing (Table S1). For 10 specimens (NHM [UK] 2000[UK] .47-2000[UK] .49, 2000[UK] .52-2000, R3992, R4021, NMW 1461), our sequences were identical with those from Austin & Arnold (2001). However, for the specimen (NHM [UK] 2000.51) with the lowest sequence quality, 15 differences were observed. Therefore, our sequence was discarded and the GenBank sequence AF371254 from Austin & Arnold (2001) used for further analyses. For another sample morphologically identified as C. inepta (NHM[UK] R3991), Austin & Arnold (2001) could not generate a cyt b sequence. Our NGS approach yielded the complete cyt b gene for this tortoise and unambiguously identified it as a C. triserrata.

Phylogeny of Testudininae based on mitogenomes
Our phylogenetic analyses revealed five deeply divergent clades within Testudininae, one of which corresponded to Cylindraspis (Figs 1 and S5). A well-supported clade comprised of Indotestudo (Asia, India), Malacochersus (Africa), and Testudo (Africa, Europe, Asia) constitutes the sister taxon of the remaining four deeply divergent clades. The clade corresponding to Cylindraspis is sister to the remaining three deeply divergent clades. The successive sister is a clade consisting of the African genera Chersina, Chersobius, Homopus, Psammobates, and Stigmochelys. This clade is sister to the crown group formed by another two deeply divergent clades. One contains Centrochelys (Africa), Chelonoidis (South America, Galápagos, extinct: Caribbean), Geochelone (India), and Kinixys (Africa). The other clade comprises taxa from the Western Indian Ocean (Aldabra, Madagascar), i.e., Aldabrachelys, Astrochelys, and Pyxis.
The branching patterns within the clade containing Indotestudo, Malacochersus, and Testudo are not well resolved, with weak support for some nodes. Testudo is not monophyletic, as in a previous analysis of mitogenomes (Parham et al. 2006). However, two other studies using mitochondrial and nuclear DNA sequences (Le et al. 2006;Fritz & Bininda-Emonds 2007) and another one using mitogenomes (Kehlmaier et al. 2017) found Testudo monophyletic, albeit with weak support. It is obvious that resolving this contradictory pattern is beyond the scope of the present study and that further investigations are needed.
A cyt b phylogeny of giant tortoises Our cyt b data set included all available 19 sequences for Cylindraspis from the present study and Austin & Arnold (2001), i.e., three sequences for C. inepta, five for C. indica, four for C. peltastes, three for C. triserrata, and four for C. vosmaeri. The alignment included also sequences for all extant tortoise species from the Western Indian Ocean (Aldabrachelys gigantea, Astrochelys radiata, A. yniphora, Pyxis arachnoides, P. planicauda), the extinct Aldabrachelys grandidieri from Madagascar, and for another genus (Chelonoidis) with extant and extinct giant tortoise species from the Bahamas (C. alburyorum, extinct) and Galápagos (C. niger complex) plus their continental South American congeners C. carbonarius, C. denticulatus, and C. chilensis. Gopherus berlandieri served for tree rooting. In comparison to the trees derived from the mitogenomes (Fig. S5), some nodes of the ML tree based on cyt b alone (Fig. S6) were only weakly supported, and the branching pattern for the tortoise taxa from the Western Indian Ocean (Madagascar, Aldabra) was only weakly resolved. However, the monophyly of Cylindraspis received reasonable bootstrap support of 76. Le & Raxworthy (2017) doubted the authenticity of the sequences of C. triserrata produced by Austin & Arnold (2001) because they clustered in their re-analyses with Chelonoidis. However, the results of our phylogenetic analyses and the comparison of the sequences with ours provide unambiguous evidence that the short cyt b sequences produced by Austin & Arnold (2001) are authentic and that C. triserrata is a deeply divergent species of Cylindraspis (see also main text). We abstain from speculations about the results reported in Le & Raxworthy (2017). Within Cylindraspis, the placement of our 1,143-bp-long cyt b sequence of the historic specimen of C. vosmaeri (NMW 1461) clearly differed from that of three sequences from Austin & Arnold (2001), rendering C. vosmaeri paraphyletic with respect to C. peltastes. Unfortunately, the three other specimens of C. vosmaeri used by Austin & Arnold (2001) were either not available for study or yielded no results. The deep divergence of the nearcomplete mitogenomes of C. peltastes and C. vosmaeri (Fig. S5) suggests that missing information, due to the short lengths (less than 50%) of the sequences from Austin & Arnold (2001) compared to our cyt b sequence of C. vosmaeri, is responsible for the paraphyly of C. vosmaeri. Nevertheless, this situation warrants further research involving broader sampling, especially because the occurrence of two distinct giant tortoise species (C. peltastes, C. vosmaeri) on a small island like Rodrigues is unexpected and unique in comparison to other fossil and extant giant tortoises.
Compared to Cylindraspis, the sequence divergences between giant tortoise species from Galápagos (Chelonoidis niger complex) are shallow and also distinctly lower than the divergence between two subspecies of Testudo graeca (Figs S5-S8). The divergences among Galápagos tortoises resemble those observed within individual Cylindraspis species, supporting recent doubts to whether the species status is justified for the distinct giant tortoise populations from Galápagos or not (Loire & Galtier 2017;Galtier 2019).

Colonization history
The BioGeoBEARS analysis selected the DIVALIKE model as best supported (Table S11). The resulting ancestral area estimation ( Fig. S9) revealed Africa as the ancestral range of Cylindraspis (node 75) and clearly discarded Aldabra, Madagascar, the Seychelles Plateau or India as source regions (ML probability of ancestral ranges: Mascarene-Africa = 0.88; Africa = 0.1). The extant and extinct tortoises from Madagascar, the granitic Seychelles, and Aldabra represent another clade (node 60), suggesting that these land masses were not stepping stones for the dispersal process of Cylindraspis because then subsequent extinction and replacement would have to be postulated.
The place of origin for the Testudinidae (node 87) remained unresolved (Fig. S9). This is not surprising, as the testudinid lineages have a broad distribution across the Northern and Southern Hemispheres. However, the basal tortoise lineages were widely distributed only over the Northern Hemisphere (Vlachos 2018;Vlachos & Rabi 2018). Given the preponderance of basal testudinoids in the Late Cretaceous to the Paleocene of Asia, independent paleontological evidence suggests an Asian origin for the clade (e.g., Sukhanov 2000) with dispersal in the Eocene to Africa, Europe, and North America (e.g., Joyce et al. 2016) and in the Oligocene from Africa to South America (this study; see also Kehlmaier et al. 2017). On the other hand, Africa was clearly suggested as the ancestral region for the clade Testudininae (node 84).
The biogeographic history for the clade comprising Indotestudo, Malacochersus, and Testudo (= clade Testudona of Parham et al. 2006), corresponding to node 83, was not well resolved, even though Europe and Asia were inferred as most likely source regions. This is not surprising, as the clade mostly developed in the Neogene when these land masses had fully merged to form a single functional continent. The fossil record of the group is mostly restricted to these land masses as well (e.g., Sukhanov 2000;de Lapparent de Broin 2001;Danilov 2005;Brinkman et al. 2008) but is in dire need of revision.
The clade that opposes Testudona was recently named Geochelona by Vlachos & Rabi (2018), but the phylogenetic definition was worded in such a way that Cylindraspis is excluded. As these turtles can be considered recent, because they only went extinct during historic times, we here slightly modify the definition of Geochelona to refer to the most inclusive crown clade of tortoises that includes Geochelone elegans (Schoepff, 1795), but not Testudo graeca Linnaeus, 1758, corresponding to node 75 in Figure S9.
Africa and the Mascarenes were inferred by the analysis as possible ancestral areas for Geochelona, but this must be an artefact of the method, as the two land masses were never connected. Geochelona therefore likely originated in Africa. From there, the clade colonized India, Madagascar with Aldabra and the granitic Seychelles, South America with the Caribbean and the Galápagos Islands, and the Mascarenes. Except for the Mascarenes, this is in broad accordance with biogeographic models developed over the course of the last twenty years based on molecular (e.g., Le et al. 2006) or fossil (e.g., Joyce et al. 2016) evidence.
The land mass "Mascarenes" implemented in our biogeographic analysis includes all islands that were formed by the Réunion Hotspot during the last 65 Ma. The unambiguous result that Cylindraspis originated on this land mass in the Oligocene is therefore not contradicted per se by the young age of the current Mascarene Islands. The basal divergence of Cylindraspis giving rise to the C. triserrata lineage occurred approximately 28 Ma ago ( Fig. 1), which postdates the formation of the Nazareth Plateau by about 8 Ma. As this is the land mass closest to the extant Mascarenes, the radiation of the group is likely to have commenced there. The lineages of C. inepta + C. indica and C. vosmaeri + C. peltastes diverged approximately 15.5 Ma ago ( Fig. 1), which still precedes the formation of Mauritius by 7 Ma (all ages used herein taken from Duncan et al. 1989 andHargraves 1990). We therefore suggest that this divergence occurred on Nazareth as well. Thus, at least three lineages dispersed independently and subsequently from Nazareth to the Mascarenes. The oldest island, Mauritius, emerged from the sea floor about 8 Ma ago and it seems likely that it served as a stepping stone for the colonization of the much younger islands of Réunion and Rodrigues. Both are thought to have formed over the course of the last 2 million years, but the marine ridges that underlie the present-day islands are often much older. It cannot be excluded that additional small land masses were exposed on which the individual species originated. We believe that the evolution of five distinct giant tortoise species on the Mascarenes reflects a complicated interplay of multiple overseas dispersals from Nazareth to a dynamic island system in the region of the present-day Mascarenes and within the Mascarenes. The three oldest lineages, i.e., C. triserrata and the ancestral lineages of C. inepta + C. indica and C. vosmaeri + C. peltastes most likely arrived already from Nazareth, while the divergence of the latter two species pairs is thought to reflect dispersal and subsequent vicariance within the Mascarene island system.
It remains unclear whether Saya de Malha and Nazareth were reached by the ancestor of Cylindraspis via the Seychelles Plateau or directly from Africa (Fig. 2). Today Aldabra is inhabited by another genus of giant tortoises (Aldabrachelys), which once occurred also on Madagascar and the granitic Seychelles (TEWG 2015;TTWG 2017), suggesting either that the Seychelles Plateau was circumvented during the colonization process or, less likely, that Cylindraspis became later extinct there. Austin & Arnold (2001) discussed and disregarded the possibility that Cylindraspis dispersed to the Mascarenes from Southeast Asia across the entire Indian Ocean. Overseas dispersal from Southeast Asia has been considered for the Round Island boas (Bolyeriidae; Hawlitschek et al. 2017), and an origin in the Eastern Indian Ocean has been inferred for the ancestors of Mascarene stick insects (Bradler et al. 2015) and some Mascarene lizards (Nactus, Leiolopisma; Austin & Arnold 2006;Arnold & Bour 2008). An origin in Southeast Asia is made plausible for Cylindraspis by the presence of the giant fossil tortoises of the genus Megalochelys in the Plio-Pleistocene of Indonesia, the Philippines, and continental South and Southeast Asia (TEWG 2015). Megalochelys is characterized by strongly forked gulars (Setiyabudi 2009), a character also known to occur in Cylindraspis triserrata but not in the remaining Cylindraspis species (Gadow 1894;Bour et al. 2014). However, the firm placement of Megalochelys as sister to Centrochelys sulcata based on morphological evidence (Vlachos & Rabi 2018) implies that this feature evolved homoplastically in these two lineages. The sister group relationship of Centrochelys and Megalochelys also rules out the argument that the latter is related to the ancestor of Cylindraspis because then a sister group relationship of Cylindraspis and Megalochelys would be expected. However, instead the two genera belong to two deeply divergent testudinine clades (clade 1 and 3 in Fig. 1). Figure S1. Example for an assembly artefact in KJ489403-KJ489405. Shown is a 58-bp-long insertion within the cyt b gene identical for all three sequences and of presumable bacterial origin.             1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28 30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51 Table S1. Material of Cylindraspis analysed in the present study by NGS methods and by Austin & Arnold (2001)