Introduction

In a context of the worldwide mass extinction of biodiversity, the hopes for rediscovering or reviving extinct wildlife have become a popular scientific fantasy1,2. So-called “de-extinction” programs, like the “mammophant” (mammoth × elephant in vitro hybrid) or the Lazarus project (cloning of the extinct platypus frog from frozen specimens) are taking this fantasy to the experimental level, following recent advances in cloning biotechnology3,4,5,6,7. Here we present a new opportunity: the preservation of the genome of an extinct species through hybridogenesis, a special reproductive mode of hybrids. In this mode, the parental genomes do not recombine in F1s: one is eliminated before meiosis, and the other is solely transmitted in a clonal manner8 (Fig. 1). Such case of semi-sexual reproduction is rare, but found in several amphibians and fishes9. It is particularly famous of European water frogs, where pool frogs (P. lessonae sensu lato) hybridize with marsh frogs (P. ridibundus s. l.) to form hybridogenetic hybrid species, aka kleptons, (P. kl. esculentus s. l.), in which the P. lessonae s. l. genome is excluded from the germline8. This group thus makes a fascinating model system to study hybridogenesis, where it can have far-reaching consequences: invasive marsh frogs are currently replacing pool frogs across Western Europe10; human-mediated introductions have led to the formation of novel hybridogenetic systems from naturally allopatric taxa, involving new klepton species11. As such, and while they benefit from a specific status, kleptons usually have a poor conservation value as they act as genetic parasites. In this report, we propose that hybridogenesis has perpetuated the lineage of an extinct clade of water frogs in Italy, and that it could be regarded as a potential mean to resurrect such extinct species.

Figure 1
figure 1

Hybridogenesis in the P. kl. hispanicus complex. Pelophylax kl. hispanicus originates from hybridization between P. bergeri (B genome) and a new lineage P. n. t. 1 (E genome, as a reference to “Extinct”). It eliminates its P. bergeri germline and thus only contributes P. n. t. 1 gametes. This klepton requires P. bergeri to reproduce and their crosses yield klepton offspring only. These natural mechanisms allow to preserve the P. n. t. 1 germline in the wild.

The Apennine Peninsula (Italy and Swiss Ticino) is widely inhabited by pool frogs and their associated kleptons: P. lessonae with P. kl. esculentus north of the Apennine Mountains and its sister species P. bergeri with P. kl. hispanicus in the Apennines and in Sicily12 (Fig. 2). Yet, marsh frogs have never been reported over the Peninsula12,13, suggesting that either they did not contribute to its colonization, or they became extinct since. Their origin and nature have thus remained a mystery. By parsing nuclear sequence data from two independent intronic regions (abbreviated SAI-1 and CMI-2) together with microsatellite data, we characterized these ancestral marsh frog germlines in a comprehensive phylogeographic framework including most Pelophylax taxa known from the Western Palearctic. We made three striking discoveries.

Figure 2
figure 2

(A) Nuclear phylogeny of Western Palearctic water frogs based on two nuclear loci combined (~1,600 bp, SAI-1 + CMI-2). Branch support are shown for internal nodes. (B) Distribution of Italian lineages from our sampling. (C) Distribution of water frog species, according to IUCN red list and our study (dash line: extinct range in Italy). Pelophylax c. f. bedriagae is a species complex involving multiple cryptic taxa summarized as P. bedriagae s. s., cf. 1 and cf. 2, based on SAI-1 and mtDNA variation18. For a mitochondrial phylogeny, see refs11,16. For information on the reference sequences, see Table S1 and references therein. Nearby Swiss Ticino localities were pooled for visibility. The map was built using ArcGIS 9.3 (http://www.esri.com/arcgis/about-arcgis).

First, we identified the marsh frog ancestor of P. kl. hispanicus as a new monophyletic lineage that we refer to as Pelophylax n. t. 1 (Fig. 2). This “ghost” species, widespread south of the Apennines, is supported by the SAI-1 phylogeny, which distinguishes all Western-Palearctic species (Fig. S1), and is represented by specific haplotypes for the shorter CMI-2 (Fig. S2). Some of the latter were also sampled north of the Apennines, as a probable result of past hybridization or incomplete lineage sorting among marsh frogs for this less polymorphic marker (Fig. S2; as also illustrated by unresolved P. ridibundus/P. cypriensis alleles). Potential signs of ancient admixture also arise from the microsatellite data: P. n. t. 1 is distinguished from the North-Italian marsh frog germline (see below), but few individuals have intermediate clustering (Fig. S3). Phylogenetic analyses show that P. n. t. 1 was most closely related to Eastern-Mediterranean water frogs, namely the Cyprian-endemic P. cypriensis, as well as Western Anatolian lineages of the P. c.f. bedriagae complex (Fig. 2), from which it diverged during the Pleistocene (1.5 Mya, 95% HPD 0.6-2.6; Fig. S4). This timing parallels the split between the pool frogs P. lessonae and P. bergeri north and south of the Apennines during the Calabrian (estimated at ~1.3 Mya14), a Pleistocene period characterized by extreme cold and desiccation. Nowadays, P. n. t. 1 appears to be extinct as such, and its genome only persists in the hybridogenetic hybrid P. kl. hispanicus (Fig. 1).

Second, we identified the marsh frog ancestor of kleptons P. kl. esculentus from N-Italy/S-Switzerland as the European marsh frog P. ridibundus, that currently inhabits most of the Western Palearctic12,13 (Fig. 2, Fig. S1, Fig. S2). The presence of different marsh frog lineages south and north of the Apennines is in clear accordance with well-established biogeographic paradigms, notably for amphibians, and especially pool frogs14.

Third, we unexpectedly found a new cryptic clade of pool frogs restricted to NW-Italy and S-Switzerland (Ticino; Fig. 2, Fig. S1, Fig. S2), diverged as early as the Late-Pliocene (3.0 Mya, 95% HPD 1.2-4.9) that we refer to as Pelophylax n. t. 2. It appears deeply admixed and panmictic with P. lessonae, as also suggested by the microsatellites (Fig. S3). This lineage might stem from long term isolation in the remote Alpine valleys (as in other herps15), before secondary contact and introgression by P. lessonae. Alternatively, it could be of hybrid origin resulting from leaky hybridogenesis between pool and marsh frogs, which would explain the intermediate phylogenetic position (Fig. 2), together with the absence of reciprocal mitochondrial divergence16.

Although our sequence markers discriminate all known species17,18, the current data is insufficient to accurately reconstruct the biogeography of marsh frogs of the Apennine Peninsula. Notably, it is unclear whether Pelophylax. n. t. 1 have speciated in the Italian Peninsula or colonized it from Central/Eastern Europe, where it would have become extinct as well. It is however unlikely that the divergences took place solely in the kleptons after hybridogenesis was initiated: P. n. t. 1 would otherwise cluster as a sister species of P. ridibundus found north of the Apennines, which we can reject from the phylogeny (Fig. 2). The contrasted genetic structures between P. bergeri and P. kl. hispanicus (and their marsh frog germlines) also argue for an independent evolution of P. n. t. 1 (Fig. S3). However, clonality may have boosted the evolutionary rate of this lineage and overestimated our dating estimates. In addition, the intronic markers used could yield biased phylogenetic signals due to specific features, like retrotransposons17. Given the complex evolutionary history of Western Palearctic water frogs, involving frequent events of hybridization between species19,20, robust nuclear phylogenies and admixture analyses will require high-throughput genomic data.

If it was ever present in Italy, why this taxon disappeared remains a mystery. Water frogs fossils from the middle Pleistocene to the Holocene are found throughout Italy21 but distinguishing between potential Pelophylax residents is challenging with current biometric methods22. Yet, many of these records were attributed to marsh frogs “P. ridibundus23 and could thus belong to P. n. t. 1.

The persistence of the genetic legacy of an extinct taxon provides exciting insights. By carrying this lineage, P. kl. hispanicus can thus be considered as a “semi-living fossil” and deserves a strong conservation value. The same is true for P. bergeri on which P. kl. hispanicus relies for reproduction. Beyond its fascinating mechanism of genetic preservation, this system fuels the fantasy for de-extinction. Thanks to the hybridogenetic context, we envisage simple ways that would allow to resuscitate P. n. t. 1 without genetic engineering. Theoretically, crossings between P. kl. hispanicus individuals would produce pure P. n. t. 1 offspring. However, most of these crossings are sterile, otherwise marsh frogs would be naturally found in Central and South-Italy. Like in other water frog hybridogenetic systems, the P. n. t. 1 germline likely degenerated following many generations of clonal transmission (ref.24 and references therein). However, intense efforts of multiple controlled crossing experiments may allow to obtain few viable offspring, i.e. those where different deleterious mutations were fixed in their parental germlines. Although audacious, these might be rewarding given the flexibility of amphibian development; e.g. eggs can sometimes still develop under artificial haploid parthenogenesis (notably in Pelophylax25). An alternative approach would be to cross P. kl. hispanicus with its closest living relatives, e.g. the Anatolian lineage P. bedriagae c. f. 2. Subsequent backcrossing of the resulting F1 P. n. t. 1 × P. c. f. bedriagae hybrids with P. kl. hispanicus might allow to dilute the P. c. f. bedriagae genome while simultaneously purging deleterious mutations accumulated on the P. n. t. 1 germline. Selective breeding is a common de-extinction practice, e.g. envisaged to resurrect the aurochs26 and applied to recreate the quagga (an extinct subspecies of zebra), although the resulting individuals could not be genetically identical27. The implementation of such approaches would contribute to the ethical debate of reviving long extinct species. De-extinction is questionable when it pursues a technological achievement28. While this would not be the case here, the responsibility of mankind in restoring a biodiversity that “naturally” vanished prior to the sixth mass extinction still remains a topic of extreme controversy28. More generally, the de-extinction possibilities offered by scientific advances should not defuse the dramatic measures required to protect wildlife on the verge of extinction.

This study calls for the tracking of “ghost” vertebrate lineages that remained hidden and protected in other species of hybrid origins, through hybridogenesis or other mechanisms with potentially similar consequences (e.g. allopolyploidisation29,30). Unknown lineages underlie the genetic makeup of some polyploid plants31 and fishes32. Such taxa may be more frequent than imagined in vertebrates, especially in amphibians and fishes where these reproductive modes have been well-documented9,33.

Methods

DNA Sampling

We included 77 individuals sampled across Italy and S-Switzerland (Ticino), representing 23 localities (Table S1). Eleven samples from the different Western-Palearctic species of water frogs were also included for sequencing, to complement available data (see below). DNA was obtained from non-invasive buccal swabs or ethanol-preserved tissues, and was extracted with the Qiagen Blood & Tissue DNA extraction kit (Qiagen, Netherlands). Procedures were approved by the local and national ethics committees for animal experiments (karch Ticino) and performed in accordance with their guidelines and regulations. The mitochondrial lineage of all but three individuals was available from16.

Microsatellite analyses

Given the uncertainty to identify water frogs by morphology16, individuals from the study area were genotyped at microsatellite loci with diagnostic alleles to distinguish between pool (here P. bergeri /lessonae) from marsh frogs (P. ridibundus s. l.) frogs, as well as their kleptons (here P. kl. esculentus/hispanicus)10,16,34. This also confirmed the expected absence of marsh frogs (P. ridibundus s. l.).

To this purpose and to infer population structure, we analyzed nine polymorphic microsatellites in 72 individuals of pool and edible frogs from the study area (loci and methods16). Furthermore, we were able to parse the genotypes of 8 markers in 26 kleptons and reconstructed their marsh frog haplotypes, as previously done in Western Switzerland10. Both datasets (pool frogs/kleptons and phased marsh frog haplotypes) were analyzed by Principal Component Analyses (PCA) on individual genotypes, using the ade4 and adegenet packages in R. The differences between pool frogs and their associated kleptons stem from their marsh frog germlines. The genetic structure of kleptons should thus mirror the genetic structure of pool frogs if their marsh frog germlines only co-evolved in the kleptons after hybridogenesis was initiated, but not necessarily if the divergence predate hybridogenesis.

Sequencing of nuclear introns

We amplified and sequenced intronic portions of the nuclear genes Serum albumin (intron 1, abbreviated SAI-1) and Cellular myelocytomatosis (intron 2, abbreviated CMI-2) in respectively 45 (34 pool frogs and 11 kleptons) and 60 (49 pool frogs and 11 kleptons) individuals from the study area. These two markers were chosen as they have been widely used for phylogeography of water frogs and were consistently shown to discriminate between currently recognized taxa16,18,35. PCR conditions and primers are available in Table S2. We first attempted direct sequencing and then cloned heterozygous individuals. For SAI-1, we developed a set of primers specific to P. bergeri/lessonae alleles and a set of primers specific to the alleles of other taxa (Fig. S5). This allowed to independently amplify and sequence the two alleles of kleptons that harbored a P. bergeri or a P. lessonae allele (i.e. P. bergeri/P. n. t. 1 and P. lessonae/P. ridibundus). However, kleptons with a P. n. t. 2 allele, as well as heterozygous pool frogs had to be cloned. Cloning was performed with the TOPO-cloning kit (Invitrogen), and a minimum of eight clones per individual was sequenced.

Phylogenetic analyses

We complemented our new data with published sequences available for these two genes. For SAI-1, this included the haplotypes found across the ranges of all species16,17,18 (see Table S1 for details). For CMI-2, this included haplotypes from P. lessonae, P. bergeri and a P. ridibundus × P. bedriagae c. f. 1 hybrid from E-Turkey35. GenBank accession numbers of haplotypes analyzed in this study are provided in Table S3. Sequences were manually aligned in Seaview36. Phylogenetic analyses were performed on 59 SAI-1 and 31 CMI-2 haplotypes from 15 taxa (including P. n. t. 1 and P. n. t. 2), with the Asian P. nigromaculatus and the early-diverged European P. perezi as outgroups. Phylogenetic Bayesian inferences were performed with MrBayes37, with 10 million iterations and 2 chains. Evolutionary models were chosen according to JModelTest38 (SAI-1: GTR + G; CMI-2: HKY + G). We also computed haplotype networks using TCS39 to visualize haplotype variation, including indels. In addition to marker-specific trees, we built a phylogeny based on concatenated sequences representative of the variation of each species for the two markers.

We performed molecular dating by calibrating the phylogeny to the known split of P. cretensis at the end of the Messinian salinity crisis (~5.3-5.5 Mya)40,41. To this purpose, we estimated the mutation rates of our nuclear markers beforehand from the evolution of the mitochondrial genomes published for ten species of European and Asian water frogs, which provides a well-resolved phylogeny42. Analyses were conducted in BEAST using a Yule-process prior, an uncorrelated lognormal relaxed clock43 and different partitions for each sequence set. This calibration involved a normally distributed prior at 5.4 Mya (95% HDP 4.8-5.9) for the split of P. cretensis. Then, we applied the molecular clock on the nuclear dataset using the estimated mutations rates with normally-distributed priors covering their 95% HDP (SAI-1: 0.0056 (0.0027-0.0112); CMI-2: 0.0113 (0.0026-0.0286)). We ran two independent chains of 20 million iterations each (with the first 2 million excluded as burnin) and used the Tracer module to check for convergence and effective sample size of parameters. We built a time-calibrated phylogeny from the BEAST runs with the module TreeAnnotator and the R package phyloch44.