Introduction

Hybridization creates exceptional challenges in conservation biology. It is known that hybridization can have an important role in the evolution of many organisms; however, it also can produce irreversible damages, namely the extinction of species (Allendorf et al., 2001). Usually these two contrasting effects are associated with human indirect or direct interference on natural processes. One of the main consequences of hybridization is the introgression of genes from one species to another, which can result in the extinction of native gene pools. The amount of introgressive hybridization can be exceptionally increased when hybridization is mediated by humans, namely when contacts between wild and domesticated counterparts are promoted because of habitat loss of the wild species.

The current situation of European wildcat (Felis silvestris silvestris) is a remarkable example of the consequences of anthropogenic hybridization. The survival and conservation of indigenous populations of the European wildcat might be locally threatened by introgressive hybridization with feral domestic cats (Felis silvestris catus). Over the last decade, the genotyping of highly polymorphic molecular markers (specifically microsatellites, short tandem repeats) and partial mitochondrial DNA sequences, combined with new Bayesian statistical tools, have radically improved the knowledge on wildcat population genetics and admixture with the domestic cat (for example, Beaumont et al., 2001; Randi et al., 2001; Pierpaoli et al., 2003; Kitchener et al., 2005; Lecis et al., 2006; Oliveira et al., 2008a, b; O’Brien et al., 2009; Hertwig et al., 2009; Eckert et al., 2010; Mattucci et al., 2013). Wildcats have been domesticated from African wildcat (F.s. libyca) ancestors ~10 600 years ago (Vigne et al., 2012), and since then wild and domesticated forms have remained fully interfertile (Robinson, 1977; Ragni and Possenti, 1996). Hybridization between wildcat subspecies is thought to have initiated when feral domestic cats started their expansion across the range of wildcats (Driscoll et al., 2009), thus occurring perhaps for several thousands of years in some regions. In some areas, where taxa boundaries are probably maintained and wildcat populations less fragmented, introgression may be minimal (for example, Pierpaoli et al., 2003; Kitchener et al., 2005; Lecis et al., 2006). However, in particular historical or ecological conditions, widespread admixture might produce hybrid swarms, likely leading to the genetic extinction of the wildcat parental populations (Allendorf et al., 2001; Beaumont et al., 2001; Brumfield, 2010; Fitzpatrick et al., 2010). European wildcats have apparently experienced both extremes. Wildcats in Scotland and Hungary show widespread hybridization and deep genetic introgression with domestic cats (Beaumont et al., 2001; Lecis et al., 2006), whereas only sporadic hybridization or no detectable introgression have been observed in Italy, Iberia and northeast France (Pierpaoli et al., 2003; Lecis et al., 2006; Oliveira et al., 2008a, b; O’Brien et al., 2009).

Although wildcat and domestic cat hybridization has been addressed in several studies, detecting hybrids and introgressed individuals, and understanding the causes limiting or favoring introgression, are still complex and controversial issues affecting wildcat research and conservation. The recent and intricate domestication of cats may implicate an overall shallow differentiation between the domestic and the wildcat counterpart subspecies, and thus the detection of hybridization between these forms is expected to be demanding. Combinations of markers, such as microsatellites and mitochondrial DNA (mtDNA; Driscoll et al., 2011), have improved hybrid detection; however, resolution remains limited, namely when involves several backcrosses. The development of a larger suite of molecular tools, applicable in invasive and noninvasive samples, is essential to increase the power of admixture analysis, which is mandatory for the adequate conservation planning of European wildcat populations.

High-throughput technologies improved genomic resources, such as single-nucleotide polymorphism (SNP) arrays and sequence assemblies, and have enabled the genome-wide genotyping of several species, namely the domesticated and their wild relatives (for example, wolf, vonHoldt et al., 2010; bison, Pertoldi et al., 2010; and bighorn sheep, Poissant et al., 2010). The European wildcat is an example of such a ‘genome-enabled’ taxon (Kohn et al., 2006), benefitting from the cross-species applicability of domestic cat data. Specifically, the recent sequencing of the domestic cat genome (Pontius et al., 2007; Mullikin et al., 2010; Montague et al., 2014; Tamazian et al., 2014), which has included SNP discovery in the African wildcat subspecies (Felis silvestris cafra), provides useful reference data for the discovery of new nuclear markers for assessing the introgression of domestic cat genes in the wild counterparts. Nussberger et al. (2013) recently described a set of 48 nuclear SNPs for identifying European wildcats, domestic cats and their admixed progeny. However, this work used a limited number of SNPs and reference samples solely from Switzerland.

Here we examined the power of anonymous SNPs in the domestic cat, to estimate the depth of introgression in conspecific wildcats sampled from several European populations. We identified the minimum number of highly divergent SNPs needed for accurate admixture analyses, hybrid identification and individual assignment to the wild or domestic parental populations. We expect that the proposed panel of SNPs, in combination with the existing ones in the recent literature, will provide an easier, standardized, cheaper and more accurate methodology to assess hybridization between domestic and the European wildcat. Accurate estimates of introgression and level of hybridization are crucial for prioritizing conservation efforts for European wildcat populations.

Materials and methods

European wildcats and domestic cats

Morphologically identified wildcats (Schauenberg, 1969, 1970; Ragni and Possenti, 1996) were selected from ISPRA (Istituto Superiore per la Protezione e la Ricerca Ambientale) and CIBIO/UP (Centro de Investigação em Biodiversidade e Recursos Genéticos, Universidade do Porto) tissue bank collections (F. s. silvestris, n=130) taking in consideration the natural distribution of European wildcat and the fragmentation of its populations in Europe (Pierpaoli et al., 2003). Sampling was performed across diverse European geographic localities by randomly selecting a few available samples from each location (Table 1; Figure 1). Five known wildcats by domestic cat hybrids obtained in captivity were included in the analyses (Pierpaoli et al., 2003). Domestic cats (F. s. catus, n=139) living in regions sympatric to the wildcats but in urban areas were also genotyped (Table 1; Lipinski et al., 2008; Kurushima et al., 2012). DNA from tissue samples was extracted as described by Pierpaoli et al. (2003). Buccal swabs from domestic cats from Cyprus were obtained from the Cyprus Malcolm Cat Sanctuary and were prepared as previously described (Kurushima et al., 2012). DNA from putative European wildcats and captive bred hybrids was whole-genome-amplified according to the manufacturer’s recommendations using the REPLI-g Midi Kit (Qiagen Inc, Hilden, Germany).

Table 1 Information and number (N) on the cats used for SNP analyses of introgression
Figure 1
figure 1

Sampling locations of putative European wildcats. Shaded areas correspond to the approximate current distribution of F. silvestris in Europe (adapted from Grabe and Worel, 2001).

SNP genotyping

A total of 158 SNPs from 18 cat autosomes (n=154) and X chromosome (n=4) were selected and used to genotype all cats. Most of the SNPs (134) were randomly selected from Kurushima et al. (2012), including 48 in intragenic regions, and were opportunistically applied to this study. Two phenotypic SNPs, associated with cat coat colors, TYR (Siamese—points) and TYRP1 (brown), were also examined (Lyons et al., 2005a, b). Additional 22 SNPs in intragenic regions were genotyped, including the following: (i) nine that revealed at least one polymorphic position between European wildcat and domestic cats (Johnson et al., 2006); (ii) one for which high variability was known among domestic cats (CCR2; Esteves et al., 2007); (iii) one considered to be leopard cat (CAT; Prionailurus bengalensis) species-specific; (iv) four on the X chromosome; and (v) seven randomly selected intragenic SNPs (Supplementary Table 1). The later seven randomly intragenic SNPs were selected at the same time and with the same criteria as Kurushima et al. (2012). Briefly, SNPs were previously selected from the 1.9 × coverage of the Abyssinian cat (Pontius et al., 2007), being heterozygous in the Abyssinian, within nonrepetitive regions and dispersed across the chromosomes. Only the SNPs that had strong Phred-like scores (>25) and were ~8 Mb apart in the genome assembly were chosen. Further selection of SNPs included proper and robust design for the GoldenGate assay using Illumina design tools (Gentrain score>0.55). Finally, each SNP had a call rate >80% and a minor allele frequency >5% across the entire data set, which is considered the inherent error rate in the assay efficiency.

Golden Gate Assay amplification and BeadXpress reads were performed following the manufacturer’s protocol (Illumina Inc, San Diego, CA, USA) on 50–500 ng of DNA or whole-genome-amplified product (Kurushima et al., 2012). The BeadStudio software v. 3.1.3.0 with the Genotyping module v. 3.2.23 (Illumina Inc.) was used to analyze the data. This software provides an automated genotype calling and powerful quality-control features to assess reproducibility and Mendelian consistency.

Statistical analysis

Summary statistics were used to describe levels of genetic variability and differentiation on the wild and domestic subspecies. Minor allele frequency was calculated with FSTAT v. 2.9.3.2 (Goudet, 2001). To avoid any bias resulting from the inclusion of hybrid genotypes among the representatives of European wildcats, we performed comparative analyses with all domestic cats against the 82 putatively purest European wildcats (Table 1). All wildcats from Hungary and Scotland were excluded from these first analyses because of their high level of admixture proportions determined both from morphological presumption and genetical inference (Beaumont et al., 2001; Daniels et al., 2001; Pierpaoli et al., 2003; Lecis et al., 2006). The individuals excluded (48, see Table 1) were, afterwards, included to the data set for hybridization analyses. Significance of deviations from Hardy–Weinberg equilibrium, and the observed (HO) and expected (HE) heterozygosities (unbiased, Nei, 1978) were calculated for all locus–population combinations using Markov chain exact tests in ARLEQUIN 3.5.1.2 (Excoffier and Lischer, 2010), with a chain length of 100000 and 3000 dememorization steps. FSTAT 2.9.3.2 was used to compute the Wilcoxon-signed rank test to evaluate differences in HE between wild and domestic cats, accounting for differences in sample size (Goudet, 2001). Allelic richness (Ar) was computed for each group following a rarefaction method that compensates for uneven sample sizes, as implemented in the software HP-Rare 1.0 (Kalinowski, 2005). ARLEQUIN 3.5.1.2 (Excoffier and Lischer, 2010) was used to perform an analysis of molecular variance (AMOVA) of pairwise FCT (wildcat versus domestic groups) for each polymorphic locus, testing the null hypothesis of no differentiation by permuting genotypes between populations (10 000 replicates; P<0.001). AMOVA was also used for testing the HE difference between the wild and domestic cat groups. Average values were calculated for autosomal SNPs alone.

The 158 SNPs were ranked for hybridization diagnostic value by computing (i) In (informativeness for assignment); (ii) Ia (informativeness for ancestry coefficients); and (iii) optimal rate of correct assignment, using INFOCALC (Rosenberg et al., 2003; Rosenberg, 2005). For each locus, average ranking values were determined. Moreover, the probability of identity was estimated with a correction for small sample size (PIDunbiased; Paetkau et al., 1998) and the equivalent probability for a pair of siblings (PIDsib; Waits et al., 2001) with GenAlEx 6.41 (Peakall and Smouse, 2006). These values were used to estimate the minimum number of loci required for describing unique individual genotypes.

Individual assignment and admixture analyses

To assign cats to populations and to test for admixture, 158 SNP genotypes from 274 cats were evaluated using two Bayesian clustering procedures. Assuming two main populations, European wildcat and domestic cats (K=2), 10 independent runs of the Bayesian-based software STRUCTURE 2.3.3 (Pritchard et al., 2000; Falush et al., 2007; Hubisz et al., 2009) were computed. For each run, the average proportion of membership (Q) of the sampled populations and the distribution of individual membership proportions (qi) to the two inferred clusters, with their 90% credibility intervals (CIs) were assessed. All computations were performed using the admixture model with correlated allele frequencies either without prior nongenetic information or considering the domestic cats as reference samples. Runs consisted of a burn-in of 105 cycles and 106 Markov Chain Monte Carlo iterations, and were averaged using CLUMPP version 1.1.1 (Jakobsson and Rosenberg, 2007) with the FullSearch algorithm and the G′ pairwise matrix similarity statistics. Average assignments were plotted using DISTRUCT 1.1 (Rosenberg, 2004). The Bayesian model-based method implemented in the software NEWHYBRIDS (Anderson and Thompson, 2002) was further applied to classify cats into discrete hybrid classes. NEWHYBRIDS estimates the posterior probability that individuals fall into each of six genotypic classes corresponding to hybrid categories (Hi): parental subspecies (domestic or wild), F1, F2 and the backcrosses. Uniform priors were chosen to downweight the influence of an allele that might be rare in one species and absent in the other. Ten independent runs were performed to test for stability.

The power of all SNPs to detect different hybrid classes was assessed by the analysis of the assignment accuracy obtained for simulated genotypes. One-hundred multilocus genotypes of each parental (wildcat × wildcat; domestic cat × domestic cat), F1 (wildcat × domestic cat), F2 (F1 × F1) and backcross (F1 × wildcat; F1 × domestic cat) categories were generated with the software HYBRIDLAB v1.0 (Nielsen et al., 2006, but see also Oliveira et al., 2008b) and, afterwards, analysed using STRUCTURE and NEWHYBRIDS under the same setting of the admixture analysis described above. Qi threshold values for all analyses where established by the minimum value for which all parental domestic cats could be correctly assigned. A complementary analysis was performed using a combined data set that include the simulated genotypes (600), plus the observed genotypes that displayed admixed genetic assignments or for which molecular assignments opposed their prior morphological identifications in the hybridization analyses of STRUCTURE and NEWHYBRIDs. Analyses of all observed, simulated and both kind of genotypes prompted the elimination of 18 putative wildcat samples on all subsequent analyses, that were most likely included in the wildcat sampling group because of incorrect morphological identifications. Accordingly, the preliminary Bayesian inferences were re-run for the new data set of 256 cats, including 139 random-bred domestic cats, 112 putative European wildcats and five known hybrids. In addition, Bayesian analyses of simulated genotypes were performed for the best estimated minimum number of SNPs (n=35), which accurately allow the evaluation of hybridization in individual cat samples (n=256).

Results

SNPs variability

The SNP genotype call rate was more than 80% per individual cat in all analysed cat samples (n=274). Descriptive statistics are presented in Supplementary Tables 1 and 2. All SNPs were polymorphic among domestic cats (minor allele frequency >5%, to ensure resolution with the domestics and avoid those unique for the Abyssinian). However, 22 SNPs (13.92%) were monomorphic among the wildcats, including the two phenotypic SNPs for TYR and TYRP1. Significant deviations from Hardy–Weinberg equilibrium, following Bonferroni correction (P<0.00016), were detected in 16 SNP loci, eleven among the domestic population and five in the wildcat group. Although none of the 158 loci had alternative private alleles, a large proportion of SNP variability was significantly partitioned between wildcat and domestic cats (average FCT=0.427; AMOVA P<0.001), with single-locus FCT pairwise values ranging between 0 (ChrA3_159537633; ChrF2_78303221) and 0.891 (ChrE2_34027888). European wildcats proved to be significantly less variable than domestic cats, both at average values of expected heterozygosity (HE(FCA)=0.340; HE(FSI)=0.107; P<0.001) and Ar (Ar(FCA)=1.738; Ar(FSI)=1.250; P<0.001). Exceptions to the lower wildcat’s variability were found at 18 SNPs, for which wildcats exhibited higher HE than domestic cats (Supplementary Tables 1 and 2). Ten SNPs had two to four times higher heterozygosity in wildcats, and five of the ten showed significant deviations from Hardy–Weinberg equilibrium.

The average informativeness scores of each locus (INFOCALC—Supplementary Table 2) revealed that SNPs with lowest values of HE in both groups displayed the highest values of genetic differentiation and top rank numbers, as they represented high frequencies of the two possible alternate variants. For increasing SNP combinations based on the loci ranked list, P(ID)unbiased and P(ID)sibling at P<0.001 were simultaneously obtained using 35 loci (Table 2). These 35 top-ranked SNPs had an average pairwise FCT=0.74 (P<0.001).

Table 2 Genomic SNP panel of top ranked loci to detect European wildcat and domestic cat introgression

Detection of hybridization

Bayesian analyses with and without prior information for domestic samples yielded globally identical results (data not shown). Hence, all the presented results were performed without prior nongenetic information. Assuming two major populations in STRUCTURE (K=2), all domestic cats were clearly assigned to their expected cluster according to genetic variation at the 158 SNPs (Figure 2). However, as noted in the Materials and methods, 18 putative European wildcats showed qi values to the domestic cluster above 0.92 and very narrow CI ranges (0.745–1.00): seven from Portugal, four from Spain, four from Italy, one from Scotland and two from Hungary (Table 3; Figures 2 and 3). No sign of subdivision was detected in the studied wildcat populations (STRUCTURE from K=1 to 15, data not shown).

Figure 2
figure 2

Average plot of the Bayesian admixture analyses performed in 10 independent STRUCTURE runs for K=2, using 158 SNPs on 139 known random-bred domestic cats (FCA) and 132 putative European wildcats (FSI). Each individual is represented by a single vertical bar divided into two genetic clusters, according to the proportion of their genome estimated to descend from each one of possible groups. Black vertical lines divide geographic groups of domestic and wild populations, which are labeled above the figure (Ger=Germany (5); IT=Italy (6); Tur=Turkey; Cyp=Cyprus; PT=Portugal (1); SP=Spain (2); Scot=Scotland (3); Bel&Lux=Belgium and Luxemburg (4); Slov&Bos=Slovenia and Bosnia (7); Hung=Hungary (8); BulRom=Bulgaria and Romania (9); HYB=Known hybrids (10)).

Table 3 Individual membership proportions (qi) of presumably misclassified and putatively admixed cats according to the Bayesian analyses performed in STRUCTURE and NEWHYBRIDS
Figure 3
figure 3

Individual membership (qi) values obtained using 158 SNPs under Bayesian-model computations. (a) STRUCTURE’s plot of 100 simulated domestic (FCA), wild (FSI), F1, F2 and backcross (BxFCA; BxFSI) genotypes and 47 real individuals for which genetic data refute their straightforward allocation to the European wildcat subspecies (?); (b) NEWHYBRID’s assignment of the same 47 dubious individuals to the different hybrid categories. Each individual is represented by a single vertical bar coloured according to the proportion of their genome descending from each of the inferred clusters (a) or hybrid class (b).

According to the Bayesian analyses of the SNP variability, 23 putative wildcats show genetic evidence of admixed ancestry both in STRUCTURE and NEWHYBRIDS computations (Table 3). The only exception was one cat from Scotland (ID 101), which was identified as a possible hybrid in STRUCTURE (qFSI=0.768) and as a European wildcat in NEWHYBRIDS (qFSI=0.970). Most of the 23 admixed cats found in the random sampling belong to Scotland (n=7) and Hungary (n=8). The other putative hybrid cats were recognized in Portugal (n=1), Germany (n=1), Italy (n=5), and Bosnia and Herzegovina (n=1). Known captive-bred hybrids clearly displayed signatures of admixture, with individual qi ranging from 0.289 to 0.734 in the wild genetic group (Table 3). Moreover, they were mostly assigned to their known hybrid category with high posterior probabilities (qi>0.90): ID 57 as F1, ID 60 as BxFSI, ID 61 as BxFCA and ID 63 as BxFSI (Table 3; Figure 3b). Overall, 89 of 130 (68.46%) putative wildcats are inferred to have no introgression with domestic cats. This number excludes the five known hybrids but includes Scottish cat 101 as possibly admixed.

SNP simulations for admixture analysis

The analyses, both in STRUCTURE and NEWHYBRIDS, of the combined data set including all 256 observed with the 600 simulated genotypes (using the HYBRIDLAB), globally revealed the same presumed misclassifications as obtained with real genotypes alone. A summary of the misclassifications expected in six simulated hybridization categories is presented in Table 4. Bayesian analyses of the simulated genotypes revealed that all parental, F1, F2 and backcrossed individuals could be correctly identified by the STRUCTURE algorithm using the 158 SNPs. Moreover, posterior probabilities of assignment to the different simulated categories of hybridization proved to be sufficiently discriminatory because as few as 1% F2, 28% BxFCA and 14% BxFSI of the properly assigned genotypes displayed CI values outside the expected range (Table 4). Assignment values for NEWHYBRIDS proved to be equally accurate for parental and first-generation hybrids; however, 4% F2, 3% BxFCA and 1% BxFSI were allocated to their own hybrid category with qi values lower than 0.85 (Table 4). Nevertheless, none of the referred cats were significantly (qi>0.85) allocated to one of the other remaining hybrid categories, preventing any case of misclassification. The simultaneous analysis of simulated and true genotypes confirmed results observed for real data alone (Table 4), both for what regards the probable misclassification of cats according to morphology and the detection of hybrids (Figure 3).

Table 4 Average membership proportion (Q) of simulated genotypes in the Bayesian analysis performed using STRUCTURE and NEWHYBRIDS

The performance for detecting hybridization of the 35 top-ranked SNPs (Table 2) was evaluated by simulations on the modified data set using STRUCTURE and NEWHYBRIDS (Table 5). The aim of this analysis was to see whether good discrimination could be obtained more economically by using only strongly differentiated SNPs. The high level of genetic differentiation in these SNPs allowed an overall clear distinction of simulated parental and hybrid genotypes, as most individuals were assigned to their expected cluster with high posterior probabilities (qi>0.80) using the reduced set of SNPs. STRUCTURE’s misassignments were exclusively obtained for lower percentages of admixture, namely 8% of the simulated BxFCA and 4% of the simulated BxFSI. In only a few of the parental and first-generation hybrids, the ranges of the CI are outside the expected values but unsurprisingly less discriminatory CIs were noticed for backcrosses. For example, only 38% of BxFCA and 42% of BxFSI had CI ranges that never overlapped parental genotypes (Table 5). NEWHYBRIDS’ clustering proved also to be highly efficient, with all parental, 98% F1, 90% F2, 90% BxFCA and 96% BxFSI being correctly allocated to their category with high posterior probabilities (Table 5). Only one of the unclassified genotypes (one BxFCA) would be incorrectly assigned to its correspondent parental group, with all of the other cases representing broad partitions among hybrid classes. Exceptionally, two simulated BxFSIs were identified as F1 and F2, whereas two F2 were classified as F1 and BxFSI.

Table 5 Power to detect wildcat–domestic cat hybrids with 35 SNPs

Discussion

Introgression of domestic cat genes is a significant concern for the conservation of European wildcat populations. Hybridization can be either a widespread or localized event in wildcat populations. Hence, more precise detection of introgression levels is essential to prioritize habitats for wildcat preservation and to design efficient conservation strategies. Previous studies clearly show that the development of more powerful tools is still critical to accurately identify parental and hybrid individuals of this species because of the high similarity in morphology and genomes of wild and domestic forms. Although microsatellites have been the dominant markers in wildcat genetic studies (for example, Beaumont et al., 2001; Randi et al., 2001; Pierpaoli et al., 2003; Lecis et al., 2006; Germain et al., 2008; Eckert et al., 2010; O’Brien et al., 2009), and recently mtDNA diagnostic SNPs have been suggested (Driscoll et al., 2011), the increasing availability and numerous advantages of nuclear SNPs make them an appealing alternative and/or a complement to maternal and paternal lineage markers.

SNPs have been attracting a growing interest in a wide range of evolutionary applications and are becoming efficient tools among wildlife conservation-oriented studies (Brumfield et al., 2003; Morin et al., 2004; Seddon et al., 2005; Morin et al., 2009). Offering less variability per locus than STRs, SNPs provide a substantial number of advantages, namely: (i) reduced propensity for homoplasy due to lower mutation rates; (ii) higher density and more uniform distribution in genomes; (iii) suitability for successful high-throughput genotyping and straightforward comparability and transportability across laboratories and detection protocols; and (iv) highly successful application in fragmented DNA samples, for example, noninvasive and historical DNA (see Brumfield et al., 2003; Morin et al., 2004; Garvin et al., 2010 for reviews). Nonetheless, the successful application of genome-wide batteries of nuclear SNPs in studies of wild populations is still limited to a few cases such as wolf-like species for studying their evolutionary history (vonHoldt et al., 2011), wild sheep for detecting population structure and linkage disequilibrium (Miller et al., 2011) and wild Atlantic salmon for the differentiation of farmed and wild individuals (Karlsson et al., 2011). Recently, Monzón et al. (2013) used species-diagnostic SNPs to quantify the relative contributions of parental populations and better understand the complex hybrid ancestry of the northeastern coyote. Here, we provide an analysis of nuclear SNPs in wildcats from a broad European range for applications in European wildcat conservation. Our main motivation was to improve molecular tools for detecting and quantifying hybridization, and testing the smaller and most informative set for it’s potential use in noninvasive genetic samples.

Population variability

Genetic diversity, including SNPs Ar and HE, showed marked differences between European wildcats and domestic cats. The wildcats, which were sampled from a broad proportion of their distribution across Europe, showed significantly lower genetic diversity. Generally, genetic variability is expected to be lower in the domesticated forms relatively to their wild counterparts, because of bottlenecks caused by low numbers of founder individuals and restricted gene flow imposed by human constrains (Doebley et al., 2006). However, the selected SNPs were ascertained from the 1.9 × genome sequence of an Abyssinian domestic cat and are highly polymorphic across all breeds of cats (Kurushima et al., 2012). Thus, these SNPs cannot represent the spectrum of variability presumably present in the studied European wildcat populations and likely suffer from ascertainment bias. Variable SNP loci detected in the European wildcat samples will probably represent widespread ancestral polymorphism, and chances to identify population-specific alleles will be limited. Yet, we cannot exclude that extant European wildcat populations, which probably underwent repeated cycles of demographic fluctuations due to Pleistocene climate changes (Mattucci et al., 2013), and have suffered recent population declines and fragmentation because of anthropogenic pressures, are actually less variable than domestic cats. Moreover, results obtained using other molecular markers, for example, STRs, also suggest that domestic cat may have higher genetic diversity than wildcats (for example, Pierpaoli et al., 2003; Lecis et al., 2006; Oliveira et al., 2008a, b).

The pre-screening of SNPs for inclusion on arrays for this project implies that SNPs are polymorphic in the species in which they were ascertained. This could be the reason for not detecting fixed alleles between the domestic and the wildcat representatives in this study, notwithstanding the 22 monomorphic SNPs observed in wildcats. Although only a very small subset of the species genome was analyzed, a similar result could be expected for larger number of SNPs. Among dogs and wolves, no fixed SNPs have been detected in a 48-K panel from the Affymetrix Canine Mapping SNP 2.0 array (vonHoldt et al., 2012). Even so, when two populations are subjected to different selective pressures, some levels of selection are expected to cause divergence in different parts of their genomes. Native California tiger salamanders (Ambystoma californiense) provide an excellent example of the benefit of SNPs in uncovering patterns of admixture (Fitzpatrick et al., 2010). These authors were able to determine that only 3 out of 68 studied markers spread rapidly into native genomes, whereas the other 65 showed little evidence of introgression beyond the region where introductions of non-native barred tiger salamanders (A. tigrinum mavortium) occur. By demonstrating substantial evidence of heterogeneity in introgression rates among loci, this work highlighted the potential problems faced by those studies that only use a few neutral markers to detect hybridization (Allendorf et al., 2010).

Bayesian clustering

The Bayesian clustering of the 274 individuals (139 random-bred cats, 130 putative European wildcats and 5 known hybrids) immediately revealed higher discriminative power of genotypes over phenotypes in identification of wildcats. Eighteen putative wildcats were allocated with high posterior probabilities to the domestic cluster and therefore were excluded from the analysis. This was in agreement with previous reports for the species (for example, Oliveira et al., 2008a, b), suggesting that morphological identification of European wildcat and domestic cats might not be as straightforward as some authors advocate (Ragni and Possenti, 1996; Daniels et al., 1998; Kitchener et al., 2005; Puzachenko, 2002; Yamaguchi et al., 2004a, b; Krüger et al., 2009; Platz et al., 2011). A variety of issues could lead to misclassification, including (i) dead animals might have been highly degraded at the time of collection and discrimination of obvious morphological characters might not be possible; (ii) cats belong to past generations of admixture and demarked diagnostic traits are no longer expressed; (iii) samples were noninvasively collected (for example, scats and hairs) and morphological discrimination was not possible; (iv) overlap of morphological features; and (v) conservation biologist and naturalist bias of their morphological evaluation toward the collection of wild specimens. The fact that most, if not all, backcrosses remained undetected under morphological evaluation further confirms the higher efficiency of genotypes over phenotypes to identify past generation hybrids. The set of markers defined in this study should effectively circumvent many cases of wrong pre-classification and identify the origin of most unknown samples.

SNP power for admixture analysis

Any ancestral inference must strike a balance between economical, technical and statistical concerns (Rosenberg et al., 2003). Ideally, the identification of recently introgressed hybrids, such as F1, F2 and first backcrosses, could be achieved with a minimum number of loci if the allele frequencies at these loci are sufficiently differentiated between the populations (Vähä and Primmer, 2006). The remarkable resemblance between European wildcats and domestic cats, the intricate history of sympatry and introgression that most probably influenced both the domestication (Driscoll et al., 2007) and the expansion of domestic populations worldwide, might have created one of the most complicated frameworks to genetically discriminate parental groups of wild and domestic relatives. In the context of wildcat’s conservation, genomic resources should be used to select the most informative ancestral markers among the huge number of loci available in DNA variants. Limited panels of 48 or 96 informative SNPs would be enough to design efficient and affordable applications, especially in cases of noninvasive sampling, or when analyses are performed to solve practical problems, such as assignment of unknown samples to parental categories, rather than complex population/introgression inferences. The identification of highly informative SNP loci from larger panels has already been proposed as a powerful approach to identify wolf (Canis lupus lupus) × dog (Canis lupus familiaris) hybrids, 24 loci proven to be informative for assignment to recent hybrid classes (vonHoldt et al., 2012). If allocations are not definitive, a subsequent analysis of 100 loci has been suggested (vonHoldt et al., 2012). In humans, subsets of informative SNPs delineate genetic relationships at the individual, parentage and population levels, namely for detecting human geographic structure (Liu et al., 2005; Lao et al., 2006). Similar studies in other species have also been conducted, such as for European bison (Bison bonasus; Tokarska et al., 2009), Atlantic salmon (Salmo salar; Glover et al., 2010), red fox (Vulpes vulpes; Sacks and Louie, 2008) and chicken breeds (Gärke et al., 2012). Recently, Nussberger et al. (2013) developed a diagnostic marker set containing 48 SNPs that allows the identification of wildcats, domestic cats, their hybrids and backcrosses, and have demonstrated their accurate genotyping in single hairs (Nussberger et al., 2014). However, these authors used a restricted set of reference samples, and the choice of highly differentiated traits/loci from a small panel of parental individuals has been considered to possibly overlook population differentiation (Brumfield et al., 2003; Schlötterer, 2004; Morin et al., 2009). This is a concern among European wildcat populations because the genetic partition of the populations is still poorly known, and central European wildcats might not be as fragmented as other regions (Mattucci et al., unpublished). Studying just a reduced panel of parental individuals from very narrow areas might, then, under-represent wildcat variability at least in that specific population and may overestimate the level of genetic differentiation between wild and domestic cats that truly exists there. Considering that the knowledge on wildcat’s genetic partition in Europe is still growing, the most accurate methodology is looking at European wildcats as an entire population that needs to be genetically differentiated from domestic cats, and try to find the most ancient variants that distinguish both forms. This would most likely prevent the advent of new variants in the future when adding more samples to the analyses. Therefore, to obtain the most powerful genetic tool for the analysis of hybridization/introgression dynamics, a combination of wide geographical samples with different types of markers from the entire genome should be evaluated (Driscoll et al., 2011), which preferably should represent both neutral and non-neutral variations (for example, Teeter et al., 2008).

To provide a similarly efficient panel of diagnostic markers for wildcat hybridization, the SNPs were ranked according to their utility in discriminating between wildcat and domestic cats. As few as 35 of the most differentiating SNPs provided correct admixture evidence for 99% of the cases, with as little as 8% of BxFCA and 4–5% of BxFSI remaining unclassified in STRUCTURE-based inferences. Therefore, the statistical power achieved with the 35 loci-based Bayesian clustering suggests that one can confidently accept the partition of individuals as European wildcat, domestic or first-generation hybrid cats (F1 and F2) with high confidence, whereas more cautious interpretations should be made when outlining admixed individuals (backcrosses). Even so, an underestimation of admixture rates in true populations is not expected, as the only case of missing hybrid identification was observed for a single simulated BxFCA. Although the 35 SNPs revealed outstanding success in hybridization inferences, a complete definition of all admixed cats in the different hybrid categories was fully obtained only with the entire set of 158 SNPs, even though 20% of loci had FCT<0.10.

Detection of hybridization in natural populations using SNPs

The inclusion of five known hybrids provided further evidence of the high accuracy of the assignment tests performed with the entire set of 158 loci, as all were assigned to their correct hybrid category. These results corroborated those obtained by simulation. However, the expectation of 100%, 96%, 97% and 99% identification of F1, F2, BxFCA and BxFSI hybrids, respectively, using NEWHYBRIDS might decrease with genotyping data. The panel of 158 SNPs successfully detected a putative hybrid class for all but one of the admixed cats identified by the same panel (ID 211). However, seven of the hybrids have been assigned with qi values between 0.60 and 0.78. These results confirm the high accuracy levels predicted by simulation analyses, but slightly increase the doubts in precisely identifying true hybrid genotypes. Globally these findings suggest that, although simulating hybrid classes might be a useful and indicative strategy for selecting informative loci and estimate the power of hybridization analyses, the inferences of introgression in true populations of European wildcats may be better refined by the inclusion of real genotypes of known hybrid categories in Bayesian clustering models. Simulation cannot account for novel and low-frequency alleles that could be discovered with additional sampling, and might provide an incomplete reflection of the true assignment power of our marker panel. Ideally, each inference should include simulation genotypes and several known hybrid individuals from different geographical locations and hybrid categories.

The highly discriminating loci discovered in this study may bring new insights to the study of European wildcat populations, specifically a powerful and efficient tool to detect and quantify hybridization with domestic cats. Further genotyping of additional populations should help to validate the selected SNPs. In addition, the possible combination of the SNPs described in our study with the ones developed by Nussberger et al. (2013) can eventually maximize the hybrid detection. Nevertheless, the new throughput technologies under development for domestic cats will soon allow the evaluation of the entire genome of F. silvestris species, supporting the identification of more diagnostic loci and potentially indicating areas of the genome involved with domestication (Montague et al., 2014; Tamazian et al., 2014). Limited X-linked SNPs were evaluated in this study and, because of its transmission pattern, X-linked genes are good candidates for selection during domestication and deserve further investigation. SNPs have already demonstrated the potential to equal or even outperform microsatellites for specific questions such as individual ancestry (Lao et al., 2008), population assignment (for example, Seddon et al., 2005; Narum et al., 2008; Smith and Seeb, 2008; Coates et al., 2009) and pedigree studies (Santure et al., 2010; Hauser et al., 2011), and proved to have large allele frequency differences among populations (Freamo et al., 2011). The inclusion of SNPs associated with specific known domestic cat phenotypes, particularly recessive traits such as melanism, hair types and gloving (for review see Lyons, 2010; 2012) would likely increase the power for domestic cat introgression into wildcats. Combined repertoires of autosomal SNPs, X- and Y-linked markers and mtDNA variants should all help decipher the domestication of the cat and the dynamics of wildcat and domestic cat populations around the world. Ultimately, we consider that SNPs are the molecular markers of choice for hybridization studies, as they can provide an easier, cheaper and standardized method to be implemented in conservation programs.

Data Archiving

Data available from the Dryad Digital Repository: http://dx.doi.org/10.5061/dryad.2q1qv.