Phylogeographical structure and demographic expansion in the endemic alpine stream salamander (Hynobiidae: Batrachuperus) of the Qinling Mountains

The Qinling Mountains of China provide an excellent study area for assessing the effect of Pleistocene climatic oscillations and paleogeological events on intraspecific diversification. To assess genetic diversity of an endemic stream salamander, Batrachuperus tibetanus, for its conservation, a phylogeographical survey was performed based on mitochondrial DNA and morphological data. The mitochondrial data revealed three lineages of B. tibetanus in the Qinling Mountains. A lineage present in the northwestern Qinling Mountains groups with the Tibet lineage of B. tibetanus, and the remaining Qinling populations are eastern and western lineages that separated ~3–4 million years ago (Ma). The eastern and western Qinling lineage delineation is supported by three morphological variables (snout length, eye diameter and axilla-groin length). The divergence of the two major lineages was likely caused by orogenesis of the Qinling Mountains during the late Cenozoic, and the two lineages were subsequently affected at different levels by Pleistocene climatic oscillations showing different signals of demographic expansion. A large suitable area of B. tibetanus through the Qinling Mountains since the last glacial maximum (LGM) indicated the adaptation of this species to the climatic changes. However, low genetic diversity within populations indicate the urgency of preserving the vulnerable populations and endemic lineages.

For all samples (281 sequences), estimates of both haplotype diversity (h = 0.861) and nucleotide diversity (π = 0.0242) were high, indicating high genetic diversity of B. tibetanus in the Qinling Mountains. The genetic diversity of the four populations possessing haplotypes from both the eastern and western lineages was generally larger than the other populations ( Table 1). The populations of B. tibetanus in the Qinling Mountains generally displayed significant genetic differentiation (Supplementary Table S1). Simulation results from the Spatial Analysis of Molecular Variance (SAMOVA) indicated that F CT increased greatly from K = 2 to K = 5 and then reached a plateau at K = 6 ( Supplementary Fig. S2). Considering the single population clusters, we selected K = 3 as the optimal number of population groups (F CT = 0.849; P < 0.001). The western group consisted of the population 1-6, the eastern group contained the population 8-20, and the third group only contained the QLC population (Fig. 1). The Mantel test revealed a significant correlated relationship (r 2 = 0.681, P < 0.001) between genetic distance and geographical distance ( Supplementary Fig. S3), indicating the significant effect of isolation by distance on population structure.
Demographic history. For the eastern lineage, we detected significant evidence of demographic expansion with significantly negative values for Tajima's D and Fu's Fs statistics ( Table 2). The mismatch distribution of the eastern lineage did not deviate significantly from the expected distribution under a sudden expansion model (Fig. 3). In contrast, the western lineage did not display any significant signals from the two summary statistics and mismatch distribution. For the two lineages, none of the statistical comparisons between the observed and simulated distributions rejected the sudden expansion model based on the raggedness index and the sum of squared deviations (SSD). In addition, positive population growth parameter (90.811 and 200.950 for the western and eastern lineages, respectively) revealed demographic expansion.
The coalescence-based Bayesian skyline plot (BSP) provided additional details on demographic changes through time. The western and eastern Qinling lineages of B. tibetanus displayed similar demographic history  Table 1). The optimal number of population groups is three revealed by the Spatial Analysis of Molecular Variance (SAMOVA   Supplementary Fig. S1), and are considered to be outgroup to explore phylogenetic relationship among the Qinling lineages. The Qinling lineages were previously described as B. taibaiensis 35 . Numbers next to nodes of ML tree indicate bootstrap values and Bayesian posterior probabilities. Numbers in network represent the mutation steps. Sizes of circles are proportional to the haplotype frequencies. Colours for populations: green, the western group; blue, the eastern group; yellow, populations possessing haplotypes from both lineages; black, the QLC population.
indicating a good model performance. The predicted current suitable area was markedly similar to the currently known distribution of B. tibetanus (Fig. 4a). The most important environmental variables when used alone were bio1 (Annual Mean Temperature), bio8 (Mean Temperature of Wettest Quarter) and bio10 (Mean Temperature of Warmest Quarter), according to the jackknife test. Compared to the current suitable area, the predicted past suitable area revealed a larger range in the Qinling Mountains and smaller in the eastern edge of the Tibetan Plateau at the last glacial maximum (LGM, 21 ka) (Fig. 4).
We not only included the distribution area of the Tibetan lineage of B. tibetanus, i.e. the eastern edge of the Tibetan Plateau, in our ENM analyses, but also tested the differences in habitat suitability between the Qinling and Tibetan lineages as well as the eastern and western Qinling lineages using ENMTools 40 . The identity test indicated that the predicted habitat suitability of the Tibetan and Qinling lineages of B. tibetanus exhibited statistically significant ecological differences (P < 0.01). The observed values for I (0.62) and D (0.32) were significantly lower than those generated from replicates. However, there was not an environmental transition between the distribution ranges of the Tibetan and Qinling lineages revealed by the range-break test (P < 0.68). For the western and eastern lineages in the Qinling Mountains, the differences between the habitat suitability were not significant (P = 0.07) with the observed values for I and D being 0.59 and 0.31, respectively.  Table 2. Genetic diversity and demographic statistics for mitochondrial cytb haplotypes of B. tibetanus in the Qinling Mountains (*P < 0.05; **P < 0.01). Standard deviation of h, π and g estimates are shown in parentheses. n, the number of individuals; N, the number of haplotypes; h, haplotype diversity; π, nucleotide diversity; g, population growth parameter; SSD, sum of square deviation (goodness-of-fit to a simulated population expansion); Raggedness, raggedness index; NA, not applicable. Analysis of morphological data. A morphological comparison (analysis of covariance) between different categories using 13 variable features is illustrated in Table 3. Three morphological variables (snout length, eye diameter and axilla-groin length) showed significant differences in mean values between the western and eastern Qinling Mountains when corrected for body size (snout-vent length as the covariate), while four morphological variables (tail length, trunk length, tail height and forelimb length) displayed significant differences between sexes. Note that the differences between the two population groups detected in this study do not greatly modify the results of the analysis of covariance. Interestingly, the factor analysis showed different trends between the four mixed populations (DP, HSP, ZH and NSG) possessing the haplotypes from the two Cytb lineages determined by phylogenetic methods and the other populations ( Supplementary Fig. S4).

Discussion
Our study revealed three lineages of B. tibetanus in the Qinling Mountains. The two major lineages were distributed in the eastern and western Qinling Mountains, respectively, and co-occurring in the middle of the Qinling Mountains ( Fig. 1). We also detected the Tibetan lineage of B. tibetanus in the northwestern Qinling Mountains ( Fig. 1, Supplementary Fig. S1), which has deep divergence from the two Qinling lineages (10.37% of the mean pairwise difference). The clear genetic divergence was consistently supported by the phylogenetic tree, network and SAMOVA analyses using mtDNA (Fig. 2). Moreover, three morphological variables, i.e. snout length, eye diameter and axilla-groin length, showed significant differences between the individuals collected from the eastern and western Qinling Mountains. The morphological variables may represent adaptations to variation in environment and climate throughout their distribution regions. The possible relationship between the two morphological variables i.e. eye diameter and axilla-groin length, and the habitats of different populations was also detected in the other salamanders 41 .
Thus, the hypothesis of the drainage system's influence on genetic patterns of B. tibetanus in the Qinling Mountains was not supported by molecular data. The drainage systems of some rivers flow from west to east (e.g. Wei River, Han River and Dan River) and therefore would contribute to north-south phylogeographical structure, which we did not observe. Instead, rivers do not appear to have acted as barriers to gene flow among the Qinling Scientific RepoRts | 7: 1871 | DOI:10.1038/s41598-017-01799-w lineages of B. tibetanus, which is consistent with the previous result that divergence of Batrachuperus species was not correlated with river drainage system 25,42 .
Also, the distinct climates and habitats at the south and north slopes of the Qinling Mountains seem to have little impact on the divergence of the species of B. tibetanus. Note that there were some lineages grouped with the Qinling lineages ( Supplementary Fig. S1). These lineages were distributed in the other mountains of the Gansu and Sichuan provinces 25 , and thus had shallow divergence with the Qinling lineages due to geographical isolation. However, we focused on the Qinling Mountains in the Shaanxi province in this article.
Given that the main phylogeographical patterns are along the west-east axis, we speculate that historical isolation due to orogenesis of the Qinling Mountains during the late Cenozoic might have caused the genetic divergence between the two Qinling lineages of B. tibetanus. The contiguous geographical areas from the west to east that the species of B. tibetanus occupied comprised the eastern edge of the Tibetan Plateau and the Qinling and Daba Mountains. The most intense uplift of the Tibetan Plateau likely has profoundly changed the topographic features of the western Qinling Mountains 43 , giving rise to the high topographic gradient along the contiguous geographical areas. According to our field survey and the description of sampling locations in prior study 25 , the Tibet lineages of B. tibetanus were generally distributed in the high altitude localities in the west from 1800 to 3700 m above sea level, and the Qinling lineages were in the east from 1300 to 2300 m. In such circumstances, vicariance and geographical barriers are generally the significant factors yielding intraspecific and interspecific differentiation. It is reasonable that high-and low-elevation topographies and significant ecological differences between the eastern edge of the Tibetan Plateau and the Qinling Mountains (P < 0.01) together contributed to the deep split between the Tibet and Qinling lineages of B. tibetanus, probably occurring during the late Miocene. In contrast, orogenesis of the Qinling Mountains played a key role on the divergence of the two Qinling lineages of B. tibetanus, considering statistically non-significant ecological differences between the habitat suitability of the two Qinling lineages (P = 0.07) and the large suitable area since the LGM. It probably occurred during the Pliocene at 3.5 Ma, which corresponded to the orogenic period of the Qinling Mountains during the late Cenozoic. As barriers to gene flow, the Taibai Mountain (3767 m), the highest peak of the Qinling Mountains, might have contributed to the split between the two Qinling lineages.
Similar west-east and/or west-central-east breaks in the Qinling Mountains have already been commonly reported in plants [27][28][29] and animals [19][20][21] , indicating the prevalent trend of west-east and/or west-central-east   Fig. 3), but had different responses to Pleistocene climatic changes. The coalescence-based Bayesian skyline plot (BSP) showed that the eastern Qinling lineage experienced a long period of constant population size (0.23 Ma-0.01 Ma) followed by a postglacial expansion (about 10 kyr), which was also detected for the other amphibian species living in the Qinling Mountains, such as frogs 19,20 . The postglacial expansion was also indicated by the most abundant haplotype (92 individuals) occurring in 9 of the 13 eastern localities and its star-like topology in the network. It suggested that the Pleistocene climatic changes might have had more impact on the demographic changes of the eastern Qinling lineage. In contrast, the western Qinling lineage experienced a long period of constant population size (0.25 Ma-0.05 Ma) followed by a recent expansion (about 50 kyr) during the last glacial epoch, indicating that the western Qinling lineage was not severely suppressed by the LGM (~21 ka). Considering restricted distribution of most haplotypes, the expansion of the two Qinling lineages probably resulted in the secondary contact in four localities with high genetic diversity, i.e. DP, HSP, ZH and NSG (Table 1).
We  (Fig. 4). In contrast, the ENM analyses revealed similar suitable areas in the Qinling Mountains since the LGM (21 ka) in the case that the two Qinling lineages show obvious signals of demographic expansion. It might be because there were not suitable montane stream habitats for expansion surrounding the Qinling and Daba Mountains, and there was significant limitation in dispersal to high elevation localities of the Tibet Plateau.
Currently active climatic changes have significant influence on the distribution of species 45,46 . Although the amphibians (frogs, salamanders and caecilians) survived through the past great mass extinctions, many species are currently at risk globally due to various factors such as rapid ecological and climatic changes [47][48][49][50] . Habitat loss, fragmentation and over-hunting might cause reductions in population size of B. tibetanus, and then decrease genetic diversity within and among populations 51 . In this circumstance, it is important to assess genetic diversity of B. tibetanus for designing conservation strategies. B. tibetanus in the Qinling Mountains has maintained a high level of total genetic diversity with h and π being 0.861 and 0.0242, respectively, and exhibited surprisingly obvious signals of recent demographic expansion (Table 2, Fig. 3). Our ENM analyses indicate that B. tibetanus in the Qinling Mountains has the adaptability in the face of environmental changes (Fig. 4). These results imply a favourable status of this species for conservation and generally healthy natural environment in the Qinling Mountains. By contrast, more than 43% of amphibian species across the globe are in a state of decline 47 . However, our study revealed low genetic diversity within populations and high genetic differentiation among populations ( Table 1, Supplementary Table S1), possibly resulting from habitat fragmentation, long-term and persistent human disturbance or the fluctuations in population size during the evolutionary history. It indicates the importance and urgency of preserving the populations of this species.
The past and current connectivity between populations of B. tibetanus revealed by genetic analyses provide vital information for conservation planning in the Qinling Mountains. Considering the significant genetic divergence and morphological differences between the two Qinling lineages of B. tibetanus, we suggest that the two main lineages should be considered as separate evolutionary significant units (ESU) in future conservation strategies. Our intraspecific units identified here do not meet Moritz's strict criteria for ESU, which require both significant divergence at nuclear loci and reciprocal monophyly of mtDNA 52 . Given that the allozyme data seem to lack sufficient variation to identify the lineages corresponding to mtDNA units 25 , additional research employing polymorphic nuclear DNA markers, such as non-coding nuclear sequences 53 , may clarify it. However, the two genetic units of B. tibetanus are supported by our morphological data and meet Fraser and Bernatchez's criterion of a lineage demonstrating highly restricted gene flow from other such lineages within the higher organizational levels (lineages) of the species 54 . The three lineages in the Qinling Mountains generally have low genetic diversity and thus deserve more surveillance for protection, especially for the QLC population and the eastern lineage ( Table 2) Mountains where the two lineages co-occur also warrant protection to increase genetic variation and to maintain adaptation and evolutionary process 55 .

Methods
The methods were performed in accordance with the guidelines of the American Society of Ichthyologists and Herpetologists 56 . All experimental procedures and animal collection were conducted under the permits (No. IOZ14001) approved by the Committee for Animal Experiments of the Institute of Zoology, Chinese Academy of Sciences, China.
Sample collection, DNA extraction and sequence amplification. Two hundred and eighty one individuals of B. tibetanus were collected at 20 localities of Shaanxi province across its main distribution range in the Qinling Mountains between 2008 and 2010 (Fig. 1). For each location, we collected 2 to 37 samples depending on population size (Table 1). We were limited to no more than ten individuals for six localities because of natural rarity. We also downloaded previously published sequences of B. tibetanus 25 for the subsequent phylogenetic analysis.
After measurements of morphological variables, the tail of each salamander individual was removed and stored in 95% ethanol for DNA analysis. Genomic DNA was isolated from muscle tissues of the tail using standard phenol-chloroform procedures 57 . The fragment of mitochondrial cytochrome b (Cytb) gene was amplified using designed primers (pn1F: 5′-ATTCGAAAAACTCACCCATTA-3′ and pn1R: 5′-AATGTTAAGCTGCGTTGTT-3′) based on the complete mitochondrial genome of B. tibetanus (Genbank sequence NC_008085). Amplifications were performed in 30 μl reactions containing 15 ul Taq Polymerase Enzymes mixture (Takala, Da Lian), 0.3 μM of each primer, and 100 ng of the template DNA. PCR involved 5 minutes at 94 °C, then 35 cycles of 94 °C for 30 s, 50 °C for 30 s and 72 °C for 45 s, and finally an extension at 72 °C for 10 minutes. The reaction mixture was amplified using an MBS Satellite thermal cycler (Thermo Electron Corp., USA). The amplified DNA products were purified, and automated DNA sequencing was performed on an ABI3730 with an ABI PRISM BigDye terminator Cycle Sequencing Ready Reaction Kit (Perkin Elmer Biosystems). All individuals were sequenced from both strands. Sequence data were carefully inspected by eye using BioEdit 7.1.3.0 58 , and sequences immediately next to the sequencing primer were excluded.

Phylogenetic analyses and divergence time estimation.
We reconstructed phylogenetic topologies of mtDNA haplotypes using maximum-likelihood (ML) and Bayesian inference (BI) methods. The best-fitting substitution model was determined under the Bayesian information criteria (BIC) using jModelTest 2.1.1 59 . The best-fit model of nucleotide substitution for Cytb fragment of B. tibetanus was TPM1uf + G. We used PhyML 3.0 60 to perform ML analyses. The pruning and regrafting (SPR) option was used to estimate better ML tree topology. The non-parametric bootstrap with 1000 replicates was used to estimate the support for each internal branch of the ML phylogeny. We used the program MrBayes 3.2.1 61 to perform Bayesian phylogenetic inference. Markov chains were run for 10 7 generations with two independent runs including one cold and three heated chains. We examined two convergence diagnostics, i.e. the average standard deviation of split frequencies and the potential scale reduction factor (PSRF). As runs converged, the average standard deviation of split frequencies should be lower than 0.01 and PSRF should approach 1.0. Trees were sampled every 1000 generations, and the initial 25% of trees were discarded as burn-in. The remaining trees (15002) were used to estimate the majority-rule consensus tree and the Bayesian posterior probabilities (BPP). We performed three replicate analyses. The Qinling lineages were previously described as B. taibaiensis 35 . Thus, haplotypes from QLC population, which were grouped with the Tibetan lineage of B. tibetanus ( Supplementary Fig. S1), were used as outgroup in the phylogenetic analyses. After pruned into a 725 base-pair fragment, we also performed the phylogenetic analyses for a combined dataset containing 33 sequences from Fu & Zeng 25 to explore phylogenetic relationship among the samples from the Qinling Mountains and the Tibetan Plateau. The best-fit model for this dataset was TPM1uf + I + G.
The divergence time between the Cytb lineages was estimated by a mean substitution rate method in BEAST 38 . An uncorrelated lognormal clock model was used, with a mean value of 0.0062 substitutions per site per million years based on the evolutionary rates of Cytb estimated in salamanders 37 . We used a coalescent tree prior modelling constant population size through time that was appropriate for sequences sampled from the same population and/or species 38 . Markov chains were run for 10 9 generations with a sampling interval of 10 5 generations, and the first 10% of the trees were discarded as burn-in. Convergence of the parameters sampled was checked using Tracer 1.6 62 .
Levels of gene flow between the cytb lineages were estimated by the coalescence approach in Migrate. The Bayesian search strategy comprised one long chain with 10 5 recorded steps, a sampling interval of 100 and a burn-in of 10 8 steps. Static heating scheme was used with four chains and temperatures that are (1, 1.5, 3 and 10 6 ). Runs were repeated three times to examine consistency of estimates, and convergence was also checked using effective sample size (ESS). The effective number of immigrants per generation (N e m) of the most likely migration model was calculated by estimated mutation-scaled population sizes (θ) and immigration rates (M), N e m j→i = θ i × M j→i .
Population genetic diversity and structure. Genetic diversity within populations was estimated by haplotype (h) and nucleotide diversity (π) using DnaSP 5.10.01 63 . Pairwise sequence differences among the haplotypes were calculated using Arlequin 3.5.1.2 64 . The exact test of population differentiation was performed using Arlequin. Isolation by distance (IBD) was examined by testing the relationship between pairwise population genetic distances and geographical distances using Mantel test in Arlequin. Genetic distances between populations used in the Mantel test were calculated by MEGA 5.2 65 . To explore general pattern of variation, a median-joining network of haplotypes was constructed using NETWORK 4.6.1.0 66 .
Scientific RepoRts | 7: 1871 | DOI:10.1038/s41598-017-01799-w We used SAMOVA 2.0 67 to define the number of population groups based on geographical and genetic distance. Without a priori structure parameters, the method was based on a simulated annealing procedure to maximize the proportion of total genetic variance (F CT ) due to differences between population groups. For each user-defined number of groups (K), F CT was estimated and evaluated to select the optimal number of genetic groups. The number of initial conditions was set to 100 with K = 2-15. Statistical significance was tested using 1000 random permutations of the data.

Historical demography. Historical demography of the Cytb lineages was investigated by summary statistics
Tajima's D 68 and Fu's Fs 69 using DnaSP. Significantly negative values of Tajima's D and Fu's Fs are evidence for an excess of rare variations potentially due to population growth. We performed the mismatch distribution analysis to detect recent population expansion by comparing the distribution of the number of pairwise differences between haplotypes and their theoretical distribution expected under a model of sudden (stepwise) demographic expansion. The SSD and raggedness index 70 were used to test goodness of fit. We also used a coalescent-based method in LAMARC 2.1.8 71 to estimate the exponential growth rate (g) for each Cytb lineage and total samples. The Bayesian search strategy comprised 10 short chains of 5000 steps and two long chains of 200,000 steps, and a sampling interval of 20. Runs were repeated ten times to examine consistency of estimates. Finally, we estimated past population dynamics using coalescence-based Bayesian skyline plots 72 , as implemented in BEAST. The best-fit model with an uncorrelated lognormal clock was used. Markov chains were run for 10 8 generations with a sampling interval of 10 4 generations, and the first 10% of the trees were discarded as burn-in. The Tracer was used to summarize the results and check convergence of the Markov chain.
Ecological niche modelling. We performed ecological niche modelling (ENM) to evaluate the present suitable area of B. tibetanus and to infer its potential distribution during the Quaternary (21 kya) using climate data and known sampling localities of this species. We used two models for the LGM (21 kya tibetanus, distributed in the eastern edge of the Tibetan Plateau, was included in our ENM analyses. We performed a correlation analysis between the nineteen climatic variables with ArcGIS 9.3 (ESRI) in the defined area, and then selected eleven uncorrelated variables (r < 0.85): bio1-Annual Mean Temperature, bio6-Min Temperature of Coldest Month, bio8-Mean Temperature of Wettest Quarter, bio9-Mean Temperature of Driest Quarter, bio10-Mean Temperature of Warmest Quarter, bio11-Mean Temperature of Coldest Quarter, bio12-Annual Precipitation, bio13-Precipitation of Wettest Month, bio14-Precipitation of Driest Month, bio17-Precipitation of Driest Quarter, bio19-Precipitation of Coldest Quarter. Species occurrence data consisted of 62 sampling localities, 32 of which were described 25 . The ENM analyses were performed using the maximum entropy algorithm in Maxent 3.3.3 73 due to its appropriateness for presence-only data and good performance 74 . Maxent was run for twenty replicates with default parameters except for using 25% of occurrence data to test the model. We used the area under the curve (AUC) of the receiver operating characteristic (ROC) plot to evaluate model performance.
We used ENMTools 1.4.3 to perform the identity test and range-break test for testing the differences between the Qinling and Tibetan lineages as well as the Qinling lineages. The identity test is used to ask whether the predicted habitat suitability for the lineages is more different than expected if they are generated at random from a pooled data set. The range-break test used the same pooled data set to ask whether geographic boundary between the lineages is associated with an environmental transition 75 . The I statistic and Schoener's D were used to measure overlap of predicted habitat suitability of the lineages. Both I and D ranged from 0 (completely discordant niches) to 1 (identical niches). Statistical significance was tested using 100 replicates simulated in ENMTools.
Morphological data. The following fourteen morphological variables were examined on 120 collected specimens (60 males and 60 females) with vernier calipers to the nearest 0.1 mm: body mass, snout-vent length, tail length, total length, head length, head width, snout length, trunk length, interocular space, eye diameter, tail height, forelimb length, hind limb length, axilla-groin length. Measurements were taken from adults, as determined by the body size and weight. These morphological variables, with snout-vent length being a covariate variable to avoid the impact of body size, were analysed using analysis of covariance by SPSS 13.0. This method examined whether means of a variable are equal across different groups taking into account variability of other variables. Our analysis of covariance considered two categorical variables, i.e. population group and sex, and one covariate (snout-vent length). Here, we considered the differences between the two population groups determined by phylogenetic methods and SAMOVA, respectively.
We also used a multivariate technique, i.e. factor analysis, to describe variability among the fourteen morphological variables. It was used to visualize the distance matrix generated from morphological data. We used principal-component analysis to compute a distance matrix, and varimax rotation to minimize the complexity of the components.