Phylogeographic Patterns in Africa and High Resolution Delineation of Genetic Clades in the Lion (Panthera leo)

Comparative phylogeography of African savannah mammals shows a congruent pattern in which populations in West/Central Africa are distinct from populations in East/Southern Africa. However, for the lion, all African populations are currently classified as a single subspecies (Panthera leo leo), while the only remaining population in Asia is considered to be distinct (Panthera leo persica). This distinction is disputed both by morphological and genetic data. In this study we introduce the lion as a model for African phylogeography. Analyses of mtDNA sequences reveal six supported clades and a strongly supported ancestral dichotomy with northern populations (West Africa, Central Africa, North Africa/Asia) on one branch, and southern populations (North East Africa, East/Southern Africa and South West Africa) on the other. We review taxonomies and phylogenies of other large savannah mammals, illustrating that similar clades are found in other species. The described phylogeographic pattern is considered in relation to large scale environmental changes in Africa over the past 300,000 years, attributable to climate. Refugial areas, predicted by climate envelope models, further confirm the observed pattern. We support the revision of current lion taxonomy, as recognition of a northern and a southern subspecies is more parsimonious with the evolutionary history of the lion.

Phylogenetic data of lion populations indicate that current taxonomy does not sufficiently reflect the genetic diversity within the African lion [14][15][16][17][18][19][20][21] . Notably, lion populations from West and Central Africa have a distinct phylogenetic position, with a nested position of the Asiatic subspecies (P. leo persica) 14,15,19,21 . The validity of the subspecies status of the Asiatic lion, nowadays confined to a single population in India, is thereby challenged. Previous studies lacked comprehensive sampling of the West and Central African region and based their results on relatively small sample sizes [14][15][16][17]19,21 . As a consequence, the position of West and Central African lion populations and their relation to the Asiatic subspecies in the phylogenetic tree remained largely unresolved 14,19,21 .
The present study aims to provide a more complete overview of genetic diversity within the African lion and compares results to phylogeographic patterns and taxonomy in a range of African savannah mammals. In addition, we estimate the dates of the major splits in the phylogenetic tree and relate the observed patterns to the dynamic climate history of Africa. Discordances between phylogeographic patterns derived from mtDNA and nuclear loci have been reported in a limited number of species 22,23 . However, previous studies on lions have shown that mitochondrial DNA (mtDNA) loci produce phylogenies that are compatible with phylogenies based on autosomal data 18,24 , supporting the use of mtDNA as an appropriate marker for the current study. For sixteen lions the complete mitochondrial genome was analysed; 1454 base pairs (bp) of the mtDNA were analysed for additional 178 lions throughout the complete geographic range (see Fig. 2 and Supplemental Table 1 for sampling locations). This included samples from each of the Lion Conservation Units (LCUs) in West and Central Africa in which the persistence of lion populations have recently been confirmed 25,26 . To reconstruct the evolutionary history of the West and Central African lion, museum samples from extinct populations in North Africa and Asia, representing a historical connection between the African and the Asiatic subspecies, were obtained and processed using an approach suitable for ancient DNA (aDNA). This is the first study in which phylogeographic patterns of a large number of savannah mammals from different trophic levels is compared and put into a context of climatic changes over the past 300,000 years (300 kyr). We have assessed inter and intraspecific distinctions for all orders of large mammals with a pan-African distribution (the orders Chiroptera, Insectivora, Lagomorpha and Rodentia were excluded). As a model taxon, the lion is used to generate a high resolution map of the distribution of haplogroups. The results contribute to a better understanding of the evolutionary forces that shaped the phylogenetic patterns observed among numerous savannah mammals on the African continent, including humans 27 . Results should be translated into recommendations for the management of different populations and species. In light of our findings, we challenge current lion taxonomy that recognizes only the African and the Asian subspecies, and we investigate options for a taxonomic revision that is more parsimonious with the newly revealed evolutionary history of the lion.

Results
Bayesian, Maximum Likelihood (ML) and Maximum Parsimony (MP) trees were constructed from three different alignments: 1) cytochrome b and control region (hereafter cytB + ctrl reg.), 2) the complete mitogenome, and 3) a combination of both datasets. All showed identical topology, and trees including the complete mitogenome showed strongly significant support for a basal split separating lions in the northern part of their range (North group: West Africa, Central Africa, and North Africa/Asia) and lions in the southern part of their range (South group: North East, East/Southern, and South West Africa) ( Fig. 3a; mitogenome tree shown in Supplemental Fig. 1). Within the North group, a clade that included all Asiatic lions and aDNA sequences from North Africa and Iran was significantly supported, as was a clade with Central African lions and the clade with West African lions. Lions from Central Africa and the North Africa/Asia clade are grouped together on a well-supported branch. In the South group, three major groups can be distinguished: a South West group, an East/Southern group and a North East group. All three clades are significantly supported, as is the branch combining the East/Southern and North East group. The same structure can be seen in the haplotype network based on cytB + ctrl reg. (Fig. 4). The observed groups are indicated together with the sample location in Fig. 2. In only two cases did we observe haplotypes from distinct phylogenetic groups in the same geographic region: in Ethiopia we found haplotypes from the Central Africa group as well as from the North East group, and in the Republic of South Africa (RSA) we found haplotypes from East/Southern and the South West group (Fig. 2, shaded areas).

Figure 1. Examples from eight species for which a dichotomy between West/Central African populations and populations in
Analysis of diversity indices for each of the main phylogenetic groups indicate that the East/Southern Africa clade is least diverse, both in terms of number of haplotypes and distance between haplotypes (Supplemental Table 2). The South West Africa clade is most diverse with notably many differences between haplotypes. Further it shows that pairwise differences within the North group (i.e. haplogroups West Africa, Central Africa, and North Africa/Asia), are relatively few, compared to the South group, or in comparison to differences between the North and South group. Population size is estimated to have been constant throughout the past 300 kyr, following from a Bayesian Skyline plot.
The most recent common ancestor of all modern lions was estimated to have existed around 245,000 years before present (245 ka) (95% Highest Posterior Density (HPD): 120-385 ka; ESS: 216). The split of the South group is older than the North group, estimated to be 189 ka (95% HPD: 90-300 ka; ESS: 253) and 142 ka (95% HPD: 60-239 ka; ESS: 273), respectively. For all major clades (nodes of all haplogroups ESS > 280) the date of the most recent common ancestor was estimated and compared to results from previous publications (Supplemental Table 3, Fig. 3a,b).

Discussion
In this paper, we describe the phylogeographic patterns shared among several orders of African savannah mammals, and examine the phylogenetic relationships of lion populations throughout their entire geographic range. We have analysed 194 lion sequences of cytB + ctrl reg., including 30 aDNA sequences and 16 complete mitochondrial genomes. This approach has produced strong support for a basal dichotomy between lion populations from the northern part of their range and those from the southern part. Within the basal dichotomy, six major phylogenetic groups are identified: West Africa, Central Africa and North Africa/Asia (North group) and North East, East/Southern and South West (South group). This study included samples from 22 countries, including all LCUs with a confirmed lion population in West and Central Africa, as well as extinct populations, representing a comprehensive overview of the historical geographic range of the modern lion. All zoo and museum samples used in our study included decisive information on the origin of the individual or its free-ranging ancestors (for additional information see Supplemental Information 1). Based on the available datapoints, a proposed range of the haplogroups was generated, showing two areas of admixture between distinct lineages (Fig. 2). Although the Rift Valley has been proposed as a barrier for gene flow in lions [14][15][16]19,28,29 , our denser sampling of the region between the North and South groups found that the Rift Valley does not completely prevent a mixture of haplotypes between the two basal branches in the phylogenetic  Fig. 4. Support is indicated as posterior probability (Bayesian analysis)/ bootstrap support (ML analysis). Branches with a single haplotype have been collapsed to improve readability. Support for these branches is indicated by a black triangle at the tip of the branch (support shown in the label). Nodes which have been included for divergence time estimates are indicated with letters and 95% HPD node bars. Distance to outgroup and nodes without dated splits are not in proportion to divergence time. (b) Global oxygen isotope (∂ 18 O) record showing two full interglacial-glacial cycles 7-6 and 5-2 (each of ca. 100 kyr duration), and the present interglacial 1, mainly related to global reorganisations of ocean and atmospheric temperature. In the African (sub)tropics temperature amplitude is lower than expressed in this (global) graph, however precipitation changes are higher than in the temperate and arctic areas and occur in 21 kyr precession cycles. Five maxima at ca. 21 kyr distance in time can be identified in each full interglacial-glacial cycle; precipitation maxima do not necessarily coincide with these temperature maxima 81 . (c) Divergence estimates in thousand years ago (ka) and 95% HPD from BEAST analysis, also indicated as error bars in Fig. 3a.
Scientific RepoRts | 6:30807 | DOI: 10.1038/srep30807 tree (haplotypes 9, 12-14). The second admixture zone is located around Kruger National Park (NP) and Limpopo-Venetia National Reserve (NR), RSA, in which we detect haplotypes from the South West group (haplotype 20 and 22) in addition to haplotypes from the East/Southern group (haplotype 15). Since lions from other parts of RSA and the southern range of Botswana and Namibia also cluster to the East/Southern group, it is likely that the mixture of haplotypes in the Kruger/Limpopo area is the result of human-induced translocations. Lions from Etosha have frequently been used in translocations to South Africa. Further, some private reserves adjacent to Kruger NP that were initially fenced off, are now connected to the park 30 .
The pattern we describe for the lion is highly congruent with phylogeographic data from different taxonomic groups occupying a range of trophic levels, implicating the environment as an evolutionary driver. The most basal dichotomy, distinguishing a northern and a southern lineage, is found in numerous savannah mammals, which we briefly review in Table 1. Several phylogeographic studies on African savannah mammals have described three main clades: West/Central Africa, East Africa and Southern Africa, suggesting that there may have been important refugial areas in these regions during the more recent part of the Pleistocene climatic cycles 4,5 . These three clades are clearly distinguishable in the lion based on mtDNA (nodes c, d and g in Fig. 3a) and autosomal data 24 . A model-based study on the habitat suitability for mammals and birds during the last glacial maximum (LGM) suggests the existence of five possible refugia in sub-Saharan Africa: Upper Guinea, Cameroon Highlands -Congo Basin, Ethiopian Highlands, Angola-Namibia, and East/Southern Africa 31 . These areas coincide with the five sub-Saharan lion groups, described in this study as West, Central, North East, South West, and East/Southern Africa, respectively. In addition to the most basal dichotomy, shown for other species in Table 1 and Fig. 1, the South West clade, which harbors lion populations from Angola and Namibia, is also represented in giraffe (Giraffa camelopardalis) 9,32 , zebra (Equus zebra) 33 , impala (Aepyceros melampus) 34 , greater kudu (Tragelaphus strepsiceros) 34 and sable antelope (Hippotragus niger) 35 . Within East Africa, the North East clade is also found in kob (Kobus kob) 36 , oryx (Oryx beisa) 37 , impala (Aepyceros melampus) 34 and greater kudu (Tragelaphus strepsiceros) 34 .  Fig. 3a. Haplotype size is proportional to its frequency in the dataset. Hatch marks represent a change in the DNA sequence. The connection to outgroup species is indicated by "OUT". Finally, the distinction we find between the West and the Central African lion is also seen in the phylogeographic pattern of roan antelope (Hippotragus equinus) 38 , potentially as a result of the lower Niger River acting as a permanent barrier for gene flow in these species. Climatological events have also heavily influenced migration of early humans 39,40 and as a result, similar major clades and phylogeographic patterns are found in human datasets [41][42][43] . Phylogenetic variation within the six geographic groups of the modern lion appears to have mainly emerged within the last c. 100 kyr, including the cool last glacial period (Marine Oxygen Isotope Stage (MIS) 4, 3 and 2) and two warmer periods (MIS 5 and 1) 44,45 . Phylogenetic structure that had evolved in regional lineages during the previous glacial-interglacial cycles, mostly disappeared by c. 100 ka attributable through various events, including genetic bottlenecks involving expansions and contractions from/to regional refugia 31,46,47 . Since the HPD intervals are relatively large, we add a palaeoclimatic context in order to propose a possible scenario that has contributed to the current phylogeographic pattern in the lion and in other sub-Saharan mammals. The two major vegetation zones on the African continent that likely influenced lion distribution through exclusion are dry desert and dense rain forests 48,49 , representing both hydrological extremes. In the tropics, the hydrological cycle that results in barriers and connective zones for lion dispersal, is mainly driven by latitudinal migrations of the intertropical convergence zone (ITCZ), associated with the 21 kyr precession cycle of orbital climate forcing. This 21 kyr precession cycle occurs five times within a c. 100 kyr eccentricity cycle and equates to a full interglacial-glacial period 45,50 . The last coalescence between the North and South lineage (node a in Fig. 3a) in the lion is estimated at ~245 ka, positioned at the start of interglacial MIS 7 (243-191 ka) (ages after Imbrie et al. 51 ). Following the southward expansion of the Sahara, the last coalescence is characterized by a maximum monsoon index that allowed the dense wet forest to expand maximally northwards along an east-west axis in lower latitude Africa 46,52-57 . Such a vegetation pattern likely reduced or possibly eliminated the connection between northern and southern lion populations. Other species show similar dates for divergence between northern and southern populations, e.g. baboon 58 , giraffe 9 and hyena 12,59 , although broader 90-95% confidence intervals overlap in the case of wild cat 60 and cheetah 13 . It is likely that the cyclic character of orbital forcing, and the concomitant distribution of rain forest and desert in Africa repeatedly created isolated refugia and suture zones for lions and co-distributed species.
The second oldest split between South West group and East/Southern & North East groups (node b in Fig. 3a) occurred at around ~189 ka, a moment positioned at the end of interglacial MIS 7 and the transition to the first cool interval of the following glacial MIS 6. During this interval the monsoon index was still high 57 and an extension of the Zambesian rain forest may have presented a barrier to lion populations in the South West. Simultaneously, lions belonging to the East/Southern group were distributed in a large area across East and Southern Africa 14,55 . More recent radiation of the South West group (node g in Fig. 3a), estimated to have occurred ~92 ka, coincides with vegetation change in the Early Glacial (MIS 5.2), following a period of droughts in which suitable habitat was reduced in Southern Africa, notably in the Kalahari region 61 . The splits between East/Southern and North East Africa (node c in Fig. 3a), located in present-day Kenya, and between West Africa and Central & North Africa/Asia (node d in Fig. 3a), located in present-day Nigeria, appear to have occurred during MIS 6 (186-128 ka) when two periods of dry and cool conditions prevailed 57,62 . The splits in the North group are likely due to the periodically maximum north-south extension of the Sahara desert 14,47,55,57,[63][64][65] . A connection between the North Africa/Asia group and the Central Africa group may have persisted during the short periods in which the monsoon front reached high latitudes, explaining the close genetic relationship of the Central African populations to the North Africa/Asia clade 57,64,65 . The West African population possibly became isolated and reduced in numbers by the significant southwards expansion of the Sahara during MIS 4 (71-59 ka) 40,57,61,64,65 , with radiation beginning around 50 ka. There are no indications from our data that the current lion population in India was sourced or reinforced by introductions from sub-Saharan African lions, as was recently hypothesized 66 .
The estimates for the time to the most recent common ancestor (TMRCA) presented in this study deviate from the results from Barnett et al. 21 . Exclusion of taxa or truncation of sequences, as well as different settings for the runs (prior settings, used substitution model and substitution rate), did not sufficiently explain the observed deviation. Antunes et al. 18 used an alternative approach to calculate a substitution rate, resulting in estimates for which the 95% CIs completely overlap with our 95% CIs for all but one split, which has an overlap of 63% (Supplemental Table 3).
The deep ancestral split within the African lion and the topology of the phylogenetic tree, along with the nested position for the Asiatic subspecies, clearly illustrate and support the contention that the current taxonomic division does not reflect the evolutionary history of the lion. Consequentially, it hampers priority setting for lion conservation, particularly in West and Central Africa. Because the distinct genetic lineages within the African lion are further supported by nuclear data 18,24 and morphological data 67,68 , we suggest recognizing a northern subspecies, including West Africa, Central Africa and North Africa/Asia, and a southern subspecies, including the North East, East/Southern and South West lineages, in line with the proposed revision by Barnett et al. 21 . In the absence of conflicting conclusions based on other genetic markers, the distinct phylogeographic clades within the proposed two subspecies should be managed as Evolutionary Significant Units (ESUs), sensu Moritz 69 . Data from more nuclear loci, and from sampling locations at the geographical borders of the proposed haplogroup ranges may provide additional insight, but are not likely to change the main pattern described in this paper.
Our study shows a fine-scale phylogeographic pattern for the lion, with strongly significant support for a basal north-south dichotomy, as is also observed in other African savannah mammals. By analysing samples from a larger range of localities, the phylogenetic position of the Asiatic subspecies was resolved and it was possible to propose ranges and connectivity zones for six major phylogenetic clades: West Africa, Central Africa and North Africa/Asia (North group) and North East, East/Southern and South West (South group). In the context of the presented time estimates, our results contribute to understanding the evolutionary forces that shaped the genetic make-up of several African savannah mammals, and of the African lion in particular.

Materials and Methods
A total of 194 samples from lions of 22 different countries were analysed, including samples previously described in Bertola et al. 19,24 and Barnett et al. 21 (Supplemental Table 1 and Fig. 2). Blood, tissue or scat samples were collected from free-ranging individuals or captive lions with proper documentation of their breeding history, in accordance with relevant guidelines and in full compliance with specific permits (CITES and permits related to national legislation in the countries of origin). Blood and tissue samples were taken by a vet and stored in biobanks as standard procedure after sedating animals for other research purposes. No animals were sedated specifically for this study.
A total of 16 museum specimens, collection dates ranging from 1831 to 1967, was added to the dataset. Samples from museum specimens were taken from the maxilloturbinal bone, unless another sample was more readily available. For details on sample storage and processing, see Supplemental Information 2, Supplemental Tables 4 and 5.
For all available samples, analyses were performed on alignments consisting of a region, containing cytochrome B, tRNAThr, tRNAPro and the left domain for the control region (cytB + ctrl reg.) (1454 bp, 202 sequences), the complete mitogenome (16,756 bp, excluding RS-2 and RS-3, 23 sequences) and an alignment including all sequence data, where ambiguous nucleotides were added to create sequences of equal length. Bayesian analyses were performed using MrBayes v.3.1.2 70,71 , using a GTR substitution model with gamma-distributed rate variation across sites and a proportion of invariable sites, and a flat Dirichlet distribution for the substitution rate priors and the state frequency priors, as was determined by MrModeltest2 (v.2.3) 72 . The Markov chain Monte Carlo search was continued for 5,000,000 generations, sampling every 100 generations and discarding the first 25% as burnin. ML analyses were done in Garli 73 , using the same setting as used for MrBayes and support of internal nodes was assessed by 100 bootstrap replications in four independent runs. Branches receiving > 0.95 PP in Bayesian analysis and/or 70 bootstrap support in ML and MP analysis are considered to be significantly supported. A haplotype network was created for cytB + ctrl reg. using the median-joining algorithm in Network 4.6.1.1 (available from www.fluxus-engineering.com) with equal weighing of all characters. In addition, we calculated haplotype diversity, nucleotide diversity and pairwaise differences within and between the six main haplogroups (i.e. West Africa, Central Africa, North Africa/Asia, North East Africa, East/Southern Africa, and South West Africa) using Arlequin 74 . Bayesian skyline reconstructions were produced, using BEAST v.1.7.5 75 , using estimated TMRCA for modern lions as derived from earlier runs (i.e. 0.245 Ma, stdev. 0.08).
BEAST v.1.7.5 75 was used to obtain estimated values for the TMRCA to date splits in the lion tree. Five independent runs of 100 million iterations were performed, discarding the first ten percent of each run as burnin, and using the same model as was used for Bayesian analysis and relaxed molecular clock setting. Fossil evidence for the origin of P. leo (including P. leo spelaea) was used for calibration and set to 0.55 Ma (stdev. 0.025) 29,76-79 , with a lognormal distribution for the calibration prior. Convergence of the runs was assessed in Tracer. Logcombiner, Treeannotator and Figtree (available from http://tree.bio.ed.ac.uk/software/figtree/) were used to visualize the results.