Evolution of Old World Equus and origin of the zebra-ass clade

Evolution of the genus Equus has been a matter of long debate with a multitude of hypotheses. Currently, there is no consensus on either the taxonomic content nor phylogeny of Equus. Some hypotheses segregate Equus species into three genera, Plesippus, Allohippus and Equus. Also, the evolutionary role of European Pleistocene Equus stenonis in the origin of the zebra-ass clade has been debated. Studies based on skull, mandible and dental morphology suggest an evolutionary relationship between North American Pliocene E. simplicidens and European and African Pleistocene Equus. In this contribution, we assess the validity of the genera Plesippus, Allohippus and Equus by cladistic analysis combined with morphological and morphometrical comparison of cranial anatomy. Our cladistic analysis, based on cranial and postcranial elements (30 taxa, 129 characters), supports the monophyly of Equus, denies the recognition of Plesippus and Allohippus and supports the derivation of Equus grevyi and members of the zebra-ass clade from European stenonine horses. We define the following evolutionary steps directly relevant to the phylogeny of extant zebras and asses: E. simplicidens–E. stenonis–E. koobiforensis–E. grevyi -zebra-ass clade. The North American Pliocene species Equus simplicidens represents the ancestral stock of Old World Pleistocene Equus and the zebra-ass clade. Our phylogenetic results uphold the most recent genomic outputs which indicate an age of 4.0–4.5 Ma for the origin and monophyly of Equus.

The Old World Equus Datum is a widely recognized biochronological event by geochronologists, correlative with the beginning of the Pleistocene, 2.58 Ma  . It is traditionally considered a significant event in the evolution of Plio-Pleistocene Old World mammalian faunas, represented by the immigration of the Pliocene North American Equus simplicidens into Eurasia across the Beringia land bridge [3][4][5][6][8][9][10][11]17,18,[21][22][23][24] . This intercontinental dispersal is correlated with strong paleoclimatic variation documented in the terrestrial and marine records, driven by the beginning of a major glaciation pulse in the northern hemisphere 17,25 (Fig. 1).
In the last century, evolution of the genus Equus was actively debated by biologists and paleontologists alike proposing a multitude of hypotheses [1][2][3][4][5][6][7][8][9][10][11][12] . Although most of the authors consider the North American E. simplicidens as the possible ancestor of the genus Equus [1][2][3][4][5][6]11,18,22 , there is no current consensus on either the taxonomic content nor phylogeny of Equus. In fact, Equus' traditional taxonomy was upended when some investigators proposed segregating the genus into three genera 7 : North American Pliocene Plesippus 13 (type species E. simplicidens), Pleistocene Eurasian and African Allohippus 14 (type species E. stenonis) older than 1 Ma, and previously recognized species of Equus less than 1 Ma as being the sole members of the genus. The segregation into these three genera was based on cranial morphology and proportion 7 . Notably, some studies used ten metric characters to distinguish Plio-Pleistocene species and extant Equus 2 , whilst more recent studies used only a single character (size of the cranium, brain-box) to distinguish among Plesippus, Allohippus and Equus 7 . The morphology of the dentition and postcranial elements was never taken into account. The validity of North American Plesippus and the European Allohippus was supported by a recent morphological qualitative cladistic analysis 12 , whereas another cladistic study supported the hypothesis of E. simplicidens as possible common ancestor for species of the genus Equus 20 . During the last decades, the Chinese species Equus E. qingyangensis was included within the genus Plesippus (P. qingyangensis) 26 , whereas the Chinese species Equus sanmeniensis (Allohippus sanmeniensis) 27 , the European Early Pleistocene species Equus livenzovensis and Equus senezensis (Allohippus livenzovensis and Allohippus senezensis) [27][28][29] , and the African Early Pleistocene Equus koobiforensis (Allohippus koobiforensis) 29 [17][18][19][20]22 and the Equus FAD in Africa 19 . Dinohippus mexicanus localities are referred from recent studies 21,34 . The Old World Equus FAD at the base of the Pleistocene is correlated with the O 18 isotopic global trend, marking a progressive environmental aridity since the beginning of the Pleistocene, within the paleoclimatic pulse recorded in terrestrial and marine strata, related to the initiation of a major glaciation pulse in the northern hemisphere [17][18][19]22,30

Results
Phylogenetic analysis. The cladistic analysis includes 30 Operative Taxonomic Units (OTU, with 4 outgroups and 26 ingroups) and 129 characters and it has produced one most parsimonious tree (Fig. 2) (Tree Length = 398 steps, Consistency Index = 0.472, Retention Index = 0.705; Homoplasy Index = 0.528). The characters have been coded by direct observations on fossil collections combined with other published fossil specimens (see Methods below). The present phylogenetic tree clusters the family Equidae by node 57 with 13 unambiguous synapomorphies (Appendix 1) and, furthermore, the Miocene tridactyl genera Merychippus and Cormohipparion are segregated from the monodactyl genus Pliohippus by 6 unambiguous synapomorphies (Appendix 1). The species referred to the genera Merychippus, Hippidion and Dinohippus are clustered together as dichotomies, with Cormohipparion being sister to Merychippus.
The phylogenetic tree reveals outcomes for the Plio-Pleistocene species from North America and the Old World. Genus Equus is modeled as a single clade with node 52 being supported by 18 unambiguous synapomorphies, and 13 of these have a CI ≥ 0.500. The complete list is reported in Table 1.
The bootstrap tree supports the Equus clade with 99/100 replications (Appendix 1, bootstrap tree and UPGMA tree). The species previously included within the genera Plesippus and Allohippus are not clustered from the Equus clade, identified by the node 52 ( Fig. 2). This evidence is strongly supported by the bootstrap resampling analyses and tree, where the species included in the clade Equus are grouped as polytomies, except for 4 small clades (Appendix 1, bootstrap tree These subclades do not represent other genera in the Equus clade (node 52), but may indicate no relevant morphological difference between these species being scored with similar character states (Table S1). An analogous result was already highlighted in recent research applying Geometrics Morphometrics on cranial elements in extant species 31 .
Notably, E. simplicidens and E. qingyangensis differ only by a single character, the shape of palatine process (slender in the former and flat in the Chinese species), raising a question about the validity of E. qingyangensis. Another small clade including E. stenonis and E. senezensis in the parsimonious tree (node 36, Fig. 2) is not supported by the bootstrap resampling (Appendix 1).
Furthermore, the UPGMA tree based on the qualitative and quantitative characters described in the Table S1 may share new insights on the possible species relationships included in the Equus clade. The UPGMA well clusters the Equus clade from the other fossil genera of North and South America since the node 59 and, remarkably, separate the caballine horses (including E. przewalskii-E. ferus) from all other stenonine species (Appendix, UPGMA, node 58). This last cluster includes the entire fossil species from the New and the Old World and the extant zebras and asses. Noteworthy, the morphometric analyses based on the skull morphology show a similar result. The small clades within Equus evidenced by the most parsimonious tree and the bootstrap resampling are also present in the UPGMA tree (E. simplicidens-E. qingyangensis; Equus sp. Dmanisi-E. oldowayensis (Olorgesaile); E. hemionus-E. kiang; E. przewalskii-E. ferus) thus supporting their morphological similarities. The Early Pleistocene Chinese species E. eisenmannae is grouped with the E. simplicidens-E. qingyangensis clade, whereas, as reported in Fig. 2, E. livenzovensis and E. stenonis are the closest relatives of E. koobiforensis (Appendix 1, UPGMA tree). These relationships are reflected in the morphometric results on skulls, wherein E. koobiforensis is found to be closely related to E. sanmeniensis and E. stenonis (Fig. 3, S. text and Fig. S2). Equus sp. from Dmanisi (Equus aff. E. altidens 35 ) and E. oldowayensis are still closely related 11 .
In our parsimonious tree ( Fig. 2), E. quagga is regarded as a sister species of the E. hemionus-E. kiang clade. This result seems to be in contrast with the most recent molecular phylogenies 32-34 , wherein plain zebras and wild Asian asses are well clustered as distinct clades, indicting a remarkable difference in their genome sequences. Our outcomes may indicate a close morphological similarity in cranial and postcranial elements of the skeleton, which can be the results of multiple evolutionary or ecological factors and which will deserve future considerations and investigations.
Eventually, our analyses allow to support that the Plio-Pleistocene North American, Eurasian and African species can be all grouped within the genus Equus, and no distinction can be recognized for the genera Plesippus and Allohippus.
Morphometric analysis. PCA results are based on total (basal and lateral) selected measurements ( Fig. 3a,b), and basal measurements of the skull (Fig. 3c,d; see S. text for skull measurement references). Considering basal and lateral skull measurements (Fig. 3a), PC1 and PC2 accounts for most of the variance with 71.3% (PC1 = 54.0% and PC2 = 17.3%). The loadings' distribution is shown in Figure S3 and reported in Table S2 Table S3). The Chinese species E. eisenmannae and E. huanghoensis appear to be more closely related to E. simplicidens than E. stenonis. PC1 and PC3 account the 63.4% of the total variance (PC1 = 54.0% and PC3 = 9.4, Fig. 3b; the loadings' distribution is shown in Figure S3b and reported in Table S2). PC1 separates species by M4 in positive values and maximum lengths (M6 and M23) in negative values (more to less elongated), whereas PC2 mostly clusters species by M5 and M31 with negative values (more to less elongated). In this diagram, E. simplicidens, E. stenonis and E. grevyi are closely clustered, overlapping some portions of their morphospaces. Equus koobiforensis plots between E. stenonis, E. sanmeniensis and E. eisenmannae, whereas E. huanghoensis is well separated from the entire sample by its reduced M5 and its elongated M2. The PCA results on the basal skull measurements (Fig. 3c,d) do not include maximum skull length (M6). We have excluded this measurement in order to investigate the evolution of the basal skull morphology. PC1 and PC2 account for most of the variance with 74.7% (PC1 = 43.2% and PC2 = 31.5%, Fig. 3c; the loadings distribution is shown in Figure S3c and it is reported in Table S2). PC1 separates species by M1 (ventral length of the muzzle) and M2 (palatal length), from negative to positive values (more to less elongated), whereas PC2 mostly clusters species by M3 in positive and M4-M5 in negative values. These results are congruent with the previous clustering pattern (Fig. 3a). Equus simplicidens and E. huanghoensis are clustered by their longer M3 length, whereas E. grevyi and E. stenonis show higher values for M4. Nevertheless, E. stenonis overlaps with E. grevyi's morphospace, providing additional support of the evidence shown in the Log10 Ratio diagram (Figure S2b). Equus koobiforensis is placed closer to E. stenonis and extant E. grevyi, whereas the Chinese species E. eisenmannae is the largest horse of the entire sample and E. sanmeniensis is placed between E. eisenmannae and E. stenonis. PC1 and PC3 account the 60.6% of the total variance (PC1 = 43.2% and PC3 = 17.4%, Fig. 3d; the loadings' distribution is shown in Figure S3d and reported in Table S2). PC1 separates species by M4 with positive values and M1 and M3 with negative values (more to less elongated), whereas PC3 clusters species by M2 with positive values and M5 with negative values (more or less elongated). Equus stenonis and E. grevyi overlap extensively in their morphospaces which likewise include E. koobiforensis. Also, the E. simplicidens sample is placed close to E. stenonis, even if separated by the latter by its longer vomerine length (M3), and it includes E. sanmeniensis in its morphospace. Equus eisenmannae is more closely related to E. simplicidens than E. stenonis, supporting observations of the Log10 ratios diagrams ( Figure S2a). Equus huanghoensis still remains separated from the entire sample by its reduced M5 and its elongated M2.

Discussion
Origin and early evolution of the Equus grevyi clade. Our phylogenetic and morphometric analyses, within the systematic position of Equus stenonis, provide novel insights into the phylogenetic relationships of the Old World Equus and the origin of the zebra-ass clade based on paleontological evidence [2][3][4]7,11,12,15,[18][19][20][21][22]36,37 . As reported in the outcomes shown by the morphometric analyses, the Early Pleistocene Chinese species E.  (Fig. 4l). The E. simplicidens skull is characterized as having a longer vomer length and a reduced post vomerine length (Fig. 4a,c), whereas E. stenonis, E. koobiforensis and E. grevyi have a reduced vomer length and a longer post vomerine length (Figs. 4e,g,i,k). In lateral view, E. simplicidens (Fig. 4b,d) has an elongated skull with a linear dorsal outline and a deep incision of the narial notch, whereas the skulls of E. stenonis (Fig. 4f,h) and E. koobiforensis (Fig. 4j) have a concave dorsal skull outline, akin to E. grevyi (Fig. 4l). Furthermore, this skull development could be related to the mandibular profile. Equus simplicidens has the mandibular ramus angled posteriorly, whereas that of E. stenonis is vertically oriented (Fig. 5a-d). There is no mandible associated with E. koobiforensis. Equus grevyi has a mandible shaped more like E. stenonis, with a steep vertical ramus and a large and round posterior angle of the mandible (Fig. 5e-f).
The preorbital fossa (POF) underwent progressive reduction in Equus species related to the increase in cheek tooth crown height 7,[11][12][13][14] . Figure S4 summarizes POF evolution in E. simplicidens, E. eisenmannae, E. stenonis, E. koobiforensis and E. grevyi, in the lateral morphology of the skull by its perpendicular maximum height (M35), by distance between POF and the facial maxillary crest (M36) and by its height of the back of the POF above  (a, b) show the PC1 versus PC2-PC3 outcomes on the skull basal and lateral measurements, whereas (c, d) show the PC1 versus PC2-PC3 results for the skull basal measurements. The analyses have been developed by the Software R using the packages prcomp() and ggplot2(). The original database is given in Table S3, whereas the loadings plots are shown in Figure S3.  Figure S4). Recent research 38,39 cites Equus' distinction as having the greatest crown height of all Equidae. In turn, increased hypsodonty is hypothesized to be an adaptation to more arid environments in the Early Pleistocene (a generalized Neogene trend in ungulates 40 ). Horses became more adapted to grazing during the Pleistocene with a higher degree of hypsodonty and, as a consequence, increased hypsodonty also affected both the development of the lateral shape of the skull and the expression of the POF. Such evidence is provided by the evolution of the preorbital fossae ( Fig. 3 and Figure S4), which is strongly reduced in E. grevyi and E. koobiforensis when compared to E. simplicidens.
Furthermore, there are important and consistent traits of the cheek tooth dentitions in Equus. The lingual margin of the protocone has a shallow depression on its medial aspect in E. simplicidens and E. stenonis, and it is more evident in E. koobiforensis and E. grevyi (S. text and Figure S5a-d; Ch. 55 in Table S1). In the lower cheek tooth row, the typical V-shaped linguaflexid and stenonine metaconid-metastylid morphology is precociously  Table S1). However, the remarkable squared morphology of the lingual margin of the metastylid is found in E. stenonis, E. koobiforensis and extant E. grevyi, and it is present, even if less clear, in E. simplicidens (S. text and Figure S5f-h; Ch. 92 in Table S1).
Evolutionary remarks. The various analyses provided herein have shown that the evolution of the head morphology occurred in the evolutionary steps E. simplicidens-E. stenonis-E. koobiforensis-E. grevyi + zebra-ass clade (Fig. 6) including: (i) the reduction of the vomerine length; (ii) elongation of the post vomerine length; (iii) reduction of the length of the naso-incisival notch; (iv) progressive reduction of the POF, from a large and more developed structure to a reduced and shallow morphology, which is still present in the extant zebras; (v) progressively more vertically oriented mandibular ramus; (vi) a more derived morphology of the lingual margin of the protocone; (vii) the advanced squared shape of the metastylid and persistent V-shaped linguaflexid . Following Azzaroli and Voorhies 5 , we find that North American Pliocene Equus simplicidens represents the likely ancestral stock for the origin of Eurasian stenonine horses and ultimately African E. grevyi, and the zebra-ass clade [1][2][3][4][5][6]9,11,18,41 . Our phylogenetic results support Equus as being a single clade, with Dinohippus as the sister taxon. Our results do not support Plesippus and Allohippus at either the generic or subgeneric ranks. Our phylogenetic outcomes support the most recent genomic outputs 9,33 , which have found evolutionary rates for the Equus most common recent ancestor living 3.6-5.8 Ma. Ancient DNA analyses have shown slower mutation rates in horses than humans 9 implying a minimal date of 4.07 Ma for Equus' most common recent ancestor, proposing an age of 4.0-4.5 Ma for the origin of Equus. The concurrent evidence of our phylogenetic and the genomic results 9 can be correlated with the most recent paleontological findings in Central America 21,36 , which have proved the occurrence of the primitive Equus morphologies 36 at the Hemphillian-Blancan boundary at ca. 4 Ma 19,21,36 correlative with the onset of Pliocene global warming 11,19,36,42 .  Table S1). The complete www.nature.com/scientificreports/ sample of the specimens coded in the cladistic matrix is reported in S. text. The matrix used in this cladistic analysis includes 68 novel characters, combined with the most recent matrices for equids and perissodactyl cladistic analyses 12,20,[43][44][45][46] (Table S1). The characters have been coded by direct observations. 24 characters are ordered (2,5,9,10,11,12,13,14,16,22,23,42,43,52,78,91,92,113,114,116,118,121,123,129) and 105 characters unordered. All characters were equally weighted. The phylogenetic analysis was undertaken using PAUP 4.0β10 47 , under parsimony using Heuristic Search with the TBR (tree bisection reconnection) branch-swapping algorithm, 1000 bootstrap replications with additional random sequence, gaps treated as missing.

Methods
Morphometric analysis. We have undertaken statistical analyses (Log10 ratio diagrams, PCA, and boxplots) on selected skull, mandible and dental morphologies and measurements (S. text) to evaluate our cladistic analysis of Plio-Pleistocene Holarctic and African Equus evolution. We use international equid measurement standards 48,49 . Plots were generated primarily using the R Studio Software v.1.4.103 2020 50 packages ggplot2() v.3.3.3 and prcomp() v.3.6.2, using the function (scale = T). Following previous analytical studies 7 , we have selected skull length measurements, avoiding medio-lateral deformation which would adversely influence our results. The complete database used in the Morphometric analyses is reported in Table S3, including personal and other published available data 53 -56 . www.nature.com/scientificreports/