The genetic ancestry of American Creole cattle inferred from uniparental and autosomal genetic markers

Cattle imported from the Iberian Peninsula spread throughout America in the early years of discovery and colonization to originate Creole breeds, which adapted to a wide diversity of environments and later received influences from other origins, including zebu cattle in more recent years. We analyzed uniparental genetic markers and autosomal microsatellites in DNA samples from 114 cattle breeds distributed worldwide, including 40 Creole breeds representing the whole American continent, and samples from the Iberian Peninsula, British islands, Continental Europe, Africa and American zebu. We show that Creole breeds differ considerably from each other, and most have their own identity or group with others from neighboring regions. Results with mtDNA indicate that T1c-lineages are rare in Iberia but common in Africa and are well represented in Creoles from Brazil and Colombia, lending support to a direct African influence on Creoles. This is reinforced by the sharing of a unique Y-haplotype between cattle from Mozambique and Creoles from Argentina. Autosomal microsatellites indicate that Creoles occupy an intermediate position between African and European breeds, and some Creoles show a clear Iberian signature. Our results confirm the mixed ancestry of American Creole cattle and the role that African cattle have played in their development.

Phylogenetic relationships. Geographic and breed distribution of maternal haplogroups are depicted in Fig. 1 and Supplementary Table S1. Overall, the most frequent maternal lineages were the European T3, namely in Iberian (over 86%), British (99%) and Continental European (~98%) cattle, but also in Creole (~71%) and Indicine (~50%) breeds. T and T2 lineages were somewhat residual, while the more distinct Q-haplogroup was found exclusively in Creole and Iberian cattle (less than 3%). Most African cattle belong to the T1 haplogroup (~83%) which is also found within the Creole (~16%) and Iberian (~9%) breed groups. Interestingly, the T1c-haplogroup is equally observed in Creole and African breeds (~8%) while it is mostly residual in Iberian cattle or absent in other European breeds. Note that the Indicine breeds of the Americas sampled in our study essentially carry taurine mitochondrial DNA (including the African T1 and T1c lineages, at frequencies of ~12% and ~36%, respectively) and only one animal of the Guzerat breed had the I haplogroup which is typical of Indian cattle.
In the Factorial Correspondence Analysis obtained with autosomal microsatellite genotyping data, the first and second axes accounted for about 13% and 4% of the total variability, respectively. The two-dimensional plot of breed coordinates defined by the first two major axes is in Fig. 3, excluding the two Cuban Creoles and the Spanish Sayaguesa, as these were outliers in the distribution. The 106 breeds represented spread along the first axis according to their continent of origin, with a remarkable separation between indicine and taurine breeds. The Creole breeds showed no discontinuity relative to Iberian breeds on one side and to African breeds on the other. The African Gabú and Bafata breeds from Guinea-Bissau, the Muturu from Nigeria and the Baladi and Menoufis from Egypt revealed a closer relationship with the Creole breeds, in particular with the Surinam Creole, the Velasquez, Caqueteño and Chino Santandereano from Colombia, the Pantaneiro from Brazil, the Pilcomayo from Paraguay and the Guabalá from Panama. On the other hand, some Creole breeds showed a closer relationship with Iberian breeds, particularly the Creoles from Argentina and Chile, the Romosinuano, Lucerna, Costeño con Cuernos and Blanco Orejinegro from Colombia, the Criollo Lechero Tropical from Mexico, the Limonero from Venezuela and the Pineywoods from the United States. The second axis in the Factorial Correspondence Analysis accounted for nearly 4% of the variance and essentially resulted in the spreading of European breeds along this axis, with a visible separation of Iberian relative to the Continental and British cattle breeds. One interesting exception was the Jersey, which clearly separated from the other British breeds.  Table 2. Number of breeds/animals analyzed and genetic diversity indicators for the various breed groups, inferred from mitochondrial DNA (mtDNA), Y-chromosome (Ychr) and autosomal microsatellite (MS) data. Details on the breeds included in each geographic group are in Table 1. For mitochondrial DNA, the total number of haplotypes and haplotype diversities were estimated for a 700 bp D-loop region, and animals/breeds with incomplete sequence data were only used for haplogroup assignment. Genetic diversity indicators for autosomal microsatellites correspond to expected heterozygosity (He), mean number of alleles/locus (Na) and effective number of alleles/locus (Ne) with standard deviation in ().
www.nature.com/scientificreports www.nature.com/scientificreports/ The neighbor-joining tree representing DA genetic distances between breeds (Fig. 4) reveals a continental clustering of the breeds evaluated, with Creole breeds placed between the African and European (including Iberian) clusters. The confidence levels of the relationships between breeds was inferred by the bootstrap values ( Supplementary Fig. S1) which were generally low for the nodes close to the root but tended to be higher when smaller groups of breeds were evaluated.
Using autosomal microsatellites, the vast majority of the Iberian breeds clustered together, but a few showed signs of exotic admixture (Ramo Grande, Minhota, Bruna de los Pirineos, Parda de Montaña and Serrana de Teruel) and grouped with the corresponding British or Continental European breeds. Nearly all the British breeds clustered together, and the same happened with Continental European breeds. The Creole breeds essentially clustered according to their geographic origin, with clades broadly corresponding to: (1) North American and Mexican breeds; (2) breeds from Argentina and Uruguay; (3) the majority of Colombian breeds; (4) breeds from Brazil and Panama; (5) breeds from Cuba; (6) one large cluster representing diverse geographic origins, including Chiapas, Ecuador, Paraguay and some Colombian breeds. African breeds had a common root diverging from Creoles, and several clusters could be identified, mostly reflecting the geographic origin of the breeds analyzed. These clusters included: (1) breeds from the West Coast of Africa (Bafata and Gabú from Guinea-Bissau and Muturu from Nigeria); (2) remaining breeds from Nigeria (Kuri, Sokoto Gudali and Red Bororo) and the Pokot and Eastern Shorthorn Zebu from Kenya; (3) Baladi and Menoufis breeds from Egypt; (4) cluster of breeds from the southern part of Africa, including the Ankole-Watussi, Sanga Tonga from Zambia, Landim from Mozambique and taurine cattle from Angola. The last major branch in the dendrogram grouped all the zebu breeds represented in our study, which showed an important genetic differentiation from each other and diverged from the group of African breeds.
The low bootstrap values observed for large breed-groups could be anticipated, as this is the pattern expected when many populations are analyzed 21 , particularly if they represent closely related breeds 9,22-24 . Nevertheless, the general clustering of breeds from Nei's genetic distances was consistent with the results obtained with other methodologies such as Factorial Analysis of Correspondence and the Bayesian approach implemented with Structure. When the tree of genetic distances between breed groups is considered (Supplementary Fig. S2)  Model-based clustering/Genetic structure. A Bayesian clustering approach was used to assess breed structure and relationships using autosomal microsatellite data, assuming that the observed genetic diversity results from the genetic contributions of a variable number of ancestral populations (K). Contributions of the  Supplementary Fig. S3). When K = 2 was assumed, the European breeds (Spanish, Portuguese, British and Continental) separated from the Indicus group, and most African breeds revealed proximity with the latter group. On the other hand, the American Creoles showed evidence of mixed ancestry from various sources, even though for most Creole breeds the major contribution was from the European group.
When the number of ancestral populations was assessed at K = 7 (i.e., the number of breed groups considered in our data set), the Indicine and African groups separated clearly, even though there were signs of indicine admixture in some African breeds, especially those from Kenya and Nigeria. The British breeds represented a very homogeneous group, with the exception of the Jersey, which was rather different and showed some similarity with a few of the Spanish breeds. The continental breeds mostly clustered together, but a few of them revealed some proximity with British breeds and some had similarity with a few Spanish and Creole breeds. The Portuguese breeds were fairly homogeneous, with the exception of two breeds (Ramo Grande and Minhota) which had clear signs of admixture with Continental European breeds. The Spanish breeds clustered in two groups, the first corresponding essentially to the group of Red breeds which share a common origin, while the second group includes the majority of the southern Spanish breeds and show some similarity with Continental European breeds. The Creole group is the one showing a more diversified background, with contributions from all the other groups represented in nearly all Creoles. Nevertheless, most Creole breeds share a distinct common ancestry, which spreads across the Americas and is mostly perceptible in Creole breeds from the United States, Mexico, Panama, Colombia, Venezuela, Brazil, Uruguay, Paraguay and Argentina. All Creole breeds displayed a minor relationship with African cattle, particularly noticeable in Caribbean cattle (Senepol and Siboney). On the other hand, an indicine contribution was detectable in many Creoles, especially the Caqueteño and Velasquez breeds from Colombia, the Creole breeds from Cuba and Suriname, and the Mexican Criollo from Chiapas. Admixture with British breeds was detectable in the Pineywood from the United States, Lucerna from Colombia, Pampa Chaqueño from Paraguay and the Creoles from Uruguay and Chile. The sharing of ancestry with the Portuguese group was detectable in all Creoles, but more noticeable in the Chiuahua and Nayarit from Mexico, the two Creoles from Panama and the Blanco Orejinegro, Sanmartinero and Costeño con Cuernos from Colombia. The Creoles from the United States, Ecuador, Bolivia, the Criollo Lechero Tropical from Mexico and the Majority of the Colombian Creoles shared an influence, which could be either from cattle breeds from southern Spain or from the other Continental European breeds.
The most likely number of ancestral populations, assessed by the method of Evanno et al. 25 was K = 41. When this large number of ancestral populations was evaluated, very heterogeneous results were obtained for the majority of the breeds studied, even though some interesting details could be identified. The indicus breeds were quite homogeneous, with little indication of introgression from other breeds, while the majority of the breeds of the African group did not reveal major signs of admixture with indicus. The African group was clearly subdivided, www.nature.com/scientificreports www.nature.com/scientificreports/ with one cluster made up by the two breeds from Guinea-Bissau (Bafata and Gabú) and the Muturu from Nigeria, another cluster formed by Ankole-Watusi and the breeds from Kenia and Nigeria, and one last cluster represented by the breeds of Egypt, Angola, Mozambique and Zambia. The Continental European breeds resulted, in general, from the contribution of several ancestral populations, with the exception of the Holstein which essentially represented one ancestral population. In the British group, the Hereford and the Jersey were isolated from the other breeds, while the remaining breeds were grouped in two clusters, one formed by Angus and White Cattle, and the other by Dexter and Shorthorn. The Portuguese breeds had heterogeneous contributions from various ancestral populations, which generally differed from one breed to another. Nevertheless, some breeds that are known to have a common phylogenetic relationship or a close geographic distribution displayed some similarity. For the group of Spanish breeds, a few of them showed a diversified and heterogeneous ancestry, but most breeds clustered independently, possibly reflecting the influence from a specific ancestral population, and this pattern was found both in highly threatened breeds (such as the Palmera and Menorca breeds) as well as in breeds with large census (such as the Retinta and the Lidia). For Creole breeds the pattern was generally very distinct from one breed to another, strongly supporting the uniqueness of the various Creole breeds. For example, The Costeño con Cuernos, Limonero, Caracú, Yacumeño, Uruguayo, Argentino, Patagónico, Chileno and Senepol all had strong individuality, with a major contribution of each one's own ancestral population. On the other hand, a few breeds with a close geographic distribution showed common ancestry, and this was the case for the Brazilian cluster (Crioulo Lageano, Curraleiro, Mocho Nacional and Pantaneiro), the Panamanian group (Guabala and Guaymí) and the North American group (Texas Longhorn and the majority of Mexican breeds). At this high level of K, no clear evidence could be found in American Creoles of admixture with any of the other breed groups evaluated, possibly with the exception of an indicine contribution to a few Creole breeds.

Discussion
We investigated the genetic diversity, uniqueness and population structure of Creole cattle using molecular markers. The results of our combined analysis of uniparental mitochondrial DNA and Y-chromosome markers with autosomal microsatellite data are highly consistent in showing the heterogeneous origins of Creole cattle from the Americas, but also to support the fact that Creole breeds are distinct entities, which demands for in-depth research to have a better knowledge of their characteristics. Historic admixture is reflected in their extremely high genetic diversity for maternal (H = 0.972; No. Haplotypes = 248), paternal (H = 0.884; No. Haplotypes = 21) and autosomal (He = 0.809; Na = 15.5; Ne = 5.8) estimates. These results are consistent with previous studies in smaller subsets of Creole breeds using classic genetic markers 8,10,15,[26][27][28][29] .
The distribution of genetic diversity varies widely among Creole breeds from the different countries. In general, Creole breeds from Mexico (e.g. Cr. Chiapas, Cr. Chihuahua, Cr. Nayarit) and USA (e.g. Florida Cracker) showed high genetic diversity across markers, whereas breeds from the Caribbean region (e.g. Senepol, Guaymí, Guabalá) had lower values. This scenario may reflect the threatened status of some Creole cattle populations, the former due to dilution from intensive crossbreeding and the latter as a consequence of isolation and www.nature.com/scientificreports www.nature.com/scientificreports/ abandonment. Creole cattle represent an enormous reservoir of genetic diversity for the species, despite the fact that many of these breeds are on the brink of disappearance 30 . There is now an increased interest in maintaining these important animal genetic resources. Within Red Conbiand and the BioBovis consortium, researchers have contributed significantly to increase awareness on this matter and several Creole breeds now have a herdbook managed by producers' associations or are under conservation programs with significant expansion in various countries, as has been recently reported in a survey carried out in the framework of a FAO-CONBIAND agreement 31 .
African cattle also retain high genetic diversity probably due to less intensive management. In particular, we identified 14 and five unique Y2 and Y3 lineages, respectively. The majority of the novel Y2-diversity was found in the Landim cattle from Mozambique, as well as in the Gabú and Bafatá breeds from Guinea, while the  Table 1. www.nature.com/scientificreports www.nature.com/scientificreports/ Eastern Shorthorn Zebu from Kenya accounted for four of the five new Y3-haplotypes (Supplementary Table S2). Interestingly, the sharing of Y2-249-158-102-130-149 haplotype between 6 Creole animals from Argentina (4 Cr. Argentino and 2 Cr. Patagónico) and one animal from the Landim breed from Mozambique, a former Portuguese colony, suggests a direct male influence from Africa in some Creole breeds, particularly in the southern region of South America. Additionally, other studies have shown that mitochondrial DNA sequence variation also provides support for an African maternal influence in Creole cattle of the Americas 8,10,11,32,33 , and our results appear to suggest that T1c-lineages, which are very scarce in Iberia, may have been introduced directly from Africa. Specifically, we observed these lineages in cattle from Guinea-Bissau and Angola and the possibility that cows from these two countries could be the direct sources of the T1c haplogroup detected in American Creoles in our study would indicate that cattle may have been taken aboard transatlantic slave ships, since these regions were of major historical importance as departure points of slave trade routes 34 . Furthermore, T1-lineages, which have been shown to exist in Iberia at least since Roman times 35,36 , could have been introgressed into Creole breeds either by the Iberian founder cattle during the early stages of colonization of the Americas or directly from African animals, or both.
The indicine animals included in our study represent the most common zebu breeds that expanded through the Americas over the 20th century. According to historic information, bulls from these breeds were introduced from India and backcrossed with local Creole cows 37 . The matrilines represented in American indicine cattle are expected to correspond essentially to the female population that was the foundation of this systematic backcrossing system rather than to the matrilines present in India. Thus, it is not surprising that the maternal diversity found in our indicine samples had a very scarce representation of the I-haplogroups 38 typical of zebu cattle from India. Indeed, with the exception of one animal of the I haplogroup, we could only detect the taurine matrilines of African and Iberian origins, confirming the findings of Curi et al. 39 and Ribeiro et al. 40 who have reported that the vast majority of indicine cattle in Brazil carry taurine mitochondrial lineages.
Our results from autosomal microsatellites revealed a transition across continents, with more distant groups corresponding to the indicus and European clusters, and with the African group in an intermediate position. On the other hand, Creole breeds showed their own identity in most cases, but also sometimes showed detectable influences from the three groups above which differed between Creole breeds. This is in agreement with results reported for some Creole breeds in studies where SNP chip arrays were used 17,19,41 .
The indicine group of breeds essentially shared a common ancestry, even though the breeds analyzed differed considerably from each other for the panel of microsatellites studied. The African group of breeds was very diverse, with a breed structure and relationships largely reflecting their geographic distribution. Most Figure 5. Population structure of 109 cattle breeds inferred by using the STRUCTURE software and based on microsatellite data. Each breed is represented by a single vertical bar divided into K colors, where K is the number of assumed ancestral clusters, which is graphically represented for K = 2, 7 and 41. The colored segment shows the breed's estimated membership proportions in a given cluster. Breed numerical codes are as defined in Table 1. Ancestral contributions for other values of K ranging from K = 2 to 40 are shown in Supplementary Fig. 2.
African breeds showed some extent of indicine admixture, which was however less pronounced in the breeds from Guinea-Bissau (Bafatá and Gabú) and the Muturu from Nigeria. These breeds from the West Coast of Africa showed a close relationship. They belong to the N'Dama taurine group, which is recognized for its high resistance to trypanosomiasis, thus allowing their maintenance in Tsetse infested areas where other breeds are unable to survive 42 . Another cluster included the Ankole Watusi, Pokot and East Shorthorn Zebu from Kenya, and the Sokoto Gudali and Red Bororo from Nigeria, which present a pronounced indicine admixture, as has been shown previously [42][43][44] . The last group of breeds presents a different identity and occupies the Zambezian region (cattle from Angola, Landim from Mozambique and Sanga Tonga from Zambia) but it also clusters with the two breeds from Egypt (Baladi and Menoufis), possibly reflecting a common ancestry for the two groups. Some of the African breeds studied here have not been genetically characterized in the past (breeds from Angola, Mozambique, Guinea-Bissau, and Egypt), and further studies are needed to better understand their origins and relationships.
The Creole group was the main focus of our work, and it presented some peculiar features, such that most Creole breeds had their own identity or grouped with a few other Creoles with a nearby geographic distribution. In agreement with previous reports 16 we detected a considerable diversity among the various Creole breeds analyzed, where some Creoles showed important influence from indicus (especially breeds raised in tropical areas such as the Creoles from Cuba and Suriname and some of the Creole breeds from Colombia and Mexico) while other Creoles did not. The results from AFC (Fig. 3) show that the Creole breeds occupy the center of the distribution plot, between the African and Iberian breeds. These results point in particular to a possible influence of African cattle on Creole breeds from Panama, Mexico, Colombia and Brazil, with a more likely contribution of cattle originating from Western Africa (Guinea-Bissau, Nigeria) and Northern Africa (Egypt). Other Creole breeds, especially those from Panama and Colombia, revealed signs of Iberian influence. The analyses with the model-based clustering procedures implemented by Structure (Fig. 5) assuming K = 7 confirm that most Creoles essentially have a common identity separate from the other breed groups, even though some Creole breeds reveal limited contributions from the other groups. When many ancestral populations are assumed (K = 41) some Creoles show mixed contributions from various ancestral populations, but most Creole breeds remain uniquely linked to their own single cluster or share a common ancestry with breeds in the same geographical vicinity. This was the case, for example, for the groups of breeds from Brazil, Panama and North America, which formed distinct clusters.
These results strongly support the idea that Creole breeds have their own identity and deserve to be adequately managed and conserved. Our comprehensive sampling of Creole cattle allowed us to clearly infer the influence of African and European founders, confirming observations from previous studies 7,19 , but also to better understand how breeding strategies shaped their genetic composition. In some Creole breeds the analysis with microsatellites indicates that there are still signs of an African and Iberian influence, but these signatures are not as strong as when uniparental markers are investigated. In particular, here we could identify complex patterns of male mediated gene flow through the presence of Y1, Y2 and Y3 lineages in creole breeds. Our results also confirm that more intensively managed cattle populations are typically fixed for a single patriline 45,46 , thus haplotype diversity was null in many British, Continental European and Indicine transboundary commercial breeds, but also in many local breeds from Iberia. Even though Creoles have likely originated mostly from Iberian cattle, with some additional influences from African and British cattle, the small size of the founder populations 2 and a long process of genetic drift and adaptation to the conditions of the New World have led to the divergence of Creoles from their ancestors, resulting in populations which are currently quite distinct in most cases. These results support further analyses at the genome level to infer adaptation/selection to specific environmental and breeding conditions, and additional studies using genomic approaches are warranted, even though biased SNP chips designed for commercial breeds may be inadequate for Creoles.

conclusions
Our findings combining three types of genetic markers in a broad representation of cattle breeds sampled in various continents, integrated with historical information, indicate that Creole breeds have their own identity and a fingerprint unique to this group. These breeds need to be studied in greater depth to better assess their integration in sustainable rural development. The genetic legacy of Iberian cattle is still represented in Creoles, but other influences could also be detected, even though in most cases Creoles remain well differentiated. The African contribution to the genetic composition of Creoles is clear in our work, and while in some cases this may occur by an indirect path through Iberian breeds, the direct influence of African breeds on Creole cattle is undoubtedly demonstrated by their sharing of unique maternal and paternal lineages. Programs aimed at the genetic management of Creole breeds of cattle are urgently needed, aimed at the characterization, conservation and valuation of these unique genetic resources. With this goal, efforts must be made to overcome the gap existing between the state-of-the-art genomic tools currently available and their application to local breeds, especially in the case of undervalued breeds kept in marginal regions such as Creoles 47 .

Methods
Ethics statement. Biological samples were collected during routine veterinary checkups in the framework of official health control programs and with the agreement of breeders.
Sample collection and microsatellite genotyping. We studied a total of 4,658 animals from 114 cattle breeds, including 1,480 Creole from 40 breeds, 1,930 Iberian from 39 breeds, 556 African from 18 breeds, 271 British from 6 breeds, 229 Continental European from 6 breeds, and 192 Indicine from 5 breeds ( Table 1). The sampling strategy was designed in the context of the BioBovis Consortium (https://biobovis.jimdo.com/) to cover the wide geographic range of dispersal of Creole cattle. Breeds from other regions were included as well to capture