Introduction

Population genetics has emerged as a powerful tool for unraveling the history of cattle domestication. Unlike mitochondrial (Loftus et al., 1994; Bradley et al., 1996) and autosomal (MacHugh et al., 1997) DNA markers, the Y-chromosome microsatellite is inherited patrilineally and provides evidence of paternal origins. Polymorphism at the Y-chromosome microsatellite can therefore reveal another perspective in the study of cattle domestication and breed development. Cattle Y-chromosome DNA polymorphism studies appeared in the early 1990s (Bradley et al., 1994; Gwakisa et al., 1994; Kemp and Teale, 1994). Surveys of variation in the Y-chromosome DNA to explore differentiation between taurine (Bos taurus) and zebu (B. indicus) breeds quickly followed (Hanotte, 1997; Edwards et al., 2000). To date, only the INRA124 microsatellite polymorphism has been examined for the Y-chromosome in indigenous African cattle breeds (Hanotte et al., 2000). Of the five Y-chromosome microsatellites studied here, three loci (INRA124, INRA189 and BM861) displayed the putative taurine- and zebu-specific alleles, which can be useful indicators of male-mediated gene flow in hybrid populations (Edwards et al., 2000).

The African continent is home to more than 150 different cattle breeds/populations classified into four broad categories of taurine, zebu, sanga and zenga (zebu-sanga) (Rege, 1999; Rege and Tawah, 1999). The modern African cattle breeds represent a unique genetic resource for improving livestock productivity on a global basis (Hanotte et al., 2002). Despite the earliest cattle in Africa being undoubtedly B. taurus in nature (Bradley et al., 1998), waves of immigration of humped B. indicus have profoundly changed the genetic landscape of African cattle (MacHugh et al., 1997; Hanotte et al., 2000, 2002). Zebu cattle are more recent immigrants in Africa ca. 2000–3000 BP (Clutton-Brock, 1989) with their major influence being centered in East Africa, around Ethiopia and neighbouring countries (Rege, 1999; Hanotte et al., 2002). Thus, as one of the African regions with the strongest influence of intercontinental migrations and dense populations of cattle, north Ethiopian cattle populations may have a genetic structure shaped by several introductions of zebu, as well as by introgression of the taurine from the Near-East. It is generally agreed that the first zebus were brought to Ethiopia through Somalia by Semitic peoples from Arabia. Subsequent interbreeding of the zebus with the local taurine longhorns produced the present-day sanga cattle (Rege, 1999). The second introduction of zebu led to the emergence of zenga breeds and their different strains adapted to the diverse ecological environments in the East African highlands. Even though there are no institutionalized schemes for genetic improvement of the breeds, exchange of superior bulls among closely related herdsmen is a common practice (Zerabruk and Vangen, 2005). The classification of breeds into zebu, sanga and zenga is by physical characteristics, geographical location and adaptive attributes (Rege, 1999).

The present study consists of one European Holstein-Friesian reference breed and seven cattle populations from north Ethiopia, representing the four broad categories described by Rege (1999) and by Zerabruk and Vangen (2005) (Table 1). We studied the variation of five Y-chromosome microsatellites, describing the distributions of haplotype diversity within each population. In addition, to better understand population relationships in a context of the traditional broad categories, we investigated patterns of affinity and diversification among the seven north Ethiopian cattle populations. The results may provide a genetic basis for the formulation of programs aimed at the conservation and sustainable use of unique cattle genetic resources in the region.

Table 1 Population information and allele frequencies of five Y-chromosome microsatellites in the seven North Ethiopian and one European Holstein-Friesian cattle populations

Materials and methods

Samples

DNA samples from 130 male cattle belonging to seven populations from north Ethiopia were analysed. Five of the seven populations were definitely allocated to three of Rege's (1999) four broad African cattle categories as follows: the humped zebu (Begait), the zenga (Arado, Fogera) and the sanga (Afar, Raya). In addition, a recent on-farm investigation reported that the Abergelle and Irob cattle are classified as zenga type by the local farmers (Zerabruk and Vangen, 2005). Sample sizes and geographical distributions are given in Table 1 and Supplementary Figure S1. As a representative modern European taurine breed, the Holstein-Friesian was added to the analysed breeds as a reference. In all cases, particular efforts were made, using both pedigree information and the knowledge of local herdsmen, to ensure that animals were unrelated.

Y-chromosome microsatellite typing

Five Y-chromosome microsatellites (INRA124, INRA126, INRA189, BM861 and BYM-1) were described previously by Edwards et al. (2000) and Ward et al. (2001). The structure of the loci are shown in Supplementary Figure S2. They were typed using 20–30 ng of DNA in PCR protocols on an MJ Research PTC 200 Peltier Thermal Cycler (MJ Research Inc., MA, USA) with a hot lid: INRA124 (Vaiman et al., 1994), INRA126 (Vaiman et al., 1994), INRA189 (Kappes et al., 1997) and BM861 (Bishop et al., 1994) were according to Edwards et al. (2000), and BYM-1 (Matthews and Reed, 1991) according to Ward et al. (2001). The size characterization of PCR products was carried out on a Pharmacia ALF-express Automated Sequencer for loci INRA124, INRA126, INRA189 and BM861, and on an Amersham MegaBACE 500 DNA Sequencer for locus BYM-1.

Statistical analyses

Allele frequencies for all loci were determined by standard gene counting. In an initial pairwise analysis, allele frequencies at the five loci were compared between all pairs of populations using the Fisher exact test-based, genic-comparison option included in GENEPOP version 1.2 (Raymond and Rousset, 1995). Haplotype assignments were based on typing results. Haplotype diversities and their standard errors were calculated as Nei's (1987) unbiased estimates.

The software package Arlequin version 2.000 (Schneider et al., 2000) was used to calculate several population genetic parameters, including pairwise genetic distances, analysis of molecular variance (AMOVA) and mismatch distribution. Significance levels of the variance components and Φ statistics were obtained by comparison of the actual values with the distribution of 10 000 values obtained by randomisation. Pairwise RST genetic distances between populations were computed as a linearization of ΦST values, that is, DST/(1−ΦST) (Slatkin, 1995). ΦST, the FST analogue, counts the number of mutational steps separating each pair of haplotypes (Michalakis and Excoffier, 1996). The pairwise RST values were represented in two-dimensional space with multidimensional scaling (MDS) in the SPSS version 7.0 software package.

In AMOVA, two hierarchical levels (i.e., individuals clustered into populations and populations clustered into groups) were considered as the seven north Ethiopian cattle populations were subdivided into statistical groups according to criteria of either physical characteristics and historical information (Rege, 1999; Zerabruk and Vangen, 2005) or RST genetic distances. Φ statistics were then calculated, representing the haplotype correlations at the various levels of the hierarchical groupings: ΦCT, cluster of the subpopulations relative to total populations; ΦSC, subpopulation relative to cluster of the subpopulations; and ΦST, subpopulation relative to total populations. A median-joining network connecting different haplotypes was constructed using NETWORK 4.1 software (Bandelt et al., 1999). All the 16 haplotypes found in this study are included in the analysis. According to Hurles et al. (2002) and Zegura et al. (2004), a reduced median analysis is performed first, and then the output is used as input for the median-joining network. Locus-specific weights are given proportional to their within-population variance components, so that loci with the highest variances are given the lowest weights. Hence, INRA124, INRA126, INRA189, BM861 and BYM-1 are given weights of 10, 9, 7, 10 and 9, respectively.

Results

Frequencies of the Y-chromosome microsatellite alleles in each population are given in Table 1. The indicine alleles predominate, comprising 99.2% (INRA124/22), 89.2% (INRA189/17) and 99.2% (BM861/15) of the observed alleles at the locus. Of the seven north Ethiopian cattle populations, three taurine alleles (INRA124/23, INRA189/22 and BM861/16) were observed only in one Arado bull. BYM-1 has a unimodel distribution with one frequent allele (BYM-1/20) and with the less frequent alleles differing from the most-frequent allele by a single repeat. INRA126 is the locus with a bimodal allele-frequency distribution. The genetic differences among north Ethiopian cattle are reflected by the possible 21 pairwise allele-frequency comparisons between the populations. Of the five Y-chromosome loci, INRA189 shows highly significant differences (P<0.01) in two comparisons and INRA126 shows significant difference (0.01<P<0.05) in one comparison, whereas for INRA124, BM861 and BYM-1 no comparisons are statistically significant. In total, there are seven statistically significant comparisons at a level of P<0.05 at all five loci, six of which arise from the differences between Abergelle and other breeds.

A total of 11 indicine and one taurine haplotypes, which are determined by the five Y-chromosome microsatellites, are observed in north Ethiopian bulls. Haplotype frequencies in each population are given in Table 2. Haplotype diversity varies from 0.527 (±0.058) in Irob to 0.795 (±0.036) in Abergelle. With the exception of Abergelle (0.795), the haplotype diversity values of north Ethiopian cattle populations (0.527–0.636) are lower than that of the European Holstein-Friesian (0.645) (Table 2). The most common haplotypes (H1 and H2) are widespread among north Ethiopian cattle populations. On the other hand, they represent a relatively low proportion in Abergelle cattle (50%), whereas their joint proportions are higher (85.0–94.7%) in all the other populations in the region. Haplotypes H3 and H4 are scattered such that haplotype H4 is shared by Arado, Abergelle and Fogera, whereas haplotype H3 is shared by Arado, Abergelle, Afar and Raya (Supplementary Figure S1).

Table 2 Haplotype (INRA124-INRA189-BM861-INRA126-BYM-1) frequency and diversity (±standard errors) in the seven North Ethiopian and one European Holstein-Friesian cattle populations

Frequencies of Y-microsatellite haplotypes and the molecular differences were used to compute population genetic distances as ΦST (Supplementary Table S1). The highest interpopulation variance is observed between European Holstein-Friesian and African populations, as would be expected from their history. Among the north Ethiopian cattle, the Abergelle can be clearly differentiated from all the other populations. Pairwise values of ΦST show that, in some cases, neighbouring populations (e.g. Irob and Afar) are significantly different, whereas, in other cases, geographically distant populations (e.g. Irob and Fogera) have nonsignificant pairwise ΦST values. On the basis of the linearized ΦST distances, a picture on the relationships between populations was obtained by multidimensional scaling (MDS) analysis (Figure 1). A good fit between the two-dimensional plot and the source data (pairwise values of ΦST) was obtained, demonstrated by the low stress value (0.136) obtained (Zerjal et al., 2002). As expected, the Holstein-Friesian population appears fairly separated from all the African cattle, evidencing the sharp dichotomy between the Y-chromosome pools of taurine and zebu. The shared haplotypes H3 and H4 in Arado and Abergelle contribute to their closeness, although the breeds are somewhat separated. The close grouping of the other five African cattle can be readily understood from the distribution and frequencies of haplotypes H1 and H2 (Table 1, Supplementary Figure S1).

Figure 1
figure 1

Genetic relationships between populations based on multidimensional scaling (MDS) and a matrix of the pairwise RST genetic distances. Symbols used to identify the four broad categories are as follows: ♦, taurine; ▪, zebu; , sanga; , zenga.

The level of population structure was further assessed by estimating various Φ statistics by means of AMOVA (Excoffier et al., 1992). The overall ΦST value calculated for the entire African sample, comprising the seven populations without grouping, is 0.040, indicating that quite a small proportion of the overall Y-chromosome variation results from interpopulation differences (Table 3). When the hierarchical approach is taken, populations are divided into the three categorical groups described by Rege (1999) and Zerabruk and Vangen (2005) or three clusters based on the above MDS analysis. In the first case, the amount of variation among groups is not detected (ΦCT=−0.019, P=0.011). However, the ΦST value for zenga indicates the existence of an internal population structure within the group (zenga, ΦST=0.054, P<0.001). In the second grouping by MDS clustering, the ΦCT is much higher than that obtained for traditional groups and is statistically significant (ΦCT=0.056, P<0.001). Thus, it would appear that, in north Ethiopian cattle populations, traditional classification mainly stemming from physical characteristics is not a reliable predictor of paternal lineages.

Table 3 Analysis of molecular variance (AMOVA)

Mismatch distributions for Y-chromosome microsatellites are shown in Figure 2. The total north Ethiopian cattle conferred a multimodal model of mismatch distribution. The distribution had two substantial peaks, one at 0 and the other at 2, being likely in the context of a range expansion. The shape reflects the fact that in the north Ethiopian cattle populations most Y-chromosomes belong to two haplotypes, whereas other haplotypes occur at lower frequencies. The peak at 0 difference corresponds to the comparisons between individuals that share the same allele or have the same repeat units, and the second peak located at 2 differences represents two mutational steps separating the most frequent haplotypes. The additional peaks were at 8 and 10 differences, resulting from the presence of divergent taurine haplotype H12, which differs from the most frequent indicine haplotypes H1 and H2 by eight mutational steps.

Figure 2
figure 2

Mismatch distributions for five Y-chromosome microsatellites in the total seven north Ethiopian cattle populations.

Figure 3 displays a median-joining network for the 16 haplotypes detected in the seven north Ethiopian cattle populations and European Holstein-Friesian. This network exhibits two major distinct clusters (I, II) of closely related haplotypes, showing the clear grouping of taurine and indicine haplotypes. Cluster I consists exclusively of taurine haplotypes. The taurine haplotypes exhibit more than six mutational step differences from the indicine haplotypes and no haplotype, but a median vector, represents the connection between the two clusters, reflecting a marked divergence between them. The most striking feature of cluster II is the clear differentiation between those haplotypes with a motif INRA126/9 (cluster IIa) and the haplotypes with motifs INRA126/11 and INRA126/12 (cluster IIb). Cluster IIa and cluster IIb are connected by haplotype with INRA126/10. Cluster II is, therefore, characterized by the subclusters based on the repeats within locus INRA126. On the other hand, the similarity of the African indicine haplotypes is emphasized. Ten out of 12 indicine haplotypes are connected to the large central nodes (haplotypes H1 and H2) by a single mutation. The nodes may represent the ancestral indicine Y-chromosomes in the African cattle populations.

Figure 3
figure 3

A reduced median-joining network of Y-chromosome microsatellite haplotypes in the seven north Ethiopian and one European Holstein-Friesian cattle populations.

Discussion

This study aimed at understanding the genetic landscape of the Y-chromosome in north Ethiopian cattle populations and identifying the underlying domestication events that have led to the current distribution. The results also provide information about the genetic diversity within populations and the genetic affinities among populations from the paternal perspective.

The indicine Y-chromosome predominates in the studied north Ethiopian breeds. Hanotte et al. (2000) and Rege and Bester (1998) concluded that East Africa is the cradle of the largest number of African zebu populations. The taurine alleles are very rare in north Ethiopian cattle and were only found in one Arado bull, which is most likely a result of recent crossbreeding or incomplete introgression of zebu patrilines. The scarcity of the taurine Y-chromosome in north Ethiopia confirms the idea of its nearly total elimination due to male-mediated zebu influxes and expansions in the region (Bradley et al., 1994, 1996; Hanotte et al., 2000).

Y-chromosomal diversity was generally low in north Ethiopian cattle: with one exception (Abergelle), the haplotype diversity in north Ethiopian cattle is lower than that in European Holstein-Friesian. The reduced paternal haplotype diversity observed could be explained, on the one hand, by the introduction of Asian zebu with a limited number of bulls; on the other hand, the low haplotype diversities might also be due to the traditional breeding practice of sharing very few superior bulls. In addition, East African cattle, especially those distributed in semi-desert (Raya and Afar) and highland (Begait, Arado, and Irob) areas, have been regularly subjected to recurrent drought and epidemics, and may have gone through bottlenecks, which can cause losses also in paternal genetic variability. A reduced paternal genetic structure is supported by the facts that: (1) there is low Y-chromosome diversity, (2) only two closely related haplotypes encompass a large proportion (50.0% in Abergelle and 85.0–94.7% in others) of north Ethiopian cattle Y-chromosomes and (3) all other haplotypes are related to the two major haplotypes. Higher haplotype diversity is evident in Abergelle, where as many as three population-specific haplotypes are present. The Abergelle cattle are distinct in showing high frequencies of both haplotype H3, which is also found in Arado, Raya and Afar, and haplotype H4, which they share with Arado and Fogera. According to the recent field survey of Zerabruk and Vangen (2005), the Abergelle cattle are recognised by farmers to have strong adaptive advantages to the hotter and drier lowlands and the ability to cope with feed shortages and ticks during the long dry periods. All these features are favourably rated by the farmers. We speculate that milder bottleneck events affecting this population in lowlands might have contributed to the current situation.

The inferred pattern of interpopulation genetic affinities is illustrated by Φ statistics (Supplementary Table S1) and MDS analysis (Figure 1). As expected, the European Holstein-Friesian represents an obvious outlier with respect to north Ethiopian cattle, which is indicative of a sharp taurine-indicine dichotomy. As reflected in the haplotype frequency distribution (Supplementary Figure S1), a striking finding of MDS analysis is that Y-chromosomes of Abergelle and Arado are distinct from those of the other north Ethiopian cattle populations. An explanation for the clustering of Begait, Fogera, Raya, Afar and Irob would be their common paternal roots. The bulls of Abergelle cattle exhibit a number of characteristics, such as a fine skin, variable coat colour – black, black with white spots, chestnut, grey and light red, a small and not well-developed hump, and the especially small dewlap and naval flap as opposed to the typical features of the zebu breeds (Zerabruk and Vangen, 2005). These make Abergelle markedly distinct from other East African cattle populations. Genetic analyses on autosomal markers also showed a differentiation of Abergelle from others (Zerabruk et al., personal communication). The closeness between Abergelle and Arado might be supported by their geographical proximity.

Genetic differences among cattle populations can be correlated with traditional classification based on physical characteristics (Li et al., 2005). However, the ΦCT estimate for traditional groups is negative (−0.019±0.011; Table 3). We would argue that traditional classification criteria alone cannot explain the substructure of Y-chromosome variation in north Ethiopian cattle populations. It has also been suggested that the topography, which contributes much to the wide variation in climate, soil, natural vegetation and settlement pattern of domestic species, has strongly influenced the distribution of cattle regionally and continentally (Kantanen et al., 2000; Hanotte et al., 2002; Li et al., 2002). The topography of the region is best described as a complex blend of highland (Arado), rugged terrain (Irob), lowland (Abergelle and Begait), steppe and semi-desert (Afra and Raya), and plain (Fogrea). There seems to be no evidence for a correspondence between the inferred pattern of interpopulation genetic relationships and the topographically differentiated groups. However, as illustrated by Φ statistics (Supplementary Table S1), the population structure identified by the MDS analysis seems to reveal the homogeneous groups better. It is interesting that the MDS reveals a special phylogeographic structure: the peripherally located populations fall into one group and the populations in between (Abergelle and Arado) have another genetic affinity, although they are also different from each other.

The network of haplotypes is shown in Figure 3. The effectiveness of the weighting is apparent in that the putative fast-evolving locus INRA126 is represented by more than eight single-repeat links, whereas all the other loci are represented by one to two links. Three clusters have been identified: one far-separated taurine cluster and two indicine clusters that are linked to each other through a single haplotype. The Y-chromosome haplotype pattern supports a hypothesis that founding migrations of massive Asian zebu bulls have carried haplotypes H1 and H2 into East Africa. This is reflected in the structure of haplotypes (Figure 3) in that there are only two mutation steps of locus INRA126 between the two founding indicine haplotypes, and the haplotypes and their derivatives are clustered by INRA126 mutations. However, the scenario outlined here should be considered with caution. One potential caveat is that the fast-evolving locus INRA126 might be located on the pseudoautosomal region of the Y-chromosome, which could explain its diversity. Moreover it would be desirable for more markers and additional breeds, including cattle from Arabian, Pakistani and Indian regions, to be studied to shed more light on the original source and migratory patterns of Asian zebu into East Africa.