Introduction

The concept of cattle breeds, rather than local types, is said to have originated in Britain under the influence of Robert Bakewell in the 18th century (Porter, 1991). It was during that period that intensive culling and inbreeding became widespread in order to achieve specific breeding goals and this coincided with the shift from cattle as draught animals to beef-producing animals. As a result of Bakewell's influence, British cattle breeds are strikingly distinct from each other at the phenotypic level and reflect a long history of intensive breeding. There are approximately 30 distinct cattle breeds native to the British Isles. These range from those selected for intensive dairy production, those selected for beef production and those that remain relatively unselected or selected primarily for breed phenotype (eg coat colour or pattern). Probably more than any other region, British cattle breeding has had a global influence and many British breeds have worldwide distributions.

Characterisation of closely related populations relies on multiple, highly informative markers. Biochemical markers such as blood group polymorphisms have previously been used to discriminate between European cattle breeds (Kidd and Pirchner 1971; Kidd et al, 1980; Grosclaude et al, 1990; Blott et al, 1998). Microsatellite loci are more variable than the biochemical markers and thus are potentially more powerful for distinguishing between closely related groups, like human populations and livestock breeds (Bowcock et al, 1994; MacHugh et al, 1994). Recently, a number of studies of livestock breeds have used microsatellite markers to describe the relationships between breeds and, where possible, the history of breeds (Ciampolini et al, 1995; Moazami-Goudarzi et al, 1997; MacHugh et al, 1994, 1998; Kantanen et al, 2000; Arranz et al, 2001; Bjørnstad and Røed 2001; Beja-Pereira et al, 2003).

The goal of this study was to use microsatellite markers to characterise diversity levels within, and relationships between, a number of British cattle breeds, most of which have not been characterised previously. These breeds included some primarily used for dairy production, some used for beef production, one reared extensively and one that is primarily a show breed.

Methods

Samples

Semen and blood samples were collected from eight cattle breeds (Table 1) in line with MoDAD recommendations (FAO, 1998). Semen samples had been previously stored for use in artificial insemination. Blood sampling was coordinated with owners and veterinary practitioners carrying out routine disease testing. Samples were chosen at random from the UK breeding pool of each breed, while avoiding parent–offspring or sib pairs where possible. Owing to the significant use of artificial insemination in some breeds, many of the samples were not UK-bred animals. Comments about specific breed samples are given in Table 1.

Table 1 Samples in study

Laboratory techniques

DNA was extracted from samples using standard methods. The 30 microsatellite markers, selected by the EC-funded European cattle diversity project (ResGen; http://www.androclus.vet.uu.nl/resgen/default.htm), were genotyped using an ABI 373 sequencer. Details of primers and original references can be found at http://www.projects.roslin.ac.uk/cdiv/markers.html. These markers are distributed across the genome, covering 21 of the autosomes.

Analysis of breed diversity and relationships

Two genetic distances were used for the comparison of breeds: Nei's standard genetic distance (Nei, 1987) and the negative logarithm of the proportion of shared alleles (psa) (Bowcock et al, 1994) as implemented by Rosenberg et al (2001), both based on allele frequencies. Phylogenetic trees were constructed using the neighbour-joining (NJ) algorithm (Saitou and Nei 1987) implemented in Phylip (http://evolution.genetics.washington.edu/phylip). Resampling across loci was used to gauge support for relationships using 1000 bootstrapped data sets. Expected heterozygosity was calculated as 1 minus the sum of the squared allele frequencies at a locus. Alleles found only in a single breed were considered ‘private’ alleles (Calafell et al, 1998).

Analysis of population structure

We used structure, a model-based clustering program (Pritchard et al, 2000; http://pritch.bsd.uchicago.edu) that implements a Monte Carlo Markov chain methodology, to assess the presence of breed structure and assign individuals to breeds, using the ‘correlated allele frequency’ version of the program (Falush et al, 2003). We first performed exploratory runs varying the length of the Markov chain to ascertain the length needed for accurate estimation of the parameters, assuming that the data set could be represented by K separate genetic clusters (K=2,…,11). We then compared the likelihoods for the different values of K. Clustering solutions with the highest likelihood were obtained for values of K=8, 9 or 10. To choose the best value of K, we then ran 20 independent cases, each for 50 000 steps following a burn-in period of 10 000 steps, for each of K=8, 9 and 10. To quantify which K gave the highest likelihood solutions, we used the Wilcoxon's two-sample test to compare the likelihoods for K=8 vs K=9, K=8 vs K=10 and K=9 vs K=10 (as per Rosenberg et al, 2001). As K=8 and 9 (see Results) gave the highest likelihoods, we also ran 20 additional cases, each for 500 000 steps following a burn-in period of 10 000 steps, for each of K=8 and 9 and again compared the likelihoods using the Wilcoxon's two-sample test to confirm results from the 50 000-step analysis. FK values (comparable to FST) for each cluster were also estimated using structure (Falush et al, 2003).

Analysis of molecular variance (AMOVA) was implemented within Arlequin (Excoffier et al, 1992; http://lgb.unige.ch/arlequin/software/2.000/manual/Arlequin.pdf) using methods developed by Cockerham and Weir (see Weir, 1996, and references therein). This multilocus approach breaks down the overall genotypic variation into a within-breed component and a between-breed component. An overall FST is calculated as the proportion of the total variation accounted for by breed.

Breed integrity

Related to breed assignment, one can ask how well defined are breeds (or other populations) genetically. We calculated ‘breed integrity’ by comparing the proportion of shared alleles (psa) between randomly chosen individuals across breeds and psa between individuals of the same breed. Breed integrity should be proportional to the probability that randomly chosen individuals from the same breed are more similar than randomly chosen individuals across breeds. For each individual in the dataset, psa-same-breed and psa-different-breed were computed 10 000 times and an overall probability of (psa-same-breed < psa-random-individual) was calculated. From these probabilities, breed integrity was calculated as

The randomly chosen individual was selected from a different breed than the base individual and the individual from the same breed was distinct from the base individual. This methodology reflects knowledge one has of breed identification of samples, as was the case for this study.

Results

Comparison of breeds

Distance-based phylogenetic analysis was used to describe the relationships between breeds. Nei's genetic distances and negative log of psa values are shown in Table 2. Similar phylogenetic relationships were seen for the two distance measures (Figure 1a and b). There were three clear clusters: Ayrshire–Friesian, Hereford–Highland and Dexter–Jersey. Aberdeen Angus appeared to be most closely related to the Friesian–Ayrshire group. The relationship of Guernsey to the other breeds was not consistent between the two distance measures, although in both cases the breed did not cluster with the Hereford–Highland pair. The lengths of branches leading to the individual breeds were not substantially different from each other, although those of Highland, Jersey and Guernsey were slightly longer than the other breeds.

Table 2 Distance measures between British cattle breeds
Figure 1
figure 1

Phylogenetic relationships between eight British cattle breeds. Genetic distances were calculated from allele frequency data of 30 microsatellite markers using (a) Nei's standard genetic distance (Nei, 1987) and (b) negative log of proportion of shared alleles (psa) values (Bowcock et al, 1994; Rosenberg et al, 2001). Phylogenetic relationships were reconstructed using NJ (Saitou and Nei, 1987) with 1000 bootstraps. Bootstrap values greater than 50% are shown in the figure.

Breed characteristics

The average expected heterozygosity and observed heterozygosity values are shown in Table 3. The values were highly consistent across breeds with ranges of 0.56–0.68 (Hexp) and 0.56–0.67 (Hobs). Highland had the lowest average heterozygosity and Ayrshire had the highest. Average numbers of alleles (Table 3) were similar across the breeds, ranging from 4.33 (Highland) to 6.37 (Friesian). Friesian had the most private alleles (10), while Highland and Dexter the least (3) (Table 3).

Table 3 Summary of diversity measures for British cattle breeds

Population structure

AMOVA analysis revealed an overall FST of 0.13. That is, 13% of allelic variation was accounted for by breed and 87% was within breeds.

Wilcoxon's two-sample tests of structure results of the 20 cases per K-value (50 000 steps per case) revealed that K=8 had significantly greater likelihood than K=7, 9 or 10 (two-sided tests: α=0.002, α=0.01, and α=0.05, respectively) and K=9 did not have significantly greater likelihood than K=10. Wilcoxon's two-sample tests of structure results of the 20 cases per K-value (500 000 steps per case) revealed that K=8 had significantly greater likelihoods than K=9 (two-sided test: α=0.002). Based on these tests, we concluded that K=8 was the best estimate of the number of clusters in the sample.

The assignment of individuals to separate clusters revealed that these eight clusters were generally associated with the eight breeds in the study. Assuming K=8, for 16 out of 20 cases, more than 90% of the members of each breed were assigned to the same cluster, with the remaining individuals generally distributed between the seven other clusters. The exceptions to this pattern were that Ayrshire and Dexter were split between two clusters (>10% found in two clusters) in three out of the 20 cases (twice for Dexter, once for Ayrshire; this latter case had a much lower likelihood than the other 19), and in another case, only 89% (less than our 90% cutoff) of Ayrshires were assigned to the same cluster. In the three cases where Ayrshire or Dexter were split into different clusters, the Ayrshires (or a subset) clustered with the Friesians. For three breeds (Jersey, Highland and Guernsey), greater than 95% of members were assigned to the same clusters in all 20 cases. Under the assumption of K=9, in 16 of the 20 cases, one of the nine clusters was represented by less than 10% of the members of each breed. Of the remaining four cases, there were two where 10–11% of Aberdeen Angus were represented by a separate cluster and one where ∼11% of Guernsey was represented by a separate cluster. In the fourth case, Dexter was split into two substantial clusters (one including 37% of its members and the other 55%), while all other breeds were primarily represented by a single cluster.

Under the K=8 model, average FK values ranged from 0.10 (for the primarily Friesian cluster) to 0.24 (for the primarily Highland cluster) (see Table 4). The higher FK values indicate a greater role of drift for the Highland and Jersey breeds. For the three runs where Friesians and Ayrshires jointly formed a cluster, the FK values for that cluster averaged 0.08.

Table 4 FK values for the breed clusters identified using structure program

Breed integrity values are shown in Table 5. The breed with the highest breed integrity was Jersey (0.924) and that with the lowest was Ayrshire (0.726). That is, psa-within-breeds was more likely to exceed psa-between-breeds for Jersey samples than for Ayrshire.

Table 5 Measures of ‘breed integrity’ (proportional to the probability that randomly chosen individuals from the same breed are more similar than randomly chosen individuals across breeds; see text for further information) for British cattle breeds

Discussion

Despite the substantial within-breed variation seen in British cattle, the analyses described here indicate that (1) there are measurable differences between breeds, seen from the significant clustering into breeds according to structure analysis and the high breed integrity values, and (2) breed relationships are discernible, seen from the breed groupings generated by the phylogenetic analysis. An overall FST value of 0.13 seen in this study was consistent with those previously published for European cattle (0.11; MacHugh et al, 1998; Kantanen et al, 2000). Beyond this single value, we have explored the within- and between-breed variation in British cattle.

Breed relationships

The relationship between breeds is somewhat unexpected in that they do not primarily cluster according to geographical origin. None of the breed pairs with strong support from the phylogeny originate from physically close regions. There is no support for a ‘Scottish group’ (Aberdeen Angus, Ayrshire and Highland) of breeds and only limited support for a ‘Channel Island group’ (Guernsey, Jersey) in that Guernsey clustered with the Dexter–Jersey grouping in the tree based on psa but not the tree based on Nei's distance. The cluster with strongest support is the Highland–Hereford pair for which there is no documented relationship. This link could reflect the long history of Highland herds in England (Felius, 1995) and possible cross-breeding; however, the distinctive phenotype of the Highland breed makes this explanation unlikely. Alternatively, it could reflect common origins.

The Friesian–Ayrshire relationship is somewhat surprising as the Ayrshire breed was developed in Scotland, whereas the British Friesian was developed from an imported continental breed. Although there is currently substantial introgression from the Red and White Holstein into Ayrshires, we were careful to exclude known crossbred animals from our study and therefore believe that the link observed between the British Friesian and Ayrshire breeds predates this recent phenomenon. As Ayrshire is known to have been ‘improved’ in the 18th century by the introduction of Dutch–Flemish cattle and Friesians were imported from this same area in the 19th century (Felius, 1995), this historical link most likely explains the apparent genetic connection. Preliminary analysis of a European-wide cattle data set including these breeds and 56 others also revealed that Ayrshire and Friesian clustered apart from the other British breeds and were more closely aligned with other Northern European cattle (JA Lenstra, personal communication).

The breed relationships seen here are not consistent with MacHugh et al (1994) as they found a bootstrap-supported Aberdeen Angus–Jersey cluster, which was distinct from Hereford and (Dutch) Friesian, whereas we do not. They used only 11 microsatellite markers for their analysis compared to 30 used here, which could account for this discrepancy. Our results also show differences in breed relationships compared with those of Blott et al (1998), who used nine blood group and serum loci, although they analysed many more breeds. In their analysis, Aberdeen Angus and Highland were found in the same cluster, separate from Hereford, Ayrshire did not cluster with Holstein–Friesians (of North American origin), and Jersey did not cluster with either Dexter or Guernsey. In addition to being conducted using microsatellite rather than protein loci, the current study also includes substantially more markers.

It is not surprising that there are inconsistencies, such as those outlined above, between studies using different types and numbers of markers. Differences between phylogenetic relationships in livestock measured by microsatellite markers and biochemical markers have been noted previously (Barker et al, 1997, 2001; Kantanen et al, 2000; Tapio et al, 2003). These could be due to greater sensitivity to bottleneck effects in lower-variability markers (Barker et al, 1997) or selection-related loss in variation at coding loci (Tapio et al, 2003). Several studies have demonstrated that the greater the number of loci, the better the definition of clusters (Bowcock et al, 1994) and accuracy of assignment of individuals to breeds (Blott et al, 1999; Bjørnstad and Røed 2001). Furthermore, genotypic disequilibrium between markers can reduce the power of the study. The large number of markers used and low genotypic disequilibrium in our data (results not shown) presumably allowed discrimination of the clusters with the higher bootstrap values. However, the low bootstrap support for many branches in the phylogram may indicate that crossbreeding between breeds has been substantial and therefore, relationships between all breeds may not be resolvable.

Breed definition

The behaviour of breed clusters based on structure results and the breed integrity values (Table 5) both reflect the similarity between individuals within a breed, relative to that between breeds. We can use these results to describe the breeds in terms of their cohesiveness. According to both analyses, Guernsey, Jersey and Highland are ‘well defined’ as breeds in that for all structure runs, more than 95% of their members clustered together. These breeds also had the highest values of breed integrity.

The finding that Guernsey and Jersey are well defined is consistent with these originally being island breeds. The Guernsey Island population (the source of approximately half of our samples) was closed to immigration between ∼1880 and 1975 (B Luff, personal communication). There have also been strict import restrictions of cattle into Jersey for many years (Felius, 1995). The Highland breed is relatively unselected compared to the other breeds but does have a distinct phenotype, possibly accounting for its discrimination from other breeds. Highland had a very low number of private alleles, indicating that although it is genetically distinct from other breeds in terms of allele frequencies, it is not distinct in terms of the presence of unique alleles. Low effective population size may have a stronger influence on numbers of private alleles than does breed cohesiveness. The impact of small Ne was also seen in the low heterozygosity level in Highland. Dexter, another breed with a small population size, also had few private alleles, although the heterozygosity level was average for this data set.

Ayrshire and Friesian were the least well defined as breeds in that they clustered together in a number of structure runs and they had the lowest breed integrity values. This probably reflects the shared ancestry and introgression between these breeds, as discussed above.

Comparison with phenotypic traits

Given the substantial phenotypic breed differences in British cattle, a greater genetic differentiation might have been expected than was evident. A previous study used a large-scale experimental design with many breeds to quantify the proportion of variation explained by breed for a number of cattle production traits (Thiessen et al, 1984, 1985). They found that this proportion was very high for some traits (0.70 for body weight at 1 year, 0.60 for cumulative feed intake at 1 year and 0.52 for growth rate between 6 and 9 months), although much lower for others (eg 0.15 for feed conversion efficiency). Although traits like colour and body shape were not considered in that study, it is likely that they would also show high levels of between-breed variation. The discrepancy between genotypic and phenotypic differentiation is probably explained by few genes having major effects on visible phenotypes, which were used in the initial establishment of the breeds. This interpretation is also supported by the findings of quantitative trait loci that explain significant proportions of phenotypic variation in cattle and other livestock (Andersson, 2001). The lack of correspondence between genotypic and phenotypic measures indicates that genotypic characterisation on its own is probably not sufficient for making conservation and genetic resource management decisions about livestock (Ponzoni 1997; Ruane 1999; Klungland et al, 2000).