Introduction

The Iberian breed constitutes the largest of the surviving populations of pigs of Mediterranean type. The population of Iberian pigs was maintained for centuries in large herds grazing the sparse oak and cork oak woodlands of the SW of the Iberian Peninsula. The morphology of the Iberian pig (dark skin and hair colour, pointed snout and strong legs) makes it resistant to sunstroke and high summer temperatures and adapts it for pasturing. From November to February, it can be quickly fattened by the consumption of available acorns, grass, small roots and bulbs (López-Bote, 1998). Although the Iberian pigs share the above features, several local varieties arose in the traditional population, showing phenotypic differences (black hairless, red and blond pigs) and with only rare genetic interchange. Some of these varieties were exported during the colonisation of the Americas, being the direct ancestors of several Creole pig breeds (Lemus-Flores et al, 2001). Red Iberian pigs imported from Portugal and Spain in the XIX century also contributed to the origin of the Duroc–Jersey breed in the United States (Vaughan, 1950).

The numbers of the Iberian breed have been drastically reduced since 1960 due to the outbreak of the African swine fever, the lowered value of animal fats and the massive introduction of more efficient foreign breeds. In the last few years, this population bottleneck has been reversed and the production of pigs of the Iberian type has increased to satisfy the new demand of top quality meat and cured products. However, the old breed structure, with differentiated varieties locally distributed, has been replaced by a pyramidal structure based on crossbreeding with Duroc, and a strong dependence on a small number of breeding nuclei supplying purebred Iberian animals to all the production tier (Silió, 2000). In these circumstances, some ancestral varieties have disappeared, others are endangered or blended, necessitating a new design for programmes of conservation of these genetic resources.

In the establishment of conservation programmes for a domestic species (or breed) in which a number of different breeds (or varieties) exist, a question to be asked is which populations should contribute to the population to be conserved and in what proportions. In other words, by what criteria should the conservation priorities be set. This is a controversial issue, although the final decision should, together with genetic information, take into account several factors such as the adaptation to specific environments or diseases, and the possession of specific traits of cultural, scientific or future economic value (Ponzoni, 1997; Oldenbroek, 1999; Ruane, 1999). A second question, especially relevant in implementing germplasm banks, is how many individuals from each population should contribute to the bank of gametes.

Phylogenetic techniques based on genetic distances estimated from polymorphic microsatellite markers are the method of choice to assess the genetic diversity of livestock breeds (Hall and Bradley, 1995). Thaon d’Arnoldi et al (1998) emphasised the analysis of genetic distances by the Weitzman (1992) approach, to measure the global diversity and the marginal loss of diversity attached to each breed. Conservation priorities can be based on these results. Laval et al (2000) applied this method to analyse the genetic diversity of 11 pig breeds from six European countries, Cañón et al (2001) to 18 European beef cattle breeds and Aranguren-Méndez et al (2002) to five endangered Spanish donkey breeds, supporting the value of the preservation of local endangered breeds in the maintenance of species diversity. Caballero and Toro (2002), however, consider these phylogenetic methods inappropriate for within-species breed conservation, because of the genetic variation within groups is ignored, although it may be of great importance for the management of livestock breeds. They have proposed other basic tools, based on molecular coancestry measured from markers, to analyse genetic diversity in subdivided populations as breeds or varieties of livestock species.

Recently, alternative clustering methods, have been proposed, which allow the inference of population structure and the assignment of individuals to populations (Pritchard et al, 2000; Dawson and Belkhir, 2001). These methods are also useful for the definition of conservation units, and have been applied to 20 chicken breeds by Rosenberg et al (2001).

The objective of this study was to assess the genetic diversity in the main varieties and preserved strains of Iberian pigs and their relationship to the Duroc breed, comparing the approaches mentioned above in their utility in establishing conservation strategies for this breed.

Material and methods

Populations

Two of the groups of Iberian pigs considered here, Guadyerbas and Torbiscal, belong to an early conservation programme established in 1945 at ‘El Dehesón del Encinar’ (Oropesa, Toledo, Spain). The herd was formed from four founder populations, representative of the most important varieties of Iberian pigs, that were kept genetically isolated until 1963. Then, the four groups were slowly blended, resulting in the composite strain named Torbiscal (Rodrigáñez et al, 2000). One of the founder populations (Guadyerbas) was also maintained as a separate closed strain, representing the endangered black hairless variety from the Guadiana Valley (Toro et al, 2000). The complete genealogy of all the animals is available since 1945, there having been 18.9 (Guadyerbas) and 21.0 (Torbiscal) generations since the population founders until the animals were genotyped in this study. The remaining Iberian pig groups represent the main extant varieties: black hairless (Negro Lampiño) and red (Retinto) pigs and a black hairy variety (Entrepelado), whose piglets show a chestnut colour at birth. Blood samples were collected from 173 individuals (77 boars and 96 sows) inscribed in the breed herdbook, their distribution being as follows: 31 Torbiscal, 32 Guadyerbas, 50 Retinto (seven herds), 30 Lampiño (three herds) and 30 Entrepelado (five herds). Owing to their historical and current relations with the Iberian pigs, a further 40 Duroc pigs from seven herds were also sampled and analysed.

Microsatellites

All the animals were genotyped for 36 microsatellite markers, two on each autosome (Table 1) They were chosen for their reproducibility, position on the chromosome, polymorphism and absence of null alleles. Most markers belong to the panel recommended by the ISAG-FAO Advisory Committee for genetic distance studies (FAO, 1998). However, some of those included in this panel, for example, CGA and S0068, were discarded because of the low reproducibility of their alleles. PCR-amplified microsatellite markers were analysed with Genescan software on capillary electrophoresis equipment with fluorescent detection (ABI PRISM 310 genetic analyzer). To increase the accuracy of allele size determination, four control animals were genotyped in all the gels. Genotypes were stored in the Gemma database (Iannuccelli et al, 1996).

Table 1 Number of alleles and observed and expected heterozygosities of microsatellite markers in Iberian and Duroc breeds

Statistical analysis

Analysis of genetic diversity in a subdivided population

There are several measures of genetic variability, but the most popular is the genetic diversity, defined by Nei (1973) as the heterozygosity expected under the Hardy–Weinberg (H–W) equilibrium conditions. It also corresponds to the complement of the average global molecular coancestry, defined as the probability that two alleles taken at random from the population are the same. In a subdivided population, the study of genetic diversity is focused on the partition on the components between and within subpopulations (strains and varieties in this case). Here, we follow closely the development of Caballero and Toro (2002) who expressed the average global coancestry as

where n is the number of populations, is the average global coancestry, si the average of within-individual coancestry of the ith population, Dii the average distance between individuals of the ith population and Dij is Nei's minimum distance between subpopulations i and j. The definitions and the way of calculating the above parameters are given in Appendix A.

The right-hand side of Eq. (1) shows how the average global coancestry depends on the within-subpopulation coancestry (first term in the brackets) and the average distance among subpopulations (second term in the brackets). Other way of expressing (1) is as genetic diversity

The last expression represents the partition of the total genetic diversity (heterozygosity), GDT=1−, into two components: the genetic diversity within subpopulations GDWS=1−f̃ and the genetic diversity between subpopulations GDBS=(f̃−). The first also has two components, the genetic diversity within individuals, GDWI=1– and the genetic diversity between individuals, GDBI=−f̃. If G is the proportion of within-subpopulation genetic diversity corresponding to within-individual diversity G=GDBI/GDWS, then Wright's (1969) F-statistics can be written as

Finally, let us consider a different question that may be useful for the conservation of genetic diversity either in a live conservation or in a cryoconservation programme. If we had to pool the different subpopulations to produce a single one (a synthetic population or a germplasm bank), what would be the contribution of each subpopulation to the pool in order to maximise its genetic diversity? If the different subpopulations were imposed to give different contributions (ci) to the next generation (see below), the genetic diversity could be obtained as

This question can be answered by obtaining the values of ci in Eq. (4) that maximise genetic diversity with the restrictions ci0 and . The appropriate values of ci can be obtained by integer quadratic programming. Similar equations were constructed to give different weighting to genetic diversity and genetic distance between populations (see Results).

Phylogenetic trees and the Weitzman method to measure genetic diversity

Although there are many measures of genetic distances, Reynolds's distance is considered to be the best for studying populations that have recently diverged, with neighbour joining (NJ) being the preferred technique to obtain a graphical representation of the distance matrix, when the populations differ in effective population size, as is usual in domestic breeds (Eding and Laval, 1999). It was constructed from the Reynolds distances using the PHYLIP package (Felsenstein, 2001).

Weitzman (1992) proposed a method to construct hierarchical trees based on a form of maximum-likelihood phylogeny conditional on the model. Thus, the contribution of an element to group diversity is proportional to the reduction in tree length caused by the removal of the element from the group. It is computationally intensive, limiting its use to sets of 25 or fewer populations.

Clustering analysis

Recently, a clustering method has been proposed (Pritchard et al, 2000; Dawson and Belkhir, 2001) that constructs genetic clusters from a set of individual multilocus genotypes estimating, for each individual, the fraction of its genome that belongs to each cluster without any prior information on the structure of the population. Thus, the individuals are assigned (probabilistically) to populations or, jointly, to two or more populations if their genotypes indicate that they are admixed. The algorithm is solved adopting a Bayesian approach computed using Markov chain Monte Carlo methods. It constitutes a most flexible alternative to cluster methods based on genetic distances. It can separate a set of individuals in several populations if their genetic origin is unknown beforehand or, as in the present situation, to study the correspondence between inferred genetic clusters and known predefined population categorisations. Here we applied, for this purpose, the Structure algorithm of Pritchard et al (2000).

Results

Number of alleles, heterozygosity and deviations from H–W

Table 1 shows the number of alleles for each genotyped microsatellite that ranged from 4 to 14 (average 7.2) summed across the Iberian breed as a whole and from 2 to 10 in the Duroc population (average 5.4). In the Iberian breed, the average number of alleles per population ranged from 2.8 (microsatellite S0219) to 7.8 (microsatellite S0005). Table 1 also gives, for each locus, the observed heterozygosity by direct count and the expected heterozygosity under H–W equilibrium. The observed values were generally lower than the expected values, indicating heterogeneity between populations within the breed.

In Table 2 we show for each population i, the self-coancestry fii, the average of individual self-coancestries si, the average inbreeding Fi, together with the parameters αi and Gi (see Appendix A). For both coancestry and inbreeding, the Guadyerbas population shows the highest values. In a randomly mating population, the values of fii and Fi should coincide and αi measures the magnitude of the possible discrepancy. The negative values of αi for the Torbiscal and Guadyerbas strains indicate that there exists an excess of heterozygous genotypes with respect to the expected value under H–W equilibrium. The parameter Gi is related to αi and measures the proportion of genetic variability between individuals. With random mating G=½, lower values indicating that variability is preferentially distributed as variability within individuals, as it happens in the aforementioned strains.

Table 2 Self-coancestries (fii, si), inbreeding (Fi), H–W deviations (αi) and proportion of diversity between individuals (Gi) for each population i

Genetic distances and clusters

The matrix of coancestry values between populations and the values of Reynolds distance are included in Table 3. An unrooted NJ tree showing the relationships among all the populations, evaluated by 1000 bootstrap replicates, was constructed using the Reynolds genetic distances (Figure 1).

Table 3 Values of coancestry (fij, above the diagonal) and Reynolds’ genetic distance (DRij, below the diagonal) among the populations analysed
Figure 1
figure 1

Unrooted NJ dendrogram showing the relationships among five Iberian pig strains and varieties (Torbiscal, Guadyerbas, Negro Lampiño, Entrepelado and Retinto) and the Duroc breed inferred from 36 microsatellite data. This tree is based on Reynolds’ genetic distances. Numbers on the nodes are the percentage values for 1000 bootstrap resamplings.

For all the Iberian subpopulations, the genetic distance to the Duroc breed is greater than that to any of the other subpopulations of the breed. Another way of visualising this differentiation is to apply the structure algorithm to classify the genome of the 213 genotyped individuals in two clusters. The results appear in Table 4 and indicate that most of the genomes of all the Iberian strains and varieties fall into the same cluster with the genome of the Duroc breed constituting the other. Both the Torbiscal and the Guadyerbas strains are the subpopulations whose genomes are differentiated the most unambiguously from Duroc. However, about 4.7% of the genotypes of the Entrepelado variety were classified in the Duroc cluster.

Table 4 Proportion of membership of each predefined population in each of the two possible clusters

Finally, when the structure algorithm is applied to the Iberian breed assuming the same number of clusters and subpopulations (five), we obtain the results presented in the Table 5. They indicate that on average 98.6% of the Torbiscal genomes and 99.5% of Guadyerbas genomes are classified as two separate clusters. However, the results are less clear for the other subpopulations, whose genomes are attributed to diverse clusters. This again emphasised that the first two strains constitute more defined populations than the others.

Table 5 Proportion of memberships of each Iberian pig strain or variety assigned to each one of the five possible clusters

Partition of genetic diversity in the Iberian breed

The overall values for the Iberian breed of the different statistics that summarise the genetic diversity were

which indicates that the total genetic diversity GDT=1−=0.696 can be partitioned in the three components: the genetic diversity within individuals, GDWI=(1−)=0.290, the genetic diversity between individuals, GDBI=(−f̃)=0.317. The sum of these two elements gives the genetic diversity within subpopulations, GDWS=(1−f̃)=0.607, and the genetic diversity between subpopulations, GDBS=(f̃−)=0.089. The corresponding Wright statistics are FIS=0.045, FST=0.129, FIT=0.167.

The proportional contribution of each subpopulation to the global coancestry of the Iberian breed is given in Table 6. The Guadyerbas strain contributes most due its own coancestry but, because it shows the highest genetic distance from the other strains, its total contribution is lower than the Retinto strain.

Table 6 Proportional contribution of each strain or variety to the global coancestry of the Iberian breed

Another question is to ascertain the loss or gain of diversity if one or several subpopulations are removed from the pool. This can be calculated by disregarding one (or more populations) and recalculating the global average coancestry from the remaining pool as shown in Table 7. The removal of the Torbiscal and Guadyerbas strains increases the within-subpopulation genetic diversity, but decreases the average genetic distance. The global net effect would be negative for Torbiscal and positive for Guadyerbas. That means that the removal of the Guadyerbas strain actually will improve the total genetic diversity of the breed, measured as its expected heterozygosity (Nei, 1973). The removal of each one of the three varieties would decrease the within-subpopulation genetic diversity, but would increase the genetic distance, the global balance being negative.

Table 7 Total genetic diversity and loss (−) or gain (+) of diversity when one or two Iberian pig subpopulations are removed

According to the results of the Weitzman approach, also shown in Table 7, the removal of any subpopulation will always decrease the total genetic diversity. The removal of the Gudayerbas strain would have the greatest impact on the decrease of variability followed by the Torbiscal strain. On the other hand, when considering the removal of two subpopulations together, Torbiscal and Guadyerbas seem to be the most important and the Entrepelado and Lampiño varieties the least.

The optimal contributions of each strain to a possible synthetic or germplasm bank can be calculated solving Eq. (4) through quadratic programming to minimise λf̄. Different λ values weighting the global genetic diversity and the genetic distances have been considered and the results appear in Table 8. The results agree with those of the analysis of genetic diversity. For maximising the global genetic diversity (λ=1), the strains that contributed most are Entrepelado and Lampiño, whereas if the objective was to maximise the genetic distance (λ=0), Guadyerbas and Torbiscal strains should be prioritised. For λ=2, two of the subpopulations would have a null contribution. It is also possible to specify a minimum for the contribution of any strain or variety using the quadratic programming solver. By way of illustration Table 8 includes results with a minimum of c=0.02 for λ=2.

Table 8 Optimal contributions to a synthetic line or to a germplasm bank for different weights of the within- and between-population variability λf̃

Discussion

The maintenance of genetic variability is the main objective of conservation programmes. In the present paper, we have been dealing the expected heterozygosity as an adequate measure of genetic diversity, as proposed by Nei (1973). Although this measure is usually applied to molecular markers and refers to identity in state, it must be emphasised that it is equivalent to the classical Malécot (1948) coefficient of coancestry in a model where all the alleles in the base or reference population are assumed to be different. Therefore, it can be applied either to molecular coancestry measured from markers or to genealogical coancestry coming from pedigree information. The basic tools for analysing genetic diversity in subdivided populations have been recently reviewed by Caballero and Toro (2002).

In Conservation Biology there is much discussion about the place of molecular markers in establishing evolutionary significant units (ESU) defined as the population units that merit separate management and have a high priority for conservation. Some authors believe that the increasing availability of molecular markers permit us to define ESUs as a function of these markers alone and therefore emphasise the role of genetic distances (Moritz, 1994). Other authors consider that ecological and genetic variation of adaptive significance are more relevant for conservation, the key concept being exchangeability: two populations are different ESUs if they cannot be exchanged because they do not share ecological and functional adaptations (Crandall et al, 2000). A parallel controversy exists for the conservation of livestock breeds. As a result of the FAO's recommendation and their availability through genomic mapping, microsatellites have been widely used to characterise genetic diversity. Genetic distances calculated from microsatellite data have been advocated as an objective criterion for making conservation decisions (Barker, 1999; Wimmers et al, 2000). However, other authors have severely criticised this approach and emphasised that aspects such as adaptive features and traits of economic or scientific importance are infrequently available for genetic distance measures. At most, genetic distances could show that populations that perform similarly are actually different but not the other way round (Ponzoni, 1997; Ruane, 1999). The Iberian breed is an interesting case in which to analyse these questions. Its suitability for the sustainable use of the Mediterranean ecosystem, the presence of several varieties and strains and the high quality of its meat and cured products are unusual characteristics, which largely prevent the exchangeability of this pig breed with others.

The above results show the high level of genetic diversity in the Iberian breed, higher than the values reported for other European pig breeds (Laval et al, 2000). The expected heterozygosity exceeds the observed heterozygosity, confirming the subdivision of the breed into strains and varieties. When the subpopulations are analysed separately, Guadyerbas shows less variability in both the expected and observed heterozygosity, as expected from its long maintenance as a closed herd with a low effective population size (Toro et al, 2000). For both the Guadyerbas and the Torbiscal strains, there is an excess of observed vs expected heterozygosity (αi<0) or in other words, more variability is stored within than between individuals (Gi<0.50), which reflects the minimum coancestry mating tactics used in the management of these strains (Toro et al, 2000; Fernández et al, 2002). The positive values of the other subpopulations and especially the high values of the Retinto variety are more difficult to interpret, apart from their indicating a lack of control of matings.

The Iberian and the Duroc breeds show a clear differentiation according to the values of molecular coancestries and genetic distances (Table 3). The study of distinctiveness of populations has been monopolised by the methodology of genetic distances, but there are other approaches that are probably more efficient. When all the genotyped individuals are classified into two clusters, there are an almost perfect coincidence with the two breeds, the Guadyerbas and Torbiscal strains being those whose genomes are differentiated the most unambiguosly from that of Duroc. In fact, any introgression from Duroc into these two strains can be absolutely ruled out, because they have complete purebred pedigree records since 1944 and it is only since 1962 that the first imported Duroc–Jersey pigs have arrived to Spain. The results of the cluster analysis of Iberian subpopulations indicate that almost 99% of individuals from Torbiscal and Guadyerbas strains are classified as two separate clusters, whereas a variable proportion of the other varieties could be attributed to diverse clusters. Guadyerbas and Retinto contribute least to the overall genetic diversity of the breed, due to their own high coancestry even though, especially in the case of Gudayerbas, there is a high genetic distance from the other subpopulations. On the other hand, Entrepelado and Torbiscal are the subpopulations that contribute most, for the opposite reasons: they show the lowest genetic distances to other subpopulations, but also the lowest within-subpopulation coancestry.

Another way of studying the relevance of the different Iberian strains and varieties to the breed diversity is to calculate the loss or gain of diversity if one or several groups are removed and recalculating the global average coancestry (Table 7). The effect on the total genetic diversity of the removal of some groups could seem paradoxical, although it arises from a standard population genetics analysis (Caballero and Toro, 2002). We must realise that we are considering a theoretical model in which subpopulations contribute to an infinite pool of genes. If, as a consequence of the removal of one subpopulation, the gene frequencies are more equalised, this will increase the expected heterozygosity. A similar argument explains that the variability of a population will increase if a group of the most related individuals (a group of clones, for example) are eliminated and substituted by randomly chosen individuals. When two subpopulations are simultaneously removed, the results agree with the previous ones. The removal of Torbiscal and Gudyerbas will hardly affect to the total diversity, whereas that of Retinto and Entrepelado will produce the maximum depletion of diversity.

These results are in sharp contrast with those obtained using the Weitzman approach to calculate the contribution of each subpopulation and the marginal diversity of each breed. Several authors have criticised the Weitzman approach (Caballero and Toro, 2002; Eding et al, 2002). This method does not have a clear genetic interpretation and therefore has properties, (eg that the removal of an element always decreases the variability) that are at variance with classical population genetics ideas. Besides, it does not have a way of including the population size if desired and, most important of all, it ignores within-population variability, which is a crucial component of total variability.

Ignoring the within-group variability is a drawback not only of the Weitzman method but also all methods based only on genetic distances. In fact, one of the properties of the method (monotonocity in distance) is that the diversity in a set of populations should increase if the distance between populations increases. Thus, it will favour inbred populations with extreme allele frequencies, whereas the coancestry approach would favour noninbred populations with an even distribution of gene frequencies. On the other hand, an over emphasis on within-breed variation will favour the largest breeds, of current commercial value, and therefore the less endangered ones.

The important point is that the results obtained either using the between-population diversity or the total diversity will produce different and sometimes opposite conservation priorities. Therefore, some compromise should be attempted. Thaon d’Arnoldi et al (1998) suggest the inclusion, together with the Weitzman method, of the probability of extinction of each population. However, as Eding et al (2002) pointed out, this will make the things worse because inbred populations will get an even higher weight. A more practical proposal could be to consider an aggregate diversity, expressed as a linear combination of within- and between-population diversity weighted in an appropriate manner, as suggested by Ollivier and Foulley (2002).

The weights will depend on the scenario imagined for the medium term use of the genetic diversity. If the plan is to use them as part of crossbreeding plans, the diversity between populations should be prioritised because both heterosis and complementarity are functions of this type of genetic variation. The same will apply if the plan is introgression programmes of some specific trait. On the other hand, if we are thinking of the future creation of a new population able to cope with a challenging environment or with diversified production conditions, the within-population diversity will be important (Notter, 1999). For example, Chaiwong and Kinghorn (1999) suggest giving five times more weight to the variation between breeds than to that within breeds. However arbitrary this figure is, it indicates that some kind of weighting is perceived by several authors as a necessary exercise.

The method for calculating the optimal contribution of populations to a possible synthetic line or to a germplasm bank is also a valuable methodology. Eding et al (2002) have used such criterion to establish a core set for 45 poultry breeds. Here, we show the flexibility of this technique (Table 8), which can be utilised by giving different weights to the total diversity and to the genetic distances or by imposing desirable restrictions, as a minimum contribution of each population to the genebank. Similar restrictions could preserve nonautosomal genetic variation that is useful to trace relevant events of the population history (mitochondrial DNA or Y chromosome lineages).

Throughout the paper we have only considered genetic diversity measured as expected heterozygosity calculated from neutral molecular markers. However, there are other types of genetic variability that can be described and measured in a population. For some loci, diversity is of known importance, such as the histocompatibility system in animals and the self-incompatibility system in plants (Petit et al, 1998). The importance of diversity for neutral markers is more questionable.

An additional restriction on the number of alleles could be integrated in the optimisation process. The objective in this case would be to maximise genetic diversity subject to the constraint of maintaining all the distinct alleles already present. Complications could arise if we want to preserve linkage disequilibrium, which is to maintain specific combinations of coadapted alleles that only occur in one breed. More work needs to be carried out on existence of such situations and the strategies to face them.

Finally, other aspects such as including data on particular traits of economic importance, specific adaptive features and local or regional importance of the breed (Oldenbroek, 1999) should be considered in the final conservation decision. In our situation, although the analysis based on the partition of the global coancestry indicates that Torbiscal and Guadyerbas are the subpopulations that contribute least to the total genetic variability, they are probably the most interesting ones according to other criteria, such as the traceability of their genetic origin, the superior productive performance of Torbiscal (Benito et al, 2000) and the large scientific knowledge available from these strains (Fernández et al, 2002; Ovilo et al, 2002).