Introduction

Cacao (Theobroma cacao L.) is a diploid tropical tree belonging to the Malvaceae (Alverson et al., 1999) that displays a preferentially allogamous mating system. Traditionally, cacao has been subdivided into four genetic groups: upper Amazon Forastero (UA), lower Amazon Forastero (LA), Criollo (Cr, anciently cultivated by the Indians in Central America) and Trinitario (Tr, hybrids between LA and Cr types). Cacao breeding largely relied on the creation of hybrids between progenitors belonging to different genetic groups of the species. Hybrid vigor (or heterosis) has been suggested to be the basis of good performance for precocity and yield of many of the selected biparental crosses between UA, LA and Tr or Cr types (Bartley, 1967; Soria, 1978; Reyes, 1979). However, in some countries, intragroup UA hybrids have also been found to be good yielders (Lockwood, 1979) and are recommended for commercial usage.

The attempt to produce hybrid seeds for large-scale distribution to farmers has involved studies on incompatibility reactions of the progenitors of the most productive hybrid progenies. Self-incompatibility (SI) in cacao was reported by Pound (1932) and its genetics is considered unique among the identified systems in flowering plants. Knight and Rogers (1955) reported that cacao has a sporophytic system controlled by a series of S alleles that show dominant or equal action. Further cytological studies by Cope (1958) revealed, however, that in incompatible combinations, nonfusion of gametes may occur in 25–100% of the ovules, suggesting also a gametophytic control system. SI in cacao is not always complete, because self-pollination of ‘self-incompatible’ genotypes may sometimes result in a low percentage of fruit setting. Haddon (1961) considered clones to be ‘self-incompatible’ if fruit setting of self-pollinated flowers is less than 10% and as ‘self-compatible’ if fruit setting is above 30%. Furthermore, the SI mechanism can be partly overcome by mixing self-compatible and self-incompatible pollen (Glendinning, 1960). This phenomenon has been used in cacao breeding as a tool to produce selfed progenies with self-incompatible genotypes.

In South America, the genetic origin of cacao, wild populations known as Forastero contain predominantly self-incompatible individuals whereas traditional varieties such as Amelonado (an LA type), Cr and Nacional (from Ecuador) are self-compatible (Cheesman, 1944; Haddon, 1961). However, according to Bartley and Cope (1973), Forastero from the Amazon basin and Cr from the Central America segregate both for the self-compatibility (SC) and the SI. This might be explained by the fact that several UA genotypes are partially self-compatibles and the ‘Modern’ Cr (Tr closed to the Cr) with the Tr's genes of incompatibility are often self-incompatibles.

Management of biclonal seed gardens (BSGs) established in many cacao-producing countries between 1960 and 1990 has generally involved the association of a self-incompatible seed parent with a self-compatible or self-incompatible pollen parent in a single plot, with the expectation that true hybrid seeds can be produced by natural pollination. During the 1960s in Cameroon, cacao breeding involved largely the development of new hybrid varieties among Tr and UA parental clones. BSGs were created to produce hybrid seeds for subsequent release to the farmers. In total, 40 BSGs were established, 36 aimed at producing UAxTr (or reciprocal) hybrid progenies and 4 aimed at producing TrxTr hybrid progenies. In most BSGs, the progenitors were planted at a ratio of one pollen to three seed parents. The harvest of pods was to be performed only on the seed trees, based on the assumption that SI is complete. However, studies with isozymes on progenies obtained by open pollination in such BSGs from different countries (including some BSGs from Cameroon) revealed that variable proportions of seeds might result from selfing (Lanaud et al., 1987). The occurrence of selfings through uncontrolled pollinations in BSG progenies may have influenced the efficiency of the breeding programs.

Besides the expected variable degrees of selfing in the BSGs, variable rates of outcrossing between cultivated varieties in farmers’ fields may occur because BSG progenies (hybrids) and traditional varieties are often planted in the same fields. The presence of genotypes from different genetic origins has favored the admixture of genes through natural outcrossing, leading to the appearance of cultivated cacao populations with a complex genetic structure. Recent analyses using a Bayesian model-based method have shown the existence of a high level of admixture for the farm accessions group. The LA genes were most represented (54%), followed by UA (33%) and Cr (7%) genes (Efombagn et al., 2008). The objective of the present study was to investigate the genetic origin of farmers’ accessions and to relate the results to the known composition of the BSGs in Cameroon and to the known traditional varieties grown. The specific questions addressed were the following:

  • What are the proportions of selfed and outcrossed cacao progenies in farmers’ field derived from BSG progenitors?

  • What is the pattern of outcrossing in existing cacao farms?

  • What are the practical consequences for seed gardens management and breeding?

Materials and methods

Sample collection

Leaf samples were obtained from a total of 400 farms accessions and of the 24 progenitors used in the BSGs in Cameroon (Table 1). Farm accessions were chosen at random during field surveys in different cacao-producing areas of Cameroon, ranging from a latitude of 02°14.199′N to 05°42.924′N, and a longitude of 009°01.430′E to 011°20.885′E. The farm accessions were the same as those analyzed before by Efombagn et al. (2008). The farms considered under the study were identified as traditional-based, hybrid-based or mixed traditional and hybrid-based varieties (accessions). Sampling within each farm was carried out according to its size and varietal composition, and the sample sizes varied from 3 to 17 cacao trees per farm.

Table 1 List of BSG progenitors with their compatibility status (SC and SI), as well as their use in seed gardens (P and S) and number of seed gardens in which each progenitor was planted

Microsatellite analysis

The 12 microsatellite markers used have been mapped by Lanaud et al. (1999) and the DNA analyzed was extracted from the leaves of all the plant materials of the study using the method developed by Bhattacharjee et al. (2004). PCR reactions were carried out in a 5 μl reaction mix containing 2.5 ng (1 μl) of template DNA, 1 μl of 55 × PCR buffer (10 mM Tris-HCl (pH 8.3), 50 mM KCl), 1 μl of 25 mM MgCl2, 0.5 μl of each forward and reverse primers, 0.2 μl of 10 mM dNTPs (dATP, dCTP, dGTP, dTTP), 0.1 μl of 5 U Taq polymerase (Bioline, London, UK). Thermal cycling profiles carried out in a Gradient Cycler PTC 200 (MJ Research, Ramsey, MN, USA) consisted of 5 min initial denaturation at 94 °C, 35 cycles of amplification at 94 °C for 30 s, 51 °C annealing for 1 min and 72 °C for 1 min. This was followed by further primer extension at 72 °C for 7 min. PCR reactions were performed on an MJ research Dyad 96 wells PCR. A volume of 0.5 μl of diluted PCR products (1:19) was mixed with 9.5 μl of formamide (PE Applied Biosystems, Foster City, CA, USA) and 0.5 μl ROX-labeled GeneScan-500 size standard (PE Applied Biosystems). The PCR products were denatured at 95 °C for 5 min and fragment analysis was performed on an ABI 3100 Prism Genetic Analyzer (PE Applied Biosystems). The software programs GeneScan 3.7 and GeneMapper 3.5 (PE Applied Biosystems) were used to process and score the data.

Hybridization and parentage analysis

To measure the level of outcrossing and to identify the putative BSG progenitor of each BSG-derived farm accessions, we used the Bayesian approach that consisted of two steps:

  1. 1

    Identification of pure and admixed farm accessions: The purpose of this step was to identify hybrid and nonhybrid samples. The software Structure 2.2 (Pritchard et al., 2000) was run to identify the genetic structure of the whole farm accessions population. The ancestry model with admixture was applied to infer how many clusters or subpopulations (K) were most appropriate to identify the number of potential varietal groups among the farm accessions. Ten independent runs of K=1–12 were performed at 100 000 MCMC (Markov chain Monte Carlo) and a postburning simulation length of 500 000. To identify and discard the traditional Amelonado accessions (nonhybrid farm genotypes), we used the option POPFLAG=1 to assign all the farm accessions to five original Amelonado accessions (controls).

  2. 2

    Identification of genealogical classes of farm accessions of hybrid origin: After discarding the traditional Amelonado genotypes from the farm accessions population, the microsatellite data of all the remaining genotypes (first-generation (F1) hybrids, admixed or selfed genotypes) were subjected along with BSG progenitors to the analysis of hybrid ancestry using the method implemented in the program NewHybrids 1.0 (Anderson and Thomson, 2002). Structure can identify admixtures among any number K of parental populations, whereas NewHybrids assumes that hybrid classes originated after admixture of two parental species or parental populations (genetic groups of putative progenitors in our study) within species. NewHybrids uses an inheritance model defined in terms of genotype frequencies to compute, for each individual, the posterior probability of inclusion in each of the following six classes: P1 (Tr BSG progenitors in our study), P2 (upper Amazon BSG progenitors), selfed P1 or selfed P2 genotypes (S1), F1, F2 and backcross (BC) genotypes between F1 and each of the two genetic groups of progenitors (P1 and P2). NewHybrids was run using the default parameters for the six genotype class frequencies, Jeffreys prior, a burn-in phase of 100 000 steps and 500 000 MCMC sweeps. The five Amelonado controls were subsequently used as a separate parental class to simulate the reality in farmers’ field where genotypes may have originated from outcrossing between traditional and BSG-derived progenies.

Parentage analysis was performed on the F1 hybrid accessions identified by NewHybrids. We used the software Cervus 3.0 (Marshall et al., 1998) to detect the putative BSG progenitors of the F1 accessions, as well as of the selfed S1 accessions per candidate progenitor. Cervus uses a simulation program to generate likelihood scores and provides a level of statistical confidence to assign paternity, maternity or both pollen and seed parents. Parent pair analysis used in this study is the common parentage analysis when the candidate pollen and seed parents are known (Marshall et al., 1998). The ln-likelihood ratios were expressed as LOD scores. The most likely progenitor was the one with the highest positive LOD score. The Cervus program uses furthermore the number of mismatching loci between a candidate parent and a putative offspring for the parental assignment. Polymorphic information content (Hearne et al., 1992) and parent pair nonexclusion probability (Jamieson and Taylor, 1997; Marshall et al., 1998) for each candidate progenitor were calculated for each microsatellite used. The nonexclusion probability is the probability of not excluding a single unrelated candidate parent or parent pair from parentage of a given offspring at one locus.

Diversity analysis of farm accessions subgroups

Diversity parameters were computed on the different farm accessions subgroups (as indicated in Table 2) generated from the hybridization and parentage analysis. Gene diversity (GD) and F-statistics (FST and FIS) as defined by Weir and Cockerham (1984) of each subgroup were calculated using FSTAT software (Goudet, 1995). Due to significant variation in sampling size between farm and reference accession groups (AGs), the average GD (=expected heterozygosity), allelic richness (AR; Leberg, 2002) were estimated using the rarefaction approach implemented in the software HP-RARE 1.0 (Kalinowski, 2005). Rarefaction approach was used to standardize the mean number of alleles per locus or the number of private alleles for a given AG, based on the number of genes (alleles) present in the smallest sample (AG) of accessions present in the study.

Table 2 Distribution of the farm accessions according to their genealogical classes into Amelonado, BSG progenies (S1, F1) and complex genotypes (F2 and BC), and analysis of genetic diversity for each subgroup

Mating system analysis at the level of cacao farms

To investigate the level of outcrossing in farmers’ field, we analyzed farm accessions from 15 cacao farms identified with at least 10 sampled accessions. The mating system within each farm was analyzed based on a mixed mating model (Ritland and Jain, 1981). A multilocus mating system implemented in the software MLTR (Ritland, 2002) based on the algorithm of Ritland and Jain (1981) provided estimates of single-locus (ts) and multilocus (tm) population outcrossing rates. The probability model underlying multilocus estimation of mating system assumes n unlinked loci. In the mixed mating model used, standard errors for the outcrossing estimates were based on the 1000 bootstrap replications. Biparental inbreeding (tm−ts) within each selected farm was then estimated (Shaw et al., 1980). Biparental inbreeding may appear during selfings or when mating occurs between relatives. In most cacao farms in Cameroon, a large proportion of trees derive from seeds collected within farms. Some cacao trees produce selfed seeds, and neighboring cacao trees may be genetically related, suggesting the existence of potential biparental inbreeding within the corresponding farms.

Results

Identification of the genealogical structure of farm accessions

The value K=3 maximized the posterior probability of the farm accessions data according to the Structure program. We found that one of the three clusters was composed of 102 genotypes that were designated as traditional accessions (according to the Amelonado reference genotypes), with an individual posterior probability 0.70 (Table 2). In the farm accessions varietal structure, this group was therefore identified as Amelonado (Table 2). The genotypes of all the remaining farm accessions were more or less admixed, and their genealogical classes in relation to the BSG progenitors used in Cameroon were determined by NewHybrids. Results suggest that 83 out of 400 farm accessions (20.75% of the total population) were composed of F1 hybrids between BSG progenitors, 102 (25.5%) were selfed progenies from BSG progenitors, whereas 113 (28.25%) accessions belonged to either F2 or BC generations. Because the probabilities of assignment to F2 or BC were rather similar, these were considered as one group (identified as F2+BC in Table 2). F2+BC material was found as the likely product of natural pollination in farmers’ field between F1 hybrids or between F1 hybrids and traditional genotypes. Within the F1 group, three subgroups based on the likely BSG progenitors were detected as shown in Table 2. Out of the 83 F1 genotypes identified, 37 (45% of the total F1 population) arose from crosses between UA and Tr progenitors.

Diversity in different farm accessions subgroups

The mean number of alleles (N) and GD recorded in traditional accessions (N=58; GD=0.20). were lower than the ones found in outcrossed (N=66–74; GD=0.57–0.62) and selfed (N=59–82; GD=0.44–0.63) groups derived from the BSG progenitors (Table 2). The inbreeding coefficient (FIS) for the traditional accessions group (Amelonado) was also higher (FIS=0.22) compared to the other groups (FIS=0.10–0.12 for outcrossed and FIS=0.12–0.14 for selfed BSG accessions). AR varied from 2.58 for traditional accessions to 5.04–5.35 for outcrossed and 3.80–5.34 for selfed BSG accessions. The traditional group was more genetically differentiated from the other subgroups (FST=0.20), whereas this genetic differentiation was relatively low among the BSG subgroups (FST=0.05–0.09).

Parentage analysis

All the F1 genotypes were subjected to a parentage analysis to detect the contribution of different BSG progenitors in the released material. Nonexclusion probabilities varied from 0.14 to 0.82 across the 12 loci of the study, and most of the null allele frequencies of the same loci were lower than 0.05 (Table 3). The genotypes were assigned to candidate BSG progenitors based on the highest positive LOD score and on the number of mismatched alleles registered (Table 4). The Tr progenitor SNK450 was the most represented with 33 putative offspring, followed by T79/501 and SNK450. The BSG progenitors SNK48 and ‘SCA6’ were not found as parents of any accession (Table 4). For 45% of the F1 hybrid genotypes, the two parents identified did not correspond to the known combinations of the BSG progenitors (data not shown). These are therefore derived from unwanted hybridizations in the BSG.

Table 3 Diversity, parental pair nonexclusion probability and null allele frequencies estimated for the 12 SSR loci in the parentage analysis of the F1 hybrids accessions
Table 4 Parentage and self-compatibility analyses

Outcrossing estimates in BSG progenies and farmers’ fields

The total number of selfed progenies was estimated for each of the 24 BSG progenitors (Table 4). Selfed offspring was detected for 20 out of 24 progenitors, including both SC and SI progenitors. The proportion of selfed individuals for all SC progenitors together was similar to that of the group of SI progenitors.

The difference between tm and ts (tm−ts) provides an estimate of biparental inbreeding within each farm. Positive tm−ts values implied the existence of biparental inbreeding within the farm. The tm−ts values in all ‘traditional’ farms were positive (Table 5), suggesting that biparental inbreeding has occurred within these farms. In the ‘F1 hybrids’ farms, tm−ts values ranged from −0.26 to 0.36, with high values being associated with a large number of selfed BSG individuals and low values with a large number of F2+BC individuals. The ‘mixed’ farms recorded low to average biparental inbreeding, with tm−ts values ranging from −0.51 to 0.19.

Table 5 Outcrossing rates in 15 cacao farms studied and number of accessions in different genealogical classes

Discussion

In earlier work, the genetic diversity of farm populations was compared to that of gene bank and reference accessions (Efombagn et al., 2008). In the present study, the genetic composition of the same farm accessions has been compared to the BSG progenitors and to the traditional Amelonado variety (named ‘German Cocoa’ by farmers). The results throw lights on the outcrossing patterns observed in BSG and in farmers’ fields. They also help to explain the performance of different varieties in farmers’ fields and stress the need to improve seed garden management.

Composition of farm accessions

Bayesian analysis suggests that 25.5% of the 400 farm accessions studied is closely related to the traditional Amelonado variety called ‘German Cocoa’ by the farmers. Another 46.3% of the farm population (20.8% outcrossed and 25.5% selfed) were found to be descendants from 24 parental clones used in BSGs established in the 1970s in southern and in western Cameroon. Furthermore, 28.3% of the farm accessions appeared to descent from uncontrolled pollination events in cacao farms, which is related to a common practice of cacao growers to use seeds collected in their own farm for new plantings. Diversity parameters revealed that the rather ‘pure’ traditional accessions subgroup was less diverse compared to BSG-derived accessions. Therefore, the existing diversity in cacao farms has not been reduced by the selection process carried out in Cameroon, but rather increased by the release of BSG material in farmers’ field. However, the AR and GD between outcrossed and selfed BSG accessions from UA origin did not differ significantly, indicating high levels of heterozygosity of the UA progenitors used in seed gardens.

Expected and observed crossing patterns in BSGs

Hybrid cacao seed production in BSGs started in many countries in the 1960s and 1970s. In some countries hybrid seed has always been produced by hand pollinations, whereas in other countries (such as Cameroon) the strategy has been to harvest open-pollinated pods on the self-incompatible parental clone in the BSGs. The latter strategy was based on the assumption that self-incompatible genotypes would only yield cross-pollinated progenies. However, different rates of selfing have subsequently been observed when allo-pollen is mixed with auto-pollen (Lanaud et al., 1987), and such may happen with natural pollination in BSGs. Consequently hand pollinations should have been adopted in all hybrid seed production programs, but due to lack of financial resources related to low cocoa prices in the 1990s and early 2000s such measures have often been adopted with delay. In Cameroon, hand pollinations were only adopted to produce hybrid seed recently (2005). The data analyzed in the present study mainly refer to trees that derived from seed gardens before hand pollinations became adopted in the BSGs.

Despite the fact that 19 of the 24 BSG progenitors are considered as self-incompatible, 55% of the BSG-derived individuals appear to be the result of self-pollinations. This can be explained by the fact that SI of cacao can be overcome by mixing self-compatible with self-incompatible pollen (Glendinning, 1960), such as may be expected from insect pollinations in BSGs. Lanaud et al. (1987) have shown that selfing rates for eight open-pollinated progenitors of BSG in Cameroon varied between 0 and 89% and obtained 5–82% selfed seeds after hand pollination of self-incompatible progenitors with mixtures of self- and allo-pollen. These large variations suggest that SI is more easily overcome with some genotypes than with others. Furthermore, the level of SC may vary according to the genotype and can also be affected by seasonal variations (Voelcker, 1938). Physiological reasons of SI may be caused by modifications of biochemical contents of cacao flowers (Haseinstein and Zavada, 2001) as well as morphological changes in floral parts. Several factors may therefore interact to favor selfing. Therefore, uncontrolled pollination (as occurred in BSGs in Cameroon) is not an efficient method for hybrid-seed production.

In our study, we further found about 45% of F1 progenies derived from progenitors planted in different BSGs. The possible hypothesis that might explain this situation is either the transfer of pollen from one BSG to another or the errors occurred during field planting of progenitors as recently observed in several BSGs. Furthermore, cacao pods produced by SC clones used as progenitors were not supposed to be harvested for seedlings production and distribution. Nevertheless selfed farm accessions were found as deriving from SC progenitors as shown in Table 4, and this shows that the process and rules for seeds production were not followed up rigorously.

Outcrossing patterns in cacao farms

Our study shows that some accessions of the farm population are derived from natural hybridization between genotypes of various origins (F2+BC genotypes). This can be explained by the general practice of farmers to collect cacao seeds on-farm to produce seedlings for new plantings or replanting.

The three types of farms (Table 5) studied were defined according to farmers’ knowledge. The results on outcrossing and on genealogical groups present in these farms show that the farmers had identified quite correctly the type of material in their farms. High biparental inbreeding (tm−ts values) was correlated with high proportions of selfed (S1) or traditional genotypes (Amelonado) in the farms studied. Traditional farms appear to be still rather pure. However, the presence of some F2+BC genotypes in three farms suggest that the farmers have possibly used outcrossed seeds from neighboring farms for infilling. With regard to the hybrid farms, it is striking to note that in the younger farms (below 20 years) there were very few F1 trees identified. This suggests that the BSG descendants distributed over the last 20 years may have contained even more selfed seeds than the BSG descendants distributed before. The oldest ‘F1 hybrids’ show also more F2+BC accessions, which indicate the relative high degree of outcrossing in these farms. This result also demonstrates clearly that some ‘hybrid farms’ do contain very few pure F1 hybrids. In the mixed farms, the F2+BC trees were predominating, showing a very high level of outcrossing.

Implications for breeding and seed gardens management

Approximately 26% of the 400 farm accessions were classified as Amelonado. This traditional variety seems therefore still to be well presented in farmers’ fields. The GD (expected heterozygosity) of the traditional Amelonado variety (0.20) is similar to that of the reference genotypes used in the study, which are the same as used by Efombagn et al. (2008). This suggests that the group of traditional farm accessions as identified by Structure is still rather pure. However, due to a certain level of introgression from other genomes into the traditional Amelonado background (Efombagn et al., 2008), it cannot be excluded that selections for improved agronomic traits will be possible within this group. The highest levels of GD were found in the hybrid (F1) and in the F2+BC progenies. The latter group has most likely been generated through natural pollination in farmers’ fields. This widely diverse group should offer good opportunities for further selection and breeding.

The parentage analysis indicates that 46% of the farm accessions analyzed descends from BSG progenitors, but only 45% of these appear to be the result of outcrossing. Furthermore, the identified parentage of 45% of the F1 hybrid genotypes did not correspond to the BSG combinations established in Cameroon. This may be due to mislabeling in the seed gardens or to uncontrolled pollinations between progenitors of neighboring BSGs. Consequently, only 25% of the seeds that have been distributed from the BSGs represents the hybrid cross combinations that were originally intended to be distributed to the farmers. The predominance of unselected progenies distributed from BSGs may well explain why the performance of the so-called ‘hybrid’ varieties has not been satisfactory in farmers’ fields. First, the high percentage (55%) of selfed Tr and UA genotypes among the BSG progenies may cause significant inbreeding effects on yield, as has been demonstrated in other African countries when comparing yield of outcrossed and selfed UA (N’Goran et al., 2003) and Tr progenitors (Toxopeus, 1972). Second, the high percentage (49%) of Tr genotypes in the BSG progenies, which as a genetic group are generally considered to be highly susceptible to diseases and pests, may explain the complaints made by farmers that the ‘hybrid’ varieties are more susceptible to Phytophthora pod rot and dieback than the traditional varieties.

The results stress the need for strict control of pollinations in cocoa seed gardens in Cameroon. Only when this is correctly accomplished, the farmers may change their views on the value of hybrid varieties as compared to the traditional Amelonado variety.