Introduction

The majority of the UK Asian population can trace their origins to the Punjab region of India and Pakistan within the Indian subcontinent (Ballard, 1994). As a consequence of the cultural diversity of this region, in particular the influence of religion on marriage practices, there are important differences in marriage patterns between the different communities. For example, there is a widely recognised culture of consanguinity within many Pakistani Muslim communities (Bittles et al, 1991), whereas there is some evidence of caste endogamy within the Indian communities (Berry and Kaur, 1991; Mukherjee et al, 1999). These two systems of marriage practices will be reflected in quite different patterns of homozygosity at genetic loci, which in turn have important implications for the incidence of recessive disorders within these populations.

In this paper we use the UK Asian population to illustrate how the different marriage practices can affect the frequencies of deleterious recessive alleles by influencing the frequency of recessive homozygotes. We show that there has apparently been sufficient time for persistent consanguinity to purge recessive alleles within populations such as the Pakistani Muslim community; a notion that has previously been discounted (Darr and Modell, 1988). However, the high incidence of recessive disorders within the UK Pakistani population provides little evidence for this argument. We therefore investigate whether reproductive compensation, at the level identified by Koeslag and Schach (1984), can effectively counteract the purging of deleterious alleles within consanguineous populations.

The magnitude of consanguinity and population subdivision

Although Britain now has a substantial South Asian population, relatively little is known about the precise genetic consequences of the marriage patterns of its constituents. Nevertheless, there are strong indications that these marriage practices are of considerable clinical significance; particularly the high incidence of recessive disorders in Pakistani neonates associated with a similarly high incidence of consanguinity (Bundey et al, 1991).

In a previous study (Overall and Nichols, 2001), two UK Asian populations were sampled and typed for microsatellite loci using the SGM and SGM Plus commercial primer kits (PE Biosystems). The individuals sampled in this survey were selected so that no two individuals had grandparents in common. For each subject we collected information about their parents’ origins and any known relationship between them. None of the parents were sampled. One of the populations comprised Britons of Mirpuri ancestry, a predominantly Muslim community originating from the Punjab region of Pakistan, where around 50% of the sample were reported to be offspring of first-cousin unions. The second population were of Jullunduri ancestry. They are largely a Sikh community originating from the Punjab region of India, within which the marriages were described as exclusively exogamous (non-kin unions). It should be noted that, although every effort was made to develop a mutual trust between the volunteers and the researcher, the nature of the information is sensitive and, consequently, it could be inaccurate. Nevertheless, useful comparisons could be made between the sociological and genetic data.

Despite the contrasting marriage practices suggested by the questionnaire data, similarly large estimates of inbreeding were obtained for each sample. Point estimates (posterior medians) of the magnitude of inbreeding, FIS, were calculated for an earlier collection of the data, using the method of Ayres and Balding (1998), as 0.065 (0.016, 0.088) for the Jullunduri and 0.05 (0.008, 0.08) for the Mirpuri (Ayres and Overall, 1999), where the 95% confidence intervals of highest posterior density are given in parenthesis. If these point estimates are taken to be equilibrium values, then they correspond to around 46–53% of the population being born to first-cousin unions (Hedrick and Cockerham, 1986). The proportion could be greater if the practice of consanguinity has been a recent introduction and equilibrium not yet reached. These findings clearly have health implications, but the estimates cannot be taken at face value on the basis that they do not correspond entirely with the sociological data. Although the consequences of consanguinity to health have been addressed in considerable detail for the UK Pakistani population (Terry et al, 1985; Chitty and Winter, 1989; Bundey et al, 1991), excess homozygosity has not been reported in other UK Asian communities, nor the different possible explanations and their implications for health, although high perinatal mortality has been recorded for UK Indians (Terry et al, 1985; Balarajan and Botting, 1989) and Bangladeshis (Bundey et al, 1991).

The main points of relevance when considering genetic disorders within populations are the relatedness of the parents and the frequency of the alleles in question. This first point is well recognised within consanguineous populations such as the UK Pakistanis although, as we point out below, it is important to determine the historical timing of changes in marriage practice, along with the rate at which the resulting allele frequencies approach equilibrium. Moreover, many factors other than persistent consanguinity will influence these dynamics, and these have received less attention in the medical literature regarding the health of consanguines. In particular, we extend this standard treatment by incorporating reproductive compensation as one such other factor.

Consanguinity and the purging of recessive alleles within the UK Asian population

We will start from the simple textbook treatment of consanguinity, noting in passing the implications for the rate at which allele frequencies approach equilibrium, and then examine the effects of reproductive compensation. The increased mortality and morbidity observed in consanguineous populations (Rao and Inbaraj, 1979; Darr and Modell, 1988; Bittles, 1993) is usually attributed to the increased homozogosity, which exposes the effects of deleterious recessive alleles that would otherwise be hidden from selection in heterozygous individuals. However, the effect of persistent consanguinity over generations can be that the deleterious recessive alleles become purged from inbred populations. Indeed, Rao and Imbaraj (1979) have found evidence of low frequencies of recessive deleterious alleles within the South Indian Hindu population; a population with a history of consanguinity stretching over hundreds of generations.

Other populations probably have a shorter history of consanguineous marriages, including several Muslim groups from the Indian subcontinent where the practice may reach back only as far as the mid-sixteenth century. The postulate that this is sufficient time to purge deleterious recessive alleles has been contested (Darr and Modell, 1988).

Reproductive compensation

Reproductive compensation (RC) refers to the replacement of offspring lost to genetic disorders. It has been suggested that the net productivity of those parents who have lost offspring can be equal to, or even greater than, the population average (Koeslag and Schach, 1985). This compensation of lost offspring need not be a physiological adaptation, but rather a straightforward decision by the parents to replace their deceased infants. Reproductive compensation may be particularly significant where economic or social factors mean that families are small compared to the maximum reproductive rate. Within small families, diseased infants may be more likely to be replaced. As a consequence, parents with otherwise reduced fertility have a greater influence on the frequency of recessive alleles in future generations. Hastings (2000) considers this issue in the context of the shift towards small family sizes in China and Western Europe and we extend the issue here by also considering populations experiencing inbreeding and subdivision.

Methods

Consanguinity

The assertion that that there has been insufficient time for the purging of deleterious recessive alleles (Darr and Modell, 1988) can be evaluated using standard theory. For example, consider a two allele system, where the relative genotypic fitnesses are: wAA = 1, wAa = 1 and waa = 0. The expected frequency of the recessive allele in the following generation, q′, is given by the ratio of the expected numbers of heterozygous carriers to twice the total expected population size, which simplifies to q′ = q/(1+q), where q is the frequency of the recessive allele a (eg, Hedrick, 2000). The same logic can be used to extend the equation to include mutation (A →a at rate μ) and consanguinity quantified by F, the probability of alleles being identical by descent (IBD). The probability of a homozygote is then given by Pr(AA) = p2 + pqF and a heterozygote by Pr(Aa) = 2pq(1 − F). The expected allele frequency in the following generation becomes

Equilibrium between selection and mutation is eventually reached where q = q′, which is approximately,

(Haldane, 1939). The rate at which this recessive allele is purged from this population, using equation (1), is shown in Figure 1 (curve labelled F = 0.03). For illustrative purposes, the initial frequency of the recessive is set at 0.02, the average frequency of the cystic fibrosis mutation in Caucasian populations. The mutation rate was set at μ = q2 and the magnitude of inbreeding was held constant at F = 0.03.

Figure 1
figure 1

Initial recessive allele frequency set at 0.02. Dashed line: represents large, randomly mating population maintaining mutation-selection equilibrium, F = 0. Solid line: shows change in frequency of recessive lethal allele within consanguineous population, F = 0.03.

A similar result can be obtained using the approach of Sanghvi (1966), although a slight discrepancy between the two procedures results through Sanghvi’s method ignoring powers of q of 3 and above.

Reproductive compensation

The effect of RC on the expected frequency of recessive alleles over generations can be illustrated by considering a set of parents heterozygous for the same lethal recessive. Should their deceased offspring be replaced by a sibling then ¼ of the time it will be an individual homozygous for the dominant allele, ½ the time a heterozygote and ¼ of the time with another recessive homozygote. For simplicity, it can be assumed that further affected individuals are replaced until a surviving sib is produced. This approach, which is treated in full by Koeslag and Schach (1985), can be approximated by considering that each loss is replaced 1 3 of the time by a dominant homozygote and 2 3 of the time by a heterozygote. Under this scheme, the recessive allele is maintained at a higher frequency in the population because the surviving offspring is twice as likely to be a heterozygous carrier. The frequency of the recessive the following generation then becomes

and with mutation

The equilibrium frequency of this recessive lethal, when powers of 3 and above are ignored is, approximately,

This result is similar to that arrived at by Hastings (2000) and results, in our example, in an equilibrium frequency around 20% greater than strict mutation-selection equilibrium (Figure 3, R = 1).

Figure 3
figure 3

Initial recessive allele frequency set at 0.02, the equilibrium value assumed within large, randomly mating population in mutation–selection equilibrium. Change in frequency of recessive lethal within consanguineous populations represented by line labelled F = 0.03. Change in frequency of recessive within populations practicing both consanguinity and reproductive compensation (between 1–3 offspring/genetic death) labelled R and F = 0.03. Change in frequency within populations practicing reproductive compensation only labelled R. Dashed line: mutation selection equilibrium within randomly mating population, F = 0.

RC clearly has a greater effect when lost offspring are compensated for in excess, rather than the straightforward single replacement outlined above. There is evidence that this does in fact occur when infant mortality rates are perceived to be high within a community (Koeslag and Schach, 1984, and references therein). In this case, where R = number of replacement offspring, equation (2) takes the general form of

A rearrangement of equation (3) in terms of the incidence of disease, q̂2, highlights the effect RC has on augmenting the incidence of a recessive disorder above that expected simply through the effects of recurrent mutation:

The issue of particular interest is the effect of RC when practised within populations that also practise consanguinity. The inbreeding coefficient, F, can be introduced into the above method using the same logic employed in equation 1,

where Q = q+(1−q)μ. This results in an equilibrium allele frequency of

The results of applying equation (5) are given in Figure 3.

Results

Figure 1 illustrates how consanguinity (in the absence of RC) is expected to markedly reduce the frequency of this recessive allele over a period of around 25 generations, contrary to Darr and Modell’s (1988) assertion.

The most striking feature of the trajectory is the initial increase in incidence at the advent of consanguinity. Figure 2 (solid line) plots the incidence of the disorder in a population where F = 0.03 begins at generation 1. This outcome of the model illustrates the principle that the change in the proportion of consanguineous unions may be of more significance than the proportion per se in determining the incidence of deleterious recessive traits. Eventually, new selection-mutation equilibrium is established where the incidence returns to its pre-consanguineous level, but where the recessive allele has been much reduced in frequency.

Figure 2
figure 2

Initial incidence frequency of recessive disorder set at 0.0004, the equilibrium value assumed within large, randomly mating population in mutation–selection equilibrium. Solid line: change in incidence of lethal disorder within consanguineous population (consanguinity, F = 0.03, 0–30 generations; consanguinity, F = 0.04, from generation 30). Dashed line: change in incidence within population adopting consanguinity, F = 0.03, at generation 30.

The simple sequence of events in Figure 2 represents the approximate pattern suggested for the history of the Pakistani population, specifically the initiation of a consanguineous tradition coinciding with the popular adoption of Islam in the middle ages (Darr and Modell, 1988) With approximately 450 years since then, this gives between 18 and 30 generations considering a minimum of 15 and a maximum of 25 years / generation. In our example in Figure 1, this corresponds to a reduction of the initial frequency of around 33–42%. Figure 2 illustrates that this reflects a reduction in incidence close to pre-consanguineous levels. The effect of small recent changes in the marriage pattern may be particularly relevant to current patterns of incidence. Darr and Modell (1988) observed that migration into the UK coincided with an increase in consanguinity. Although it was only a slight increase, the genetic effects could be marked. An increase of the same magnitude was introduced at generation 30 in Figure 2 (the value of F was increased to 0.04). This small change produces a burst in the incidence.

Otherwise non-consanguineous populations, such as the Jullunduri, have very different histories. Indeed, if the magnitude of F, calculated for the Jullunduri sample, is due to some event associated with their immigration, such as restrictive immigration laws influencing traditional marriage practices, then the incidence frequency will have increased only recently. This is illustrated by the dotted line in Figure 2. Essentially, by having only a very recent history of inbreeding, such communities would suffer more severely through experiencing recessive alleles at higher frequencies.

Because of the absence of parental genotypes for the Jullunduri sample, examination of parental relationships was not possible. Despite this, it is doubtful that the high FIS estimate in the Jullunduri community is actually explained by such a spectacular increase in consanguineous unions. Analysis of the multilocus pattern of homozygosity (Overall and Nichols, 2001) and sociological information (Berry and Kaur, 1991) both indicate that consanguineous unions are infrequent but that zaat, or caste, endogamy is relatively strong. These castes are likely to range considerably in size; a situation probably enhanced initially by a migration event. It is always possible, therefore, that the strict practice of endogamy within the less-well represented castes will, as a consequence of the inheritance of caste down a family line, result in consanguineous unions. Such unions are made more likely by the tight familial networks developed as a result of the early immigration laws, in particular, laws which provided relatively easy passage of kin into the UK (Darr and Modell, 1988).

A general increase in consanguineous marriage as represented in Figure 2, may not then be applicable to the Jullunduri sample in its entirety, but may well be relevant to small subgroups (castes) of the population. Consequently, a broad treatment of ethnicity and religion regarding a substructured population may provide inadequate information and be of little benefit in the context of genetic counselling and forensic science. More detailed information that takes account of the finer scale of substructuring, such as caste endogamy, and the potential marriage networks within these castes, may prove more revealing.

Evidence for purging of recessives

One important way in which the standard treatment, outlined above, can be considered more realistic, is to include the possibility that human reproductive behaviour is modified by the occurrence of the deleterious effects. This may go some way to explain why the evidence for purging of deleterious recessive alleles is equivocal. Communities such as the South Indian Hindu and Pakistani Muslim population have practised close-kin marriage for several hundreds of years (Darr and Modell, 1988) and would therefore be expected to show reduced genetic load. The best evidence of such an effect is, perhaps, from the reduced sterility and increased fertility observed in the population of Southern India (Rao and Inbaraj, 1979). On the other hand, the incidence of genetic disorders within the UK Pakistani population is significantly higher than the incidence in European Caucasians (Terry et al, 1985; Darr and Modell, 1988; Chitty and Winter, 1989; Bundey et al, 1991). Hutchesson et al (1998) determined the frequencies of the ten most common autosomal recessive inborn errors of metabolism within UK Pakistani and UK Caucasian infants. They calculated that, although the incidence of these disorders in the Pakistani infants was around 10 times higher, the frequency of the recessive alleles was not significantly different between the two groups. Substantial purging of these deleterious alleles appeared not to have occurred within these populations.

One straightforward possibility, to explain why the effects of purging are not apparent, is that consanguinity has been more sporadic than is suggested by current practice. The approach illustrated in Figure 2 can be extended to evaluate the alternative suggestion that reproductive compensation may explain the apparent inefficiency of the purging (Rao and Imbaraj, 1979). Indeed it has been suggested that, in consanguineous societies such as the Pakistani Muslim community, RC is a response to increased early postnatal mortality (Schull et al, 1962; Bittles, 1993).

Consequences of reproductive compensation in consanguineous and substructured populations

The effect of RC, then, will be to sustain high frequencies of deleterious recessive alleles. In the absence of consanguinity, RC will compound any increase in allele frequencies that may have occurred through the random fluctuations in frequency caused by drift. Because genetic drift is important for populations that are subdivided into quite small subpopulations, it may be of relevance to populations such as the UK Jullunduri community. Overall and Nichols (2001) identified subdivision within the UK Jullunduri population that was most likely due to caste endogamy. An earlier study also observed that the sizes of these castes varied significantly (Overall, 1998). If drift has indeed elevated the frequency of deleterious alleles within any of these castes, the effect of RC will be to reduce the rate at which equilibrium is restored. This scenario has been suggested as a possible cause for the unusually high recessive allele frequencies giving rise to Tay-Sachs, and other lipid-storage disorders, found in the Ashkenazi Jewish population (Koeslag and Schach, 1984). Because of this, the combination of caste endogamy and RC is likely to complicate the interpretation of disease frequencies within structured populations such the Jullunduri.

Conversely, drift is unlikely to have been of much significance for the UK Pakistani populations as the initial migration involved considerable numbers which rapidly expanded without any substantial impediment on the migration of spouses between the two countries (Ballard, 1990). The action of consanguinity and RC within populations such as the Mirpuri is more likely to be of significance. Of the two factors, consanguinity appears to have a more substantial effect on recessive allele frequency than RC (Figures 1 and 3). Indeed, if there is to be any substantial influence resulting from RC, lost individuals are required to be replaced at a rate in excess of 1 per genetic death. With our example in Figure 3, inbreeding (F = 0.03) reduces the initial allele frequency by around 50% at equilibrium. For RC to increase the initial frequency by 50%, replacement is required at a rate of R = 2. This is not an unrealistic scenario as, according to Koeslag and Schach (1984), rates of compensation of R = 2 are conceivable and levels of inbreeding above this value are not common amongst human populations. In a review of several populations, Koeslag and Schach (1984) calculated an average value of R = 2.3 (where the R used by these authors is defined as the number of surviving replacements per recessive homozygote death). If we model a consanguineous population with an F = 0.03 and RC at a rate of 2 (Figure 3, R = 2, F = 0.03), we can see that the purging affect of continuous inbreeding is effectively nullified.

Discussion

We have emphasised the influence different mating traditions and reproductive strategies have on the frequencies of recessive alleles. In an extreme example, we illustrated the case that there can be a three-fold difference in the equilibrium frequency between a population inbreeding (F = 0.03) and a population practising RC with R = 2, corresponding to a nine-fold difference in incidence frequency, considering the equilibrium allele frequencies in Figure 3.

If we consider the incidence of disease, rather than focusing only on the surviving offspring, the probability of a homozygote for the recessive is approximately

indicating that for every recessive homozygote death, R/3 additional deaths occur through reproductive compensation. This equation gives the incidence of genetic death when this death has been compensated for by a live birth, ignoring the possibility that for each live replacement several deaths may have occurred. When allele frequencies are estimated from disease incidence, ignoring the effect of consanguinity and RC will lead to over-estimates.

Indirect estimates of R have generally been calculated by comparing the sizes of groups known to carry recessive alleles, such as Tay-Sachs in Ashkenazim Jews, relative to groups assumed not to carry these alleles (Koeslag and Schach, 1984). Although these authors generally find consistent values of R from several independent studies, the group sizes have been small and measures of error possibly large (JH Koeslag, personal communication). Indeed, a study based on compensation resulting through Tay-Sachs-related mortality produced a value of R = 4.9 (Koeslag and Schach, 1984).

Despite the theoretical significance of RC, there is a shortage of direct evidence to suggest that compensation is occurring in human populations. Long-term studies are required to identify both foetal loss and completed family sizes, the former being absent in the studies surveyed by Koeslag and Schach (1984). One example where this was largely achieved is the Ober et al (1999) study of the Hutterite population now resident in South Dakota, USA. This study demonstrated, not only a negative relationship between inbreeding and fitness within adult females, but also suggested that RC was occurring among the more inbred females. The effect may be largely due to the exercise of choice over family size rather than an effect of foetal loss. In particular, families in which there was no evidence of genetic disorders were smaller than the reproductive maximum. The affected families could, therefore, achieve similar productivity. Similar surveys would provide valuable insights into the relative risks within the UK Asian communities. Indeed, this present study suggests that it is necessary to identify reproductive compensation if the relationship between disease incidence, allele frequency and consanguinity is to be better understood. For groups known to have higher frequencies of genetic mortality, such as the UK Pakistani population, the incorporation of reproductive compensation into models of disease incidence will aid in clarifying this, often confusing, issue and focus much needed attention on other health aspects of the British Asian community; in particular, away from the often exaggerated notions of the deleterious effects of consanguineous marriage (Modell, 1991).