General population surveys can yield unbiased estimations of the effects of inbreeding, but such surveys are rare, especially those concerning the effects of inbreeding on adults.1 In humans, results of extensive analyses on the effects of inbreeding on fertility are subject to a number of potential limitations, due to lack of control for important sociodemographic variables such as literacy, age at marriage, use of contraceptives and duration of marriage.2, 3, 4 A recent meta-analysis of data on 30 populations in six countries uncovered no evidence for an association between consanguinity and fertility when controlling for these confounding effects.4

Conversely, some rare studies reported evidence of an effect of inbreeding on fertility in humans.5 However, in modern societies, it appears that such an effect can hardly be assessed through its impact on productivity, owing to reproductive compensation.5, 6 Reproductive compensation refers to the replacement of offspring lost to genetic disorders. It has been suggested that the net productivity of those parents who have lost their offspring can be equal to, or even greater than, the population average.7 Other demographic variables have been used to examine the relationship between inbreeding and fertility. These include the protogenesic interval (which refers to the period of time between marriage and the first birth) and the intergenesic interval (the time between any two successive birth events of siblings in a genetic family). However, these variables do not use any information on the timing of reproduction (eg, variation in productivity or in resource allocated to reproduction of individuals along their reproductive period). Fertility yet exhibits an age-related decline in many species,8, 9 including humans.10 Further, sensitivity of fertility to inbreeding might also vary with parental ages. Several empirical studies have for instance shown that mutations can have age-specific effects on fitness components,11, 12 and these studies are generally in good agreement with the general evolutionary theories of senescence.13, 14 The evolutionary basis of senescence is generally explained by two widely accepted (and not mutually exclusive) theories. The antagonistic pleiotropy theory15 attributes senescence to the fixation of alleles with pleiotropic effects that favor early life fitness but bear a cost in later life. In contrast, the mutation accumulation hypothesis attributes senescence to the accumulation of deleterious mutations with late-acting effects on fitness.16 Under this hypothesis, a newly arising mutation that reduces fertility or survival but is only expressed late in life, would experience little selection against it and may increase in frequency through drift. The same deleterious mutation expressed earlier would be subject to counter-selection. As a consequence, late-acting deleterious mutations are more likely to accumulate than early-acting ones.16, 17 The mutation accumulation model has lead to the prediction that inbreeding depression should increase with age (because inbreeding depression is inversely related to sensitivity18). This prediction is supported empirically for both mortality and fertility.19

In this paper we study a Canadian population to investigate the effect of consanguineous marriages on reproduction. Our work focuses on a cohort of women born in the late 19th century in Saguenay-Lac-Saint-Jean, a region located in north-eastern Quebec where carriers of certain rare recessive inherited disorders can be found at very high frequencies.20, 21 The Saguenay-Lac-Saint-Jean population provides the opportunity to estimate the effect of inbreeding with reduced confounding effects of socioeconomic factors, which are expected to be relatively uniform within the population of this period.22 We pay special attention to temporal aspects of reproduction to disentangle between social and biological factors linked to inbreeding levels that potentially have contrasting effects on fertility. In a possible context of reproductive compensation, and in the light of theoretical developments on the evolution of senescence, we make the prediction that (i) the reproductive performance of individuals varies along their reproductive period; (ii) the magnitude of this variation is related to their inbreeding level.


The Saguenay-Lac-Saint-Jean population

The Saguenay-Lac-Saint-Jean region was opened for settlement during the second quarter of the 19th century.22 The first settlers came mainly from the nearby region of Charlevoix23 and the population experienced a rapid growth.24 Although subsequent migration from and to other regions of Quebec contributed to shaping the Saguenay-Lac-Saint-Jean population to its present form and structure,25 population growth was mostly because of a very high fecundity.26 The population is consequently relatively isolated from a genetic view point. However, due to the relatively large population size (the present population numbers approximately 274 000; see Institut de la statistique du Québec27), close inbreeding (3–5 generations) is not very high as compared with other regions of Quebec. In contrast, long-term inbreeding is considered as particularly high.25

The BALSAC population database

The BALSAC database contains genealogical, historical and demographic information on the population of Quebec. This information was obtained mainly from parish records (birth, marriage and deaths certificates). For the Saguenay-Lac-Saint-Jean region, all records from the beginning of settlement (early 19th century) to 1971 have been computerized and linked. Thus, family reconstructions for the Saguenay-Lac-Saint-Jean population can be traced for a period of nearly 140 years. Moreover, genealogical data (from marriage records only) for the founders of the Saguenay-Lac-Saint-Jean population (and for subsequent immigrants from other regions of Quebec) are available from the early 17th century (beginning of settlement in Quebec) to 1940,28 which corresponds to >300 years (or >10 generations) of data.29

Description of data

Both demographic and genealogical data were used in our analysis. We used data on 172 women born in 1879 in the Saguenay-Lac-Saint-Jean region, who were married only once and had at least three children (see below for justification). The choice of this cohort was made to (i) maximize genealogical depth, (ii) have all necessary demographic data on these women (eg, marriage of their children during the 20th century). Women born in 1879 and their husbands are hereafter referred to as the parent cohort, and their children are referred to as the offspring cohort. Demographic data for the parent cohort consisted of the date of marriage, number of children produced, number of married children in the population, the age at first and last childbearing and average age at childbearing. Further, the intergenesic interval (average time between two successive birth events), and protogenesic interval (interval between marriage and birth of the first child) were computed for each woman, as well as an index allowing to quantify the asymmetry in productivity between the first and second half of the reproductive period, defined as ASYM=average age at childbearing−0.5 (age at first childbearing+age at last childbearing). For a given couple, a positive value for ASYM means that birth intervals were longer early in marriage than latter on, whereas negative values indicate that births became spaced further and further apart as the couple aged. Because ASYM is meaningless for pairs with less than three children, the data set only included couples having had three or more children.

Ascending genealogies for these 172 women were reconstructed as far back as the available data would permit. Most genealogical branches were traced back to the early 17th century. All genealogical paths linking these women through their common ancestors were established.

Genealogical analysis

In genealogical analysis, a founder refers to a person whose ancestors are unknown. A numerical gene-dropping method30 was used to compute different genetic variables. Two unique but different alleles at a single locus were attributed to each founder, and random Mendelian segregation (ie, transmission of a randomly equiprobably chosen allele of each parent through a Bernoulli trial) occurred according to the actual genealogical structure. The process was repeated 1 000 000 times to compute the following coefficients.

For all individuals of the genealogy, the inbreeding coefficient was defined as the probability that the two alleles at any given locus from this individual are identical by descent31 (we thus assumed that all founders of the genealogy had a zero inbreeding coefficient). Long-term inbreeding coefficients of the individuals of the parent cohort (respectively fm and ff for males and females) and of their offspring (fo) were computed using the complete genealogy. Due to demographic changes in Quebec and in the Saguenay-Lac-Saint-Jean community (population growth in Quebec between the early 17th century and the late 19th century) and to the occurrence of past selection against most inbred individuals, long-term inbreeding and inbreeding due to very recent mating between relatives may have different effects on fertility. Therefore, we also calculated close inbreeding coefficients (hereafter fm3G, ff3G and fo3G) by redefining founders to have a maximum depth of three generations (ie, starting from the parent cohort (depth zero of the genealogy), common ancestors were only considered at a depth of three generations).

Besides these measures of inbreeding, we used the genealogy to compute the coefficient of average kinship of each individual of the parent cohort with its contemporary population. For each male of the parent cohort, we computed the probability, rm, that an allele sampled at random in the focal individual was identical (by descent) to either of the alleles found at the same diploid locus of a randomly chosen individual of either gender in the rest of the parent cohort (excluding the focal individual). The same protocol was used to compute the coefficients of average kinship of females of the parent cohort rf. As socioeconomic heterogeneity is expected to create genetic structure, we used these coefficients of average kinship with the contemporary population as social indices to differentiate individuals belonging to large population subunits (the ‘core’ of the population, with large values of coefficients of average kinship) from those from small subunits (the ‘periphery’ of the population).

Differences in genealogical depth (ie, the number of generations used to compute inbreeding and kinship coefficients) may reflect demographic and social differences (eg, individuals from old versus recently established families) as well as differences in data availability (identification of founders). We therefore used the genealogies to compute the genealogical depth of the offspring cohort. Starting from the offspring generation, a parent (ie, mother or father) was randomly chosen at each generational step until a founder was encountered, and the mean number of generations was averaged over a large number of simulations. Alternatively, genealogical depth of each offspring was computed by following patrilineal or matrilineal genealogy to uncover potential differences among sexes.

Computation of each genetic index (inbreeding and kinship measures) and genealogical depth were performed by averaging 1 000 000 iterations. Additionally, long-term inbreeding and kinship coefficients were computed through exact computation using a C-program implementing the recursive algorithm described by Thompson32 (equations (9) and (10), p. 25). Inbreeding coefficients obtained through exact calculation and numerical gene dropping were compared to ensure validity of the method (Kendall correlation coefficient τ>0.999).

Statistical analysis

Relationships between demographic variables (number of produced and married children, proportion of married children among produced children, first, last and average ages at childbearing, intergenesic and protogenesic intervals, ASYM) and genetic variables (rm, rf, fm, ff, fo, fm3G, ff3G, fo3G) were investigated by comparing two statistical methods. First, hierarchical partitioning analyses33 (HP) were run using sequentially each demographic variable as a dependent variable, and simultaneously all genetic variables. HP uses all models in a regression hierarchy to distinguish those variables that have high independent correlations with the dependent variable.34 Results were expressed in terms of percentage of total independent effect of each explanatory variable on the dependent variable (R2). A randomization test (1000 iterations) was performed for testing the significance of each explanatory variable in each analysis.35

As a second step, the relationships between demographic and genetic variables were examined using generalized linear models36 (GLM). All demographic-dependent variables were assumed to follow a Gaussian distribution (after transformation to normality), except the number of children (produced and married), for which a Poisson distribution was assumed. The best model was selected by starting from a full model with all explanatory variables (without interactions) and sequentially removing variables according to the Akaike information criterion. Both techniques (HP and GLM) have been used due to their complementarity, as far as HP allowed for providing the percentage of independent effect of genetic variables on demographic variables (which facilitated causal interpretation in cases of correlations among explanatory variables), whereas GLM allowed for providing quantitative directional estimate of the effect of each variable. For the sake of simplicity and comparison between HP and GLM, interacting effects among explanatory variables were neglected in the results presented in Table 2. However, further analyses were performed by including first-order interactions among explanatory variables in GLMs (see results).

All analysis was conducted using the R software package.37


Demographic and genetic measures, means and correlations

Women of the parent cohort had on average 9.39 (SE=0.29) children, and 5.17 (SE=0.23) children who got married in the Saguenay-Lac-Saint-Jean region. The mean age of mothers at first childbearing was 21.94 years (SE=0.34), the mean age at last childbearing was 37.80 years (SE=0.56), with a mean reproductive period duration of 15.87 years. The mean age of mothers at childbearing was 29.9 years (SE=0.28). The productivity of women was on average higher during the first part of their reproductive period as indicated by ASYM that had a mean value of −0.55 year (SE=0.06). This value was significantly different from zero (t-test, t=−9.25, d.f.=173, P<10−6).

Mean genealogical depth was significantly different when following patrilineal versus matrilineal genealogy (9.34 generations for patrilineages, 11.31 generations for matrilineages; Mann–Whitney U-test, P<10−6). This difference is mainly explained by a difference of generation length between males and females.38 Although available data did not allow to compute exact generation length (mean age of the parents at the moment of child birth), the average duration between the date of marriage of individuals and the date of marriage of their parents over the period covered by the genealogy was very different for the two sexes (34.1 years (SE=0.11) for males and 28.9 years (SE=0.10) for females). As genealogical depth is likely to be linked to demographic and genetic variables for social (eg, old families belonging to the ‘core’ of the population) and statistical (ie, computation of inbreeding measures affected by genealogical depth) reasons, it was included as a control variable in all subsequent analyses.

Mean values of inbreeding and kinship coefficients are given in Table 1. At the population scale, there was no significant difference between the coefficients of average kinship to the contemporary population of males and females (ie, there was no sex effect on the mean kinship to the population, Mann–Whitney test, U=17227, NS), and no difference among individual inbreeding coefficients of children, fathers and mothers (ie, there was no effect of sex or generation on inbreeding coefficients, Kruskal–Wallis test: H=166.13, NS for long-term inbreeding coefficient; χ2=120.47, NS for close inbreeding coefficient). However, at the individual scale, some of these coefficients were correlated. There were positive significant correlations between the kinship of parents (as deduced from the fo value of their children) and both (i) their long-term inbreeding coefficient (fm and ff) and (ii) their coefficient of average kinship to the contemporary population (rm and rf). This suggests the existence of non-random pairing with respect to inbreeding and kinship (Kendall's rank correlation coefficients were τ (fo, rm)=0.35, τ (fo, fm)=0.37 for males and τ (fo, rf)=0.26, τ (fo, ff)=0.25 for females; all correlations were significant at the P<10−6 level). Similarly, genealogical depth was logically positively correlated with all long-term inbreeding and kinship measures.

Table 1 Average values and standard errors (in brackets) of genetic measures

Relationships between inbreeding levels and productivity measures

The effect of inbreeding and kinship on demographic variables related to overall productivity of parents (ie, number of children) and reproduction effort (intergenesic and protogenesic intervals) was first examined. Both HP and GLM models suggested that inbreeding and kinship variables had no independent effect on the number of children produced, nor on the number and proportion of children who married. However, GLM models including first-order interactions uncovered the existence of interacting effects among genetic variables on the productivity of parents.

First, a positive interaction between rf and rm on the number of children produced (P<10−6) was detected. This suggests that families in which both parents had a high coefficient of average kinship to the contemporary population tended to have more children. This interaction is illustrated by simple regressions in Figure 1, where the number of children produced increases with increasing values of the coefficient of average kinship of the mother only for couples with high coefficients of average kinship of the father. Second, weak but significant negative interactions between fo and fm on both the number (P=0.04) and proportion (P=0.03) of married children were uncovered, suggesting that a high degree of kinship between spouses tended to increase the probability for their children to get married, but these effects were lower when the father was highly inbred.

Figure 1
figure 1

Relationship between the productivity (number of children produced) of mothers and their coefficient of average kinship to the contemporary population (rf), according to the coefficient of average kinship to the contemporary population of the fathers (rm) (low: gray open dots; high: full black dots). The definition of two groups according to rm is based on a quartile cut (high rm: fourth quartile; low rm: other quartiles). Lines present linear regressions (black line: high rm, P<10−3; gray line: low rm, NS).

The protogenesic interval is considered as a reliable index of reproductive potential in human populations.39 HP and GLM both uncovered an independent positive effect of the long-term inbreeding coefficient of offspring (fo) on the protogenesic interval (HP, 2.1% of variance, P=0.029; GLM, P<0.05). GLM models including interactions indicated that this effect was significantly stronger for spouses with a high long-term inbreeding coefficient of the female (P<0.0005). Similar (although not significant) trends were observed when using the close inbreeding coefficient of the offspring (fo3G) instead of fo. Further analysis based on quartile cuts revealed that this effect was mainly because of the existence of very long protogenesic intervals among the most genetically related spouses (see Figure 2 for fo). In contrast, no effect of inbreeding on the intergenesic interval was found.

Figure 2
figure 2

Distributions of protogenesic intervals for increasing long-term inbreeding coefficients of offspring (fo). The definition of four groups according to fo is based on a quartile cut. The variance of the fourth quartile is significantly different from the other quartiles (Ansari–Bradley test, AB=12 167, P<10−4).

Reproduction chronology and inbreeding

Finer effects of inbreeding on reproductive effort and failure were examined by focusing on temporal aspects of reproduction, using quantitative measures related to social or biological processes affecting fertility.

As the age of mothers at first childbearing was strongly correlated with their age at marriage (Kendall's rank correlation τ=0.86), this variable should be considered as mainly determined by social factors. Both HP and GLM models indicated that the age at first childbearing was related to the coefficient of average kinship of mothers to the contemporary population, rf (HP, 3.4% of independent variance explained, P<10−4; GLM, P=0.003): mothers with the highest values of rf had the lowest ages at first childbearing (the average value was 21.1 years for mothers with highest values of rf (last quartile), whereas it equaled 22.9 years for mothers of the first quartile).

Conversely, the age at last childbearing and ASYM (which quantifies the difference in productivity between the first and second halves of the reproductive period of each couple) should reflect some differences in reproductive effort among couples in a context of reproductive compensation. Our results indicated indeed that these two variables were negatively correlated to each other (Kendall's rank correlation τ=−0.24, z=−4.34, P<10−4), suggesting that couples experiencing marked reduction in fertility with age continued having children till a latter age than the rest of the population.

Using genetic variables, the analysis showed that the age at last childbearing was significantly positively related to the long-term inbreeding coefficient of the offspring fo (HP, 2.5% of independent variance explained, P=0.014; GLM, P=0.01). Further analysis indicated that the effect of fo on age at last childbearing becomes not significant when including the total number of children produced in the analysis as a control variable (most related spouses tended to extend their reproductive period and to have more children, but there was no relationship between fo and age at last childbearing for a fixed number of children).

More importantly, all analyses showed that ASYM was strongly negatively related to the close inbreeding coefficient of the father fm3G (HP, 4.3% of independent variance explained, P<10−4; GLM, P=0.003, result illustrated in Figure 3). This effect was still significant when including the number of children produced in the model. Thus, for a given total number of children produced, couples in which the father was most closely inbred showed a strong asymmetry in the number of children produced between the first and second halves of the reproductive period (ie, more children were produced during the first half, as compared with the second half). In contrast, the close inbreeding coefficient of the mother was not related to ASYM in both HP and GLM analyses. A comparison between the outcomes of HP and GLM analyses is summarized in Table 2. More details on these results are provided in Supplementary Tables 1 and 2.

Figure 3
figure 3

Relationship between the close inbreeding coefficient of fathers of the parent cohort (fm3G) and the asymmetry in the number of children produced between the first and second halves of the reproductive period (ASYM) (negative values of ASYM indicate that births became spaced further and further apart as the couple aged). The definition of four groups according to fm3G is based on a quartile cut. Error bars indicate standard error of the mean.

Table 2 Summary of hierarchical partitioning (HP) and generalized linear model (GLM) analyses (models without interaction)


Our results indicate that parents belonging to the core of the Saguenay-Lac-Saint-Jean population (ie, those with high values of rm and/or rf) tend to marry and reproduce earlier, and to have more children (Table 2, Figure 1), presumably because of social factors.2, 4 However, our analysis also uncovers biological effects of inbreeding on reproduction. It appears that these effects are mostly related to the timing of reproduction (see Table 2). First, results show a positive effect of the level of kinship among parents (ie, fo, the long-term inbreeding coefficient of offspring) on both the mean and variance of the protogenesic interval (Figure 2). This effect is nonlinear, suggesting a threshold effect of kinship on fertility.

Further, the analysis indicates that the close inbreeding coefficient of the father (fm3G) strongly affects reproduction rates along the reproductive period (more precisely, fm3G decreases productivity of parents during the second half of their reproductive period, as compared with the first half). The analysis shows that the mean age of mothers at reproduction can be influenced by different demographic processes, which are themselves correlated with different genetic metrics. First, the mean age of mothers at childbearing increases if women tend to reproduce latter in life. Results indicate that the age at last childbearing is positively related to both the long-term coefficient of inbreeding of the female (fm) and the long-term kinship among parents (fo). Second, the mean age of mothers at childbearing decreases if the productivity is higher during the first half of the mother's reproductive period than during the second half. Results indicate that such loss of productivity along the reproductive life is related to close father inbreeding. Therefore, an index based on the skewness of the distribution of ages of mothers at childbearing (such as ASYM) appears highly sensitive to the effect of inbreeding level of the father (Table 2, Figure 3).

Although small increases in pre-reproductive mortality among offspring of consanguineous parents have been shown in humans,40 little is known about the effects of inbreeding on reproduction.1, 3, 4, 5 Unlike in humans, the deleterious effects of inbreeding on both male and female fertility have been documented in a number of species (chiefly in fruit flies, but also in several mammalian species),41, 42, 43, 44 and recent experiments have reported greater inbreeding depression on male reproductive performance in more stressful environments.45

Assessing the relationships between environmental conditions and inbreeding depression is especially important in humans, in whom improved living conditions and healthcare may strongly affect the phenotypic manifestations of incompletely penetrant alleles.46 In one of the rare studies on humans, having reported some biological relationship between inbreeding and fertility, Ober et al5 uncovered significantly longer interbirth intervals in most inbred women, presumably because of lower conception rates or higher peri-implantation loss rates. However, this study also reported strong evidence of reproductive compensation occurring among the more inbred and less fertile women. Reproductive compensation, which may occur if the culturally defined ‘optimal’ family size is less than the reproductive potential of the population, is one of the processes leading to reduced selection and purging.6 Its occurrence should therefore depend on both environmental conditions and on the strength of selection. This includes variations in physiological condition and in gene expression occurring along an individual's reproductive period. Although the occurrence of reproductive compensation cannot be inferred using only pedigree and demographic data (and would require records on, eg, infant death), the patterns observed suggest that most inbred or related pairs (i) have a lower fertility (higher protogenesic interval, larger decrease of fertility with age), (ii) tend to reproduce till a later age than the rest of the population, but (iii) eventually have similar numbers of children than less inbred/related pairs. This pattern is compatible with the hypothesis of reproductive compensation.

In the presence of reproductive compensation, inbreeding depression may have important social and medical consequences although it does not act as an evolutionary force (because inbred and non-inbred individuals eventually have the same fitness). Inbreeding depression has nevertheless an evolutionary origin, and evolution theories may help to understand its action and improve its detection efficiency,47 mostly by providing theoretical expectations on the effect of inbreeding. Besides empirical observations according to which mutations can have age-specific effects on fitness components,12 the mathematical description of the mutation accumulation theory of senescence predicts that the deleterious effects of inbreeding will increase with age,17, 18, 19, 48 presumably because of genotype–environment interactions in which age-related changes in the physiology of the organism modulate the effects of some loci implied in inbreeding depression.49

Although the relevance of evolutionary theories of ageing has rarely been tested in humans (but see Drenos et al50), the prediction that inbreeding depression on fertility should increase with age18 has received recent empirical support in other species19 and is consistent with our findings indicating that (i) inter-birth intervals increase with parental age, as indicated by ASYM; (ii) this increase is significantly stronger for most inbred males.

The observation of an age-related decline in the fertility of inbred males (and not in females) may be explained by several reasons. First, the critical role played by the reduction of the age-related quality of gametogenesis (which occurs continuously during the lifetime of males) and its interaction with inbreeding could explain the observed pattern. On one hand, evidence for the decline in men's fertility with increasing age has been provided,51 and this decline is partly because of an increase of spermatozoal structural aberrations with age.52 On the other hand, strong inbreeding depression on male (and not female) fertility has been reported in animal populations53, 54 and several studies have established a link between inbreeding, sperm quality and reduced fertility in mammal populations.43, 55

Second, reproduction failures are multi-factorial and may depend on the genotype of the zygote, as well as on the interaction between the male and female genotypes. Our results indicate that the individual inbreeding coefficients of the male (fm) and female (ff) of the same pair are correlated to each other, and that both inbreeding coefficients are correlated with the kinship of spouses (fo), which may greatly complicate the statistical and biological interpretation of the results. In particular, at the pair level the outcome of the correlation between fm and ff should greatly depend on how male and female phenotypes will interact,56 which remains to be explored in humans.

Whatever are the reasons for the observed pattern, it is important to note that these observations require comparisons of reproductive performances at an intra-individual level (ie, reproductive potential at different times of the reproductive period for a given individual or a given couple). We therefore recommend using demographic measures based on intra-individual (rather than only inter-individual) comparisons to uncover biological effects (eg, productivities of parents at different ages). We suspect that larger asymmetry of productivity between the first and second halves of the reproductive period of most inbred parents is the consequence of an interaction between inbreeding depression and age (considered here as an environmental factor). As a result, measures based on the skewness of temporal distribution of reproductive events may allow improving detection of inbreeding depression.