Abstract
Many decades of theory have demonstrated that, in non-recombining systems, slightly deleterious mutations accumulate non-reversibly1, potentially driving the extinction of many asexual species. Non-recombining chromosomes in sexual organisms are thought to have degenerated in a similar fashion2; however, it is not clear the extent to which damaging mutations accumulate along chromosomes with highly variable rates of crossing over. Using high-coverage sequencing data from over 1,400 individuals in the 1000 Genomes and CARTaGENE projects, we show that recombination rate modulates the distribution of putatively deleterious variants across the entire human genome. Exons in regions of low recombination are significantly enriched for deleterious and disease-associated variants, a signature varying in strength across worldwide human populations with different demographic histories. Regions with low recombination rates are enriched for highly conserved genes with essential cellular functions and show an excess of mutations with demonstrated effects on health, a phenomenon likely affecting disease susceptibility in humans.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Felsenstein, J. The evolutionary advantage of recombination. Genetics 78, 737–756 (1974).
Charlesworth, B. & Charlesworth, D. The degeneration of Y chromosomes. Phil. Trans. R. Soc. Lond. B 355, 1563–1572 (2000).
Keinan, A. & Clark, A.G. Recent explosive human population growth has resulted in an excess of rare genetic variants. Science 336, 740–743 (2012).
Nelson, M.R. et al. An abundance of rare functional variants in 202 drug target genes sequenced in 14,002 people. Science 337, 100–104 (2012).
Muller, H.J. The relation of recombination to mutational advance. Mutat. Res. 106, 2–9 (1964).
Campos, J.L., Charlesworth, B. & Haddrill, P.R. Molecular evolution in nonrecombining regions of the Drosophila melanogaster genome. Genome Biol. Evol. 4, 278–288 (2012).
Campos, J.L., Halligan, D.L., Haddrill, P.R. & Charlesworth, B. The relation between recombination rate and patterns of molecular evolution and variation in Drosophila melanogaster. Mol. Biol. Evol. 31, 1010–1028 (2014).
Hellmann, I. et al. Why do human diversity levels vary at a megabase scale? Genome Res. 15, 1222–1231 (2005).
Lercher, M.J. & Hurst, L.D. Human SNP variability and mutation rate are higher in regions of high recombination. Trends Genet. 18, 337–340 (2002).
Hernandez, R.D. et al. Classic selective sweeps were rare in recent human evolution. Science 331, 920–924 (2011).
Lohmueller, K.E. et al. Natural selection affects multiple aspects of genetic variation at putatively neutral sites across the human genome. PLoS Genet. 7, e1002326 (2011).
Charlesworth, B. The effects of deleterious mutations on evolution at linked sites. Genetics 190, 5–22 (2012).
McGaugh, S.E. et al. Recombination modulates how selection affects linked sites in Drosophila. PLoS Biol. 10, e1001422 (2012).
Kaiser, V.B. & Charlesworth, B. The effects of deleterious mutations on evolution in non-recombining genomes. Trends Genet. 25, 9–12 (2009).
Hill, W.G. & Robertson, A. The effect of linkage on limits to artificial selection. Genet. Res. 8, 269–294 (1966).
Keightley, P.D. & Otto, S.P. Interference among deleterious mutations favours sex and recombination in finite populations. Nature 443, 89–92 (2006).
Awadalla, P. et al. Cohort profile of the CARTaGENE study: Quebec's population-based biobank for public health and personalized genomics. Int. J. Epidemiol. 42, 1285–1299 (2013).
Hodgkinson, A. et al. High-resolution genomic analysis of human mitochondrial RNA sequence variation. Science 344, 413–415 (2014).
Abecasis, G.R. et al. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65 (2012).
Landrum, M.J. et al. ClinVar: public archive of relationships among sequence variation and human phenotype. Nucleic Acids Res. 42, D980–D985 (2014).
Davydov, E.V. et al. Identifying a high fraction of the human genome to be under selective constraint using GERP++. PLoS Comput. Biol. 6, e1001025 (2010).
Comeron, J.M., Williford, A. & Kliman, R.M. The Hill-Robertson effect: evolutionary consequences of weak selection and linkage in finite populations. Heredity (Edinb.) 100, 19–31 (2008).
Gordo, I., Navarro, A. & Charlesworth, B. Muller's ratchet and the pattern of variation at a neutral locus. Genetics 161, 835–848 (2002).
Messer, P.W. SLiM: simulating evolution with selection and linkage. Genetics 194, 1037–1039 (2013).
Hernandez, R.D. A flexible forward simulator for populations subject to selection and demography. Bioinformatics 24, 2786–2787 (2008).
Hudson, R.R. & Kaplan, N.L. Deleterious background selection with recombination. Genetics 141, 1605–1617 (1995).
Charlesworth, B. & Charlesworth, D. Rapid fixation of deleterious alleles can be caused by Muller's ratchet. Genet. Res. 70, 63–73 (1997).
Bullaughey, K., Przeworski, M. & Coop, G. No effect of recombination on the efficacy of natural selection in primates. Genome Res. 18, 544–554 (2008).
Casals, F. et al. Whole-exome sequencing reveals a rapid change in the frequency of rare functional variants in a founding population of humans. PLoS Genet. 9, e1003815 (2013).
Moreau, C. et al. Deep human genealogies reveal a selective advantage to be on an expanding wave front. Science 334, 1148–1150 (2011).
Smith, A.V., Thomas, D.J., Munro, H.M. & Abecasis, G.R. Sequence features in regions of weak and strong linkage disequilibrium. Genome Res. 15, 1519–1534 (2005).
Khurana, E. et al. Integrative annotation of variants from 1092 humans: application to cancer genomics. Science 342, 1235587 (2013).
Simons, Y.B., Turchin, M.C., Pritchard, J.K. & Sella, G. The deleterious mutation load is insensitive to recent population history. Nat. Genet. 46, 220–224 (2014).
Boyko, A.R. et al. Assessing the evolutionary impact of amino acid mutations in the human genome. PLoS Genet. 4, e1000083 (2008).
HapMap Consortium. A second generation human haplotype map of over 3.1 million SNPs. Nature 449, 851–861 (2007).
Kong, A. et al. Fine-scale recombination rate differences between sexes, populations and individuals. Nature 467, 1099–1103 (2010).
Hinch, A.G. et al. The landscape of recombination in African Americans. Nature 476, 170–175 (2011).
Adzhubei, I.A. et al. A method and server for predicting damaging missense mutations. Nat. Methods 7, 248–249 (2010).
Kumar, P., Henikoff, S. & Ng, P.C. Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nat. Protoc. 4, 1073–1081 (2009).
Yang, Z. PAML: a program package for phylogenetic analysis by maximum likelihood. Comput. Appl. Biosci. 13, 555–556 (1997).
Morris, J.A. & Gardner, M.J. Calculating confidence intervals for relative risks (odds ratios) and standardised ratios and rates. Br. Med. J. (Clin. Res. Ed.) 296, 1313–1316 (1988).
Delaneau, O., Zagury, J.F. & Marchini, J. Improved whole-chromosome phasing for disease and population genetic studies. Nat. Methods 10, 5–6 (2013).
Hussin, J., Nadeau, P., Lefebvre, J.F. & Labuda, D. Haplotype allelic classes for detecting ongoing positive selection. BMC Bioinformatics 11, 65 (2010).
Eyre-Walker, A., Woolfit, M. & Phelps, T. The distribution of fitness effects of new deleterious amino acid mutations in humans. Genetics 173, 891–900 (2006).
Mi, H. et al. The PANTHER database of protein families, subfamilies, functions and pathways. Nucleic Acids Res. 33, D284–D288 (2005).
Wang, J., Duncan, D., Shi, Z. & Zhang, B. WEB-based GEne SeT AnaLysis Toolkit (WebGestalt): update 2013. Nucleic Acids Res. 41, W77–W83 (2013).
Acknowledgements
We thank G. Gibson, E. Hatton, G. McVean, J. Novembre, C. Spencer, E. Stone and the anonymous reviewers for insightful comments on the study, and we thank the CARTaGENE participants and team for data collection. We confirm that informed consent was obtained from all subjects. We acknowledge financial support from Fonds de la Recherche en Santé du Québec (FRSQ), Génome Québec, Fonds Québécois de la Recherche sur la Nature et les Technologies (FQRNT) and the Canadian Partnership Against Cancer. J.G.H. is a Human Frontiers Postdoctoral Fellow, A.H. is an FRSQ Research Fellow, and Y.I. is a Banting Postdoctoral Fellow.
Author information
Authors and Affiliations
Contributions
J.G.H. designed the study, performed quality control on genotyping and sequencing data, performed bioinformatics and statistical analyses and wrote the manuscript. A.H. performed quality control on genotyping and sequencing data and wrote the manuscript. E.G. and E.H.-K. processed samples for sequencing and genotyping. Y.I., J.-C.G. and J.-P.G. preprocessed the genomic data and performed quality control and bioinformatics analyses. P.A. provided samples, designed the study and wrote the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Integrated supplementary information
Supplementary Figure 1 Comparison of the levels of diversity between coldspots (CS) and highly recombining regions (HRRs) for SNPs in the FCQ data set.
Odds ratios (ORs) are computed to compare SNP density between coldspots and HRRs for all SNPs (red) and SNPs divided in different allele frequency classes (black). OR < 1 means that diversity is greater in HRRs than in coldspots. We confirm the lack of diversity in coldspots relative to HRRs, in line with previous evidence that diversity is reduced in regions with low recombination rates owing to background selection. The effect is seen for all frequency classes and does not differ significantly between classes of SNPs with MAF > 0.05. The class of variants with MAF < 0.05 shows a smaller effect than the other frequency classes.
Supplementary Figure 2 Differential mutational burden between coldspots (CS) and highly recombining regions (HRRs) in a genomic subset of the data.
Differential burden is computed using odds ratios (ORs), representing the relative enrichment of a category of variants compared to all variants in coldspots versus HRRs for (a) RNA and exome sequencing of French Canadians (FC) and for exome sequencing of (b) Europeans (EUR), (c) Asians (ASN) and (d) Africans (AFR) from the 1000 Genomes Project. Variants are categorized as rare (MAF < 0.01 in a population), nonsynonymous (missense and nonsense) and damaging (as predicted by both SIFT and PolyPhen-2). Highly covered exons (HC exons) have coverage above 20× for each position within the exons in all data sets. The set of exons analyzed does not affect the results, and the exome data set in French Canadians (FC) replicates the results found in RNA sequencing.
Supplementary Figure 3 Minor allele frequencies (MAF) impact on odds ratios between coldspots (CS) and high recombining regions (HRRs).
Impact of MAF on the effects for functional mutations in the French-Canadian (FCQ) RNA sequencing data set (a,b) and for private and shared variants in (c) Europeans (EUR) and (d) Africans (AFR). (a) The enrichment of nonsynonymous and damaging mutations in coldspots remains significant for MAF < 0.05, indicating that the excess of rare variants in coldspots does not drive the effect for nonsynonymous and damaging variants. (b) Neutral variants with MAF < 0.05 are enriched in coldspots in comparison to more frequent variants, indicating that neutral diversity contributes to the excess of rare variants in coldspots. (c,d) The enrichment of private mutations in coldspots and of shared mutations in HRRs remains significant for MAF < 0.1 in both EUR and AFR, indicating that these effects are not driven only by differences in allele frequency between shared and private variants.
Supplementary Figure 4 Distribution of conservation across exons measured by GERP scores in coldspots (CS) and highly recombining regions (HRRs).
(a) Mean GERP score per exon. (b) Proportion of constrained positions (GERP > 3) per exon. (c) Scatter plot of mean GERP by the proportion of constrained positions for all exons. (d) For each measure of conservation per exon, exons were grouped into four categories of equal size. Only exons that were concordant between the two classifications were kept in analyses within conservation categories, to minimize the effect of outliers for one of the two measures. Characteristics of exons in these four conservation categories in terms of average GERP score per base pair and number of constrained sites per base pair (GERP > 3) are reported in Supplementary Table 7 .
Supplementary Figure 5 Differential mutational burden in conservation categories.
Differential mutational burden between coldspots (CS) and highly recombining regions (HRRs) for rare (MAF < 0.01), nonsynonymous (nonsyn), damaging and constrained variants in (a) French Canadians (FCQ) and (b) Europeans (EUR) for highly covered (HC) exons and in (c) Asians (ASN) and (d) Africans (AFR) for the whole exome. Results for EUR in the whole exome are presented in Figure 3a . Conservation categories are described in Supplementary Table 7 . Results for ASN and AFR in HC exons (data not shown) are similar to EUR results. For all populations and exon data sets, the medium high and high conservation categories always show a significant enrichment for potentially deleterious mutations in coldspots.
Supplementary Figure 7 Additional simulations testing the effect of recombination rates and phasing.
(a) Distribution of effects for initial and modified coldspot (CS) and highly recombining region (HRR) rates in simulations, with CS/HRR recombination rates matching the rates in the CEU and YRI maps, respectively ( Supplementary Note , section 4). The distributions are significantly different, but the shift in the mean is very weak and unlikely to cause the large differences observed between populations in Figure 5 . (b,c) Effect of phasing on the distribution of the number of haplotypes with two and more rare mutations (MAF < 0.01) in real haplotypes and phased haplotypes on chunks of the same length (25 kb) in simulated coldspots and HRRs. (b) The number of haplotypes with two mutations is reduced by statistical phasing with SHAPEIT2, (b) but no significant difference between coldspots and HRRs was found in this phasing bias.
Supplementary Figure 8 Effects for private and shared variants between African subpopulations.
Comparison of closely related populations of African ancestry. Odds ratios comparing coldspots (CS) and highly recombining regions (HRRs) are computed on the basis of private and shared variants called in 88 Yoruba in Ibadan from Nigeria (YRI), 97 Luhya in Webuye from Kenya (LWK) and 61 Americans of African ancestry (ASW).
Supplementary Figure 9 Per-individual differential mutational burden across populations.
Comparison of proportions of (a) rare and (b) nonsynonymous mutations between coldspots (CS) and highly recombining regions (HRRs) in French Canadians (FCQ), Europeans (EUR), Asians (ASN) and Africans (AFR). For each individual (ordered by their OR values), the relative proportions of rare or nonsynonymous mutations in coldspots and HRRs are shown, computed by dividing coldspot and HRR proportions by genome-wide proportions of rare or nonsynonymous variants within each individual, to adjust for differences across individuals. The larger symbols represent individuals with the minimum and maximum OR values in each population. Ticks at the bottom of the plots show individual OR values significantly different from 1 (two-tailed P < 0.05). The French-Canadian data used are the RNA sequencing data set (Supplementary Note, section 2); replication with exome data of 96 French Canadians is presented in Supplementary Figure 11.
Supplementary Figure 10 Per-individual differential mutational burden across European populations for private variants.
Distribution of odds ratios (ORs) per individual comparing proportions of private variants between coldspots (CS) and highly recombining regions (HRRs) in closely related populations of western European ancestry. ORs are computed on the basis of private variants called in the exome sequencing data set of 96 French Canadians (FCX), 89 British individuals (GBR), 93 Finns (FIN), 98 Italians from Tuscany (TSI) and 85 European Americans (CEU). The left panel shows the frequencies of individual ORs in each population. The right panel shows, for each individual (ordered by their OR values), the relative proportions of private mutations in coldspots and HRRs, computed by dividing coldspot and HRR proportions by genome-wide proportions of private variants within each individual, to adjust for differences across individuals.
Supplementary Figure 11 Per-individual differential mutational burden across populations with FCQ exome sequencing data.
Distribution of odds ratios (ORs) per individual comparing proportions of rare (a,b) and nonsynonymous (c,d) mutations between coldspots (CS) and highly recombining regions (HRRs). For Europeans (EUR), Asians (ASN) and Africans (AFR), the results are the same as shown in Figure 4 and Supplementary Figure 9, whereas French-Canadian (FCQ) results are computed using the exome sequencing data set from 96 individuals. Further descriptions of the plots are found in Figure 4 and Supplementary Figure 9.
Supplementary Figure 12 Quality checks on per-individual differential mutational burden across populations.
Distribution of odds ratios (ORs) per individual in French Canadians (FCQ), Europeans (EUR), Asians (ASN) and Africans (AFR), comparing proportions of (a) nonsynonymous variants after modifying annotations in the 1000 Genomes Project populations (see the Supplementary Note , section 4.1) and (b) nonsynonymous and (c) rare variants, after excluding mutations that are fixed in one population but still segregating in others, between coldspots (CS) and highly recombining regions (HRRs). The differences between populations observed in Figure 5 remain the same after correcting for these potential technical differences.
Supplementary Figure 13 Population structure in regional populations of Quebec.
Sampling from the CARTaGENE Project includes individuals from the Montreal area (MTL), Quebec City (QCC) and the Saguenay region (SAG). The regional origin of individuals was confirmed by a principal-component analysis of genetic diversity in FCQ individuals compared with genetic diversity within the Reference Panel of Quebec (RPQ) and in the CEU population from HapMap 3. Other populations included in the RPQ are GAS (Gaspesia region), ACA (Acadians), LOY (Loyalists) and CNO (North Shore region).
Supplementary information
Supplementary Text and Figures
Supplementary Figures 1–13, Supplementary Tables 1–11 and Supplementary Note. (PDF 4097 kb)
Supplementary Data Set
Genetic maps, coldspot and HRR list, and SNV details. (TGZ 81619 kb)
Rights and permissions
About this article
Cite this article
Hussin, J., Hodgkinson, A., Idaghdour, Y. et al. Recombination affects accumulation of damaging and disease-associated mutations in human populations. Nat Genet 47, 400–404 (2015). https://doi.org/10.1038/ng.3216
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/ng.3216
This article is cited by
-
Fine human genetic map based on UK10K data set
Human Genetics (2022)
-
Estimated prevalence of Niemann–Pick type C disease in Quebec
Scientific Reports (2021)
-
Interacting evolutionary pressures drive mutation dynamics and health outcomes in aging blood
Nature Communications (2021)
-
Sex differences in spiders: from phenotype to genomics
Development Genes and Evolution (2020)
-
The role of sex in the genomics of human complex traits
Nature Reviews Genetics (2019)