A major goal in human genetics is to understand the role of common genetic variants in susceptibility to common diseases. This will require characterizing the nature of gene variation in human populations, assembling an extensive catalogue of single-nucleotide polymorphisms (SNPs) in candidate genes and performing association studies for particular diseases. At present, our knowledge of human gene variation remains rudimentary. Here we describe a systematic survey of SNPs in the coding regions of human genes. We identified SNPs in 106 genes relevant to cardiovascular disease, endocrinology and neuropsychiatry by screening an average of 114 independent alleles using 2 independent screening methods. To ensure high accuracy, all reported SNPs were confirmed by DNA sequencing. We identified 560 SNPs, including 392 coding-region SNPs (cSNPs) divided roughly equally between those causing synonymous and non-synonymous changes. We observed different rates of polymorphism among classes of sites within genes (non-coding, degenerate and non-degenerate) as well as between genes. The cSNPs most likely to influence disease, those that alter the amino acid sequence of the encoded protein, are found at a lower rate and with lower allele frequencies than silent substitutions. This likely reflects selection acting against deleterious alleles during human evolution. The lower allele frequency of missense cSNPs has implications for the compilation of a comprehensive catalogue, as well as for the subsequent application to disease association.
Access optionsAccess options
Subscribe to Journal
Get full journal access for 1 year
only $18.75 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Ayala, F.J., Escalante, A., O'Huigin, C. & Klein, J. Molecular genetics of speciation and human origins. Proc. Natl Acad. Sci. USA 91, 6787–6794 (1994).
Risch, N. & Merikangas, K. The future of genetic studies of complex human diseases. Science 273, 1516–1517 (1996).
Collins, F.S., Guyer, M.S. & Chakravarti, A. Variations on a theme: cataloging human DNA sequence variation. Science 278, 1580– 1581 (1997).
Lander, E.S. The new genomics: global views of biology. Science 274, 536–539 (1996).
Saunders, A.M. et al. Association of apolipoprotein E allele ε 4 with late-onset familial and sporadic Alzheimer's disease. Neurology 43, 1467–1472 (1993).
Bertina, R.M. et al. Mutation in blood coagulation factor V associated with resistance to activated protein C. Nature 369, 64– 67 (1994).
Dean, M. et al. Genetic restriction of HIV-1 infection and progression to AIDS by a deletion allele of the CKR5 structural gene. Hemophilia Growth and Development Study, Multicenter AIDS Cohort Study, Multicenter Hemophilia Cohort Study, San Francisco City Cohort, ALIVE Study. Science 273 , 1856–1862 (1996).
Corder, E.H. et al. Protective effect of apolipoprotein E type 2 allele for late onset Alzheimer disease. Nature Genet. 7, 180–184 (1994).
Moriyama, E.N. & PowelI, J.R. Intraspecific nuclear DNA variation in Drosophila. Mol. Biol. Evol. 13, 261– 277 (1996).
Harris, H. The Principles of Biochemical Genetics (North-Holland/Elsevier, Amsterdam, 1975).
Harding, R.M. et al. Archaic African and Asian lineages in the genetic ancestry of modern humans. Am. J. Hum. Genet. 60, 772–789 (1997).
Nickerson, D.A. et al. DNA sequence diversity in a 9.7-kb region of the human lipoprotein lipase gene. Nature Genet. 19, 233– 240 (1998).
Li, W.-H. & Sadler, L.A. Low nucleotide diversity in man. Genetics 129, 513–523 (1991).
Chee, M. et al. Accessing genetic information with high-density DNA arrays. Science 274, 610–614 (1996).
Wang, D.G. et al. Large-scale identification, mapping, and genotyping of single-nucleotide polymorphisms in the human genome. Science 280, 1077–1082 (1998).
Underhill, P.A. et al. A pre-Columbian Y chromosome-specific transition and its implications for human evolutionary history. Proc. Natl Acad. Sci. USA 93, 196–200 (1996).
Li, W.-H. Molecular Evolution (Sinauer Associates, Canada, 1997 ).
Tajima, F. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123, 585–595 (1989).
Begun, D.J. & Aquadro, C.F. Levels of naturally occurring DNA polymorphism correlate with recombination rates in D. melanogaster . Nature 356, 519–520 (1993).
Nachman, M.W., Bauer, V.L. Crowell, S.L. & Aquadro, C.F. DNA variability and recombination rates at X-linked loci in humans. Genetics 150, 1133–1141 (1998).
Wayne, M.L. & Simonson, K.L. Statistical tests of neutrality in the age of weak selection. Trends Ecol. Evol. 13 , 236 (1998).
Lander, E.S. & Schork, N.J. Genetic dissection of complex traits. Science 265, 2037–2048 (1994).
Watterson, G.A. & Guess, H.A. Is the most frequent allele the oldest? Theor. Popul. Biol. 11, 141–160 (1977).
Zietkiewicz, E. et al. Nuclear DNA diversity in worldwide distributed human populations. Gene 205, 161–171 (1997).
Halushka, M.K. et al. Patterns of single-nucleotide polymorphisms in candidate genes regulating blood-pressure homeostasis. Nature Genet. 22, 239–247 (1999).
Eyre-Walker. A. & Keightley, P. High genomic deleterious mutations rates in hominids. Nature 397 , 344–347 (1999).
Weber, J.L. & Myers, E.W. Human whole-genome shotgun sequencing. Genome Res. 7, 401–409 (1997).
Venter, J.C. et al. Shotgun sequencing of the human genome. Science 280, 1540–1542 ( 1998).
Clark, A.G. et al. Haplotype structure and population genetic inferences from nucleotide-sequence variation in human lipoprotein lipase. Am. J. Hum. Genet. 63, 595–612 (1998).
Day, D.J., Speiser, P.W., White, P.C. & Barany, F. Detection of steroid-21 hydroxylase alleles using gene specific PCR and a multiplex ligation detection reaction. Genomics 29 152–162 (1995).
Nickerson D.A., Tobe, V.O. & Taylor, S.L. PolyPhred: automating the detection and genotyping of single nucleotide substitution using fluorescence-based resequencing. Nucleic Acids Res. 25, 2745–2751 (1997).
Henikoff, S. & Henikoff, J.G. Amino acid substitution matrices from protein blocks. Proc. Natl Acad. Sci. USA 89, 10915–10919 (1992).
About this article
Translational Psychiatry (2019)
Genetic meta-analysis of diagnosed Alzheimer’s disease identifies new risk loci and implicates Aβ, tau, immunity and lipid processing
Nature Genetics (2019)
Impact of rare and low-frequency sequence variants on reliability of genomic prediction in dairy cattle
Genetics Selection Evolution (2018)
BMC Bioinformatics (2018)
Probability of phenotypically detectable protein damage by ENU-induced mutations in the Mutagenetix database
Nature Communications (2018)