Association studies for finding cancer-susceptibility genetic variants

Pharoah, Paul D. P.; Dunning, Alison M.; Ponder, Bruce A. J.; Easton, Douglas F.

doi:10.1038/nrc1476

Review Article
Published: 01 November 2004

Association studies for finding cancer-susceptibility genetic variants

Paul D. P. Pharoah¹,
Alison M. Dunning¹,
Bruce A. J. Ponder¹ &
…
Douglas F. Easton²

Nature Reviews Cancer volume 4, pages 850–860 (2004)Cite this article

1349 Accesses
401 Citations
3 Altmetric
Metrics details

Key Points

The polygenic model for cancer susceptibility indicates that much of the inherited risk of cancer is due to multiple risk alleles, each with a low to moderate risk. The number of such alleles for any specific cancer is unknown, but might be in the hundreds or thousands.
Although linkage studies have been highly successful in mapping the genes that underlie monogenic disorders, these studies are of limited use for investigating predisposition to polygenic disease, such as cancer. Genetic-association studies — or case–control studies — provide an efficient design for identifying common genetic variants that confer modest disease risks.
Few convincing cancer-susceptibility alleles have been identified so far using the genetic-association study design. The limited success of these studies can be attributed mainly to the use of small study sizes — which provide insufficient statistical power and give a high rate of false positives — and limitations in the selection of candidate genes.
The rapid acquisition of data on the occurrence of common single-nucleotide polymorphisms (SNPs) has made it possible to test for the association of a candidate gene or region with disease using a tagging-SNP approach.
Several approaches can be used to increase the efficiency of candidate-gene association studies, such as improving the selection of candidate genes that are likely to be associated with cancer predisposition and enriching for genetic susceptibility by studying families with a history of cancer.
A combination of cheaper genotyping technologies with efficient study design will make empirical, whole-genome studies a feasible prospect in the near future.
Elucidating how multiple susceptibility alleles interact with each other and with lifestyle and environmental factors will be a key future challenge for the molecular and genetic epidemiology of cancer predisposition.

Abstract

Cancer is the result of complex interactions between inherited and environmental factors. Known genes account for a small proportion of the heritability of cancer, and it is likely that many genes with modest effects are yet to be found. Genetic-association studies have been widely used in the search for such genes, but success has been limited so far. Increased knowledge of the function of genes and the architecture of human genetic variation combined with new genotyping technologies herald a new era of gene mapping by association.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Figure 1: The number of alleles required to explain the excess familial risk of a typical common cancer according to alleles with different frequencies and conferring different risks.**

**Figure 2: The stages in the design of an association study for cancer-susceptibility genes.**

Pan-cancer study detects genetic risk variants and shared genetic basis in two large cohorts

Article Open access 04 September 2020

Sara R. Rashkin, Rebecca E. Graff, … John S. Witte

Assessment of polygenic architecture and risk prediction based on common variants across fourteen cancers

Article Open access 03 July 2020

Yan Dora Zhang, Amber N. Hurson, … Montserrat Garcia-Closas

Pan-cancer and cross-population genome-wide association studies dissect shared genetic backgrounds underlying carcinogenesis

Article Open access 20 June 2023

Go Sato, Yuya Shirai, … Yukinori Okada

References

Houlston, R. S. & Peto, J. in Genetic predisposition to cancer (eds Eeles, R. A., Ponder, B. A. J., Easton, D. F. & Horwich, A.) 208–226 (Chapman & Hall, London, 1996).
Book Google Scholar
Lichtenstein, P. et al. Environmental and heritable factors in the causation of cancer — analyses of cohorts of twins from Sweden, Denmark, and Finland. N. Engl. J. Med. 343, 78–85 (2000). A landmark paper reporting the heritability of the common cancers based on data from over 40,000 twin pairs from Scandinavia.
Article CAS PubMed Google Scholar
Easton, D. F. How many more breast cancer predisposition genes are there. Breast Cancer Res. 1, 14–17 (1999).
Article CAS PubMed PubMed Central Google Scholar
Antoniou, A. C. et al. A comprehensive model for familial breast cancer incorporating BRCA1, BRCA2 and other genes. Br. J. Cancer 86, 76–83 (2002).
Article CAS PubMed PubMed Central Google Scholar
Risch, N. Searching for genetic determinants in the new millenium. Nature 405, 847–856 (2000). An excellent description of the strengths and weaknesses of different methods for gene mapping in complex diseases.
Article CAS PubMed Google Scholar
Cardon, L. R. & Bell, J. I. Association study designs for complex diseases. Nature Rev. Genet. 2, 91–99 (2001).
Article CAS PubMed Google Scholar
Chakravarti, A. Population genetics — making sense out of sequence. Nature Genet. 21, 56–60 (1999).
Article CAS PubMed Google Scholar
Glober, G. A., Cantrell, E. G., Doll, R. & Peto, R. Interaction between ABO and rhesus blood groups, the site of origin of gastric cancers, and the age and sex of the patient. Gut 12, 570–573 (1971).
Article CAS PubMed PubMed Central Google Scholar
Hildesheim, A. et al. Association of HLA class I and II alleles and extended haplotypes with nasopharyngeal carcinoma in Taiwan. J. Natl Cancer Inst. 94, 1780–1789 (2002).
Article CAS PubMed Google Scholar
Engel, L. S. et al. Pooled analysis and meta-analysis of glutathione S-transferase M1 and bladder cancer: a HuGE review. Am. J. Epidemiol. 156, 95–109 (2002).
Article PubMed Google Scholar
Vineis, P. et al. Current smoking, occupation, N-acetyltransferase-2 and bladder cancer: a pooled analysis of genotype-based studies. Cancer Epidemiol. Biomarkers Prev. 10, 1249–1252 (2001).
CAS PubMed Google Scholar
Dunning, A. M. et al. A systematic review of genetic polymorphisms and breast cancer risk. Cancer Epidemiol. Biomarkers Prev. 8, 843–854 (1999).
CAS PubMed Google Scholar
Gonzalez, C. A., Sala, N. & Capella, G. Genetic susceptibility and gastric cancer risk. Int. J. Cancer 100, 249–260 (2002).
Article CAS PubMed Google Scholar
Ioannidis, J. P., Ntzani, E. E., Trikalinos, T. A. & Contopoulos-Ioannidis, D. G. Replication validity of genetic association studies. Nature Genet. 29, 306–309 (2001).
Article CAS PubMed Google Scholar
Lohmueller, K. E., Pearce, C. L., Pike, M., Lander, E. S. & Hirschhorn, J. N. Meta-analysis of genetic association studies supports a contribution of common variants to susceptibility to common disease. Nature Genet. 33, 177–182 (2003).
Article CAS PubMed Google Scholar
Tabor, H. K., Risch, N. J. & Myers, R. M. Candidate-gene approaches for studying complex genetic traits: practical considerations. Nature Rev. Genet. 3, 391–397 (2002).
Article CAS PubMed Google Scholar
Dahlman, I. et al. Parameters for reliable results in genetic association studies in common disease. Nature Genet. 30, 149–150 (2002).
Article CAS PubMed Google Scholar
Colhoun, H. M., McKeigue, P. M. & Davey Smith, G. Problems of reporting genetic associations with complex outcomes. Lancet 361, 865–872 (2003).
Article PubMed Google Scholar
Patil, N. et al. Blocks of limited haplotype diversity revealed by high-resolution scanning of human chromosome 21. Science 294, 1719–1723 (2001).
Article CAS PubMed Google Scholar
Johnson, G. C. et al. Haplotype tagging for the identification of common disease genes. Nature Genet. 29, 233–237 (2001).
Article CAS PubMed Google Scholar
Gabriel, S. B. et al. The structure of haplotype blocks in the human genome. Science 296, 2225–2229 (2002).
Article CAS PubMed Google Scholar
Zhang, K., Calabrese, P., Nordborg, M. & Sun, F. Haplotype block structure and its applications to association studies: power and study designs. Am. J. Hum. Genet. 71, 1386–1394 (2002).
Article CAS PubMed PubMed Central Google Scholar
Meng, Z., Zaykin, D. V., Xu, C. F., Wagner, M. & Ehm, M. G. Selection of genetic markers for association analyses, using linkage disequilibrium and haplotypes. Am. J. Hum. Genet. 73, 115–130 (2003).
Article CAS PubMed PubMed Central Google Scholar
Haiman, C. A. et al. A comprehensive haplotype analysis of CYP19 and breast cancer risk: the Multiethnic Cohort. Hum. Mol. Genet. 12, 2679–2692 (2003). One of the first studies to use a comprehensive haplotype-tagging approach to examine a gene for common variants associated with breast cancer risk.
Article CAS PubMed Google Scholar
Carlson, C. S. et al. Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium. Am. J. Hum. Genet. 74, 106–120 (2004). This paper reports the results of re-sequencing 100 genes in 24 African-American and 23 European-American samples. They showed that a tagging-SNP set can comprehensively interrogate for main effects of common variants, but that tagging SNPs should be selected separately for populations of different ancestries.
Article CAS PubMed Google Scholar
Chapman, J. M., Cooper, J. D., Todd, J. A. & Clayton, D. G. Detecting disease associations due to linkage disequilibrium using haplotype tags: a class of tests and the determinants of statistical power. Hum. Hered. 56, 18–31 (2003).
Article PubMed Google Scholar
Zhang, K. & Jin, L. HaploBlockFinder: haplotype block analyses. Bioinformatics 19, 1300–1301 (2003).
Article CAS PubMed Google Scholar
Stram, D. O. et al. Choosing haplotype-tagging SNPS based on unphased genotype data using a preliminary sample of unrelated subjects with an example from the Multiethnic Cohort Study. Hum. Hered. 55, 27–36 (2003).
Article PubMed Google Scholar
Ke, X. & Cardon, L. R. Efficient selective screening of haplotype tag SNPs. Bioinformatics 19, 287–288 (2003).
Article CAS PubMed Google Scholar
Neale, B. M. & Sham, P. C. The future of association studies: gene-based analysis and replication. Am. J. Hum. Genet. 75, 353–362 (2004).
Article CAS PubMed PubMed Central Google Scholar
Marron, M. P. et al. Insulin-dependent diabetes mellitus (IDDM) is associated with CTLA4 polymorphisms in multiple ethnic groups. Hum. Mol. Genet. 6, 1275–1282 (1997).
Article CAS PubMed Google Scholar
Hugot, J. P. et al. Association of NOD2 leucine-rich repeat variants with susceptibility to Crohn's disease. Nature 411, 599–603 (2001).
Article CAS PubMed Google Scholar
Boyd, N. F. et al. Heritability of mammographic density, a risk factor for breast cancer. N. Engl. J. Med. 347, 886–894 (2002).
Article PubMed Google Scholar
Lakhani, S. R. et al. Multifactorial analysis of differences between sporadic breast cancers and cancers involving BRCA1 and BRCA2 mutations. J. Natl Cancer Inst. 90, 1138–1145 (1998).
Article CAS PubMed Google Scholar
Botstein, D. & Risch, N. Discovering genotypes underlying human phenotypes: past successes for mendelian disease, future approaches for complex disease. Nature Genet. 33, S228–S237 (2003).
Article Google Scholar
Antoniou, A. & Easton, D. F. Polygenic inheritance of breast cancer: implications for design of association studies. Genet. Epidemiol. 25, 190–203 (2003).
Article PubMed Google Scholar
Meijers-Heijboer, H. et al. Low-penetrance susceptibility to breast cancer due to CHEK2(^*)1100delC in noncarriers of BRCA1 or BRCA2 mutations. Nature Genet. 31, 55–59 (2002).
Article CAS PubMed Google Scholar
Dunning, A. M. et al. The extent of linkage disequilibrium in four populations with distinct demographic histories. Am. J. Hum. Genet. 67, 1544–1554 (2000).
Article CAS PubMed PubMed Central Google Scholar
Sham, P., Bader, J. S., Craig, I., O'Donovan, M. & Owen, M. DNA Pooling: a tool for large-scale association studies. Nature Rev. Genet. 3, 862–871 (2002).
Article CAS PubMed Google Scholar
Barratt, B. J. et al. Identification of the sources of error in allele frequency estimations from pooled DNA indicates an optimal experimental design. Ann. Hum. Genet. 66, 393–405 (2002).
Article CAS PubMed Google Scholar
Risch, N. & Merikangas, K. The future of genetic studies of complex diseases. Science 273, 1516–1517 (1996).
Article CAS PubMed Google Scholar
Carlson, C. S., Eberle, M. A., Kruglyak, L. & Nickerson, D. A. Mapping complex disease loci in whole-genome association studies. Nature 429, 446–452 (2004).
Article CAS PubMed Google Scholar
Kruglyak, L. Prospects for whole-genome linkage disequilibrium mapping of common disease genes. Nature Genet. 22, 139–144 (1999).
Article CAS PubMed Google Scholar
Kuschel, B. et al. Common polymorphisms in CHEK2 (checkpoint kinase 2) are not associated with breast cancer risk. Cancer Epidemiol. Biomarkers Prev. 12, 809–812 (2003).
CAS PubMed Google Scholar
Hastie, T., Tibshirani, R. & Friedman, J. The elements of statistical learning (Springer–Verlag, New York, 2001).
Book Google Scholar
Ritchie, M. D. et al. Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. Am. J. Hum. Genet. 69, 138–147 (2001).
Article CAS PubMed PubMed Central Google Scholar
Wacholder, S., Chanock, S., Garcia-Closas, M., El Ghormli, L. & Rothman, N. Assessing the probability that a positive report is false: an approach for molecular epidemiology studies. J. Natl Cancer Inst. 96, 434–442 (2004).
Article PubMed PubMed Central Google Scholar
Thomas, D. C. & Clayton, D. G. Betting odds and genetic associations. J. Natl Cancer Inst. 96, 421–423 (2004).
Article PubMed Google Scholar
Devlin, B. & Roeder, K. Genomic control for association studies. Biometrics 55, 997–1004 (1999).
Article CAS PubMed Google Scholar
Pritchard, J. K. & Rosenberg, N. A. The use of unlinked genetic markers to detect population stratification in association studies. Am. J. Hum. Genet. 65, 220–228 (1999).
Article CAS PubMed PubMed Central Google Scholar
Cardon, L. R. & Palmer, L. J. Population stratification and spurious allelic association. Lancet 361, 598–604 (2003). An excellent review of methods to detect and account for population stratification in genotype–phenotype association studies.
Article PubMed Google Scholar
Marchini, J., Cardon, L. R., Phillips, M. S. & Donnelly, P. The effects of human population structure on large genetic association studies. Nature Genet. 36, 512–517 (2004).
Article CAS PubMed Google Scholar
Freedman, M. L. et al. Assessing the impact of population stratification on genetic association studies. Nature Genet. 36, 388–393 (2004).
Article CAS PubMed Google Scholar
Risch, N. The genetic epidemiology of cancer: interpreting family and twin studies and their implications for molecular genetic approaches. Cancer Epidemiol. Biomarkers Prev. 10, 733–741 (2001).
CAS PubMed Google Scholar

Download references

Acknowledgements

We thank the referees and editors, whose comments on earlier drafts of this manuscript were very helpful.

Author information

Authors and Affiliations

Department of Oncology, Cancer Research UK Human Cancer Genetics Group, Strangeways Research Laboratory, Worts Causeway, Cambridge, CB1 8RN, UK
Paul D. P. Pharoah, Alison M. Dunning & Bruce A. J. Ponder
Department of Public Health and Primary Care, Genetic Epidemiology Group, Strangeways Research Laboratory, Worts Causeway, Cambridge, CB1 8RN, UK
Douglas F. Easton

Authors

Paul D. P. Pharoah
View author publications
You can also search for this author in PubMed Google Scholar
Alison M. Dunning
View author publications
You can also search for this author in PubMed Google Scholar
Bruce A. J. Ponder
View author publications
You can also search for this author in PubMed Google Scholar
Douglas F. Easton
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bruce A. J. Ponder.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Glossary

PENETRANCE: The frequency with which individuals who carry a given mutation show the manifestations associated with that mutation. If the penetrance of a disease allele is 100%, then all individuals carrying that allele will express the associated phenotype.
LINKAGE STUDIES: A statistical method in which the genotypes and phenotypes of parents and offspring in families are studied to determine whether two or more loci are assorting independently or exhibiting linkage during meiosis.
PHENOCOPY: A non-hereditary alteration in phenotype, induced by environmental factors such as nutritional status, that mimics the phenotype produced by a specific gene.
POLYMORPHISM: A polymorphism is the existence of two or more variants (alleles, sequence variants, chromosomal structural variants) at significant frequencies in the population. It is conventional for a genetic variant with a frequency of >1% to be called a polymorphism.
HAPLOTYPE: The physical arrangement of multiple alleles along a chromosome or segment of a chromosome.
TANDEM-REPEAT POLYMORPHISM: A tandem repeat is two or more copies of the same DNA sequence arranged in a direct head to tail succession along a chromosome. The number of copies of the repeat might vary in the population.
RESTRICTION FRAGMENT LENGTH POLYMORPHISM: A polymorphic difference in DNA sequence between individuals that can be recognized by restriction endonucleases.
SINGLE-NUCLEOTIDE POLYMORPHISM: Any polymorphic variation at a single nucleotide (base) in the genome.
RELATIVE RISK: The relative risk of disease associated with a particular risk factor (also known as an exposure), such as a particular genotype, is the ratio of the incidence of disease in individuals with that risk factor to the incidence of disease in individuals without the risk factor.
HAPLOTYPE RESOLUTION: The estimation of haploype frequencies in a population is complicated by the fact that haplotypes for diploid data are not usually directly observable. Haplotypes can be resolved (inferred) by using parental genotype data or estimated by using statistical estimation.
LINKAGE PEAKS: In a whole-genome linkage analysis, the strength of linkage at any given marker is given by the log of odds (LOD) score. A high LOD score at one or several adjacent markers can be called a linkage peak.
CONGENIC STRAIN: A congenic strain is derived by mating mice carrying a locus of interest in each succeeding generation to mice of an inbred strain. A fully congenic strain and the inbred partner are expected to be identical at all loci except for the transferred locus and a linked segment of chromosome.
SYNTENY: The physical presence of two or more genetic loci on the same chromosome, whether or not they are close enough together to demonstrate linkage.
FOUNDER: When a population expands from a limited number of individuals, those individuals are known as founders. The founder effect is when a particular allele is frequent in a population derived from a small number of founders.
MULTIFACTOR-DIMENSIONALITY REDUCTION: Uses case-control data to pool multilocus genotypes into either a high-risk or a low-risk group, effectively reducing the number of genotype predictors to one. The new one-dimensional multilocus genotype can then be evaluated to classify and predict disease status.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pharoah, P., Dunning, A., Ponder, B. et al. Association studies for finding cancer-susceptibility genetic variants. Nat Rev Cancer 4, 850–860 (2004). https://doi.org/10.1038/nrc1476

Download citation

Issue Date: 01 November 2004
DOI: https://doi.org/10.1038/nrc1476

This article is cited by

Possible therapeutic targets for NLRP3 inflammasome-induced breast cancer
- Xixi Wang
- Junyi Lin
- Minghua Wang
Discover Oncology (2023)
Genetic and molecular biology of gastric cancer among Iranian patients: an update
- Mohammad Reza Abbaszadegan
- Majid Mojarrad
- Meysam Moghbeli
Egyptian Journal of Medical Human Genetics (2022)
Prognostic role of PHYH for overall survival (OS) in clear cell renal cell carcinoma (ccRCC)
- Qiu Zhengqi
- Guo Zezhi
- Ao Ying
European Journal of Medical Research (2021)
Genetic Variant XRCC1 rs1799782 (C194T) and Risk of Cancer Susceptibility in Indian Population: A Meta-analysis of Case–Control Studies
- Raju Kumar Mandal
- Rama Devi Mittal
Indian Journal of Clinical Biochemistry (2021)
Genetic variants in histone modification regions are associated with the prognosis of lung adenocarcinoma
- Hyo-Gyoung Kang
- Yong Hoon Lee
- Jae Yong Park
Scientific Reports (2021)

Association studies for finding cancer-susceptibility genetic variants

Key Points

Abstract

Access options

Similar content being viewed by others

Pan-cancer study detects genetic risk variants and shared genetic basis in two large cohorts

Assessment of polygenic architecture and risk prediction based on common variants across fourteen cancers

Pan-cancer and cross-population genome-wide association studies dissect shared genetic backgrounds underlying carcinogenesis

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Competing interests

Related links

DATABASES

Entrez Gene

National Cancer Institute

OMIM

FURTHER INFORMATION

Glossary

Rights and permissions

About this article

Cite this article

This article is cited by

Possible therapeutic targets for NLRP3 inflammasome-induced breast cancer

Genetic and molecular biology of gastric cancer among Iranian patients: an update

Prognostic role of PHYH for overall survival (OS) in clear cell renal cell carcinoma (ccRCC)

Genetic Variant XRCC1 rs1799782 (C194T) and Risk of Cancer Susceptibility in Indian Population: A Meta-analysis of Case–Control Studies

Genetic variants in histone modification regions are associated with the prognosis of lung adenocarcinoma

Search

Quick links

Key Points

Abstract

Access options

Similar content being viewed by others

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Competing interests

Related links

Related links

DATABASES

Entrez Gene

National Cancer Institute

OMIM

FURTHER INFORMATION

Glossary

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links