Two recent reviews, one in Nature Reviews Genetics from Bansal et al. (Statistical analysis strategies for association studies involving rare variants. Nature Rev. Genet. 11, 773–785 (2010))1 and one elsewhere2, examined the emerging area of rare variant association studies. These reviews nicely describe the progression from association studies for common SNPs towards those for rare variants. We would like to add to these discussions a strategy that has been used by several groups for rare variant case–control association studies. This strategy was developed independently of genome-wide association (GWA) studies and is largely confined to cancer genetics, and we refer to it here as case–control mutation screening (CCMS).
Ideas contributing to CCMS are as follows. First, linkage analysis shows that evidence from many, individually very rare sequence variants at the same locus can be combined3. Second, clinical testing of susceptibility genes such as breast cancer 1, early onset (BRCA1) and BRCA2 has shown that testing can be based on sequencing rather than genotyping. Third, the integrated evaluation of unclassified variants in BRCA1 and BRCA2 has shown that in silico assessment of rare variants — currently, rare missense substitutions (rMSs) — can be used to grade variants on the basis of predicted severity without attempting to dichotomize them as deleterious versus neutral4. Finally, lessons from GWA studies tell us that well-powered CCMS studies will be large, usually multi-centre and often multi-ethnic, and therefore must be analysed by statistical methods that allow for covariates.
The development of CCMS can be traced through the efforts of the genetics community to understand the contribution of heterozygous sequence variation in ataxia telangiectasia mutated (ATM) to risk of breast cancer (Table 1). Analysis of ATM CCMS data started with a case–control study that used a cohort allelic sums test limited to protein-truncating variants plus variants that clearly damage splice junctions (T+SJVs)5, Analyses progressed to a two-pronged strategy of analysing the pool of ATM T+SJVs in one logistic regression and the pool of rMSs in a second logistic regression6. The subtlety in this latter approach lies in combining all of the rMSs into a single categorical variable that incorporates prior information, such as sequence conservation, and grades the severity of rMSs from probably harmful to probably benign4,6. This variable is easily assessed in a logistical regression test for trend, thus minimizing the multiple testing problem while accommodating epidemiologic covariates. We believe that this form of CCMS, augmented by steadily improving statistical methods7,8, will be useful for identifying genes that harbour variants conferring intermediate risk, especially those in which most pathogenic variants are rare and either reduce or ablate function.
Going forward, improving the accuracy and scope of methods for predicting sequence variant severity is a key goal. To this end, the Critical Assessment of Genomic Interpretation community exercise will illuminate the capabilities of current approaches and inform their further development. An important additional issue is that methods for predicting gene dysfunction must be sufficiently transparent to allow other researchers to readily replicate predictions and judge the effects of hidden multiple testing (which maybe introduced by the prediction of sequence variant severity) on CCMS data analysis.
Bansal, V., Libiger, O., Torkamani, A. & Schork, N. J. Statistical analysis strategies for association studies involving rare variants. Nature Rev. Genet. 11, 773–785 (2010).
Asimit, J. & Zeggini, E. Rare variant association analysis methods for complex traits. Annu. Rev. Genet. 44, 293–308 (2010).
Terwilliger, J. D. & Ott, J. Handbook of Human Genetic Linkage (Johns Hopkins Univ. Press, Baltimore and London, 1994).
Tavtigian, S. V., Byrnes, G. B., Goldgar, D. E. & Thomas, A. Classification of rare missense substitutions, using risk surfaces, with genetic- and molecular-epidemiology applications. Hum. Mutat. 29, 1342–1354 (2008).
FitzGerald, M. G. et al. Heterozygous ATM mutations do not contribute to early onset of breast cancer. Nature Genet. 15, 307–310 (1997).
Tavtigian, S. V. et al. Rare, evolutionarily unlikely missense substitutions in ATM confer increased risk of breast cancer. Am. J. Hum. Genet. 85, 427–446 (2009).
Le Calvez-Kelm, F. et al. Rare, evolutionarily unlikely missense substitutions in CHEK2 contribute to breast cancer susceptibility: results from a breast cancer family registry (CFR) case-control mutation screening study. Breast Cancer Res. 13, R6 (2011).
Price, A. L. et al. Pooled association tests for rare variants in exon-resequencing studies. Am. J. Hum. Genet. 86, 832–838 (2010).
Gatti, R. A., Tward, A. & Concannon, P. Cancer risk in ATM heterozygotes: a model of phenotypic and mechanistic differences between missense and truncating mutations. Mol. Genet. Metab. 68, 419–423 (1999).
Sommer, S. S. et al. ATM missense mutations are frequent in patients with breast cancer. Cancer Genet. Cytogenet. 145, 115–120 (2003).
Renwick, A. et al. ATM mutations that cause ataxia-telangiectasia are breast cancer susceptibility alleles. Nature Genet. 38, 873–875 (2006).
Bernstein, J. L. et al. Radiation exposure, the ATM gene, and contralateral breast cancer in the women's environmental cancer and radiation epidemiology study. J. Natl Cancer Inst. 102, 475–483 (2010).
This work was supported by US National Institutes of Health grant R01 CA121245.
The authors declare no competing financial interests.
About this article
Cite this article
Tavtigian, S., Hashibe, M. & Thomas, A. Tests of association for rare variants: case control mutation screening. Nat Rev Genet 12, 224 (2011). https://doi.org/10.1038/nrg2867-c1
Biomarkers in Medicine (2014)
Briefings in Bioinformatics (2014)
Journal of the American Medical Informatics Association (2012)