Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Identification of cis-suppression of human disease mutations by comparative genomics



Patterns of amino acid conservation have served as a tool for understanding protein evolution1. The same principles have also found broad application in human genomics, driven by the need to interpret the pathogenic potential of variants in patients2. Here we performed a systematic comparative genomics analysis of human disease-causing missense variants. We found that an appreciable fraction of disease-causing alleles are fixed in the genomes of other species, suggesting a role for genomic context. We developed a model of genetic interactions that predicts most of these to be simple pairwise compensations. Functional testing of this model on two known human disease genes3,4 revealed discrete cis amino acid residues that, although benign on their own, could rescue the human mutations in vivo. This approach was also applied to ab initio gene discovery to support the identification of a de novo disease driver in BTG2 that is subject to protective cis-modification in more than 50 species. Finally, on the basis of our data and models, we developed a computational tool to predict candidate residues subject to compensation. Taken together, our data highlight the importance of cis-genomic context as a contributor to protein evolution; they provide an insight into the complexity of allele effect on phenotype; and they are likely to assist methods for predicting allele pathogenicity5,6.

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Figure 1: Distribution of variants found in sequence alignments.
Figure 2: Relationship between variants and evolutionary distance.
Figure 3: Compensatory mutations rescue pathogenic alleles in BBS4 and RPGRIP1L.
Figure 4: A de novo BTG2 p.V141M-encoding allele causes microcephaly.


  1. Alföldi, J. & Lindblad-Toh, K. Comparative genomics as a tool to understand evolution and disease. Genome Res. 23, 1063–1068 (2013)

    Article  Google Scholar 

  2. Cooper, G. M. & Shendure, J. Needles in stacks of needles: finding disease-causal variants in a wealth of genomic data. Nature Rev. Genet. 12, 628–640 (2011)

    Article  CAS  Google Scholar 

  3. Katsanis, N. et al. BBS4 is a minor contributor to Bardet-Biedl syndrome and may also participate in triallelic inheritance. Am. J. Hum. Genet. 71, 22–29 (2002)

    Article  CAS  Google Scholar 

  4. Khanna, H. et al. A common allele in RPGRIP1L is a modifier of retinal degeneration in ciliopathies. Nature Genet. 41, 739–745 (2009)

    Article  CAS  Google Scholar 

  5. Adzhubei, I. A. et al. A method and server for predicting damaging missense mutations. Nature Methods 7, 248–249 (2010)

    Article  CAS  Google Scholar 

  6. Sim, N. L. et al. SIFT web server: predicting effects of amino acid substitutions on proteins. Nucleic Acids Res. 40, W452–W457 (2012)

    Article  CAS  Google Scholar 

  7. Breen, M. S., Kemena, C., Vlasov, P. K., Notredame, C. & Kondrashov, F. A. Epistasis as the primary factor in molecular evolution. Nature 490, 535–538 (2012)

    Article  CAS  ADS  Google Scholar 

  8. Weinreich, D. M., Delaney, N. F., DePristo, M. A. & Hartl, D. L. Darwinian evolution can follow only very few mutational paths to fitter proteins. Science 312, 111–114 (2006)

    Article  CAS  ADS  Google Scholar 

  9. McCandlish, D. M., Rajon, E., Shah, P., Ding, Y. & Plotkin, J. B. The role of epistasis in protein evolution. Nature 497, E1–2 (2013)

    Article  CAS  ADS  Google Scholar 

  10. Corbett-Detig, R. B., Zhou, J., Clark, A. G., Hartl, D. L. & Ayroles, J. F. Genetic incompatibilities are widespread within species. Nature 504, 135–137 (2013)

    Article  CAS  ADS  Google Scholar 

  11. Gao, L. & Zhang, J. Why are some human disease-associated mutations fixed in mice? Trends Genet. 19, 678–681 (2003)

    Article  CAS  Google Scholar 

  12. Weinreich, D. M., Lan, Y., Wylie, C. S. & Heckendorn, R. B. Should evolutionary geneticists worry about higher-order epistasis? Curr. Opin. Genet. Dev. 23, 700–707 (2013)

    Article  CAS  Google Scholar 

  13. Chou, H. H., Chiu, H. C., Delaney, N. F., Segre, D. & Marx, C. J. Diminishing returns epistasis among beneficial mutations decelerates adaptation. Science 332, 1190–1192 (2011)

    Article  CAS  ADS  Google Scholar 

  14. Kondrashov, A. S., Sunyaev, S. & Kondrashov, F. A. Dobzhansky-Muller incompatibilities in protein evolution. Proc. Natl Acad. Sci. USA 99, 14878–14883 (2002)

    Article  CAS  ADS  Google Scholar 

  15. Kulathinal, R. J., Bettencourt, B. R. & Hartl, D. L. Compensated deleterious mutations in insect genomes. Science 306, 1553–1554 (2004)

    Article  CAS  ADS  Google Scholar 

  16. Soylemez, O. & Kondrashov, F. A. Estimating the rate of irreversibility in protein evolution. Genome Biol. Evol. 4, 1213–1222 (2012)

    Article  Google Scholar 

  17. Ferrer-Costa, C., Orozco, M. & de la Cruz, X. Characterization of compensated mutations in terms of structural and physico-chemical properties. J. Mol. Biol. 365, 249–256 (2007)

    Article  CAS  Google Scholar 

  18. Waterston, R. H. et al. Initial sequencing and comparative analysis of the mouse genome. Nature 420, 520–562 (2002)

    Article  CAS  ADS  Google Scholar 

  19. Mottaz, A., David, F. P., Veuthey, A. L. & Yip, Y. L. Easy retrieval of single amino-acid polymorphisms and phenotype information using SwissVar. Bioinformatics 26, 851–852 (2010)

    Article  CAS  Google Scholar 

  20. Landrum, M. J. et al. ClinVar: public archive of relationships among sequence variation and human phenotype. Nucleic Acids Res. 42, D980–D985 (2014)

    Article  CAS  Google Scholar 

  21. Ahola, V., Aittokallio, T., Vihinen, M. & Uusipaikka, E. Model-based prediction of sequence alignment quality. Bioinformatics 24, 2165–2171 (2008)

    Article  CAS  Google Scholar 

  22. Giudicessi, J. R. & Ackerman, M. J. Determinants of incomplete penetrance and variable expressivity in heritable cardiac arrhythmia syndromes. Transl. Res. 161, 1–14 (2013)

    Article  Google Scholar 

  23. Kimura, M. The Neutral Theory of Molecular Evolution (Cambridge Univ. Press, 1984)

    Google Scholar 

  24. Povolotskaya, I. S. & Kondrashov, F. A. Sequence space and the ongoing expansion of the protein universe. Nature 465, 922–926 (2010)

    Article  CAS  ADS  Google Scholar 

  25. Zaghloul, N. A. et al. Functional analyses of variants reveal a significant role for dominant negative and common alleles in oligogenic Bardet-Biedl syndrome. Proc. Natl Acad. Sci. USA 107, 10602–10607 (2010)

  26. Katsanis, N., Cotten, M. & Angrist, M. Exome and genome sequencing of neonates with neurodevelopmental disorders. Future Neurology 7, 655–658 (2012)

    Article  CAS  Google Scholar 

  27. Herman, D. S. et al. Truncations of titin causing dilated cardiomyopathy. N. Engl. J. Med. 366, 619–628 (2012)

    Article  CAS  Google Scholar 

  28. Montagnoli, A., Guardavaccaro, D., Starace, G. & Tirone, F. Overexpression of the nerve growth factor-inducible PC3 immediate early gene is associated with growth inhibition. Cell Growth Differ. 7, 1327–1336 (1996)

    CAS  PubMed  Google Scholar 

  29. Beunders, G. et al. Exonic deletions in AUTS2 cause a syndromic form of intellectual disability and suggest a critical role for the C terminus. Am. J. Hum. Genet. 92, 210–220 (2013)

    Article  CAS  Google Scholar 

  30. Fraïsse, C., Elderfield, J. A. & Welch, J. J. The genetics of speciation: are complex incompatibilities easier to evolve? J. Evol. Biol. 27, 688–699 (2014)

    Article  Google Scholar 

  31. Tennessen, J. A. et al. Evolution and functional impact of rare coding variation from deep sequencing of human exomes. Science 337, 64–69 (2012)

    Article  CAS  ADS  Google Scholar 

  32. Karolchik, D. et al. The UCSC Genome Browser database: 2014 update. Nucleic Acids Res. 42, D764–D770 (2014)

    Article  CAS  Google Scholar 

  33. Flicek, P. et al. Ensembl 2014. Nucleic Acids Res. 42, D749–D755 (2014)

    Article  CAS  Google Scholar 

  34. Bainbridge, M. N. et al. Targeted enrichment beyond the consensus coding DNA sequence exome reveals exons with higher variant densities. Genome Biol. 12, R68 (2011)

    Article  CAS  Google Scholar 

  35. Challis, D. et al. An integrative variant analysis suite for whole exome next-generation sequencing data. BMC Bioinformatics 13, 8 (2012)

    Article  Google Scholar 

  36. Niederriter, A. R. et al. In vivo modeling of the morbid human genome using Danio rerio . J. Vis. Exp. 78, e50338 (2012)

    Google Scholar 

Download references


We thank Y. Liu and D. Balick for helpful discussions, M. Kousi for assistance with the NCL mutational list, and M. Talkowski, A. Kondrashov and G. Lyon for critical review of the manuscript. This work was supported by grants R01HD04260, R01DK072301 and R01DK075972 (N.K.); R01 GM078598, R01 MH101244, R01 DK095721 and U01 HG006500 (S.R.S.); R01EY021872 (E.E.D.); and a NARSAD Young Investigator Award (C.G.). N.K. is a Distinguished Brumley Professor.

Author information

Authors and Affiliations




D.M.J., S.G.F., S.R.S. and N.K. designed the overall study. D.M.J., C.A.C. and S.R.S. conceptualized the principle of CPDs and performed all computational analyses. S.G.F., E.E.D. and N.K. conceptualized the biological properties of CPDs and implemented in vivo testing with the assistance of C.G. J.K. referred the index patient and evaluated clinical data in the context of molecular discoveries. The Task Force for Neonatal Genomics constructed the platforms and methods for recruitment, ascertainment and evaluation of clinical and molecular data and return of results.

Corresponding authors

Correspondence to Shamil R. Sunyaev or Nicholas Katsanis.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Additional information

Lists of participants and their affiliations appear in the Supplementary Information.

Extended data figures and tables

Extended Data Figure 1 Different alignment methodologies with HumVar and ClinVar produce qualitatively similar alignments.

a, b, Distributions of missense variants annotated as neutral (a) or pathogenic (b) in the HumVar and ClinVar data sets, with each of the five alignment strategies described in the text (MultiZ unfiltered, MultiZ mammals-only, EPO, MultiZ with alignment quality filter, MultiZ with >1 sequence filter). All distributions are quantitatively similar. Compare with Fig. 2c, d.

Extended Data Figure 2 Protein domain structure of functionally tested human disease genes.

a, Schematic of BBS4 (519 amino acids) is depicted with eight tetratricopeptide (TPR) domains (yellow); b, RPGRIP1L (1,315 amino acids) has multiple coiled-coil domains (green rectangles) and two protein kinase C conserved region 2 (C2) domains (green hexagons); and c, BTG2 (158 amino acids) has one BTG1 domain (purple pentagon). Disease-causing alleles are shown with red stars; complementing alleles are represented with blue stars; amino acid number scale in increments of 100 is shown below each schematic.

Extended Data Figure 3 Evaluation of btg2 and nos2a/b MOs.

ac, Schematic of the D. rerio btg2, nos2a and nos2b loci. Blue boxes, exons; dashed lines, introns; white boxes, untranslated regions; red boxes, MOs; ATG indicates the translational start site; arrows, polymerase chain reaction with reverse transcription (RT–PCR) primers; number indicates the targeted exon. d, e, Agarose gel images of nos2a/b RT–PCR products.

Extended Data Figure 4 HuC/HuD staining and quantification of 2 dpf zebrafish embryos confirms pathogenicity of BTG2 V141M.

a, Suppression of btg2 leads to a decrease of HuC/HuD levels at 2 dpf. Representative ventral images of control, btg2 morphants (images show unilateral or absent HuC/HuD expression), and a rescued embryo injected with a btg2 MO plus human BTG2 wild-type (WT) mRNA. Scale bar, 250 μm. b, Percentage of embryos with normal, bilateral HuC/HuD protein levels in the anterior forebrain or decreased/unilateral HuC/HuD protein levels in embryos injected with btg2 MOs alone or MOs plus human BTG2 wild-type or variant mRNAs (p.V141M, index case; p.A126S and p.R145Q, control alleles). *P < 0.05 (two-tailed t-test comparisons between MO-injected and rescued embryos; n = 38–78 per injection batch).

Supplementary information

Supplementary Information

This file contains Supplementary Text and Supplementary Tables 1-11. (PDF 992 kb)

Supplementary Data

This file contains the Predictor Code, the source code for the publically accessible online prediction algorithm. (TXT 6 kb)

PowerPoint slides

Source data

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Jordan, D., Frangakis, S., Golzio, C. et al. Identification of cis-suppression of human disease mutations by comparative genomics. Nature 524, 225–229 (2015).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

This article is cited by


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing