Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Analysis framework and experimental design for evaluating synergy-driving gene expression


The mechanisms by which genetic risk variants interact with each other, as well as environmental factors, to contribute to complex genetic disorders remain unclear. We describe in detail our recently published approach to resolve distinct additive and synergistic transcriptomic effects after combinatorial manipulation of genetic variants and/or chemical perturbagens. Although first developed for CRISPR-based perturbation studies of isogenic human induced pluripotent stem cell-derived neurons, our methodology can be broadly applied to any RNA sequencing dataset, provided that raw read counts are available. Whereas other differential expression analyses reveal the effect of individual perturbations, here we specifically query interactions between two or more perturbagens, resolving the extent of non-additive (synergistic) interactions between perturbations. We discuss the careful experimental design required to resolve synergistic effects and considerations of statistical power and how to quantify observed synergy between experiments. Additionally, we speculate on potential future applications and explore the obvious limitations of this approach. Overall, by interrogating the effect of independent factors, alone and in combination, our analytic framework and experimental design facilitate the discovery of convergence and synergy downstream of gene and/or treatment perturbations hypothesized to contribute to complex diseases. We think that this protocol can be successfully applied by any scientist with bioinformatic skills and basic proficiency in the R programming language. Our computational pipeline ( is straightforward, does not require supercomputing support and can be conducted in a single day upon completion of RNA sequencing experiments.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: General overview of the analysis pipeline, experimental design and differential expression contrast design.
Fig. 2: Synergistic effects.
Fig. 3: Differential expression analysis output.
Fig. 4: Synergistic effect analysis output.
Fig. 5: Gene set enrichment analysis output.
Fig. 6: Over-representation analysis output.

Data availability

RNA sequencing data from our study of schizophrenia risk genes20, including their individual and combined perturbation, are available at!Synapse:syn20502314 and as Supplementary Data 1. Downloading these data requires that you are a registered Synapse user and have agreed to the Synapse terms of use. Figures 3, 4, 5 and 6 were created based on these data. RNA sequencing data from the NRAS-mutant melanoma study (ref. 44) are available as Supplementary Data 2 and were reanalyzed here to generate Extended Data Figs. 14.

Code availability

Code is available at, under the MIT License.


  1. 1.

    Pardinas, A. F. et al. Common schizophrenia alleles are enriched in mutation-intolerant genes and in regions under strong background selection. Nat. Genet. 50, 381–389 (2018).

  2. 2.

    Nalls, M. A. et al. Expanding Parkinson’s disease genetics: novel risk loci, genomic context, causal insights and heritable risk. Preprint at (2019).

  3. 3.

    Nelson, C. P. et al. Association analyses based on false discovery rate implicate new loci for coronary artery disease. Nat. Genet. 49, 1385–1391 (2017).

    CAS  PubMed  Google Scholar 

  4. 4.

    Xue, A. et al. Genome-wide association analyses identify 143 risk variants and putative regulatory mechanisms for type 2 diabetes. Nat. Commun. 9, 2941 (2018).

    PubMed  PubMed Central  Google Scholar 

  5. 5.

    Satterstrom, F. K. et al. Autism spectrum disorder and attention deficit hyperactivity disorder have a similar burden of rare protein-truncating variants. Nat. Neurosci. 22, 1961–1965 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  6. 6.

    Cavalli, G. & Heard, E. Advances in epigenetics link genetics to the environment and disease. Nature 571, 489–499 (2019).

    CAS  PubMed  Google Scholar 

  7. 7.

    Chaste, P. & Leboyer, M. Autism risk factors: genes, environment, and gene-environment interactions. Dialogues Clin. Neurosci. 14, 281–292 (2012).

    PubMed  PubMed Central  Google Scholar 

  8. 8.

    Visscher, P. M. et al. 10 years of GWAS discovery: biology, function, and translation. Am. J. Hum. Genet. 101, 5–22 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  9. 9.

    Ye, C. J. et al. Intersection of population variation and autoimmunity genetics in human T cell activation. Science 345, 1254665 (2014).

    PubMed  PubMed Central  Google Scholar 

  10. 10.

    Moyerbrailean, G. A. et al. High-throughput allele-specific expression across 250 environmental conditions. Genome Res. 26, 1627–1638 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  11. 11.

    Phillips, P. C. Epistasis—the essential role of gene interactions in the structure and evolution of genetic systems. Nat. Rev. Genet. 9, 855–867 (2008).

    CAS  PubMed  PubMed Central  Google Scholar 

  12. 12.

    Wray, N. R., Wijmenga, C., Sullivan, P. F., Yang, J. & Visscher, P. M. Common disease is more complex than implied by the core gene omnigenic model. Cell 173, 1573–1580 (2018).

    CAS  PubMed  Google Scholar 

  13. 13.

    Boyle, E. A., Li, Y. I. & Pritchard, J. K. An expanded view of complex traits: from polygenic to omnigenic. Cell 169, 1177–1186 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  14. 14.

    Baeza-Centurion, P., Minana, B., Schmiedel, J. M., Valcarcel, J. & Lehner, B. Combinatorial genetics reveals a scaling law for the effects of mutations on splicing. Cell 176, 549–563 (2019).

    CAS  PubMed  Google Scholar 

  15. 15.

    Kuzmin, E. et al. Systematic analysis of complex genetic interactions. Science 360, eaao1729 (2018).

  16. 16.

    VanderSluis, B. et al. Integrating genetic and protein–protein interaction networks maps a functional wiring diagram of a cell. Curr. Opin. Microbiol. 45, 170–179 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  17. 17.

    Shalem, O., Sanjana, N. E. & Zhang, F. High-throughput functional genomics using CRISPR–Cas9. Nat. Rev. Genet. 16, 299–311 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  18. 18.

    Rehbach, K., Fernando, M. B. & Brennand, K. J. Integrating CRISPR engineering and hiPSC-derived 2D disease modeling systems. J. Neurosci. 40, 1176–1185 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  19. 19.

    Hoffman, G. E., Schrode, N., Flaherty, E. & Brennand, K. J. New considerations for hiPSC-based models of neuropsychiatric disorders. Mol. Psychiatry 24, 49–66 (2019).

    CAS  PubMed  Google Scholar 

  20. 20.

    Schrode, N. et al. Synergistic effects of common schizophrenia risk variants. Nat. Genet. 51, 1475–1485 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  21. 21.

    Wang, M. et al. Transformative network modeling of multi-omics data reveals detailed circuits, key regulators, and potential therapeutics for Alzheimer’s disease. Neuron (2020).

  22. 22.

    Elam, K. K., Clifford, S., Shaw, D. S., Wilson, M. N. & Lemery-Chalfant, K. Gene set enrichment analysis to create polygenic scores: a developmental examination of aggression. Transl. Psychiatry 9, 212 (2019).

    PubMed  PubMed Central  Google Scholar 

  23. 23.

    Choi, S. W. & O'Reilly, P. F. PRSice-2: polygenic risk score software for biobank-scale data. Gigascience 8, giz082 (2019).

  24. 24.

    Mimitou, E. P. et al. Multiplexed detection of proteins, transcriptomes, clonotypes and CRISPR perturbations in single cells. Nat. Methods 16, 409–412 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  25. 25.

    Dixit, A. et al. Perturb-Seq: dissecting molecular circuits with scalable single-cell RNA profiling of pooled genetic screens. Cell 167, 1853–1866 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  26. 26.

    Datlinger, P. et al. Pooled CRISPR screening with single-cell transcriptome readout. Nat. Methods 14, 297–301 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  27. 27.

    Readhead, B. et al. Expression-based drug screening of neural progenitor cells from individuals with schizophrenia. Nat. Commun. 9, 4412 (2018).

    PubMed  PubMed Central  Google Scholar 

  28. 28.

    Duan, Q. et al. LINCS Canvas Browser: interactive web app to query, browse and interrogate LINCS L1000 gene expression signatures. Nucleic Acids Res. 42, W449–W460 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  29. 29.

    Charbogne, P., Kieffer, B. L. & Befort, K. 15 years of genetic approaches in vivo for addiction research: opioid receptor and peptide gene knockout in mouse models of drug abuse. Neuropharmacology 76, 204–217 (2014).

    CAS  PubMed  Google Scholar 

  30. 30.

    Vasilatos, S. N. et al. Crosstalk between lysine-specific demethylase 1 (LSD1) and histone deacetylases mediates antineoplastic efficacy of HDAC inhibitors in human breast cancer cells. Carcinogenesis 34, 1196–1207 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  31. 31.

    Shahbazi, J. et al. The bromodomain inhibitor JQ1 and the histone deacetylase inhibitor panobinostat synergistically reduce N-Myc expression and induce anticancer effects. Clin. Cancer Res. 22, 2534–2544 (2016).

    CAS  PubMed  Google Scholar 

  32. 32.

    Walasek, M. A. et al. The combination of valproic acid and lithium delays hematopoietic stem/progenitor cell differentiation. Blood 119, 3050–3059 (2012).

    CAS  PubMed  Google Scholar 

  33. 33.

    Slowikowski, K. et al. CUX1 and IκBζ (NFKBIZ) mediate the synergistic inflammatory response to TNF and IL-17A in stromal fibroblasts. Proc. Natl Acad. Sci. USA 117, 5532–5541 (2020).

    CAS  PubMed  Google Scholar 

  34. 34.

    Kuchenov, D. et al. A combinatorial extracellular code tunes the intracellular signaling network activity to distinct cellular responses. Preprint at (2018).

  35. 35.

    Fursova, N. A. et al. Synergy between variant PRC1 complexes defines polycomb-mediated gene repression. Mol. Cell. 74, 1020–1036 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  36. 36.

    Glover, K. P., Chen, Z., Markell, L. K. & Han, X. Synergistic gene expression signature observed in TK6 cells upon co-exposure to UVC-irradiation and protein kinase C-activating tumor promoters. PLoS ONE 10, e0139850 (2015).

    PubMed  PubMed Central  Google Scholar 

  37. 37.

    Licciardello, M. P. et al. A combinatorial screen of the CLOUD uncovers a synergy targeting the androgen receptor. Nat. Chem. Biol. 13, 771–778 (2017).

    CAS  PubMed  Google Scholar 

  38. 38.

    Sriraman, A. et al. Cooperation of Nutlin-3a and a Wip1 inhibitor to induce p53 activity. Oncotarget 7, 31623–31638 (2016).

    PubMed  PubMed Central  Google Scholar 

  39. 39.

    Gupta, S. et al. IL-6 augments IL-4-induced polarization of primary human macrophages through synergy of STAT3, STAT6 and BATF transcription factors. Oncoimmunology 7, e1494110 (2018).

    PubMed  PubMed Central  Google Scholar 

  40. 40.

    Goldstein, I., Paakinaho, V., Baek, S., Sung, M. H. & Hager, G. L. Synergistic gene expression during the acute phase response is characterized by transcription factor assisted loading. Nat. Commun. 8, 1849 (2017).

    PubMed  PubMed Central  Google Scholar 

  41. 41.

    Oner, M. G. et al. Combined inactivation of TP53 and MIR34A promotes colorectal cancer development and progression in mice via increasing levels of IL6R and PAI1. Gastroenterology 155, 1868–1882 (2018).

    PubMed  Google Scholar 

  42. 42.

    Smitheman, K. N. et al. Lysine specific demethylase 1 inactivation enhances differentiation and promotes cytotoxic response when combined with all-trans retinoic acid in acute myeloid leukemia across subtypes. Haematologica 104, 1156–1167 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  43. 43.

    Rajaraman, S. et al. Measles virus-based treatments trigger a pro-inflammatory cascade and a distinctive immunopeptidome in glioblastoma. Mol. Ther. Oncolytics 12, 147–161 (2019).

    CAS  PubMed  Google Scholar 

  44. 44.

    Echevarria-Vargas, I. M. et al. Co-targeting BET and MEK as salvage therapy for MAPK and checkpoint inhibitor-resistant melanoma. EMBO Mol. Med. 10, e8446 (2018).

  45. 45.

    Storey, J. D. & Tibshirani, R. Statistical significance for genomewide studies. Proc. Natl Acad. Sci. USA 100, 9440–9445 (2003).

    CAS  PubMed  Google Scholar 

  46. 46.

    Corney, D. C. RNA-seq using next generation sequencing. Mater. Methods 3, 203 (2013).

  47. 47.

    Hoffman, G. E. & Schadt, E. E. variancePartition: interpreting drivers of variation in complex gene expression studies. BMC Bioinformatics 17, 483 (2016).

    PubMed  PubMed Central  Google Scholar 

  48. 48.

    Hoffman, G. E. et al. Transcriptional signatures of schizophrenia in hiPSC-derived NPCs and neurons are concordant with post-mortem adult brains. Nat. Commun. 8, 2225 (2017).

    PubMed  PubMed Central  Google Scholar 

  49. 49.

    Ritchie, M. E. et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 43, e47 (2015).

    PubMed  PubMed Central  Google Scholar 

  50. 50.

    Robinson, M. D., McCarthy, D. J. & Smyth, G. K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140 (2010).

    CAS  Google Scholar 

  51. 51.

    Kolde, R. pheatmap: Pretty Heatmaps. R package version 1.0.12 (2019).

  52. 52.

    Neuwirth, E. RColorBrewer: ColorBrewer Palettes. R Package Version 1.1-2 (2014).

  53. 53.

    Wickham, H. ggplot2: Elegant Graphics for Data Analysis (Springer, 2016)..

  54. 54.

    Kassambara, A. ggpubr: ‘ggplot2' Based Publication Ready Plots. R Package Version 0.2.5 (2020).

  55. 55.

    Storey, J. D., Bass A.J., Dabney, A. & Robinson, D. qvalue: Q-Value Estimation for False Discovery Rate Control. R Package Version 2.18.0 (2019).

  56. 56.

    Wickham, H. The split-apply-combine strategy for data analysis. J. Stat. Soft. 40, 1–29 (2011).

    Google Scholar 

  57. 57.

    Ram, K. & Wickham, H. wesanderson: A Wes Anderson Palette Generator. R Package Version 0.3.6 (2018).

  58. 58.

    Morgan, M., Falcon, S. & Gentleman, R. GSEABase: Gene Set Enrichment Data Structures And Methods. R Package Version 1.48.0. (2019).

  59. 59.

    R Core Team. R: A Language and Environment for Statistical Computing (2019).

  60. 60.

    Wickham, H. & Seidel, D. scales: Scale Functions for Visualization. R Package Version 1.1.1. (2020).

  61. 61.

    Wang, J. & Liao, Y. WebGestaltR: Gene Set Analysis Toolkit WebGestaltR. R Package Version 0.4.3. (2020).

  62. 62.

    Wickham, H. stringr: Simple, Consistent Wrappers for Common String Operations. R Package Version 1.4.0 (2019).

  63. 63.

    Ho, S. M. et al. Rapid Ngn2-induction of excitatory neurons from hiPSC-derived neural progenitor cells. Methods 101, 113–124 (2016).

    CAS  PubMed  Google Scholar 

  64. 64.

    Ho, S. M. et al. Evaluating synthetic activation and repression of neuropsychiatric-related genes in hiPSC-derived NPCs, neurons, and astrocytes. Stem Cell Reports 9, 615–628 (2017).

Download references


This work was partially supported by National Institutes of Health grants R56 MH101454 (K.J.B.) and R01 MH106056 (K.J.B.). This work was supported, in part, through the computational resources and staff expertise provided by Scientific Computing at the Icahn School of Medicine at Mount Sinai.

Author information




N.S., together with K.J.B. and G.H., developed the synergistic analysis. N.S., C.S. and P.J.M.D. independently ran the synergistic analysis on the same dataset and tested the analysis on separate datasets. N.S. and K.J.B. wrote the manuscript.

Corresponding authors

Correspondence to Gabriel Hoffman or Kristen J. Brennand.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature Protocols thanks Yang Zhou and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Related links

Key reference using this protocol

Schrode, N. et al. Nat. Genet. 51, 1475–1485 (2019):

Key data used in this protocol

Echevarria-Vargas, I. M. et al. EMBO Mol. Med. 10, e8446 (2018):

Extended data

Extended Data Fig. 1 Differential expression analysis output, related to Fig. 3.

a) Plot showing counts over cpm. Horizontal red line marks 10 counts. Arrow indicates the intersection with the plotted data, which here equals 1.4 cpm (vertical red line). b) MDS plots highlighting two metadata variables respectively. Sample data are separated by treatment (left), but not by replicate (right). c) Voom mean-variance plot. d) Volcano and mean difference (MA) plots of differential expression in the additive (left) and the combinatorial (right) comparisons. Significantly differentially expressed genes are highlighted in blue and red (Volcano plot) and the top 10 significant genes are denoted in blue (MA plot).

Extended Data Fig. 2 Synergistic effect analysis output, related to Fig. 4.

a) Plot visualizing synergistic effect power calculations. X-axis shows synergistic log2FCs. In the current example, 10 samples per condition are required to resolve a synergistic log2FC of 1.6 at 75% power. b) Histogram of synergistic P-values. c) Pie chart showing the proportions of genes that fall into different synergistic differential expression categories. d) Hierarchical clustering of the differential expression log2(fold changes) of all synergy categories, in the additive model versus the combinatorial perturbation comparisons.

Extended Data Fig. 3 Gene set enrichment analysis (GSEA) output, related to Fig. 5.

a) Competitive GSEA of differential expression in the additive (top) and the combinatorial (bottom) comparisons using limma camera, based on two cancer hallmark gene sets. b) Bar chart showing detailed results of the 10 most significant gene sets as in (A). Red lines denote enrichment FDR of 5%.

Extended Data Fig. 4 Over-representation analysis (ORA) output, related to Fig. 6.

a - b) Over-representation analysis, using a hypergeometric test, of 2 publicly available gene sets and those ‘more downregulated’ (A) and ‘more upregulated’ (B) genes with significant synergistic differential expression (FDR < 1%), ranked by adjusted significance. Red lines denote enrichment FDR of 5%.

Supplementary information

Reporting Summary

Supplementary Data 1

R code, count matrix, metadata file and gene annotation file for analysis used in ref. 20.

Supplementary Data 2

R code, count matrix, metadata file, gene annotation file and gene sets for supplementary analysis using data from ref. 44.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Schrode, N., Seah, C., Deans, P.J.M. et al. Analysis framework and experimental design for evaluating synergy-driving gene expression. Nat Protoc 16, 812–840 (2021).

Download citation


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing