The mechanisms by which genetic risk variants interact with each other, as well as environmental factors, to contribute to complex genetic disorders remain unclear. We describe in detail our recently published approach to resolve distinct additive and synergistic transcriptomic effects after combinatorial manipulation of genetic variants and/or chemical perturbagens. Although first developed for CRISPR-based perturbation studies of isogenic human induced pluripotent stem cell-derived neurons, our methodology can be broadly applied to any RNA sequencing dataset, provided that raw read counts are available. Whereas other differential expression analyses reveal the effect of individual perturbations, here we specifically query interactions between two or more perturbagens, resolving the extent of non-additive (synergistic) interactions between perturbations. We discuss the careful experimental design required to resolve synergistic effects and considerations of statistical power and how to quantify observed synergy between experiments. Additionally, we speculate on potential future applications and explore the obvious limitations of this approach. Overall, by interrogating the effect of independent factors, alone and in combination, our analytic framework and experimental design facilitate the discovery of convergence and synergy downstream of gene and/or treatment perturbations hypothesized to contribute to complex diseases. We think that this protocol can be successfully applied by any scientist with bioinformatic skills and basic proficiency in the R programming language. Our computational pipeline (https://github.com/nadschro/synergy-analysis) is straightforward, does not require supercomputing support and can be conducted in a single day upon completion of RNA sequencing experiments.
Subscribe to Journal
Get full journal access for 1 year
only $9.92 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Code is available at https://github.com/nadschro/synergy-analysis, under the MIT License.
Pardinas, A. F. et al. Common schizophrenia alleles are enriched in mutation-intolerant genes and in regions under strong background selection. Nat. Genet. 50, 381–389 (2018).
Nalls, M. A. et al. Expanding Parkinson’s disease genetics: novel risk loci, genomic context, causal insights and heritable risk. Preprint at https://www.biorxiv.org/content/10.1101/388165v2 (2019).
Nelson, C. P. et al. Association analyses based on false discovery rate implicate new loci for coronary artery disease. Nat. Genet. 49, 1385–1391 (2017).
Xue, A. et al. Genome-wide association analyses identify 143 risk variants and putative regulatory mechanisms for type 2 diabetes. Nat. Commun. 9, 2941 (2018).
Satterstrom, F. K. et al. Autism spectrum disorder and attention deficit hyperactivity disorder have a similar burden of rare protein-truncating variants. Nat. Neurosci. 22, 1961–1965 (2019).
Cavalli, G. & Heard, E. Advances in epigenetics link genetics to the environment and disease. Nature 571, 489–499 (2019).
Chaste, P. & Leboyer, M. Autism risk factors: genes, environment, and gene-environment interactions. Dialogues Clin. Neurosci. 14, 281–292 (2012).
Visscher, P. M. et al. 10 years of GWAS discovery: biology, function, and translation. Am. J. Hum. Genet. 101, 5–22 (2017).
Ye, C. J. et al. Intersection of population variation and autoimmunity genetics in human T cell activation. Science 345, 1254665 (2014).
Moyerbrailean, G. A. et al. High-throughput allele-specific expression across 250 environmental conditions. Genome Res. 26, 1627–1638 (2016).
Phillips, P. C. Epistasis—the essential role of gene interactions in the structure and evolution of genetic systems. Nat. Rev. Genet. 9, 855–867 (2008).
Wray, N. R., Wijmenga, C., Sullivan, P. F., Yang, J. & Visscher, P. M. Common disease is more complex than implied by the core gene omnigenic model. Cell 173, 1573–1580 (2018).
Boyle, E. A., Li, Y. I. & Pritchard, J. K. An expanded view of complex traits: from polygenic to omnigenic. Cell 169, 1177–1186 (2017).
Baeza-Centurion, P., Minana, B., Schmiedel, J. M., Valcarcel, J. & Lehner, B. Combinatorial genetics reveals a scaling law for the effects of mutations on splicing. Cell 176, 549–563 (2019).
Kuzmin, E. et al. Systematic analysis of complex genetic interactions. Science 360, eaao1729 (2018).
VanderSluis, B. et al. Integrating genetic and protein–protein interaction networks maps a functional wiring diagram of a cell. Curr. Opin. Microbiol. 45, 170–179 (2018).
Shalem, O., Sanjana, N. E. & Zhang, F. High-throughput functional genomics using CRISPR–Cas9. Nat. Rev. Genet. 16, 299–311 (2015).
Rehbach, K., Fernando, M. B. & Brennand, K. J. Integrating CRISPR engineering and hiPSC-derived 2D disease modeling systems. J. Neurosci. 40, 1176–1185 (2020).
Hoffman, G. E., Schrode, N., Flaherty, E. & Brennand, K. J. New considerations for hiPSC-based models of neuropsychiatric disorders. Mol. Psychiatry 24, 49–66 (2019).
Schrode, N. et al. Synergistic effects of common schizophrenia risk variants. Nat. Genet. 51, 1475–1485 (2019).
Wang, M. et al. Transformative network modeling of multi-omics data reveals detailed circuits, key regulators, and potential therapeutics for Alzheimer’s disease. Neuron https://doi.org/10.1016/j.neuron.2020.11.002 (2020).
Elam, K. K., Clifford, S., Shaw, D. S., Wilson, M. N. & Lemery-Chalfant, K. Gene set enrichment analysis to create polygenic scores: a developmental examination of aggression. Transl. Psychiatry 9, 212 (2019).
Choi, S. W. & O'Reilly, P. F. PRSice-2: polygenic risk score software for biobank-scale data. Gigascience 8, giz082 (2019).
Mimitou, E. P. et al. Multiplexed detection of proteins, transcriptomes, clonotypes and CRISPR perturbations in single cells. Nat. Methods 16, 409–412 (2019).
Dixit, A. et al. Perturb-Seq: dissecting molecular circuits with scalable single-cell RNA profiling of pooled genetic screens. Cell 167, 1853–1866 (2016).
Datlinger, P. et al. Pooled CRISPR screening with single-cell transcriptome readout. Nat. Methods 14, 297–301 (2017).
Readhead, B. et al. Expression-based drug screening of neural progenitor cells from individuals with schizophrenia. Nat. Commun. 9, 4412 (2018).
Duan, Q. et al. LINCS Canvas Browser: interactive web app to query, browse and interrogate LINCS L1000 gene expression signatures. Nucleic Acids Res. 42, W449–W460 (2014).
Charbogne, P., Kieffer, B. L. & Befort, K. 15 years of genetic approaches in vivo for addiction research: opioid receptor and peptide gene knockout in mouse models of drug abuse. Neuropharmacology 76, 204–217 (2014).
Vasilatos, S. N. et al. Crosstalk between lysine-specific demethylase 1 (LSD1) and histone deacetylases mediates antineoplastic efficacy of HDAC inhibitors in human breast cancer cells. Carcinogenesis 34, 1196–1207 (2013).
Shahbazi, J. et al. The bromodomain inhibitor JQ1 and the histone deacetylase inhibitor panobinostat synergistically reduce N-Myc expression and induce anticancer effects. Clin. Cancer Res. 22, 2534–2544 (2016).
Walasek, M. A. et al. The combination of valproic acid and lithium delays hematopoietic stem/progenitor cell differentiation. Blood 119, 3050–3059 (2012).
Slowikowski, K. et al. CUX1 and IκBζ (NFKBIZ) mediate the synergistic inflammatory response to TNF and IL-17A in stromal fibroblasts. Proc. Natl Acad. Sci. USA 117, 5532–5541 (2020).
Kuchenov, D. et al. A combinatorial extracellular code tunes the intracellular signaling network activity to distinct cellular responses. Preprint at https://www.biorxiv.org/content/10.1101/346957v1 (2018).
Fursova, N. A. et al. Synergy between variant PRC1 complexes defines polycomb-mediated gene repression. Mol. Cell. 74, 1020–1036 (2019).
Glover, K. P., Chen, Z., Markell, L. K. & Han, X. Synergistic gene expression signature observed in TK6 cells upon co-exposure to UVC-irradiation and protein kinase C-activating tumor promoters. PLoS ONE 10, e0139850 (2015).
Licciardello, M. P. et al. A combinatorial screen of the CLOUD uncovers a synergy targeting the androgen receptor. Nat. Chem. Biol. 13, 771–778 (2017).
Sriraman, A. et al. Cooperation of Nutlin-3a and a Wip1 inhibitor to induce p53 activity. Oncotarget 7, 31623–31638 (2016).
Gupta, S. et al. IL-6 augments IL-4-induced polarization of primary human macrophages through synergy of STAT3, STAT6 and BATF transcription factors. Oncoimmunology 7, e1494110 (2018).
Goldstein, I., Paakinaho, V., Baek, S., Sung, M. H. & Hager, G. L. Synergistic gene expression during the acute phase response is characterized by transcription factor assisted loading. Nat. Commun. 8, 1849 (2017).
Oner, M. G. et al. Combined inactivation of TP53 and MIR34A promotes colorectal cancer development and progression in mice via increasing levels of IL6R and PAI1. Gastroenterology 155, 1868–1882 (2018).
Smitheman, K. N. et al. Lysine specific demethylase 1 inactivation enhances differentiation and promotes cytotoxic response when combined with all-trans retinoic acid in acute myeloid leukemia across subtypes. Haematologica 104, 1156–1167 (2019).
Rajaraman, S. et al. Measles virus-based treatments trigger a pro-inflammatory cascade and a distinctive immunopeptidome in glioblastoma. Mol. Ther. Oncolytics 12, 147–161 (2019).
Echevarria-Vargas, I. M. et al. Co-targeting BET and MEK as salvage therapy for MAPK and checkpoint inhibitor-resistant melanoma. EMBO Mol. Med. 10, e8446 (2018).
Storey, J. D. & Tibshirani, R. Statistical significance for genomewide studies. Proc. Natl Acad. Sci. USA 100, 9440–9445 (2003).
Corney, D. C. RNA-seq using next generation sequencing. Mater. Methods 3, 203 (2013).
Hoffman, G. E. & Schadt, E. E. variancePartition: interpreting drivers of variation in complex gene expression studies. BMC Bioinformatics 17, 483 (2016).
Hoffman, G. E. et al. Transcriptional signatures of schizophrenia in hiPSC-derived NPCs and neurons are concordant with post-mortem adult brains. Nat. Commun. 8, 2225 (2017).
Ritchie, M. E. et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 43, e47 (2015).
Robinson, M. D., McCarthy, D. J. & Smyth, G. K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140 (2010).
Kolde, R. pheatmap: Pretty Heatmaps. R package version 1.0.12 https://CRAN.R-project.org/package=pheatmap (2019).
Neuwirth, E. RColorBrewer: ColorBrewer Palettes. R Package Version 1.1-2 https://CRAN.R-project.org/package=RColorBrewer (2014).
Wickham, H. ggplot2: Elegant Graphics for Data Analysis (Springer, 2016)..
Kassambara, A. ggpubr: ‘ggplot2' Based Publication Ready Plots. R Package Version 0.2.5 https://CRAN.R-project.org/package=ggpubr (2020).
Storey, J. D., Bass A.J., Dabney, A. & Robinson, D. qvalue: Q-Value Estimation for False Discovery Rate Control. R Package Version 2.18.0 http://github.com/jdstorey/qvalue (2019).
Wickham, H. The split-apply-combine strategy for data analysis. J. Stat. Soft. 40, 1–29 (2011).
Ram, K. & Wickham, H. wesanderson: A Wes Anderson Palette Generator. R Package Version 0.3.6 https://CRAN.R-project.org/package=wesanderson (2018).
Morgan, M., Falcon, S. & Gentleman, R. GSEABase: Gene Set Enrichment Data Structures And Methods. R Package Version 1.48.0. https://bioconductor.org/packages/release/bioc/html/GSEABase.html (2019).
R Core Team. R: A Language and Environment for Statistical Computing https://www.R-project.org/ (2019).
Wickham, H. & Seidel, D. scales: Scale Functions for Visualization. R Package Version 1.1.1. https://CRAN.R-project.org/package=scales (2020).
Wang, J. & Liao, Y. WebGestaltR: Gene Set Analysis Toolkit WebGestaltR. R Package Version 0.4.3. https://CRAN.R-project.org/package=WebGestaltR (2020).
Wickham, H. stringr: Simple, Consistent Wrappers for Common String Operations. R Package Version 1.4.0 https://CRAN.R-project.org/package=stringr (2019).
Ho, S. M. et al. Rapid Ngn2-induction of excitatory neurons from hiPSC-derived neural progenitor cells. Methods 101, 113–124 (2016).
Ho, S. M. et al. Evaluating synthetic activation and repression of neuropsychiatric-related genes in hiPSC-derived NPCs, neurons, and astrocytes. Stem Cell Reports 9, 615–628 (2017).
This work was partially supported by National Institutes of Health grants R56 MH101454 (K.J.B.) and R01 MH106056 (K.J.B.). This work was supported, in part, through the computational resources and staff expertise provided by Scientific Computing at the Icahn School of Medicine at Mount Sinai.
The authors declare no competing interests.
Peer review information Nature Protocols thanks Yang Zhou and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Key reference using this protocol
Schrode, N. et al. Nat. Genet. 51, 1475–1485 (2019): https://doi.org/10.1038/s41588-019-0497-5
Key data used in this protocol
Echevarria-Vargas, I. M. et al. EMBO Mol. Med. 10, e8446 (2018): https://doi.org/10.15252/emmm.201708446
a) Plot showing counts over cpm. Horizontal red line marks 10 counts. Arrow indicates the intersection with the plotted data, which here equals 1.4 cpm (vertical red line). b) MDS plots highlighting two metadata variables respectively. Sample data are separated by treatment (left), but not by replicate (right). c) Voom mean-variance plot. d) Volcano and mean difference (MA) plots of differential expression in the additive (left) and the combinatorial (right) comparisons. Significantly differentially expressed genes are highlighted in blue and red (Volcano plot) and the top 10 significant genes are denoted in blue (MA plot).
a) Plot visualizing synergistic effect power calculations. X-axis shows synergistic log2FCs. In the current example, 10 samples per condition are required to resolve a synergistic log2FC of 1.6 at 75% power. b) Histogram of synergistic P-values. c) Pie chart showing the proportions of genes that fall into different synergistic differential expression categories. d) Hierarchical clustering of the differential expression log2(fold changes) of all synergy categories, in the additive model versus the combinatorial perturbation comparisons.
a) Competitive GSEA of differential expression in the additive (top) and the combinatorial (bottom) comparisons using limma camera, based on two cancer hallmark gene sets. b) Bar chart showing detailed results of the 10 most significant gene sets as in (A). Red lines denote enrichment FDR of 5%.
a - b) Over-representation analysis, using a hypergeometric test, of 2 publicly available gene sets and those ‘more downregulated’ (A) and ‘more upregulated’ (B) genes with significant synergistic differential expression (FDR < 1%), ranked by adjusted significance. Red lines denote enrichment FDR of 5%.
About this article
Cite this article
Schrode, N., Seah, C., Deans, P.J.M. et al. Analysis framework and experimental design for evaluating synergy-driving gene expression. Nat Protoc 16, 812–840 (2021). https://doi.org/10.1038/s41596-020-00436-7