Tissue-specific enhancer–gene maps from multimodal single-cell data identify causal disease alleles

Sakaue, Saori; Weinand, Kathryn; Isaac, Shakson; Dey, Kushal K.; Jagadeesh, Karthik; Kanai, Masahiro; Watts, Gerald F. M.; Zhu, Zhu; Brenner, Michael B.; McDavid, Andrew; Donlin, Laura T.; Wei, Kevin; Price, Alkes L.; Raychaudhuri, Soumya

doi:10.1038/s41588-024-01682-1

Article
Published: 09 April 2024

Tissue-specific enhancer–gene maps from multimodal single-cell data identify causal disease alleles

Nature Genetics volume 56, pages 615–626 (2024)Cite this article

6210 Accesses
92 Altmetric
Metrics details

Subjects

Abstract

Translating genome-wide association study (GWAS) loci into causal variants and genes requires accurate cell-type-specific enhancer–gene maps from disease-relevant tissues. Building enhancer–gene maps is essential but challenging with current experimental methods in primary human tissues. Here we developed a nonparametric statistical method, SCENT (single-cell enhancer target gene mapping), that models association between enhancer chromatin accessibility and gene expression in single-cell or nucleus multimodal RNA sequencing and ATAC sequencing data. We applied SCENT to 9 multimodal datasets including >120,000 single cells or nuclei and created 23 cell-type-specific enhancer–gene maps. These maps were highly enriched for causal variants in expression quantitative loci and GWAS for 1,143 diseases and traits. We identified likely causal genes for both common and rare diseases and linked somatic mutation hotspots to target genes. We demonstrate that application of SCENT to multimodal data from disease-relevant human tissue enables the scalable construction of accurate cell-type-specific enhancer–gene maps, essential for defining noncoding variant function.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Schematic overview of SCENT and SCENT enhancer–gene pairs across nine single-cell multimodal datasets.**

**Fig. 2: SCENT identified functionally active and evolutionarily conserved *cis*-regulatory regions from single-cell multimodal data.**

**Fig. 3: SCENT enhancers are enriched in putative causal variants of eQTL and GWAS.**

**Fig. 4: SCENT defined causal variants and genes in complex trait GWAS.**

Single-cell multi-ome regression models identify functional and disease-associated enhancers and enable chromatin potential analysis

Article Open access 21 March 2024

Multi-context genetic modeling of transcriptional regulation resolves novel disease loci

Article Open access 28 September 2022

Regulatory genomic circuitry of human disease loci by integrative epigenomics

Article Open access 03 February 2021

Data availability

The publicly available datasets were downloaded via Gene Expression Omnibus (accession codes GSE140203, GSE156478, GSE178707, GSE194122, GSE193240 and GSE178453) or web repository (https://www.10xgenomics.com/resources/datasets?query=&page=1&configure%5Bfacets%5D%5B0%5D=chemistryVersionAndThroughput&configure%5Bfacets%5D%5B1%5D=pipeline.version&configure%5BhitsPerPage%5D=500&menu%5Bproducts.name%5D=Single%20Cell%20Multiome%20ATAC%20%2B%20Gene%20Expression). The raw data for arthritis-tissue dataset (single-cell multimodal RNA/ATAC–seq and single-cell ATAC–seq) are deposited at the NIH Database of Genotypes and Phenotypes (dbGaP accession number phs003417.v1.p1) and the Gene Expression Omnibus (GEO accession number GSE243917).

Code availability

The computational scripts related to this manuscript are available at https://github.com/immunogenomics/SCENT (https://doi.org/10.5281/zenodo.10452116)¹²⁴.

References

Welter, D. et al. The NHGRI GWAS Catalog, a curated resource of SNP–trait associations. Nucleic Acids Res. 42, D1001–D1006 (2014).
Article CAS PubMed Google Scholar
Visscher, P. M. et al. 10 years of GWAS discovery: biology, function, and translation. Am. J. Hum. Genet 101, 5–22 (2017).
Article CAS PubMed PubMed Central Google Scholar
Buniello, A. et al. The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res. 47, D1005–D1012 (2019).
Article CAS PubMed Google Scholar
Claussnitzer, M. et al. A brief history of human disease genetics. Nature 577, 179–189 (2020).
Article CAS PubMed PubMed Central Google Scholar
Plenge, R. M., Scolnick, E. M. & Altshuler, D. Validating therapeutic targets through human genetics. Nat. Rev. Drug Discov. 12, 581–594 (2013).
Article CAS PubMed Google Scholar
Shendure, J., Findlay, G. M. & Snyder, M. W. Genomic medicine—progress, pitfalls, and promise. Cell 177, 45–57 (2019).
Article CAS PubMed PubMed Central Google Scholar
Schaid, D. J., Chen, W. & Larson, N. B. From genome-wide associations to candidate causal variants by statistical fine-mapping. Nat. Rev. Genet. 19, 491–504 (2018).
Article CAS PubMed PubMed Central Google Scholar
Maurano, M. T. et al. Systematic localization of common disease-associated variation in regulatory DNA. Science 337, 1190–1195 (2012).
Article CAS PubMed PubMed Central Google Scholar
Edwards, S. L., Beesley, J., French, J. D. & Dunning, M. Beyond GWASs: illuminating the dark road from association to function. Am. J. Hum. Genet 93, 779–797 (2013).
Article CAS PubMed PubMed Central Google Scholar
Trynka, G. et al. Chromatin marks identify critical cell types for fine mapping complex trait variants. Nat. Genet. 45, 124–130 (2013).
Article CAS PubMed Google Scholar
Sanyal, A., Lajoie, B. R., Jain, G. & Dekker, J. The long-range interaction landscape of gene promoters. Nature 489, 109–113 (2012).
Article CAS PubMed PubMed Central Google Scholar
Smemo, S. et al. Obesity-associated variants within FTO form long-range functional connections with IRX3. Nature 507, 371–375 (2014).
Article CAS PubMed PubMed Central Google Scholar
Won, H. et al. Chromosome conformation elucidates regulatory relationships in developing human brain. Nature 538, 523–527 (2016).
Article PubMed PubMed Central Google Scholar
Strober, B. J. et al. Dynamic genetic regulation of gene expression during cellular differentiation. Science 364, 1287–1290 (2019).
Article CAS PubMed PubMed Central Google Scholar
Cuomo, A. S. E. et al. Single-cell RNA-sequencing of differentiating iPS cells reveals dynamic genetic effects on gene expression. Nat. Commun. 11, 810 (2020).
Article CAS PubMed PubMed Central Google Scholar
Zhernakova, D. V. et al. Identification of context-dependent expression quantitative trait loci in whole blood. Nat. Genet. 49, 139–145 (2017).
Article CAS PubMed Google Scholar
Nathan, A. et al. Single-cell eQTL models reveal dynamic T cell state dependence of disease loci. Nature 606, 120–128 (2022).
Article CAS PubMed PubMed Central Google Scholar
Wakefield, J. A Bayesian measure of the probability of false discovery in genetic epidemiology studies. Am. J. Hum. Genet. 81, 208–227 (2007).
Article CAS PubMed PubMed Central Google Scholar
Maller, J. B. et al. Bayesian refinement of association signals for 14 loci in 3 common diseases. Nat. Genet. 44, 1294–1301 (2012).
Article CAS PubMed PubMed Central Google Scholar
Hormozdiari, F., Kostem, E., Kang, E. Y., Pasaniuc, B. & Eskin, E. Identifying causal variants at loci with multiple signals of association. Genetics 198, 497–508 (2014).
Article CAS PubMed PubMed Central Google Scholar
Benner, C. et al. FINEMAP: efficient variable selection using summary data from genome-wide association studies. Bioinformatics 32, 1493–1501 (2016).
Article CAS PubMed PubMed Central Google Scholar
Wang, G., Sarkar, A., Carbonetto, P. & Stephens, M. A simple new approach to variable selection in regression, with application to genetic fine mapping. J. R. Stat. Soc. Ser. B 82, 1273–1300 (2020).
Article Google Scholar
Weissbrod, O. et al. Functionally informed fine-mapping and polygenic localization of complex trait heritability. Nat. Genet. 52, 1355–1363 (2020).
Article CAS PubMed PubMed Central Google Scholar
Wojcik, G. L. et al. Genetic analyses of diverse populations improves discovery for complex traits. Nature 570, 514–518 (2019).
Article CAS PubMed PubMed Central Google Scholar
Chen, M. H. et al. Trans-ethnic and ancestry-specific blood-cell genetics in 746,667 individuals from 5 global populations. Cell 182, 1198–1213.e14 (2020).
Article CAS PubMed PubMed Central Google Scholar
Ishigaki, K. et al. Multi-ancestry genome-wide association analyses identify novel genetic mechanisms in rheumatoid arthritis. Nat. Genet. 54, 1640–1651 (2022).
Article CAS PubMed PubMed Central Google Scholar
Kichaev, G. & Pasaniuc, B. Leveraging functional-annotation data in trans-ethnic fine-mapping studies. Am. J. Hum. Genet. 97, 260–271 (2015).
Article CAS PubMed PubMed Central Google Scholar
Kanai, M. et al. Insights from complex trait fine-mapping across diverse populations. Preprint at medRxiv https://doi.org/10.1101/2021.09.03.21262975 (2021).
Huang, H. et al. Fine-mapping inflammatory bowel disease loci to single-variant resolution. Nature 547, 173–178 (2017).
Article CAS PubMed PubMed Central Google Scholar
Farh, K. K. H. et al. Genetic and epigenetic fine mapping of causal autoimmune disease variants. Nature 518, 337–343 (2014).
Article PubMed PubMed Central Google Scholar
Mahajan, A. et al. Fine-mapping type 2 diabetes loci to single-variant resolution using high-density imputation and islet-specific epigenome maps. Nat. Genet. 50, 1505–1513 (2018).
Article CAS PubMed PubMed Central Google Scholar
Kichaev, G. et al. Integrating functional data to prioritize causal variants in statistical fine-mapping studies. PLoS Genet. 10, e1004722 (2014).
Article PubMed PubMed Central Google Scholar
Roadmap Epigenomics Consortium et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).
Chen, L. et al. Genetic drivers of epigenetic and transcriptional variation in human immune cells. Cell 167, 1398–1414.e24 (2016).
Article CAS PubMed PubMed Central Google Scholar
Dunham, I. et al. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).
Article CAS Google Scholar
Ernst, J. et al. Mapping and analysis of chromatin state dynamics in nine human cell types. Nature 473, 43–49 (2011).
Article CAS PubMed PubMed Central Google Scholar
Boix, C. A., James, B. T., Park, Y. P., Meuleman, W. & Kellis, M. Regulatory genomic circuitry of human disease loci by integrative epigenomics. Nature 590, 300–307 (2021).
Article CAS PubMed PubMed Central Google Scholar
Fulco, C. P. et al. Activity-by-contact model of enhancer-promoter regulation from thousands of CRISPR perturbations. Nat. Genet. 51, 1664 (2019).
Article CAS PubMed PubMed Central Google Scholar
Nasser, J. et al. Genome-wide enhancer maps link risk variants to disease genes. Nature 593, 238–243 (2021).
Article CAS PubMed PubMed Central Google Scholar
Gazal, S. et al. Combining SNP-to-gene linking strategies to identify disease genes and assess disease omnigenicity. Nat. Genet. 54, 827–836 (2022).
Article CAS PubMed PubMed Central Google Scholar
Pickar-Oliver, A. & Gersbach, C. A. The next generation of CRISPR–Cas technologies and applications. Nat. Rev. Mol. Cell Biol. 20, 490–507 (2019).
Article CAS PubMed PubMed Central Google Scholar
Anzalone, A. V., Koblan, L. W. & Liu, D. R. Genome editing with CRISPR–Cas nucleases, base editors, transposases and prime editors. Nat. Biotechnol. 38, 824–844 (2020).
Article CAS PubMed Google Scholar
Baglaenko, Y., Macfarlane, D., Marson, A., Nigrovic, P. A. & Raychaudhuri, S. Genome editing to define the function of risk loci and variants in rheumatic disease. Nat. Rev. Rheumatol. 17, 462–474 (2021).
Article CAS PubMed PubMed Central Google Scholar
Cao, J. et al. Joint profiling of chromatin accessibility and gene expression in thousands of single cells. Science 361, 1380–1385 (2018).
Article CAS PubMed PubMed Central Google Scholar
Chen, S., Lake, B. B. & Zhang, K. High-throughput sequencing of the transcriptome and chromatin accessibility in the same cell. Nat. Biotechnol. 37, 1452–1457 (2019).
Article CAS PubMed PubMed Central Google Scholar
Ma, S. et al. Chromatin potential identified by shared single-cell profiling of RNA and chromatin. Cell 183, 1103–1116.e20 (2020).
Article CAS PubMed PubMed Central Google Scholar
Allaway, K. C. et al. Genetic and epigenetic coordination of cortical interneuron development. Nature 597, 693–697 (2021).
Article CAS PubMed PubMed Central Google Scholar
Trevino, A. E. et al. Chromatin and gene-regulatory dynamics of the developing human cerebral cortex at single-cell resolution. Cell 184, 5053–5069.e23 (2021).
Article CAS PubMed Google Scholar
Granja, J. M. et al. ArchR is a scalable software package for integrative single-cell chromatin accessibility analysis. Nat. Genet. 53, 403–411 (2021).
Article CAS PubMed PubMed Central Google Scholar
Stuart, T., Srivastava, A., Madad, S., Lareau, C. A. & Satija, R. Single-cell chromatin state analysis with Signac. Nat. Methods 18, 1333–1341 (2021).
Article CAS PubMed PubMed Central Google Scholar
Pliner, H. A. et al. Cicero predicts cis-regulatory DNA interactions from single-cell chromatin accessibility data. Mol. Cell 71, 858–871.e8 (2018).
Article CAS PubMed PubMed Central Google Scholar
Lähnemann, D. et al. Eleven grand challenges in single-cell data science. Genome Biol. 21, 31 (2020).
Article PubMed PubMed Central Google Scholar
Sarkar, A. & Stephens, M. Separating measurement and expression models clarifies confusion in single-cell RNA sequencing analysis. Nat. Genet. 53, 770–777 (2021).
Article CAS PubMed PubMed Central Google Scholar
Chen, H. et al. Assessment of computational methods for the analysis of single-cell ATAC–seq data. Genome Biol. 20, 1–25 (2019).
Article Google Scholar
Granja, J. M. et al. Single-cell multiomic analysis identifies regulatory programs in mixed-phenotype acute leukemia. Nat. Biotechnol. 37, 1458–1465 (2019).
Article CAS PubMed PubMed Central Google Scholar
Townes, F. W., Hicks, S. C., Aryee, M. J. & Irizarry, R. A. Feature selection and dimension reduction for single-cell RNA-seq based on a multinomial model. Genome Biol. 20, 1–16 (2019).
Article Google Scholar
Efron, B. & Tibshirani, R. J. An Introduction to the Bootstrap (Chapman and Hall, 1994).
Weinand, K. et al. The chromatin landscape of pathogenic transcriptional cell states in rheumatoid arthritis. Preprint at bioRxiv https://doi.org/10.1101/2023.04.07.536026 (2023).
Luecken, M. D. et al. A sandbox for prediction and integration of DNA, RNA, and proteins in single cells. In 35th Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2) (NeurIPS, 2021).
Mimitou, E. P. et al. Scalable, multimodal profiling of chromatin accessibility, gene expression and protein levels in single cells. Nat. Biotechnol. 39, 1246–1258 (2021).
Article CAS PubMed PubMed Central Google Scholar
Chen, A. F. et al. NEAT-seq: simultaneous profiling of intra-nuclear proteins, chromatin accessibility and gene expression in single cells. Nat. Methods 19, 547–553 (2022).
Article CAS PubMed Google Scholar
Meijer, M. et al. Epigenomic priming of immune genes implicates oligodendroglia in multiple sclerosis susceptibility. Neuron 110, 1193–12 (2022).
Article CAS PubMed PubMed Central Google Scholar
Zhang, Z. et al. Single nucleus transcriptome and chromatin accessibility of postmortem human pituitaries reveal diverse stem cell regulatory mechanisms. Cell Rep. https://doi.org/10.1016/J.CELREP.2022.110467 (2022).
Abascal, F. et al. Expanded encyclopaedias of DNA elements in the human and mouse genomes. Nature 583, 699–710 (2020).
Article Google Scholar
Westra, H. J. & Franke, L. From genome to function by studying eQTLs. Biochim. Biophys. Acta 1842, 1896–1902 (2014).
Article CAS PubMed Google Scholar
Siepel, A. et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 15, 1034–1050 (2005).
Article CAS PubMed PubMed Central Google Scholar
Hujoel, M. L. A., Gazal, S., Hormozdiari, F., van de Geijn, B. & Price, A. L. Disease heritability enrichment of regulatory elements is concentrated in elements with ancient sequence age and conserved function across species. Am. J. Hum. Genet 104, 611–624 (2019).
Article CAS PubMed PubMed Central Google Scholar
Mumbach, M. R. et al. Enhancer connectome in primary human cells identifies target genes of disease-associated DNA elements. Nat. Genet. 49, 1602–1612 (2017).
Article CAS PubMed PubMed Central Google Scholar
Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016).
Article CAS PubMed PubMed Central Google Scholar
Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443 (2020).
Article CAS PubMed PubMed Central Google Scholar
Wang, X. & Goldstein, D. B. Enhancer domains predict gene pathogenicity and inform gene discovery in complex disease. Am. J. Hum. Genet. 106, 215–233 (2020).
Article CAS PubMed PubMed Central Google Scholar
Aguet, F. et al. The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science 369, 1318–1330 (2020).
Article CAS Google Scholar
Wang, Q. S. et al. Leveraging supervised learning for functionally informed fine-mapping of cis-eQTLs identifies an additional 20,913 putative causal eQTLs. Nat. Commun. 12, 3394 (2021).
Article CAS PubMed PubMed Central Google Scholar
Zou, J. et al. Leveraging allelic imbalance to refine fine-mapping for eQTL studies. PLoS Genet. 15, e1008481 (2019).
Article PubMed PubMed Central Google Scholar
Chen, W., McDonnell, S. K., Thibodeau, S. N., Tillmans, L. S. & Schaid, D. J. Incorporating functional annotations for fine-mapping causal variants in a Bayesian framework using summary statistics. Genetics 204, 933–958 (2016).
Article PubMed PubMed Central Google Scholar
Gaffney, D. J. et al. Dissecting the regulatory architecture of gene expression QTLs. Genome Biol. 13, R7 (2012).
Article CAS PubMed PubMed Central Google Scholar
Göring, H. H. H. et al. Discovery of expression QTLs using large-scale transcriptional profiling in human lymphocytes. Nat. Genet. 39, 1208–1216 (2007).
Article PubMed Google Scholar
Wen, X., Luca, F. & Pique-Regi, R. Cross-population joint analysis of eQTLs: fine mapping and functional annotation. PLoS Genet. 11, e1005176 (2015).
Article PubMed PubMed Central Google Scholar
Kurki, M. I. et al. FinnGen provides genetic insights from a well-phenotyped isolated population. Nature 613, 508–518 (2023).
Article CAS PubMed PubMed Central Google Scholar
Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203–209 (2018).
Article CAS PubMed PubMed Central Google Scholar
Dey, K. K. et al. SNP-to-gene linking strategies reveal contributions of enhancer-related and candidate master-regulator genes to autoimmune disease. Cell Genomics 2, 100145 (2022).
Article CAS PubMed PubMed Central Google Scholar
Freund, M. K. et al. Phenotype-specific enrichment of mendelian disorder genes near GWAS regions across 62 complex traits. Am. J. Hum. Genet. 103, 535–552 (2018).
Article CAS PubMed PubMed Central Google Scholar
Gate, R. E. et al. Genetic determinants of co-accessible chromatin regions in activated T cells across humans. Nat. Genet. 50, 1140–1150 (2018).
Article CAS PubMed PubMed Central Google Scholar
Khetan, S. et al. Type 2 diabetes-associated genetic variants regulate chromatin accessibility in Human Islets. Diabetes 67, 2466–2477 (2018).
Article CAS PubMed PubMed Central Google Scholar
Alasoo, K. et al. Shared genetic effects on chromatin and gene expression indicate a role for enhancer priming in immune response. Nat. Genet. 50, 424–431 (2018).
Article CAS PubMed PubMed Central Google Scholar
Currin, K. W. et al. Genetic effects on liver chromatin accessibility identify disease regulatory variants. Am. J. Hum. Genet. 108, 1169–1189 (2021).
Article CAS PubMed PubMed Central Google Scholar
Kumasaka, N., Knights, A. J. & Gaffney, D. J. Fine-mapping cellular QTLs with RASQUAL and ATAC–seq. Nat. Genet. 48, 206–213 (2015).
Article PubMed PubMed Central Google Scholar
Kerimov, N. et al. A compendium of uniformly processed human gene expression and splicing quantitative trait loci. Nat. Genet. 53, 1290–1299 (2021).
Article CAS PubMed PubMed Central Google Scholar
Sagara, H. et al. Activation of TGF-β/Smad2 signaling is associated with airway remodeling in asthma. J. Allergy Clin. Immunol. 110, 249–254 (2002).
Article CAS PubMed Google Scholar
Chiou, J. et al. Interpreting type 1 diabetes risk with genetics and single-cell epigenomics. Nature 594, 398–402 (2021).
Article CAS PubMed PubMed Central Google Scholar
Mouri, K. et al. Prioritization of autoimmune disease-associated genetic variants that perturb regulatory element activity in T cells. Nat. Genet. 54, 603–612 (2022).
Article CAS PubMed PubMed Central Google Scholar
Javierre, B. M. et al. Lineage-specific genome architecture links enhancers and non-coding disease variants to target gene promoters. Cell 167, 1369–1384.e19 (2016).
Article CAS PubMed PubMed Central Google Scholar
Radtke, F., Fasnacht, N. & MacDonald, H. R. Notch signaling in the immune system. Immunity 32, 14–27 (2010).
Article CAS PubMed Google Scholar
Wei, K. et al. Notch signalling drives synovial fibroblast identity and arthritis pathology. Nature 582, 259–264 (2020).
Article CAS PubMed PubMed Central Google Scholar
Delacher, M. et al. Rbpj expression in regulatory T cells is critical for restraining TH2 responses. Nat. Commun. 10, 1621 (2019).
Article PubMed PubMed Central Google Scholar
Blake, J. A. et al. Mouse Genome Database (MGD): knowledgebase for mouse–human comparative biology. Nucleic Acids Res. 49, D981–D987 (2021).
Article CAS PubMed Google Scholar
Uhlén, M. et al. Tissue-based map of the human proteome. Science 347, 1260419 (2015).
Article PubMed Google Scholar
Hillier, S. G. Gonadotropic control of ovarian follicular growth and development. Mol. Cell. Endocrinol. 179, 39–46 (2001).
Article CAS PubMed Google Scholar
Rubinstein, W. S. et al. The NIH genetic testing registry: a new, centralized database of genetic tests to enable access to comprehensive information and improve transparency. Nucleic Acids Res. 41, D925–D935 (2013).
Article CAS PubMed Google Scholar
Retterer, K. et al. Clinical application of whole-exome sequencing across clinical indications. Genet. Med. 18, 696–704 (2016).
Article CAS PubMed Google Scholar
Adams, D. R. & Eng, C. M. Next-generation sequencing to diagnose suspected genetic disorders. N. Engl. J. Med. 379, 1353–1362 (2018).
Article CAS PubMed Google Scholar
Srivastava, S. et al. Meta-analysis and multidisciplinary consensus statement: exome sequencing is a first-tier clinical diagnostic test for individuals with neurodevelopmental disorders. Genet. Med. 21, 2413–2421 (2019).
Article PubMed PubMed Central Google Scholar
Landrum, M. J. et al. ClinVar: improving access to variant interpretations and supporting evidence. Nucleic Acids Res. 46, D1062–D1067 (2018).
Article CAS PubMed Google Scholar
Glocker, E.-O. et al. Inflammatory bowel disease and mutations affecting the interleukin-10 receptor. N. Engl. J. Med. 361, 2033–2045 (2009).
Article CAS PubMed PubMed Central Google Scholar
Dietlein, F. et al. Genome-wide analysis of somatic noncoding mutation patterns in cancer. Science 376, eabg5601 (2022).
Article CAS PubMed PubMed Central Google Scholar
Connally, N. et al. The missing link between genetic association and regulatory function. eLife 11, e74970 (2022).
Article CAS PubMed PubMed Central Google Scholar
Dixit, A. et al. Perturb-seq: dissecting molecular circuits with scalable single cell RNA profiling of pooled genetic screens. Cell 167, 1853–1866 (2016).
Article CAS PubMed PubMed Central Google Scholar
Rees, H. A. & Liu, D. R. Base editing: precision chemistry on the genome and transcriptome of living cells. Nat. Rev. Genet. 19, 770–788 (2018).
Article CAS PubMed PubMed Central Google Scholar
Morris, J. A. et al. Discovery of target genes and pathways at GWAS loci by pooled single-cell CRISPR screens. Science 380, eadh7699 (2023).
Article CAS PubMed PubMed Central Google Scholar
Donlin, L. T. et al. Methods for high-dimensional analysis of cells dissociated from cyropreserved synovial tissue. Arthritis Res Ther. 20, 139 (2018).
Article PubMed PubMed Central Google Scholar
Stuart, T. et al. Comprehensive integration of single-cell data. Cell 177, 1888–1902.e21 (2019).
Article CAS PubMed PubMed Central Google Scholar
Korsunsky, I. et al. Fast, sensitive and accurate integration of single-cell data with Harmony. Nat. Methods 16, 1289–1296 (2019).
Article CAS PubMed PubMed Central Google Scholar
PhastCons scores for multiple alignments of 99 vertebrate genomes to the human genome. UCSC Genome Browser https://hgdownload.cse.ucsc.edu/goldenpath/hg19/phastCons100way/ (2014).
gnomAD database. Broad Institute https://gnomad.broadinstitute.org/downloads (2023).
GWAS fine-mapping results. Finucane Lab https://www.finucanelab.org/data (2019).
EpiMap Gene-Enhancer links. Broad Institute https://personal.broadinstitute.org/cboix/epimap/links/pergroup/ (2021).
ABC predictions across 131 biosamples. Broad Institute ftp://ftp.broadinstitute.org/outgoing/lincRNA/ABC/AllPredictions.AvgHiC.ABC0.015.minus150.ForABCPaperV3.txt.gz (2021).
Delaneau, O., Marchini, J. & Zagury, J. F. A linear complexity phasing method for thousands of genomes. Nat. Methods 9, 179–181 (2012).
Article CAS Google Scholar
Das, S. et al. Next-generation genotype imputation service and methods. Nat. Genet. 48, 1284–1287 (2016).
Article CAS PubMed PubMed Central Google Scholar
Gibbs, R. A. et al. A global reference for human genetic variation. Nature 526, 68–74 (2015).
Article Google Scholar
van de Geijn, B., Mcvicker, G., Gilad, Y. & Pritchard, J. K. WASP: allele-specific software for robust molecular quantitative trait locus discovery. Nat. Methods 12, 1061–1063 (2015).
Article PubMed PubMed Central Google Scholar
van der Auwera G. & O’Connor, B. Genomics in the Cloud (O’Reilly Media, Inc., 2020).
ClinVar variants. ClinVar https://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh38/clinvar.vcf.gz (2023).
Sakaue, S. immunogenomics/SCENT: v1.0.0. Zenodo https://doi.org/10.5281/zenodo.10452116 (2024).

Download references

Acknowledgements

We sincerely thank participants of this study who provided tissue samples. We thank A. Gupta, J. Kang and K. Lagattuta for their comments and helpful discussion on the manuscript. This work is supported in part by funding from the National Institutes of Health (R01AR063759, U01HG012009 and UC2AR081023 to S.R.). S.S. was in part supported by the Uehara Memorial Foundation and The Osamu Hayaishi Memorial Scholarship. K. Weinand was supported by NIH NIAMS T32AR007530. K. Wei was supported by a Burroughs Wellcome Fund Career Awards for Medical Scientists, a Doris Duke Charitable Foundation Clinical Scientist Development Award, a Rheumatology Research Foundation Innovative Research Award, and NIH NIAMS K08AR077037. We thank the Brigham and Women’s Hospital Center for Cellular Profiling Single Cell Multiomics Core for experimental design and protocol optimization. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.

Author information

Authors and Affiliations

Center for Data Sciences, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA
Saori Sakaue, Kathryn Weinand, Shakson Isaac, Michelle Curtis, Maria Gutierrez-Arcelus, Siddarth Gurajala, Kazuyoshi Ishigaki, Joyce B. Kang, Ilya Korsunsky, Joseph Mears, Nghia Millard, Aparna Nathan, Yakir Reshef, Laurie Rumker, Qian Xiao, Fan Zhang & Soumya Raychaudhuri
Divisions of Genetics and Rheumatology, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA
Saori Sakaue, Kathryn Weinand, Shakson Isaac, Michelle Curtis, Maria Gutierrez-Arcelus, Siddarth Gurajala, Kazuyoshi Ishigaki, Joyce B. Kang, Ilya Korsunsky, Joseph Mears, Nghia Millard, Aparna Nathan, Yakir Reshef, Laurie Rumker, Qian Xiao, Fan Zhang & Soumya Raychaudhuri
Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
Saori Sakaue, Kathryn Weinand, Shakson Isaac, Kushal K. Dey, Karthik Jagadeesh, Masahiro Kanai, Michelle Curtis, Maria Gutierrez-Arcelus, Siddarth Gurajala, Kazuyoshi Ishigaki, Joyce B. Kang, Ilya Korsunsky, Joseph Mears, Nghia Millard, Aparna Nathan, Yakir Reshef, Laurie Rumker, Qian Xiao, Fan Zhang, Alkes L. Price & Soumya Raychaudhuri
Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
Kathryn Weinand, Shakson Isaac, Joyce B. Kang, Katherine P. Liao, Nghia Millard, Aparna Nathan, Laurie Rumker, Kamil Slowikowski & Soumya Raychaudhuri
Department of Epidemiology, Harvard T. H. Chan School of Public Health, Boston, MA, USA
Kushal K. Dey, Karthik Jagadeesh & Alkes L. Price
Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
Masahiro Kanai
Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
Masahiro Kanai
Center for Computational and Integrative Biology, Massachusetts General Hospital, Boston, MA, USA
Masahiro Kanai
Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
Gerald F. M. Watts, Zhu Zhu, Adam Chicoine, Ellen M. Gravallese, Anna Helena Jonsson, Gregory Keras, Zhihan J. Li, Yuhong Li, Katherine P. Liao, Deepak A. Rao, Dana Weisenfeld, Michael B. Brenner & Kevin Wei
Department of Biostatistics and Computational Biology, University of Rochester Medical Center, Rochester, NY, USA
Andrew McDavid
Hospital for Special Surgery, New York, NY, USA
S. Louis Bridges Jr, Vivian P. Bykerk, Susan M. Goodman, Lionel B. Ivashkiv, Amit Lakhanpal, Ian Mantel, Dana E. Orange, Melanie H. Smith & Laura T. Donlin
Weill Cornell Medicine, New York, NY, USA
S. Louis Bridges Jr, Vivian P. Bykerk, Susan M. Goodman, Lionel B. Ivashkiv, Amit Lakhanpal, Ian Mantel & Laura T. Donlin
Department of Biostatistics, Harvard T. H. Chan School of Public Health, Boston, MA, USA
Alkes L. Price
Division of Allergy, Immunology and Rheumatology, Department of Medicine, University of Rochester Medical Center, Rochester, NY, USA
Jennifer Albrecht, Jennifer H. Anolik, Jennifer L. Barnas, Debbie Campbell, Nida Meednu, Javier Rangel-Moreno, Christopher Ritchlin & Darren Tabechian
Accelerating Medicines Partnership® Program: Rheumatoid Arthritis and Systemic Lupus Erythematosus (AMP® RA/SLE) Network, Boston, MA, USA
William Apruzzese
Division of Rheumatology., University of Colorado School of Medicine, Aurora, CO, USA
Nirmal Banda
Division of Rheumatology, Columbia University College of Physicians and Surgeons, New York, NY, USA
Joan M. Bathon & Laura Geraldino-Pardilla
Division of Rheumatology, Cedars-Sinai Medical Center, Los Angeles, CA, USA
Ami Ben-Artzi & Michael H. Weisman
Department of Pathology and Laboratory Medicine, University of Rochester Medical Center, Rochester, NY, USA
Brendan F. Boyce
Division of Rheumatology, Allergy and Immunology, University of California, San Diego, La Jolla, CA, USA
David L. Boyle, Arnold Ceponis & Gary S. Firestein
Rheumatology Research Group, Institute for Inflammation and Ageing, University of Birmingham, NIHR Birmingham Biomedical Research Center and Clinical Research Facility, University of Birmingham, Queen Elizabeth Hospital, Birmingham, UK
Hayley L. Carr, Andrew Filer, Lindsy Forbess, Mark Maybury, Karim Raza, Ilfita Sahbudin & Dagmar Scheel-Toellner
Department of Radiology, University of Pittsburgh Medical Center, Pittsburgh, PA, USA
Andrew Cordle
Division of Rheumatology, University of Colorado School of Medicine, Aurora, CO, USA
Kevin D. Deane, V. Michael Holers, Larry W. Moreland & Jennifer A. Seifert
Department of Pathology and Laboratory Medicine, Hospital for Special Surgery, New York, NY, USA
Edward DiCarlo
Division of Allergy, Immunology, and Transplantation, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD, USA
Patrick Dunn
Northrop Grumman Health Solutions, Rockville, MD, USA
Patrick Dunn
Feinstein Institute for Medical Research, Northwell Health, Manhasset, New York, NY, USA
Peter K. Gregersen & Diane Horowitz
Department of Arthritis & Clinical Immunology, Oklahoma Medical Research Foundation, Oklahoma City, OK, USA
Joel M. Guthridge & Judith A. James
Division of Immunology, Department of Pediatrics, Boston Children’s Hospital and Harvard Medical School, Boston, MA, USA
Maria Gutierrez-Arcelus
Division of Clinical Immunology and Rheumatology, Department of Medicine, University of Alabama at Birmingham, Birmingham, AL, USA
Laura B. Hughes
Laboratory for Human Immunogenetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
Kazuyoshi Ishigaki
Department of Surgery, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
James A. Lederer
Division of Rheumatology, Department of Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, USA
Arthur M. Mandelin II & Harris Perlman
Division of Rheumatology and Clinical Immunology, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
Larry W. Moreland
Centre for Experimental Medicine and Rheumatology, William Harvey Research Institute, Queen Mary University of London, London, UK
Alessandra Nerviani, Costantino Pitzalis & Felice Rivellese
Laboratory of Molecular Neuro-Oncology, The Rockefeller University, New York, NY, USA
Dana E. Orange
Division of Immunology and Rheumatology, Institute for Immunity, Transplantation and Infection, Stanford University School of Medicine, Stanford, CA, USA
William H. Robinson, Paul J. Utz & Michael H. Weisman
Center for Immunology and Inflammatory Diseases, Department of Medicine, Massachusetts General Hospital, Boston, MA, USA
Kamil Slowikowski
MGH Cancer Center, Boston, MA, USA
Kamil Slowikowski
Division of Rheumatology and the Center for Health Artificial Intelligence, University of Colorado School of Medicine, Aurora, CO, USA
Fan Zhang

Authors

Saori Sakaue
View author publications
You can also search for this author in PubMed Google Scholar
Kathryn Weinand
View author publications
You can also search for this author in PubMed Google Scholar
Shakson Isaac
View author publications
You can also search for this author in PubMed Google Scholar
Kushal K. Dey
View author publications
You can also search for this author in PubMed Google Scholar
Karthik Jagadeesh
View author publications
You can also search for this author in PubMed Google Scholar
Masahiro Kanai
View author publications
You can also search for this author in PubMed Google Scholar
Gerald F. M. Watts
View author publications
You can also search for this author in PubMed Google Scholar
Zhu Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Michael B. Brenner
View author publications
You can also search for this author in PubMed Google Scholar
Andrew McDavid
View author publications
You can also search for this author in PubMed Google Scholar
Laura T. Donlin
View author publications
You can also search for this author in PubMed Google Scholar
Kevin Wei
View author publications
You can also search for this author in PubMed Google Scholar
Alkes L. Price
View author publications
You can also search for this author in PubMed Google Scholar
Soumya Raychaudhuri
View author publications
You can also search for this author in PubMed Google Scholar

Consortia

Accelerating Medicines Partnership® RA/SLE Program and Network

Jennifer Albrecht
, Jennifer H. Anolik
, William Apruzzese
, Nirmal Banda
, Jennifer L. Barnas
, Joan M. Bathon
, Ami Ben-Artzi
, Brendan F. Boyce
, David L. Boyle
, S. Louis Bridges Jr
, Vivian P. Bykerk
, Debbie Campbell
, Hayley L. Carr
, Arnold Ceponis
, Adam Chicoine
, Andrew Cordle
, Michelle Curtis
, Kevin D. Deane
, Edward DiCarlo
, Patrick Dunn
, Andrew Filer
, Gary S. Firestein
, Lindsy Forbess
, Laura Geraldino-Pardilla
, Susan M. Goodman
, Ellen M. Gravallese
, Peter K. Gregersen
, Joel M. Guthridge
, Maria Gutierrez-Arcelus
, Siddarth Gurajala
, V. Michael Holers
, Diane Horowitz
, Laura B. Hughes
, Kazuyoshi Ishigaki
, Lionel B. Ivashkiv
, Judith A. James
, Anna Helena Jonsson
, Joyce B. Kang
, Gregory Keras
, Ilya Korsunsky
, Amit Lakhanpal
, James A. Lederer
, Zhihan J. Li
, Yuhong Li
, Katherine P. Liao
, Arthur M. Mandelin II
, Ian Mantel
, Mark Maybury
, Joseph Mears
, Nida Meednu
, Nghia Millard
, Larry W. Moreland
, Aparna Nathan
, Alessandra Nerviani
, Dana E. Orange
, Harris Perlman
, Costantino Pitzalis
, Javier Rangel-Moreno
, Deepak A. Rao
, Karim Raza
, Yakir Reshef
, Christopher Ritchlin
, Felice Rivellese
, William H. Robinson
, Laurie Rumker
, Ilfita Sahbudin
, Jennifer A. Seifert
, Kamil Slowikowski
, Melanie H. Smith
, Darren Tabechian
, Dagmar Scheel-Toellner
, Paul J. Utz
, Dana Weisenfeld
, Michael H. Weisman
, Qian Xiao
& Fan Zhang

Contributions

S.S. and S.R. conceived the work and wrote the manuscript with critical input from co-authors. S.S. and K. Weinand analyzed the arthritis-tissue dataset and S.S. analyzed publicly available datasets with help and guidance from K.K.D., K.J., M.K., A.M., A.L.P. and S.R. G.F.M.W., Z.Z., M.B.B., L.T.D. and K. Wei provided samples and generated the arthritis-tissue dataset. S.I. refactored the SCENT software implementation as an R package.

Corresponding author

Correspondence to Soumya Raychaudhuri.

Ethics declarations

Competing interests

S.R. is a founder for Mestag, Inc., a scientific advisor for Rheos, Jannsen and Pfizer, and serves as a consultant for Sanofi and Abbvie. The other authors declare no competing interests.

Peer review

Peer review information

Nature Genetics thanks Tim Stuart and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Distribution of gene expression counts in single-cell RNA-seq and statistics from association between gene expression and chromatin accessibility under null simulation.

a. In an example dataset of arthritis-dataset, mean gene count was strongly correlated with standard deviation of the gene count. b. The correlation between max expression count per gene (x-axis) and the mean naïve association chi-square values (χ²) from Poisson regression between gene expression and chromatin accessibility under null simulation (y-axis). c. The quantile-quantile (QQ) plot of two-sided P values from the Poisson regression between gene expression count and chromatin accessibility under null simulation. d. The QQ plot of two-sided P values from the negative binomial regression between gene expression count and chromatin accessibility under null simulation. e. The QQ plot of two-sided P values from the linear regression between log-normalized and inverse-normal-transformed gene expression and chromatin accessibility under null simulation. f. The QQ plot of two-sided P values estimated from bootstrapping based on the statistics distributions from the Poisson regression between gene expression count and chromatin accessibility under null simulation. g. The QQ plot of two-sided P values estimated from bootstrapping based on the statistics distributions from the negative binomial regression between gene expression count and chromatin accessibility under null simulation. h. Computational runtime benchmarking for Poisson regression with binarized ATAC-seq peak (red), negative binomial regression with binarized ATAC-seq peak (teal), and Poisson regression with non-binarized ATAC-seq peak (blue). The values are relative to the computational time for Poisson regression, and bars are the mean across n=100 randomly selected peak-gene pairs. Horizontal lines (error bars) indicate one standard deviation from the mean.

Extended Data Fig. 2 Schematic overview of SCENT model using Poisson regression and non-parametric bootstrapping.

We first run Poisson regression associating the raw gene expression count (RNA-seq) with the peak accessibility (ATAC-seq) accounting for technical covariates across the entire cells in the multimodal data to estimate β_peak. Then, we resampled cells with replacement from the full data in each of the bootstrapping round and re-estimated \({\beta {\prime} }_{{peak}}\) for N times. We compared this empirical distribution of \({\beta {\prime} }_{{peak}}\) against the null hypothesis (\({\beta {\prime} }_{{peak}}\) = 0) to derive the significance of β_peak (that is, two-sided bootstrapping-based P value = P_bootstrap).

Extended Data Fig. 3 The QQ plot of SCENT P values by bootstrapping.

We applied SCENT to each of 23 broad cell types from 9 single-cell multimodal datasets. Each QQ plot represents two-sided P_bootstrap values in each cell type in each dataset (a. arthritis-tissue, b. public PBMC, c. NeurIPS, d. SHARE-seq, e. Dogma-seq (control), f. Dogma-seq (stimulated) g. NEAT-seq, h. Brain, i. Pituitary.

Extended Data Fig. 4 Properties of SCENT peaks.

a. The number of significant SCENT peaks per gene across genes we investigated in at least one dataset-cell type pair. b. The number of significant gene-peak pairs discovered by SCENT with FDR < 10% in each dataset (y-axis) as a function of the total number of ATAC-seq fragments in each dataset (x-axis), colored by the dataset. c. The number of significant gene-peak pairs discovered by SCENT with FDR < 10% in each dataset (y-axis) as a function of the total number of unique RNA molecules in each dataset (x-axis), colored by the dataset. d. The effect size correlation r by Pearson’s correlation between arthritis-tissue dataset and the other dataset for the same cell type (left) and the directional (sign) concordance between arthritis-tissue dataset and the other dataset for the same cell type (right). e. Fraction of overlap with ENCODE cCREs in SCENT (teal) or non-SCENT peaks (orange) in each dataset and random set of cis-non-coding regions (pink). f. The mean Δ phastCons score for SCENT with excluding promoter peaks (teal) and all cis-ATAC peaks with excluding promoter peaks (yellow) in each of the three example multimodal datasets. The bars indicate the 95% CI by bootstrapping genes (n_bootstrap=1000). g. The mean Δ phastCons score between SCENT peaks and TSS-distance-matched non-SCENT peaks across all the genes. The bars indicate the 95% CI by bootstrapping genes (n_bootstrap=1000).

Extended Data Fig. 5 Mutational constraint on genes with a high number of SCENT peaks.

For each gene, the number of SCENT peaks were counted and binned as shown in the x-asis, and mutational constraint metric (pLI (the probability of being loss of function intolerant): a, LOEUF (the loss-of-function observed/expected upper bound fraction): b) for genes within each bin are shown as a violin plot on the y-axis. The dots indicate the mean score in each bin, and the error bars indicate one standard deviation from the mean. Each bin consists of 555-4071 genes in a and 568-4265 genes in b.

Extended Data Fig. 6 Causal variant enrichment for eQTLs.

a. The mean causal variant enrichment for eQTL within SCENT peaks with excluding all promoters (teal) or cis-regulatory ATAC-seq peaks with excluding all promoters (yellow) in each dataset. b. The mean causal variant enrichment for eQTL within SCENT peaks (teal) or non-SCENT peaks with matching distance to TSS (pink). c. Comparison of the mean causal variant enrichment for eQTL (y-axis) among SCENT (teal), ArchR (pink), and Signac (purple) as a function of the number of significant peak-gene pairs at each threshold of significance by FDR in SCENT and correlation r in ArchR and Signac. d. Comparison of the mean causal variant enrichment for eQTL among SCENT, ArchR, and Signac as a function of the number of significant peak-gene pairs at each threshold of FDR in SCENT, ArchR and Signac. The ArchR results with > 180,000 peak-gene linkages are omitted. e. Comparison of the mean causal variant enrichment for eQTL among SCENT, ArchR, and ArchR filtered on RNA expression as a function of the number of significant peak-gene pairs. f. Comparison of the mean causal variant enrichment for eQTL among SCENT, Signac, and Signac filtered on RNA expression as a function of the number of significant peak-gene pairs. g. Comparison of the mean causal variant enrichment for eQTL among SCENT, the default Pearson’s correlation version of Signac, and the optional Spearman’s correlation version of Signac as a function of the number of significant peak-gene pairs. h. Comparison of the mean causal variant enrichment for eQTL among original SCENT (Poisson regression + non-parametric bootstrapping), Poisson-only strategy without bootstrapping, and Cicero (correlation method using sc-ATAC-seq alone) as a function of the number of significant peak-gene pairs up to 100,000 peak-gene linkages. i. Comparison of the mean causal variant enrichment for eQTL between SCENT and Cicero peaks with adding all accessible promoter regions (1 kb regions from TSS) to account for potential promoter bias. j. Tissue-specific causal variant enrichment within SCENT peaks. The dots and lines are colored by the eQTL source tissue in GTEx that we assessed. In all panels, the bars indicate 95% confidence intervals by bootstrapping genes (n_bootstrap=1000).

Extended Data Fig. 7 Causal variant enrichment for GWAS.

a and b. The mean causal variant enrichment for GWAS within cell-type-specific and aggregated SCENT enhancers (teal), ENCODE cCREs (pink), group-specific and aggregated EpiMap enhancers (red) and sample-specific and aggregated ABC enhancers (blue). GWAS results were based on FinnGen (a) and UK Biobank (b). The bars indicate 95% confidence intervals by bootstrapping traits (n_bootstrap=1000). c. The mean causal variant enrichment for FinnGen GWAS (see Methods) within SCENT peaks with excluding all promoters (teal) or cis-regulatory ATAC-seq peaks with excluding all promoters (yellow) in each of the 9 single-cell datasets. The bars indicate 95% confidence intervals by bootstrapping traits (n_bootstrap=1000). d. The mean causal variant enrichment for FinnGen GWAS (see Methods) within SCENT peaks (teal) or non-SCENT peaks with matching distance to TSS (pink) in each of the 9 single-cell datasets. The bars indicate 95% confidence intervals by bootstrapping traits (n_bootstrap=1000). e. The fraction of known genes from Mendelian autoimmune diseases among all the genes identified by SCENT, EpiMap, and ABC model. The color of the bars indicates the cell types in each linking method.

Extended Data Fig. 8 Causal variant enrichment for GWAS and comparison with published bulk methods and single-cell methods.

a. Comparison of the mean causal variant enrichment for FinnGen GWAS (y-axis) among SCENT (teal), EpiMap (red), and ABC model (blue) as a function of the number of significant peak-gene pairs (x-axis) at each threshold of significance. The bars indicate 95% confidence intervals by bootstrapping traits (n_bootstrap=1000). b. We calculated the causal variant enrichment for FinnGen GWAS among SCENT (teal), EpiMap (reds), and ABC model (blues) by changing the PIP thresholds in defining putative causal variants from fine-mapping. The bars indicate 95% confidence intervals by bootstrapping traits (n_bootstrap=1000). c and d. The mean causal variant enrichment for GWAS within SCENT enhancers (teal), ArchR (pink) and Signac enhancers (purple). GWAS results were based on FinnGen (c) and UK Biobank (d) using the FDR < 10% threshold in each software and eight benchmarking datasets (see Methods). The bars indicate 95% confidence intervals by bootstrapping traits (n_bootstrap=1000).

Extended Data Fig. 9 SMAD3 locus in asthma GWAS.

Rs17293632 in asthma GWAS (a) was prioritized and connected to SMAD3 gene by SCENT in myeloid cells (b). The panel a is a GWAS regional plot, with x-axis representing the position of each genetic variant and y-axis representing -log₁₀(P) from GWAS (a two-sided P value). The rs17293632 has a significant caQTL effect, as shown in c and d. In panel c, the read coverage from single-cell ATAC-seq in each of donors with heterozygous genotype at this accessible region is presented, and at rs17293632, we observed allele-specific increased accessibility with C allele when compared T allele across donors. In panel d, normalized chromatin accessibility based on the read coverage for an individual after regressing out covariates is presented by the genotype of rs17293632 (CC, CT and TT). The horizontal bars within boxes indicate the median, and the lower and upper hinges represent 25% and 75% quantile. The upper whisker extends from the hinge to the largest value no further than 1.5 * inter-quartile range (IQR) from the hinge. The lower whisker extends from the hinge to the smallest value at most 1.5 * IQR of the hinge. All individual points are plotted as dots.

Extended Data Fig. 10 Cells to be included in the regression framework.

a. An example situation of correlated gene expression without biological regulatory function. b. Benchmarking models for statistical power to define biologically plausible peak-gene linkage over false-associations due to correlated genes. c. Benchmarking results regarding cells and covariates included in the SCENT regression model. The x-axis represents the number of statistically significant peak-gene linkages among 5,000 randomly selected peak-gene linkages in cis, and the y-axis represents the number of statistically significant peak-gene linkages in cis divided by the number of statistically significant peak-gene linkages in trans among 5,000 randomly selected peak-gene linkages on different chromosomes, as a proxy metric for capability of identifying regulatory elements over ‘correlated’ elements. Red dots indicate the analyses conducted in all cells including different cell types (n = 8,881), whereas blue dots indicate the analyses conducted in only T cells (n = 8,881). d and e. False positive rate and precision for peak-gene linkages from analyses conducted in all cells (teal) or in only T cells (orange) by using experimentally validated enhancer-gene linkages (that is, CRISPR-Flow FISH data in d and H3K27ac data in e). False negative rate and precision were defined as follows: \(false\,negative\,rate=\#\,false\,negative/(\#\,true\,positive+\#\,false\,negative)=1-recall\).

Supplementary information

Supplementary Information

Supplementary Notes 1–3 and Figs. 1–8.

Reporting Summary

Peer Review File

Supplementary Table 1

Supplementary Tables 1–9.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Sakaue, S., Weinand, K., Isaac, S. et al. Tissue-specific enhancer–gene maps from multimodal single-cell data identify causal disease alleles. Nat Genet 56, 615–626 (2024). https://doi.org/10.1038/s41588-024-01682-1

Download citation

Received: 07 March 2023
Accepted: 07 February 2024
Published: 09 April 2024
Issue Date: April 2024
DOI: https://doi.org/10.1038/s41588-024-01682-1