Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Convergence of coronary artery disease genes onto endothelial cell programs

Abstract

Linking variants from genome-wide association studies (GWAS) to underlying mechanisms of disease remains a challenge1,2,3. For some diseases, a successful strategy has been to look for cases in which multiple GWAS loci contain genes that act in the same biological pathway1,2,3,4,5,6. However, our knowledge of which genes act in which pathways is incomplete, particularly for cell-type-specific pathways or understudied genes. Here we introduce a method to connect GWAS variants to functions. This method links variants to genes using epigenomics data, links genes to pathways de novo using Perturb-seq and integrates these data to identify convergence of GWAS loci onto pathways. We apply this approach to study the role of endothelial cells in genetic risk for coronary artery disease (CAD), and discover 43 CAD GWAS signals that converge on the cerebral cavernous malformation (CCM) signalling pathway. Two regulators of this pathway, CCM2 and TLNRD1, are each linked to a CAD risk variant, regulate other CAD risk genes and affect atheroprotective processes in endothelial cells. These results suggest a model whereby CAD risk is driven in part by the convergence of causal genes onto a particular transcriptional pathway in endothelial cells. They highlight shared genes between common and rare vascular diseases (CAD and CCM), and identify TLNRD1 as a new, previously uncharacterized member of the CCM signalling pathway. This approach will be widely useful for linking variants to functions for other common polygenic diseases.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Building a map of gene programs in ECs using Perturb-seq.
Fig. 2: CAD genes converge on five programs in ECs.
Fig. 3: Regulatory connections among CAD genes in the CCM pathway.
Fig. 4: Linking CAD risk variants at 15q25.1 to TLNRD1.
Fig. 5: TLNRD1 interacts with CCM2 and phenocopies CCM2 in ECs and zebrafish.
Fig. 6: CCM2 and TLNRD1 knockdown mimic the atheroprotective effects of laminar flow in ECs.

Similar content being viewed by others

Data availability

Raw and processed data for Perturb-seq, ATAC–seq, H3K27ac ChIP–seq and RNA-seq in teloHAECs were deposited into the NCBI’s GEO database under accession number GSE210523. This superseries is composed of the following subseries: GSE210489 (ATAC–seq), GSE210491 (ChIP–seq), GSE210522 (bulk RNA-seq of cytokine-treated parental lines and single guide CRISPRi knockdowns), GSE232400 (bulk RNA-seq of cells under flow, and MAP3K3 double knockdowns), GSE212396 (pilot scRNA-seq studies) and GSE210681 (comprehensive Perturb-seq). FASTQ files for Perturb-seq and TLNRD1 enhancer CRISPRi FlowFISH in teloHAECs are also available at the Impact of Genomic Variation on Function (IGVF) portal: Perturb-seq: https://data.igvf.org/analysis-sets/IGVFDS7943IIWZ/, and CRISPRi FlowFISH: https://data.igvf.org/analysis-sets/IGVFDS9053MHCS/. The following datasets were also used: CAD lead GWAS variants10,12 and phenome-wide association scan data12. GWAS summary statistics for other traits and finemapping analysis were from H. Finucane and J. Ulirsch’s analysis of UK Biobank data (https://www.finucanelab.org/data), and coding and splice site annotations were from the RefGene database (from the UCSC Genome Browser dated 24 June 2017)68. LD data from the 1000 Genome European ancestry database were accessed using “plink --ld-window-kb 1000 --ld-window 99999 --ld-window-r2 0.9”. TF binding site information was from HOCOMOCO v.11 human full scan motifs (https://hocomoco11.autosome.ru/downloads_v11). Gene sets were from the MSigDB67. We also used scRNA-seq data from explanted human right coronary artery ECs69, endothelial cis-regulatory elements derived from snATAC-seq72, ENCODE datasets ENCSR000EVW (GATA2 ChIP–seq on HUVECs) and ENCSR000EOB (DNase-seq and DGF on HMVEC-dLy-Neo). We used UniProt for protein structure models for KRIT1 (O00522), CCM2 (Q9BSQ5), PDCD10 (Q9BUL8) and TLNRD1 (Q9H1K6). Previous prioritization calls for genes in CAD GWAS loci were from refs. 10,12,29,30,96,97,98 (details in the Methods).

Code availability

The V2G2P approach snakemake pipeline is available at GitHub (https://github.com/EngreitzLab/V2G or https://doi.org/10.5281/zenodo.10357646)99, for V2G and https://github.com/EngreitzLab/cNMF_pipeline/ or https://doi.org/10.5281/zenodo.10357454 100, for G2P and V2G2P enrichment). For data analysis, data were processed using all packages required for running cNMF (v.1.2)25. We additionally used R-3.6.1 and R-3.6.3 (ref. 101), edgeR 3.28.0 (refs. 58,62,64,65), seqLogo 1.52.0, MAST 1.12.0 (ref. 50), clusterProfiler 3.14.0 (refs. 66,102), org.Hs.eg.db 3.10.0, Seurat 3.0.2 (ref. 103), SeuratObject 4.0.0 (refs. 103,104,105), stats4 v3.6.1 and SingleCellExperiment 1.8.0 (ref. 106), for downstream analysis in R. The supporting packages in R used for data processing and figure generation were ggplot2 3.3.5 (ref. 107), ggpubr 0.4.0, ggrepel 0.9.1, gplots 3.1.1, gridExtra 2.3, scales 1.1.1, cowplot 1.1.1, dplyr 1.0.7, tidyr 1.1.3, textshape 1.7.1, reshape2 1.4.4, stringi 1.7.5 (ref. 108), conflicted 1.0.4, data.table 1.14.0, purrr 0.3.4, readxl 1.3.1, writexl 1.5.0, ramify 0.3.3, optparse 1.6.6, and all dependencies. Data were further processed using pysuspenders 0.2.6 (ref. 109), pysam 0.19.1 (ref. 110), python 2.7.15, and LDSC (v.1.0.1)111. Other software packages used were kallisto 0.48.0 (ref. 56), limma 3.42.2 (ref. 57), Bowtie (v.1.3.0) and Bowtie2 (v.2.4.2)112, Macs2 2.2.7 (ref. 113), CellRanger 7.0.0 (ref. 114), FIMO (https://meme-suite.org/meme/meme_5.3.2/tools/fimo), Morpheus (https://software.broadinstitute.org/morpheus), IGV115, Open Targets98,116, plink (v.1.90b6.21)117, MAGMA4, S-LDSC28,70, PoPS5, FloJo (v.10.8.1), FCSalyzer 0.9.22alpha https://sourceforge.net/projects/fcsalyzer/), LocusZoom http://locuszoom.org/)118, AlphaFold2.3 Multimer (v.3)81, UCSF ChimeraX (v.1.61), AlphaPickle82, Matplotlib (v.3.7.0), Seaborn (https://seaborn.pydata.org/) and Matlab R2018a (MathWorks). Phalloidin analysis was done using Fiji/ImageJ 2.9.0/1.53t with the LPX plugin. CRISPRi primer design used CRISPRDesigner (https://github.com/EngreitzLab/CRISPRDesigner). For data collection, the Opera Phenix imager (used for phalloidin stain analysis of teloHAECs) was run using Harmony 4.9.2137.273, Acapella 5.0.1.124082 and Oda 4.9.2137.273. The Zeiss confocal microscope LSM 880 (used for zebrafish imaging) was run using ZEN 2.3 SP1 software. The Sony MA900FP cell sorter (used for FlowFISH) was run using Cell Sorter Software v.3.2. Other instruments were used with the manufacturer-supplied intrinsic software, including EVOS microscopes, Applied Biosystems QuantStudio 5 (for RT–qPCR) and the ECIS Z-Theta instrument (Applied BioPhysics).

References

  1. Uffelmann, E. et al. Genome-wide association studies. Nat. Rev. Methods Primers 1, 59 (2021).

    Article  CAS  Google Scholar 

  2. Visscher, P. M. et al. 10 years of GWAS discovery: biology, function, and translation. Am. J. Hum. Genet. 101, 5–22 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Claussnitzer, M. et al. A brief history of human disease genetics. Nature 577, 179–189 (2020).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  4. de Leeuw, C. A., Mooij, J. M., Heskes, T. & Posthuma, D. MAGMA: generalized gene-set analysis of GWAS data. PLoS Comput. Biol. 11, e1004219 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  5. Weeks, E. M. et al. Leveraging polygenic enrichments of gene features to predict genes underlying complex traits and diseases. Nat. Genet. 55, 1267–1276 (2023).

  6. Pers, T. H. et al. Biological interpretation of genome-wide association studies using predicted gene functions. Nat. Commun. 6, 5890 (2015).

    Article  CAS  PubMed  Google Scholar 

  7. Westra, H.-J. & Franke, L. From genome to function by studying eQTLs. Biochim. Biophys. Acta 1842, 1896–1902 (2014).

    Article  CAS  PubMed  Google Scholar 

  8. Maurano, M. T. et al. Systematic localization of common disease-associated variation in regulatory DNA. Science 337, 1190–1195 (2012).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  9. Nasser, J. et al. Genome-wide enhancer maps link risk variants to disease genes. Nature 593, 238–243 (2021).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  10. van der Harst, P. & Verweij, N. Identification of 64 novel genetic loci provides an expanded view on the genetic architecture of coronary artery disease. Circ. Res. 122, 433–443 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  11. Tcheandjieu, C. et al. Large-scale genome-wide association study of coronary artery disease in genetically diverse populations. Nat. Med. 28, 1679–1692 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Aragam, K. G. et al. Discovery and systematic characterization of risk variants and genes for coronary artery disease in over a million participants. Nat. Genet. 54, 1803–1815 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Gimbrone, M. A. Jr & García-Cardeña, G. Endothelial cell dysfunction and the pathobiology of atherosclerosis. Circ. Res. 118, 620–636 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Gupta, R. M. et al. A genetic variant associated with five vascular diseases is a distal regulator of endothelin-1 gene expression. Cell 170, 522–533.e15 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Turner, A. W. et al. Single-nucleus chromatin accessibility profiling highlights regulatory mechanisms of coronary artery disease risk. Nat. Genet. 54, 804–816 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Pepin, M. E. & Gupta, R. The role of endothelial cells in atherosclerosis: insights from genetic association studies. Am. J. Pathol. https://doi.org/10.1016/j.ajpath.2023.09.012 (2023).

  17. Dixit, A. et al. Perturb-seq: dissecting molecular circuits with scalable single-cell RNA profiling of pooled genetic screens. Cell 167, 1853–1866.e17 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Adamson, B. et al. A multiplexed single-cell CRISPR screening platform enables systematic dissection of the unfolded protein response. Cell 167, 1867–1882.e21 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Replogle, J. M. et al. Mapping information-rich genotype–phenotype landscapes with genome-scale Perturb-seq. Cell 185, 2559–75.e38 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Datlinger, P. et al. Pooled CRISPR screening with single-cell transcriptome readout. Nat. Methods 14, 297–301 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Bouïs, D., Hospers, G. A., Meijer, C., Molema, G. & Mulder, N. H. Endothelium in vitro: a review of human vascular endothelial cell lines for blood vessel-related research. Angiogenesis 4, 91–102 (2001).

    Article  PubMed  Google Scholar 

  22. Fulco, C. P., Nasser, J., Jones, T. R. & Munson, G. Activity-by-contact model of enhancer–promoter regulation from thousands of CRISPR perturbations. Nat. Genet. 51, 1664–1669 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Norman, T. M. et al. Exploring genetic interaction manifolds constructed from rich single-cell phenotypes. Science 365, 786–793 (2019).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  24. Morris, J. A. et al. Discovery of target genes and pathways at GWAS loci by pooled single-cell CRISPR screens. Science 380, eadh7699 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Kotliar, D. et al. Identifying gene expression programs of cell-type identity and cellular activity with single-cell RNA-seq. eLife 8, e43803 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  26. Nichol, D. & Stuhlmann, H. EGFL7: a unique angiogenic signaling factor in vascular development and disease. Blood 119, 1345–1352 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Brütsch, R. et al. Integrin cytoplasmic domain-associated protein-1 attenuates sprouting angiogenesis. Circ. Res. 107, 592–601 (2010).

    Article  PubMed  Google Scholar 

  28. Finucane, H. K. et al. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat. Genet. 47, 1228–1235 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Stolze, L. K. et al. Systems genetics in human endothelial cells identifies non-coding variants modifying enhancers, expression, and complex disease traits. Am. J. Hum. Genet. 106, 748–763 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Wünnemann, F. et al. Multimodal CRISPR perturbations of GWAS loci associated with coronary artery disease in vascular endothelial cells. PLoS Genet. 19, e1010680 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  31. Stacey, D. et al. ProGeM: a framework for the prioritization of candidate causal genes at molecular quantitative trait loci. Nucleic Acids Res. 47, e3 (2019).

    Article  CAS  PubMed  Google Scholar 

  32. Jagadeesh, K. A. et al. Identifying disease-critical cell types and cellular processes by integrating single-cell RNA-sequencing and human genetics. Nat. Genet. 54, 1479–1492 (2022).

  33. GTEx Consortium. The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science 369, 1318–1330 (2020).

    Article  Google Scholar 

  34. Snellings, D. A. et al. Cerebral cavernous malformation: from mechanism to therapy. Circ. Res. 129, 195–215 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Zhou, Z. et al. The cerebral cavernous malformation pathway controls cardiac development via regulation of endocardial MEKK3 signaling and KLF expression. Dev. Cell 32, 168–180 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Riolo, G., Ricci, C. & Battistini, S. Molecular genetic features of cerebral cavernous malformations (CCM) patients: an overall view from genes to endothelial cells. Cells 10, 704 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Gingras, A. R. et al. Central region of talin has a unique fold that binds vinculin and actin. J. Biol. Chem. 285, 29577–29587 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Cowell, A. R. et al. Talin rod domain–containing protein 1 (TLNRD1) is a novel actin-bundling protein which promotes filopodia formation. J. Cell Biol. 220, e202005214 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Luck, K. et al. A reference map of the human binary protein interactome. Nature 580, 402–408 (2020).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  40. Fisher, O. S. et al. Structural basis for the disruption of the cerebral cavernous malformations 2 (CCM2) interaction with Krev interaction trapped 1 (KRIT1) by disease-associated mutations. J. Biol. Chem. 290, 2842–2853 (2015).

    Article  CAS  PubMed  Google Scholar 

  41. Draheim, K. M. et al. CCM2–CCM3 interaction stabilizes their protein expression and permits endothelial network formation. J. Cell Biol. 208, 987–1001 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Zhou, Z. et al. Cerebral cavernous malformations arise from endothelial gain of MEKK3-KLF2/4 signalling. Nature 532, 122–126 (2016).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  43. Renz, M. et al. Regulation of β1 integrin–Klf2-mediated angiogenesis by CCM proteins. Dev. Cell 32, 181–190 (2015).

    Article  CAS  PubMed  Google Scholar 

  44. Donat, S. et al. Heg1 and Ccm1/2 proteins control endocardial mechanosensitivity during zebrafish valvulogenesis. eLife 7, e28939 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  45. Khera, A. V. et al. Gene sequencing identifies perturbation in nitric oxide signaling as a nonlipid molecular subtype of coronary artery disease. Circ. Genom. Precis. Med. 15, e003598 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Macek Jilkova, Z. et al. CCM proteins control endothelial β1 integrin dependent response to shear stress. Biol. Open 3, 1228–1235 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  47. Knowles, J. W. et al. Enhanced atherosclerosis and kidney dysfunction in eNOS–/–Apoe–/– mice are ameliorated by enalapril treatment. J. Clin. Invest. 105, 451–458 (2000).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Mueller, P. A. et al. Coronary artery disease risk-associated Plpp3 gene and its product lipid phosphate phosphatase 3 regulate experimental atherosclerosis. Arterioscler. Thromb. Vasc. Biol. 39, 2261–2272 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Denier, C. et al. Genotype–phenotype correlations in cerebral cavernous malformations patients. Ann. Neurol. 60, 550–556 (2006).

    Article  PubMed  Google Scholar 

  50. Finak, G. et al. MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data. Genome Biol. 16, 278 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  51. Whitehead, K. J. et al. The cerebral cavernous malformation signaling pathway promotes vascular integrity via Rho GTPases. Nat. Med. 15, 177–184 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Zheng, X. et al. CCM3 signaling through sterile 20-like kinases plays an essential role during zebrafish cardiovascular development and cerebral cavernous malformations. J. Clin. Invest. 120, 2795–2804 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Fulco, C. P., Munschauer, M., Anyoha, R. & Munson, G. Systematic mapping of functional enhancer–promoter connections with CRISPR interference. Science 354, 769–773 (2016).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  54. Thakore, P. I. et al. Highly specific epigenome editing by CRISPR–Cas9 repressors for silencing of distal regulatory elements. Nat. Methods 12, 1143–1149 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. Gilbert, L. A. et al. Genome-scale CRISPR-mediated control of gene repression and activation. Cell 159, 647–661 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Bray, N. L., Pimentel, H., Melsted, P. & Pachter, L. Near-optimal probabilistic RNA-seq quantification. Nat. Biotechnol. 34, 525–527 (2016).

    Article  CAS  PubMed  Google Scholar 

  57. Law, C. W., Chen, Y., Shi, W. & Smyth, G. K. voom: precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biol. 15, R29 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  58. Chen, Y., Lun, A. T. L. & Smyth, G. K. From reads to genes to pathways: differential expression analysis of RNA-seq experiments using Rsubread and the edgeR quasi-likelihood pipeline. F1000Research 5, 1438 (2016).

    PubMed  PubMed Central  Google Scholar 

  59. Huang, H. et al. Fine-mapping inflammatory bowel disease loci to single-variant resolution. Nature 547, 173–178 (2017).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  60. Marshall, J. L. et al. HyPR-seq: single-cell quantification of chosen RNAs via hybridization and sequencing of DNA probes. Proc. Natl Acad. Sci. USA 117, 33404–33413 (2020).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  61. Hart, T. & Moffat, J. BAGEL: a computational framework for identifying essential genes from pooled library screens. BMC Bioinformatics 17, 164 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  62. Soneson, C. & Robinson, M. D. Bias, robustness and scalability in single-cell differential expression analysis. Nat. Methods 15, 255–261 (2018).

    Article  CAS  PubMed  Google Scholar 

  63. Nygaard, V., Rødland, E. A. & Hovig, E. Methods that remove batch effects while retaining group differences may lead to exaggerated confidence in downstream analyses. Biostatistics 17, 29–39 (2016).

    Article  MathSciNet  PubMed  Google Scholar 

  64. Robinson, M. D., McCarthy, D. J. & Smyth, G. K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140 (2010).

    Article  CAS  PubMed  Google Scholar 

  65. McCarthy, D. J., Chen, Y. & Smyth, G. K. Differential expression analysis of multifactor RNA-seq experiments with respect to biological variation. Nucleic Acids Res. 40, 4288–4297 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  66. Wu, T. et al. clusterProfiler 4.0: a universal enrichment tool for interpreting omics data. Innovation 2, 100141 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  67. Subramanian, A. et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl Acad. Sci. USA 102, 15545–15550 (2005).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  68. Karolchik, D., Hinrichs, A. S. & James Kent, W. The UCSC genome browser. Curr. Protoc. Hum. Genet. 71, 18.6.1–18.6.33 (2011).

    Google Scholar 

  69. Wirka, R. C. et al. Atheroprotective roles of smooth muscle cell phenotypic modulation and the TCF21 disease gene as revealed by single-cell analysis. Nat. Med. 25, 1280–1289 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  70. Finucane, H. K. et al. Heritability enrichment of specifically expressed genes identifies disease-relevant tissues and cell types. Nat. Genet. 50, 621–629 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  71. Dey, K. K. et al. Contribution of enhancer-driven and master-regulator genes to autoimmune disease revealed using functionally informed SNP-to-gene linking strategies. Cell Genomics 2, 100145 (2022).

  72. Zhang, K. et al. A single-cell atlas of chromatin accessibility in the human genome. Cell 184, 5985–6001.e19 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  73. Hujoel, M. L. A., Gazal, S., Hormozdiari, F., van de Geijn, B. & Price, A. L. Disease heritability enrichment of regulatory elements is concentrated in elements with ancient sequence age and conserved function across species. Am. J. Hum. Genet. 104, 611–624 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  74. Padarti, A. & Zhang, J. Recent advances in cerebral cavernous malformation research. Vessel Plus 2, 21 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  75. Wei, S. et al. Cerebral cavernous malformation proteins in barrier maintenance and regulation. Int. J. Mol. Sci. 21, 675 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  76. Fischer, A., Zalvide, J., Faurobert, E., Albiges-Rizo, C. & Tournier-Lasserve, E. Cerebral cavernous malformations: from CCM genes to endothelial cell homeostasis. Trends Mol. Med. 19, 302–308 (2013).

    Article  CAS  PubMed  Google Scholar 

  77. Cullere, X., Plovie, E., Bennett, P. M., MacRae, C. A. & Mayadas, T. N. The cerebral cavernous malformation proteins CCM2L and CCM2 prevent the activation of the MAP kinase MEKK3. Proc. Natl Acad. Sci. USA 112, 14284–14289 (2015).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  78. Kleaveland, B. et al. Regulation of cardiovascular development and integrity by the heart of glass-cerebral cavernous malformation protein pathway. Nat. Med. 15, 169–176 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  79. Engreitz, J. M. et al. Local regulation of gene expression by lncRNA promoters, transcription and splicing. Nature 539, 452–455 (2016).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  80. Atri, D. S. et al. CRISPR–Cas9 genome editing of primary human vascular cells in vitro. Curr Protoc. 1, e291 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  81. Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  82. mattarnoldbio. mattarnoldbio/alphapickle: Release v.1.4.0. Zenodo https://doi.org/10.5281/zenodo.5708709 (2021).

  83. Yang, X. et al. A public genome-scale lentiviral expression library of human ORFs. Nat. Methods 8, 659–661 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  84. Bray, M.-A. et al. Cell Painting, a high-content image-based assay for morphological profiling using multiplexed fluorescent dyes. Nat. Protoc. 11, 1757–1774 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  85. Higaki, T. Quantitative evaluation of cytoskeletal organizations by microscopic image analysis. Plant Morphol. 29, 15–21 (2017).

    Article  Google Scholar 

  86. Kroll, F. et al. A simple and effective F0 knockout method for rapid screening of behaviour and other complex phenotypes. eLife 10, e59683 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  87. Lu, F., Leach, L. L. & Gross, J. M. A CRISPR–Cas9-mediated F0 screen to identify pro-regenerative genes in the zebrafish retinal pigment epithelium. Sci. Rep. 13, 3142 (2023).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  88. Moulton, J. D. & Yan, Y.-L. Using morpholinos to control gene expression. Curr. Protoc. Mol. Biol. Chapter 26, Unit 26.8 (2008).

    PubMed  Google Scholar 

  89. Hoeppner, L. H. et al. Revealing the role of phospholipase Cβ3 in the regulation of VEGF-induced vascular permeability. Blood 120, 2167–2173 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  90. Wang, Y. et al. Dissecting VEGF-induced acute versus chronic vascular hyperpermeability: essential roles of dimethylarginine dimethylaminohydrolase-1. iScience 24, 103189 (2021).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  91. Zebrafish embryo medium. Cold Spring Harb. Protoc. 2011, db.rec12478 (2011).

  92. Machikhin, A. S., Volkov, M. V., Burlakov, A. B., Khokhlov, D. D. & Potemkin, A. V. Blood vessel imaging at pre-larval stages of zebrafish embryonic development. Diagnostics 10, 886 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  93. Thisse, C. & Thisse, B. High-resolution in situ hybridization to whole-mount zebrafish embryos. Nat. Protoc. 3, 59–69 (2008).

    Article  CAS  PubMed  Google Scholar 

  94. Sudlow, C. et al. UK Biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 12, e1001779 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  95. Roadmap Epigenomics Consortium. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).

    Article  PubMed Central  Google Scholar 

  96. Hodonsky, C. J. et al. Multi-ancestry genetic analysis of gene regulation in coronary arteries prioritizes disease risk loci. Cell Genomics 4, 100465 (2024).

    Article  CAS  PubMed  Google Scholar 

  97. Li, L. et al. Transcriptome-wide association study of coronary artery disease identifies novel susceptibility genes. Basic Res. Cardiol. 117, 6 (2022).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  98. Mountjoy, E. et al. An open approach to systematically prioritize causal variants and genes at all published human GWAS trait-associated loci. Nat. Genet. 53, 1527–1533 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  99. Ma, X. R. & Engreitz J. M. EngreitzLab/V2G: V2Gv1.0.0. Zenodo https://doi.org/10.5281/zenodo.10357646 (2023).

  100. Kang, H., Guo, K. & Engreitz J. M. EngreitzLab/cNMF_pipeline: v1.0. Zenodo https://doi.org/10.5281/zenodo.10357454 (2023).

  101. The R Development Core Team. R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2022).

  102. Yu, G., Wang, L.-G., Han, Y. & He, Q.-Y. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS 16, 284–287 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  103. Stuart, T. et al. Comprehensive integration of single-Cell data. Cell 177, 1888–1902.e21 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  104. Satija, R., Farrell, J. A., Gennert, D., Schier, A. F. & Regev, A. Spatial reconstruction of single-cell gene expression data. Nat. Biotechnol. 33, 495–502 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  105. Macosko, E. Z. et al. Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell 161, 1202–1214 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  106. Amezquita, R. A. et al. Orchestrating single-cell analysis with Bioconductor. Nat. Methods 17, 137–145 (2020).

    Article  CAS  PubMed  Google Scholar 

  107. Wickham, H. ggplot2 (Springer-Verlag, 2016).

  108. Gagolewski, M. stringi: fast and portable character string processing in R. J. Stat. Softw. 103, 1–59 (2022).

    Article  MathSciNet  Google Scholar 

  109. Holt, J., Huang, S., McMillan, L. & Wang, W. Read annotation pipeline for high-throughput sequencing data. In Proc. International Conference on Bioinformatics, Computational Biology and Biomedical Informatics 605–612 (Association for Computing Machinery, 2013).

  110. Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  111. Bulik-Sullivan, B. K. et al. LD score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47, 291–295 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  112. Langmead, B., Wilks, C., Antonescu, V. & Charles, R. Scaling read aligners to hundreds of threads on general-purpose processors. Bioinformatics 35, 421–432 (2019).

    Article  CAS  PubMed  Google Scholar 

  113. Gaspar, J. M. Improved peak-calling with MACS2. Preprint at bioRxiv https://doi.org/10.1101/496521 (2018).

  114. Zheng, G. X. Y. et al. Massively parallel digital transcriptional profiling of single cells. Nat. Commun. 8, 14049 (2017).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  115. Robinson, J. T. et al. Integrative genomics viewer. Nat. Biotechnol. 29, 24–26 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  116. Ghoussaini, M. et al. Open Targets Genetics: systematic identification of trait-associated genes using large-scale genetics and functional genomics. Nucleic Acids Res. 49, D1311–D1320 (2021).

    Article  CAS  PubMed  Google Scholar 

  117. Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  118. Boughton, A. P. et al. LocusZoom.js: interactive and embeddable visualization of genetic association study results. Bioinformatics 37, 3017–3018 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  119. Tsherniak, A. et al. Defining a cancer dependency map. Cell 170, 564–576.e16 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  120. Alcid, E. A. & Tsukiyama, T. ATP-dependent chromatin remodeling shapes the long noncoding RNA landscape. Genes Dev. 28, 2348–2360 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  121. SenBanerjee, S. et al. KLF2 is a novel transcriptional regulator of endothelial proinflammatory activation. J. Exp. Med. 199, 1305–1315 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  122. Coma, S. et al. GATA2 and LMO2 control angiogenesis and lymphangiogenesis via direct transcriptional regulation of neuropilin-2. Angiogenesis 16, 939–952 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  123. Yu, M. et al. Computational estimates of annular diameter reveal genetic determinants of mitral valve function and disease. JCI Insight 7, e146580 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  124. Hogan, B. M., Bussmann, J., Wolburg, H. & Schulte-Merker, S. ccm1 cell autonomously regulates endothelial cellular morphogenesis and vascular tubulogenesis in zebrafish. Hum. Mol. Genet. 17, 2424–2432 (2008).

    Article  CAS  PubMed  Google Scholar 

  125. Neuman, N. A. et al. The four-and-a-half LIM domain protein 2 regulates vascular smooth muscle phenotype and vascular tone. J. Biol. Chem. 284, 13202–13212 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  126. Wang, W. et al. Essential role of Smad3 in angiotensin II-induced vascular fibrosis. Circ. Res. 98, 1032–1039 (2006).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  127. Tsai, S. et al. TGF-β through Smad3 signaling stimulates vascular smooth muscle cell proliferation and neointimal formation. Am. J. Physiol. Heart Circ. Physiol. 297, H540–H549 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  128. Crispino, J. D. & Weiss, M. J. Erythro-megakaryocytic transcription factors associated with hereditary anemia. Blood 123, 3080–3088 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  129. Gruber, T. A. & Downing, J. R. The biology of pediatric acute megakaryoblastic leukemia. Blood 126, 943–949 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  130. Hauser, W. et al. Megakaryocyte hyperplasia and enhanced agonist-induced platelet activation in vasodilator-stimulated phosphoprotein knockout mice. Proc. Natl Acad. Sci. USA 96, 8120–8125 (1999).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  131. Pleines, I. et al. Mutations in tropomyosin 4 underlie a rare form of human macrothrombocytopenia. J. Clin. Invest. 127, 814–829 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  132. Meinders, M. et al. Sp1/Sp3 transcription factors regulate hallmarks of megakaryocyte maturation and platelet formation and function. Blood 125, 1957–1967 (2015).

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

This work was supported by the Variant-to-Function Initiative at the Broad Institute (to R.M.G. and J.M.E.); NHLBI R01HL159176 and R01HL164811 (to J.M.E. and R.M.G.); the NHGRI Impact of Genomic Variation on Function Consortium (UM1HG011972 to J.M.E.); a NHGRI Genomic Innovator Award (R35HG011324 to J.M.E.); the Gordon and Betty Moore Foundation and the Basic Sciences and Engineering Initiative at the Lucile Packard Children’s Hospital at Stanford University (J.M.E.); a NIH Pathway to Independence Award (K99HG009917 and R00HG009917 to J.M.E.); the Novo Nordisk Foundation (NNF21SA0072102); the Harvard Society of Fellows (J.M.E.); a NIH New Innovator Award (DP2HL152423 to R.M.G); NHLBI U01HL166060 (R.M.G); a Khoury Innovation Award and Braunwald Scholar Award (R.M.G); the Broad Institute (E.S.L.); NIH HL70567 (D.M.); a Florida Department of Health Cancer Research Chair’s Fund Grant (3J-02 to D.M.); and K08DK129824 (M.S.T). We thank members of the Engreitz and Gupta research groups for discussions and technical assistance; J. Bischoff and S. Nasim (Harvard Medical School) for assistance with the TEER assay; J. Keaney and H.-J. Yoo for assistance with the Ibidi shear stress assay; K. Lee, M. Velimirovic and T. van Eeuwen for assistance with computation and visualization of modelling data; and staff at the Harvard O2 cluster for computational resources.

Author information

Authors and Affiliations

Authors

Contributions

G.R.S. developed and implemented the systematic Perturb-seq method. G.R.S., B.C., E.S.L., R.M.G. and J.M.E. designed Perturb-seq experiments. H.K., G.R.S., X.R.M., T.Z., S.K.V., A.B., H.K.F. and J.M.E. developed and implemented analysis methods for Perturb-seq data. G.R.S., O.S-G., V.S.L-K. and S.K.V. conducted and analysed cell imaging experiments. G.R.S., S.F., V.S.L-K., A.E.B. and R.Z. conducted additional assays in ECs. M.S.T. performed AlphaFold2 modelling. S.H. contributed to interpreting AlphaFold2 modelling data. G.R.S. and S.F. performed the co-IP experiments. G.R.S., S.F., D.T.B. and T.H.N. conducted bulk RNA-seq, ATAC–seq and ChIP–seq experiments. G.R.S. and K.G. analysed bulk RNA-seq, ATAC–seq and ChIP–seq data. P.G., S.F. and G.M. created plasmids. N.C. and H.K.F. contributed to the PoPS analysis. K.A. provided GWAS data. R.S.A. and D.M. performed the zebrafish experiments. R.M.G. and J.M.E. supervised the work. All authors contributed to writing the manuscript.

Corresponding authors

Correspondence to Rajat M. Gupta or Jesse M. Engreitz.

Ethics declarations

Competing interests

J.M.E. is a shareholder of Illumina and 10X Genomics, has received materials from 10X Genomics unrelated to this work, is an equity holder in and consultant for Martingale Labs, and has received guest speaker honoraria from GSK. M.S.T. holds equity and has received consulting fees from ROME Therapeutics, which is not related to this work. G.R.S., R.M.G., J.M.E., H.K. and X.R.M. are inventors on a provisional patent related to this work. All other authors declare no competing interests.

Peer review

Peer review information

Nature thanks Lea Starita and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Establishing the TeloHAEC CRISPRi model and Perturb-seq details.

a, Enrichment of CAD heritability in TeloHAEC enhancers, from Stratified Linkage Disequilibrium Score Regression analysis (S-LDSC, see Methods), where enrichment is the percentage of heritability explained by variants in enhancers (%heritability), divided by the percentage of variants in enhancers (%SNPs). Enhancers in TeloHAEC (treated under the indicated conditions) were identified from ATAC-seq and H3K27ac ChIP-seq data (n = 6 for control ATAC, 3 for IL-1β, TNFα or VEGF ATAC, 4 for control ChIP, and 2 for IL-1β, TNFα or VEGF ChIP) by the Activity-by-Contact model. Error bars: standard error around the enrichment estimate, calculated by S-LDSC using jackknife (which resamples the data used for calculating heritability enrichment). P-values were calculated using the S-LDSC method28, and FDR by the Benjamini-Hochberg method. *: FDR < 0.05, with specific FDR values of: Ctrl; 0.037, IL-1β; 0.015, TNFα; 0.020 and VEGF; 0.041. Full S-LDSC results can be found in Supplementary Table 27. b, Scatter density plot of human right coronary artery endothelial cell single cell RNA-seq pseudobulk gene expression (from69) versus teloHAEC pseudobulk gene expression, for genes perturbed in this study. 2,107 of the 2,285 perturbed genes are expressed at TPM > 1 in healthy or diseased RCAECs. R and p-values from two sided Pearson correlation test. c, As in b, but for the 41 V2G2P genes. d, Heatmap of gene expression (log10 TPM) of the 41 V2G2P genes in diseased RCAECs and in teloHAEC. All but one gene, FBN2, is expressed at > 1 TPM in RCAECs. e, FACS showing dox inducibility of KRAB-dCas9-IRES-BFP in TeloHAEC, after sorting but before the screen. Left panels: gating for viable individual cells. Right panels: counts of gated cells by fluorescence intensity in the BFP/PB450 channel. f, BFP channel counts of cells grown in parallel and concurrently with cells for the Perturb-seq screen. After expansion to 120 M cells, transduction, selection and 5-day doxycycline treatment, 92% of cells remain BFP positive. g, Cumulative distribution fraction for duplication levels of unique CBC-UMI-Guide combinations in deeply-sequenced dialout libraries (“unique UMIs”, red) or all guide reads (blue) versus duplication level. Requiring four duplicates (dotted line) eliminates 90% of CBC-UMI-guide combinations (likely PCR chimaeras), while retaining >85% of total guide reads. h, UMIs for top guide per CBC. Arrow: the chosen 4 UMI threshold. i, Counts of singlets (1 gRNA, black bar), doublets (2) and higher multimers, as well as cells with no guide called (0), at the chosen thresholds of 4 UMIs for the top guide and 4 or more fold fewer for the next most frequent guide. j, Histogram of counts of singlet cells per target. Dotted line: average. k, As in j, but for singlet cells per guide. l, Read UMI counts for all transcripts per cell by singlet/multiplet status. The median UMI count for doublets was 37% more than singlets. Assuming that droplets with two cells will have double the number of reads as singlets, this suggests 37% of doublets are due to two cells (9.3% of cells with guides) while the remainder (15.7% of cells with guides) are due to two guides in one cell, very close to the expectation from the infection MOI of 15%. n = 352686, 214449, 79744, 19195 and 5345 cells with 0, 1, 2, 3, or 4 guides, respectively. Boxplot centre line, median; box limits, upper and lower quartiles; whiskers, 1.5x interquartile range; points, outliers. m, Distribution of knockdown efficiency across target genes (log2 expression in cells containing guideRNAs targeting the gene versus in cells containing negative control guideRNAs). Grey line: all targeted genes. Yellow and red lines: genes expressed at >30 and >300 TPM, respectively. Red dotted vertical line: 40% knockdown (average for 300 + TPM target genes). n, Distribution of fitness effects across all guideRNAs (log2 ratio of guide frequency in singlet cells from the Perturb-seq experiment after 5 days of CRISPRi induction compared to guide frequency in the original guideRNA library). Guides targeting common essential genes (red) were depleted more frequently than guideRNAs targeting other genes. o, Relative count frequencies for the number of nominally significant differentially expressed (DE) genes per perturbed target (log2 of genes with raw p < 0.01, and fold change >1.15 from EdgeR DE analysis), for the indicated subclasses of targets. p, Volcano plot showing log2 (# DE genes for target)/(avg. # DE genes for non-expressed controls) versus -log10 FDR (capped at 100). Significance was assessed by two sided binomial test versus DE gene counts for the 48 perturbed non-expressed control genes. Right: symbols for target genes with the strongest effects. 10.7% of all targets had a significant effect on transcription (FDR < .05 increased DE gene count), including 31.9% of common essential genes, and 9.0% of other genes. q, Percent of perturbations that have a significant transcriptional effect in Perturb-seq, as defined by either (i) “DE Genes”, as per p, or (ii) “DE Programs”: perturbations that lead to significant changes in program expression by MAST package50 with 10X lane correction (FDR < 0.05), by each indicated class: Permuted Controls (statistical tests performed on randomly drawn cells with negative control or safe-targeting guides), Expressed (>=1 TPM in TeloHAEC bulk RNA-seq), Low or No Expression (<1 TPM), Common essential (as identified in DepMap119), TeloHAEC Proliferation (showing fitness effects, as per n, of +/−15%, FDR < 0.05), Gene near CAD GWAS signals (expressed genes nearby any CAD GWAS signal, see Methods: Defining variants in CAD GWAS signals), Gene near IBD signals (perturbed expressed genes nearby 10 selected IBD GWAS signals, with no genes overlapping those for CAD signals).

Extended Data Fig. 2 QC metrics for single cells, and selection of number of components for cNMF.

a, UMAPs showing number of UMIs per cell (left), percent ribosomal genes detected per cell (middle), percent mitochondrial genes detected per cell (right). b, UMAPs showing cells from each of the twenty 10X lanes. The differences in clustering along the UMAP_2 axis indicates a technical batch effect between 10X lanes. c, Cumulative distribution function (CDF) plot of the maximum absolute value of Pearson correlation between cNMF component expression in cells and batch. Dotted line: the R > = 0.15 threshold used to call programs associated with batch. d, Gene set enrichment analysis for GO terms among co-regulated genes, as a function of the number of components in the cNMF model (K). y-axis: the number of unique GO terms enriched across all programs for a given K. e, Number of unique motifs enriched among the promoters (top) or enhancers (bottom) of co-regulated genes across all components, as a function of K. f, Number of unique perturbations that have significant effect (FDR < 0.05) on one or more programs, as a function of K. g, Model-based evaluation of the choice of K. Stability of the components over 100 NMF runs (top) and element-wise square of error (bottom, see Methods). h, Quantile-quantile plot for effects of perturbations on program expression. X-axis: expected uniform distribution. Y-axis: –log10 p-value computed from MAST package50. Red: p-value < 0.05.

Extended Data Fig. 3 Catalogue of gene programs.

a, Correlation heatmap of cNMF components. Colour: Pearson’s correlation of log2 fold-change in component expression across all perturbed genes. b, 50 programs ordered by variance explained (see Methods). c, 50 programs ordered by endothelial-cell specificity score — that is, the degree to which the co-regulated genes in the program are specifically expressed in endothelial cells versus in other cell types from FANTOM5 CAGE data (see Methods). Red line: z-score corresponding to top 10% of genes most specifically expressed in endothelial cells. d, Effects of selected regulators on the 13 endothelial-cell-specific programs. Heatmap: log2 fold-change in component expression in perturbation versus control. Top: 16 regulators shared between multiple endothelial cell-specific programs. Bottom: the 4 significant regulators (experiment-wide FDR < 0.05) per program with the most specific effects on that program relative to other endothelial-cell-specific programs. e, Programs ordered by number of enriched transcription factor motifs (see Methods). Grey: promoters. Blue: enhancers. Some programs only have enrichment for motifs in promoters. Some programs showed enrichment of distinct motifs in enhancers versus promoters, such as Program 47 (Angiogenesis, GATA2), with promoter enrichment in WT1 and EGR2 motifs, and enhancer enrichment in GATA2 and PRDM6 motifs. Among the programs with few or no enriched transcription factor motifs, we identified other likely proximal regulatory mechanisms: Program 17 expressed genes whose promoters were marked by H3K27me3 in endothelial cells (see also panel k), and the most significant regulator of this program was SUZ12, a component of the complex (PRC2) that writes this histone modification; and Program 16 pointed to a potential RNA surveillance program, since 40% of its program genes were noncoding RNAs (panel l), and its regulators included a component of the RNA exosome (EXOSC5) and the chromatin remodeller INO80E, which has previously been shown to regulate a subset of noncoding transcripts in yeast120 (see also Supplementary Table 12). f, Annotations for an example program: 15. Left: top 10 program co-regulated genes. Middle, top: motifs enriched in promoters of the 300 program co-regulated genes. Middle, bottom: Gene Ontology terms enriched in the 300 program co-regulated genes. Right: volcano plot of the effects of regulators on cNMF component 15 genes. Program 15 (Flow response, KLF2) appeared to correspond to a canonical endothelial cell response to laminar shear stress defined by the known flow-responsive transcription factor KLF2: the program was highly enriched for KLF motifs in promoters; included known flow-responsive genes such as KRT18/19, NOS3, and KLF2 itself; and was significantly reduced by perturbations to MAP2K5 (MEK5), a kinase known to activate the signaling pathway upstream of KLF235,121. g, Log2 fold change in expression of programs 28 versus 47 for each perturbed gene relative to controls. Program 28 (Tip cell, migration) includes co-regulated genes that mark tip cell specification during sprouting angiogenesis (ESM1, RHOC, PLAUR), and Program 47 (Angiogenesis, GATA) includes co-regulated genes that are enriched in GATA2 & TAL1 motifs and that include NRP2, a co-receptor for VEGF-A, previously shown to act downstream of GATA2122). Blue, red, and purple mark genes that are regulators of Program 28, Program 47, or both programs, respectively. Note that regulators that affect both programs do so in opposite directions. h, Perturbations ordered by the number of regulated programs. Red: top 10 perturbed genes. i. Programs ordered by the number of regulators. Blue: endothelial-cell-specific programs. The top 3 programs, by number of regulators, are labeled. j, 131 perturbed genes that are regulators of at least one endothelial-cell-specific program, ordered by the number of such programs that they regulate. Top 10 regulators are labelled, and included genes known to have important functions in ECs such as EGFL7 and ITGB1BP1/ICAP126,27. k, Average H3K27me3 ChIP-seq signal in co-regulated gene promoters. The top program is Program 17 (Polycomb targets). See legend to e for more details. N = 50 programs. Boxplot centre line, median; box limits, upper and lower quartiles; whiskers, 1.5x interquartile range; points, all data points. l, Percent of noncoding RNA genes in program co-regulated genes. The top program is Program 16 (ncRNA & antisense RNAs). See legend to e for more details. N = 50 programs. Boxplot centre line, median; box limits, upper and lower quartiles; whiskers, 1.5x interquartile range; points, all data points. m, Same as Fig. 1d, but with transcription factor motif sequence logo. The enriched motifs have distinct sequence logos.

Extended Data Fig. 4 Annotations for CAD-associated programs: 8, 35, 39, 47, 48.

Left panels, top 10 program co-regulated genes. Program Specificity z-scores are the cNMF marker gene coefficients, indicating how specific this gene is to this program, relative to other programs (see Methods). Middle left panels, top: top 5 motifs enriched in the promoters or enhancers of the program co-regulated genes; bottom: top 5 GO terms enriched in program co-regulated genes. Middle right panels, regulators of the program. Volcano plot shows effects of all perturbed genes on program expression. Red: FDR < 0.05. Labelled: top two significant regulators in each direction, plus CCM2 and TLNRD1. Right panels, UMAP of program expression in a subset of cells (24,000, randomly selected).

Extended Data Fig. 5 Prioritization of CAD-associated programs and candidate CAD genes.

a, Using MAGMA to prioritize gene programs enriched for CAD heritability (linking variants to program genes and 50 kb of flanking sequence, see Methods). Barplots show beta regression coefficient (left) and –log10 FDR (Benjamini-Hochberg adjusted enrichment p-value, right). Programs are ordered separately by beta or FDR value. Dotted line: FDR = 0.05. b, Using S-LDSC to prioritize gene programs enriched for CAD heritability (linking variants in endothelial cell chromatin accessible regions to genes within 50 Kb, see Methods). Barplots show enrichment (left) and –log10 FDR (Benjamini-Hochberg adjusted enrichment p-value, right). N = 300 (co-regulated program genes ranked by z-score coefficient, for each program). Error bars: standard error around the enrichment estimate, calculated by S-LDSC using jackknife (which resamples the data used for calculating heritability enrichment). P-values were calculated using the S-LDSC method28, and FDR by the Benjamini-Hochberg method. *: FDR < 0.05. Dotted lines: 1 fold enrichment (left), or FDR 0.05 (right). c, CAD-associated V2G2P genes are ranked highly by an independent gene prioritization method, the Polygenic priority score (PoPS). For each of the 43 CAD GWAS signals including a CAD-associated V2G2P gene, we ranked nearby genes based on their PoPS scores. Red: 39 CAD-associated V2G2P genes (two genes, EXOC3L2 and PECAM1, were not assigned scores by PoPS). Grey: all other nearby genes. p-value: two-sided Mann-Whitney U-test. d, Contingency table of PoPS and distance-to-TSS ranks for the 39 CAD-associated V2G2P genes. (two CAD-associated V2G2P genes were not assigned scores by PoPS). e, Odds ratios of variants in lipid-associated (N = 1,181) or non-lipid-associated (N = 3,313) CAD GWAS signals in (i) ATAC peaks in endothelial cells (N = 373,630 unique non-overlapping non-promoter features from 11 epigenomic datasets in ECs, see Methods), (ii) ABC enhancers in endothelial cells (N = 47,112 unique non-overlapping non-promoter features from 11 epigenomic datasets in ECs), (iii) coding sequences (N = 189,232 unique non-overlapping non-promoter features), or (iv) all three categories combined (N = 519,046 unique non-overlapping non-promoter features), compared to background variants (all SNPs from 1000 Genomes, excluding lipid-associated or non-lipid associated CAD GWAS variants, N = 9,955,2088 or N = 9,953,076, respectively; see Methods). Odds ratios were calculated as ((CAD variants within the indicated genomic features)/(all background variants within these features))/((CAD variants outside of these features)/(all background variants outside of these features), and significance assessed by application of a two-sided Fisher’s exact test to the contingency table of these data, with columns = CAD variants versus background variants and rows=inside features versus outside features. Error bars: 95% confidence interval. *: FDR < 0.05. Specific FDR values, from top to bottom, were 1.1e-4, 3.3e-33, 1.5e-8, 3.2e-6, 0.39, 6.0e-32, 0.011, 7.5e-31. Dotted line: odds ratio of 1. f, sc-linker prioritization for 60 EC Perturb-seq gene programs, ranked by z-score. The ranking of programs was similar to V2G2P analysis, but none of the programs reached significance. g, Precision/Recall (PR) plot for V2G2P and seven prior approaches to prioritize CAD locus genes. Recall: the fraction of the eight “gold standard” genes (with strong prior evidence for endothelial cell-specific roles in CAD) detected by each method. Precision: [number of “gold standard” genes called]/[number of genes called within these gold standard loci]. Red: V2G2P. Blue: other studies that prioritized CAD GWAS genes in endothelial cells.

Extended Data Fig. 6 Details for V2G2P analysis.

a, Number of genes with V2G links, per non-lipid CAD GWAS signal. b, Number of genes with G2P links, per non-lipid CAD GWAS signal. c, The cell-type specificity of V2G links appeared to be important for identifying endothelial-cell-specific programs. Here, we repeated the V2G2P analysis (as outlined in Fig. 2), but linked variants to genes using cell-type-agnostic criteria (including ABC scores from any cell type and not just endothelial cells). The 50 programs are ordered (y-axis) by the number of program genes linked to CAD variants (x-axis). Grey dashed line: the number of genes linked to CAD variants that would be expected by chance. Orange labels: endothelial-cell-specific programs. Two significant non-endothelial-cell-specific programs were identified. Fisher exact test with FDR correction, as outlined in panel g. (*FDR < 0.05, **FDR < 0.005, ***FDR < 5e-4). d, Number of overlapping co-regulated genes between control programs (ctrl, x-axis) and full library (2 kG, y-axis) programs. e, Using the full Perturb-seq dataset appeared to be important for identifying the five CAD-associated programs. Programs discovered through cNMF analysis of only the “unperturbed” cells carrying negative control guideRNAs. In this version of the analysis, none of the programs are enriched for genes with V2G link. Fisher exact test with FDR correction, see procedure in panel g. (All programs have FDR > 0.05.) f, Enrichment of genes with V2G links, from the full library but only using co-regulated genes (not regulators,***: FDR < 0.0005, **: FDR < 0.005). g, Steps of V2G2P enrichment test. Numbers shown as examples are from program 8. h, V2G2P analysis prioritizes a small subset of genes and GWAS signals compared to either V2G or G2P information alone. Barplots: counts for signals (left) or nearby genes (right), total (“All”) or those that have: a V2G link, a G2P link, both a V2G link and G2P link to any program, or both a V2G link and a G2P link to a significantly enriched V2G2P program.

Extended Data Fig. 7 Variant-to-gene to program links refine causal gene predictions.

a, V2G2P evidence at the 20p13.1 CAD GWAS locus. Top, heatmap lists genes within 1 Mb of the CAD GWAS signal in genomic order, and shows variant-to-gene (V2G) and gene-to-pathway (G2P) evidence, with the prioritized CAD-associated V2G2P gene(s) labelled in red bold font. Legend details: “ABC, top 2”: a noncoding variant overlaps a chromatin accessible peak in endothelial cells, and the ABC score is at least the second highest of all genes near the GWAS signal. “Distance to variant, top 2”: a noncoding variant overlaps a chromatin accessible peak in endothelial cells, and the gene is one of the two closest genes to the variant. Bottom, zoom-in on genes near the CAD GWAS signal, where rs2004772 is predicted by ABC to regulate PREX1 in the Eahy926 endothelial cell line treated with VEGF. Red dot: prioritized variant in predicted enhancer. Grey dots: other variants within R2 < 0.9 of the lead variant in the locus. Signal tracks below show ATAC-seq or DNase-seq for endothelial and coronary artery smooth muscle cells (CaSMCs, another cell type relevant to CAD). b, As per a, showing V2G2P evidence at the 10p24.33 CAD GWAS signal, where three genes had V2G links (to an enhancer containing rs4918069) and two had gene to CAD-associated program links. HUVEC: human umbilical vein endothelial cells. c, V2G2P evidence at the 17q21.3 CAD GWAS locus, where we have previously linked rs17608766 to GOSR2123. Heatmap, as in panel a. Middle: box plot of GOSR2 reads per cell, normalized to control cell average. n: number of cells. Dotted line: control average (100%). Boxplot centre line, median; box limits, upper and lower quartiles; whiskers, 1.5x interquartile range; points, outliers. Counts for outliers, from top to bottom: GOSR2; 1 & 17, Control; 4 & 15. Right: volcano plot shows effect of GOSR2 knockdown in Perturb-seq on the expression of the 50 non-batch gene programs. Red: FDR < 0.05, from two-sided statistical tests on program expression between perturbed versus control cells by the MAST package50.

Extended Data Fig. 8 Regulatory connections amongst perturbed genes and the CCM pathway, and variant-to-gene links to TLNRD1.

a, Locus zoom plot (http://locuszoom.org/) for CAD GWAS in a 1-Mb region around CCM2. P-values are from the joint association analysis in ref. 12. b, Volcano plot showing effect of CCM2 knockdown in Perturb-seq on the expression of the 50 programs. Red: FDR < 0.05. Significance was assessed by two-sided statistical test on program expression between perturbed versus control cells by the MAST package50. c, Effects of selected perturbed genes on CAD-associated programs (same as Fig. 3b, with significant effects marked with a * (FDR < 0.05)). Colour scale: log2 fold-change on program expression in Perturb-seq. Bold text: CAD-associated V2G2P genes. d, Knockdown efficiency for genes in the CCM pathway, in bulk RNA-seq (Fig. 3c). x-axis: gene expression for each target gene in cells receiving target guides, versus cells with control guides. y-axis: guide IDs for CCM pathway genes. e, 15q25.1 CAD risk locus, where rs1879454 is predicted to regulate TLNRD1 (red arc). GWAS variants: –log10 GWAS P-value12 for variants with LD R2 > 0.9 with the lead SNP. Green signal tracks: epigenomic data from ECs. Grey signal tracks: data from other cell types. HUVEC: human umbilical vein ECs. CaSMCs: coronary artery smooth muscle cells. f, Volcano plot showing the effect of TLNRD1 knockdown in Perturb-seq on the expression of the 50 programs. Red: FDR < 0.05. Statistical test as per b. g, Phenome-wide association study (PheWAS) for rs1879454, from Open Targets98,116. P-values are the best GWAS p-values for association of this variant with each trait across all GWAS sources used by Open Targets (February 25, 2022 release), including UKBB. Orange line: p-value required for significance across all traits. No lipid measure met the p-value threshold for inclusion in the plot, of 0.005. Note: the p-value for CAD is higher than that observed in ref. 12 because Open Targets does not currently contain summary statistics for the latest CAD GWAS. There were no measures of circulating lipids or blood pressure associated with this GWAS signal in a PheWAS analysis in ref. 12. h, Variant-to-gene-to-program evidence at the 15q25.1 CAD GWAS locus. Heatmap shows variant-to-gene (V2G) and gene-to-pathway (G2P) evidence for all genes within 1 Mb of the CAD GWAS signal, in genomic order, with the CAD-associated V2G2P gene labelled in red bold font. Legend details: “ABC, top 2”: a noncoding variant overlaps a predicted enhancer linked to this gene in endothelial cells, and ABC score is at least the second highest of all genes in the locus. “Distance to variant, top 2”: a noncoding variant overlaps a chromatin accessible peak near this gene in endothelial cells, and the gene is one of the two closest genes to the peak. “CAD-associated gene programs” are the five V2G2P programs: 8, 35, 39, 47, 48. i, Histograms of FlowFISH signal (arbitrary units of fluorescence) for RPL13A (left) and TLNRD1 (right) in unstained versus stained teloHAEC expressing the gRNA pool against promoter and potential enhancers. The complete FlowFISH data can be found in Supplementary Table 28. j, As in d, but showing the RPL13A (left) and TLNRD1 (right) signal after sorting of cells into four bins based on expression of TLNRD1. Results are typical of cells across the four independent samples. k, Scatter plots showing the strong correlation between effects on TLNRD1 gene expression for all E-G pairs measured in each of four independent CRISPRi-FlowFISH screens. The enhancer containing rs1879454 is coloured in red, all others in grey. R: Pearson correlation coefficient.

Extended Data Fig. 9 TLNRD1 interaction with the CCM complex depends on the CCM2 C-terminal helix.

a, The 15 top-ranking AlphaFold2 models of the CCM complex are shown, as predicted with and without the presence of CCM2 residues 417–444, and aligned to the CCM2 PTB domain (residues 55–237). The complex is consistently predicted in a high-confidence arrangement (left panel), with most variability in positions of the HHD domain of CCM2 and flexible regions of Krit1. Interactions are predicted between all members of the complex, including published interactions between CCM2-PDCD10. In contrast, multiple conformations are predicted in the absence of the CCM2 C-terminal helix (right panel). b, Predicted Alignment Error (PAE, Å) of all pairwise residue combinations in the WT CCM complex (extracted and plotted using AlphaPickle82); lower error indicates higher confidence. c, Differences in PAE (Å) between the full CCM complex and the CCM complex lacking the CCM2 C-terminal helix; larger numbers represent higher confidence in the presence of the helix. The thick white lines correspond to deleted helix residues, which are omitted from comparison. Predictions within individual domains and proteins are largely unaffected, but within CCM2, the HHD and subsequent loops are predicted with reduced confidence upon helix deletion (yellow). Between domains and proteins, the largest differences are reduced confidence interactions between CCM2 and PDCD10 and between Krit1 and TLNRD1. Summing over the entire matrices, the increase in predicted error with deletion of the helix is 0.7 Å (22.9 versus 23.6 Å, p = 10−56, two-tailed t-test). d, FLAG-tagged TLNRD1 and/or V5-tagged CCM2 full length (“WT”) or C-terminal truncation (“Δ”) were expressed in HEK293T cells, as indicated. Extracts were co-immunoprecipitated with rabbit anti-V5 and blotted with mouse anti-Flag (top) or mouse anti-V5 (bottom). For gel source data, see Supplementary Fig. 1b. Similar results were seen in two separate experiments. e, HEK293 cells were transfected with V5-tagged CCM2 and either Flag-tagged TLNRD1 or Flag-tagged Akt (negative control). Cell lysates were either immunoprecipitated with mouse anti-Flag antibody-bound beads (IP Flag), or loaded directly on the gel (Input). The membranes were first probed with rabbit anti-V5 to detect CCM2 in the Flag precipitant and confirm the transfection of CCM2-V5. The membranes were then re-blotted with rabbit anti-TLNRD1 to evaluate the efficiency of Flag immunoprecipitation and validate the transfection of Akt-Flag and TLNRD1-Flag. Each pair of lanes came from independent biological replicates. For gel source data, see Supplementary Figure 1c. Similar results were seen in two separate experiments. f, HEK293 cells were transfected with V5-tagged CCM2 and Flag-tagged TLNRD1, or, as negative controls, either with CCM2-V5 and Akt-Flag or only TLNRD1-Flag. Cell lysates were either immunoprecipitated with anti-V5 beads (IP V5), or loaded directly on the gel (Input). The membranes were first blotted for Flag to detect TLNRD1 in the V5 precipitant and validate the transfection of Akt-Flag and TLNRD1-Flag. The membranes were re-blotted for V5 to evaluate the efficiency of V5 immunoprecipitation and confirm the transfection of CCM2-V5. For gel source data, see Supplementary Fig. 1d. Similar results were seen in two separate experiments.

Extended Data Fig. 10 Effects of CCM2 and TLNRD1 knockdown relative to MAP3K3 knockdown and laminar flow.

ad, CRISPRi TeloHAEC with control non-targeting guides or with guides to CCM2 or TLNRD1 (two guides apiece) were treated with 2 μg/ml doxycycline for 3 days and nucleofected with Cas9 particles containing control non-targeting guides or with 3 guides targeting exon 3 of MAP3K3 (Synthego). Cells were grown for 48 hours in media with 2 μg/ml doxycycline, and RNA harvested for bulk RNA-seq. The CCM2 guides reduced target gene expression, on average, by 3.4-3.6 fold (p. <2e-9), while TLNRD1 guides reduced target gene expression by 9.4-9.9 fold (p. <2e-43), consistent with the effects of these guides in our other bulk RNAseq data (Fig. 3c). MAP3K3 transcript levels were not significantly reduced, but genome-mappable reads for the targeted exon were greatly reduced, and most of the remaining reads showed multiple mismatches, indicating efficient introduction of Cas9-targeted deletions. N = 2 per condition (from one experiment, 1 RNAseq library for each of 2 CRISPRi guides per target). Correlation coefficient (R), and p-values given in each panel are from a two-sided Pearson correlation test. a, The difference between the effect of CCM2 knockdown in cells with MAP3K3 knockdown and the effect of CCM2 knockdown in control cells (log2 [CCM2kd_with_MAP3K3kd/Control_with_MAP3K3kd]/[CCM2kd/Control], Y-axis) was plotted against the effect of CCM2 knockdown in control cells (log2 [CCM2kd/Control], X-axis, plotting all genes regulated at p. <5e-4 in either contrast). Labelled genes are the top give up- or down-regulated genes by log2 fold change, on each axis. Diagonal line: slope -1 reference. b, As in a, but for the difference between the effect of TLNRD1 knockdown in cells with MAP3K3 knockdown and the effect of TLNRD1 knockdown in control cells (Y-axis), versus the effect of TLNRD1 knockdown in control cells (X axis). The negative correlations in a,b indicate that MAP3K3 perturbation partially reverses the transcriptomic effects of CCM2 or TLNRD1 knock down, consistent with a role of MEKK3/MAP3K3 signalling in regulating transcription downstream of both CCM2 & TLNRD1. c, As in a, but the difference between the effect of MAP3K3 knockdown in cells with CCM2 knockdown and the effect of MAP3K3 knockdown in control cells (Y-axis) versus the effect of MAP3K3 knockdown alone (X axis). d, As in a, but for the difference between the effect of MAP3K3 knockdown in TLNRD1 knockdown cells and the effect of MAP3K3 knockdown in control cells (Y-axis) versus the effect of MAP3K3 knockdown in control cells (X axis). The negative correlations in both c and d indicate that perturbation of CCM2 or TLNRD1 partially reverses the transcriptional effects of MAP3K3 knockdown, consistent with the expectation that decreased expression of upstream inhibitors can compensate for decreased expression of MEKK3. e–h, CRISPRi TeloHAEC with control non-targeting guides or with guides to CCM2 or TLNRD1 (2 guides apiece) were grown in static culture or subjected to flow in an Ibidi flow chamber for 48 h. In each case, cells were treated with 2 μg/ml doxycycline to induce the CRISPRi machinery for 5 days (3 days prior & 2 days after introduction of laminar flow). After phase contrast imaging, RNAseq libraries were prepared and sequenced to a depth of 10–12 million reads. N = 2 per condition (one experiment, with 1 RNAseq library for each of 2 CRISPRi guides per target). R and p-values, and labeled genes, as per ad. e, The effects of CCM2 knockdown in static culture (Y-axis) compared to the effect of flow in control cells (X-axis, showing all genes regulated at p. <5e-4 in either contrast). Diagonal line: slope = +1 reference. f, As in e, but with Y-axis = effects of TLNRD1 knockdown in static culture. The positive correlations indicate that TLNRD1 or CCM2 knockdown in static culture is similar to the effect of flow. g, The difference between the effect of flow in cells with CCM2 knockdown and the effect of flow in control cells (log2 [Flow_CCM2kd/Static_CCM2kd]/[Flow_Ctrl/Static_Ctrl], Y-axis) versus the effect of flow in control cells (log2 [Flow_Ctrl/Static_Ctrl], X-axis). Diagonal line: slope = −1 reference. h, As in g, but showing the difference between the effect of flow in cells with TLNRD1 knock down and the effect of flow in control cells (Y-axis) versus the effect of flow in control cells (X-axis). The negative correlations in g,h indicate that CCM2 or TLNRD1 knockdown cells have a weaker transcriptional response to flow than control cells, consistent with these cells already having a partial flow-like transcriptional phenotype in static culture (e,f). i, Representative images of CRISPRi teloHAEC with the indicated guides, under laminar flow. N = 2 per condition from one experiment (two distinct CRIPSRi guides to CCM2, TLNRD1 or controls), and with four images per guide. j, The normal alignment to flow in control teloHAEC (measured as the angle, relative to flow, of the long axis of each cell) is significantly abrogated in both CCM2 & TLNRD1 KD cells (increased average angle relative to flow). Average values for all cells in each of four images for each of two guides per target were calculated (35 to 103 cells per image). Significance was assessed by a two sided T-test on these average values. N = 8. Boxplot features, as per Extended Data Fig. 1l. Note that alignment to flow is not completely blocked in CCM2 or TLNRD1 KD cells, since the average angle relative to flow does not reach the 45% value expected if orientation were entirely random. k, As per j, but measuring the ratio of the long versus short axis lengths for each cell (“length/width”) from the fit ellipse function in FiJi.

Extended Data Fig. 11 Additional analysis of ccm2 and tlnrd1 CRISPR and morpholino knockdown zebrafish embryos.

Zebrafish is a model system which has been extensively used to study CCM gene functions, and where ccm2 has been shown to have characteristic effects in heart and vascular development34,35,43,44,78,124. a, In situ analysis of tlnrd1 mRNA expression. tlnrd1 mRNA, detected by the anti-sense in situ probe, is expressed in the head (black arrowhead) and heart (red arrowhead, top left), and in the notochord (white bracket), posterior cardinal vein (PCV, green arrow) and intersegmental vessels (ISVs, blue arrows, bottom panel). This staining pattern is not seen with the negative control sense probe (top right). N = 10 from one experiment. This staining pattern is consistent with TLNRD1 being most highly expressed in human endothelial cells (Tabula Sapiens atlas of gene expression, https://tabula-sapiens-portal.ds.czbiohub.org/). b, Quantitation of ccm2 and tlnrd1 CRISPR/morpholino heart phenotypes, knockdown efficiency and effects on klf2b expression. The table summarizes the number of embryos injected with each guide, or with tracRNA control (for CRISPR experiments), or with each experimental or control morpholinos at the indicated concentration(s), the number that survived, and the percent that showed a heart phenotype characterized by enlarged atrium, pericardial edema and slow blood flow in tail veins. “ccm2/tlnrd1 (v. ctrl)” summarizes qRT-PCR analysis of knockdown efficacy (ccm2 or tlnrd1 levels in embryos with CRISPR guides to each of these target genes, versus control embryos). “klf2b (v. ctrl)” summarizes qRT-PCR quantification of klf2b in CRISPR or morpholino knockdown animals versus controls. “NA”: effects on ccm2 or tlnrd1 were not measured for the morpholino studies, because morpholinos generally function by inhibiting translation; and klf2b levels were not tested for the indicated morpholino treatments. See also Supplementary Table 20. c, Light microscope images of CRISPR embryos. Representative images of Zebrafish 3dpf embryos injected with control, ccm2 or tlnrd1 gRNA and Cas9 protein. A: atrium. V: ventricle. pc: pericardial space. For N and experimental replicates see b and Supplementary Table 20. d, Similar heart phenotypes in ccm2 and tlnrd1 morpholino embryos. Knockdown of either tlnrd1 or ccm2 with 100 μM anti-tlnrd1 or anti-ccm2 morpholino caused similar heart defects as seen by CRISPR knockdown, with no heart defects seen using 100 μM control morpholino. For N and experimental replicates see b and Supplementary Table 20. e, Synergistic effect of ccm2 and tlnrd1 morpholinos. As in d, but showing the synergistic phenotype of 50 μM tlnrd1 & 50 μM ccm2 morpholinos, which, individually, showed no phenotype, but together showed the heart phenotype in 72% of embryos. For N and experimental replicates see b and Supplementary Table 20. f, Vascular phenotype in ccm2 and tlnrd1 morpholino embryos. Representative microangiogram images showing FITC dextran green (2000 kDa) injected in the vasculature in control, ccm2 and tlnrd1 gRNA injected 3 dpf larvae. Brackets mark the thickness of the posterior cardinal vein (PCV). Arrows indicate the intersegmental vessels (ISVs). Experiments were repeated 3 times. N = 6 for control, and 5 for ccm2 or tlnrd1 gRNAs, from one experiment. g, Ventricle wall thinning in ccm2 and tlnrd1 morpholino embryos. Hematoxylin & eosin (H&E) stained sections of 3dpf embryos. A: atrium. V: ventricle. The space between dotted lines in the ventricle indicates ventricular wall thinning in ccm2 or tlnrd1 CRISPR embryos. Cells within each chamber are blood cells. N = 3 for each treatment, from one experiment. h, More permeable vasculature in ccm2 and tlnrd1 morpholino embryos. Representative images from vascular permeability analysis in control, ccm2 and tlnrd1 gRNA injected zebrafish at 3 dpf. Red colour indicates texas red dextran 70 KD, which was injected into the vasculature before imaging. Green: green fluorescence protein expression in the vasculature (Tg:Fli GFP). Both ccm2 and tlnrd1 gRNA injected embryos displayed higher levels of red dye in the interspace between the vessels (arrows). Bottom right: quantitation of permeability (ratio of red dextran in interspace vs. controls. n = 10 for control, 13 for ccm2 & 13 for tlnrd1). Significance was assessed by two-sided T-test. Boxplot centre line, median; box limits, upper and lower quartiles; whiskers, 1.5x interquartile range; points, outliers. For the complete data, see Supplementary Table 31. itlnrd1 knockdown unregulated klf2b in zebrafish. qRT-PCR for knockdown of tlnrd1 & induction of klf2b in zebrafish embryos treated with CRISPR guides to tlnrd1, or with control tracrRNA. Signal was normalized to Actin, and then to the average for controls. n = 9 for klf2b (6 for guide AF, 3 for guide AN.2). n = 5 for tlnrd1 (2 for guide AF, 3 for guide AN.2). Quantitation and boxplot features as in h. For the complete data, see Supplementary Table 30. j, TLNRD1 knockdown upregulates KLF2 in TeloHAEC. qRT-PCR for knockdown of TLNRD1 and induction of KLF2 in TeloHAEC with Cas9-guide nucleofection knock down of TLNRD1 (or non-targeting guides, “Control”). Signal was normalized to GAPDH, and then to the average for controls. n = 4 separate samples. Quantitation and boxplot features as in h. For the complete data, see Supplementary Table 30.

Extended Data Fig. 12 Application of V2G2P to other traits and other cell models.

a, Venn diagram of V2G2P genes for coronary artery disease (CAD), pulse pressure (PP), and mean arterial pressure (MAP) GWAS traits in teloHAEC (using the same ABC-maps and Perturb-seq data, but disease variants for each trait). For MAP, we prioritized program 8 (ECM organization, AQP1, FDR = 0.0135) and program 15 (KLF2, flow response, FDR = 0.0289). For PP we prioritized program 50 (TGFβ response, FDR = 0.0046) and program 29 (EDN1, wound healing, FDR = 0.0316). Several genes in the PP programs are known to regulate vascular tone and stiffness, including FHL2, SMAD3, and TGFB1125,126,127. b, K562 V2G2P programs for mean corpuscular haemoglobin (MCH), mean corpuscular volume (MCV), platelet count (Plt), red blood cell count (RBC), pulse pressure (PP), mean corpuscular haemoglobin concentration (MCHC), average blood glucose level (HbA1c), haemoglobin count (Hb), mean arterial pressure (MAP), diastolic blood pressure (DBP), systolic blood pressure (SBP). Overall, 32 programs were prioritized for 6 GWAS traits, ranging from 27 programs associated with MCH to 2 programs for MCHC. In general, traits that were not relevant to K562 erythroleukaemia cells had no K562 programs significantly associated with them (e.g. MAP, DBP & SBP). Programs associated with each trait contained genes related to that trait. For instance, the most significantly-enriched mean corpuscular haemoglobin program was K562 Program 13, which included many haemoglobin genes as well as the known regulators GFI1B128 and CBFA2T3129, while variants associated with platelet count showed most significant enrichment in K562 Program 4, which included genes known to be involved in megakaryocyte differentiation and platelet count such as VASP130 and TPM4131, and which showed high enrichment of motifs for the known megakaryocyte regulators SP1/3132.

Supplementary information

Supplementary Information

Supplementary Notes 1–5, Supplementary References and Supplementary Figs. 1–2.

Reporting Summary

Supplementary Tables 1–15

.

Supplementary Tables 16–31

.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Schnitzler, G.R., Kang, H., Fang, S. et al. Convergence of coronary artery disease genes onto endothelial cell programs. Nature 626, 799–807 (2024). https://doi.org/10.1038/s41586-024-07022-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41586-024-07022-x

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research