Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Perspective
  • Published:

Deciphering the impact of genomic variation on function

Abstract

Our genomes influence nearly every aspect of human biology—from molecular and cellular functions to phenotypes in health and disease. Studying the differences in DNA sequence between individuals (genomic variation) could reveal previously unknown mechanisms of human biology, uncover the basis of genetic predispositions to diseases, and guide the development of new diagnostic tools and therapeutic agents. Yet, understanding how genomic variation alters genome function to influence phenotype has proved challenging. To unlock these insights, we need a systematic and comprehensive catalogue of genome function and the molecular and cellular effects of genomic variants. Towards this goal, the Impact of Genomic Variation on Function (IGVF) Consortium will combine approaches in single-cell mapping, genomic perturbations and predictive modelling to investigate the relationships among genomic variation, genome function and phenotypes. IGVF will create maps across hundreds of cell types and states describing how coding variants alter protein activity, how noncoding variants change the regulation of gene expression, and how such effects connect through gene-regulatory and protein-interaction networks. These experimental data, computational predictions and accompanying standards and pipelines will be integrated into an open resource that will catalyse community efforts to explore how our genomes influence biology and disease across populations.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Genomic variation influences genome function and phenotype.
Fig. 2: A map–perturb–predict framework to connect genome variation to genome function and phenotype.
Fig. 3: The IGVF Catalogue of genome function and the effects of genomic variation.

Similar content being viewed by others

References

  1. Claussnitzer, M. et al. A brief history of human disease genetics. Nature 577, 179–189 (2020). This review describes progress in identifying genomic variants associated with common and rare diseases, and the approaches needed to combine these data with maps of genome function to advance diagnostic and therapeutic strategies.

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  2. Loos, R. J. F. 15 years of genome-wide association studies and no signs of slowing down. Nat. Commun. 11, 5900 (2020).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  3. Green, E. D. et al. Strategic vision for improving human health at the forefront of genomics. Nature 586, 683–692 (2020).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  4. Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443 (2020).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  5. Collins, R. L. et al. A structural variation reference for medical and population genetics. Nature 581, 444–451 (2020).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  6. Sollis, E. et al. The NHGRI–EBI GWAS Catalog: knowledgebase and deposition resource. Nucleic Acids Res. 51, D977–D985 (2023).

    Article  CAS  PubMed  Google Scholar 

  7. Landrum, M. J. et al. ClinVar: public archive of relationships among sequence variation and human phenotype. Nucleic Acids Res. 42, D980–D985 (2014).

    Article  CAS  PubMed  Google Scholar 

  8. Rehm, H. L. et al. ClinGen—the clinical genome resource. N. Engl. J. Med. 372, 2235–2242 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Zhou, W. et al. Global Biobank Meta-analysis Initiative: powering genetic discovery across human disease. Cell Genom 2, 100192 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Taliun, D. et al. Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program. Nature 590, 290–299 (2021).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  11. Backman, J. D. et al. Exome sequencing and analysis of 454,787 UK Biobank participants. Nature 599, 628–634 (2021).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  12. Karczewski, K. J. et al. Systematic single-variant and gene-based association testing of thousands of phenotypes in 394,841 UK Biobank exomes. Cell Genomics 2, 100168 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Doolittle, W. F., Brunet, T. D. P., Linquist, S. & Gregory, T. R. Distinguishing between ‘function’ and ‘effect’ in genome biology. Genome Biol. Evol. 6, 1234–1237 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  14. Kellis, M. et al. Defining functional DNA elements in the human genome. Proc. Natl. Acad. Sci. USA 111, 6131–6138 (2014).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  15. ENCODE Project Consortium. Expanded encyclopaedias of DNA elements in the human and mouse genomes. Nature 583, 699–710 (2020). An exemplary team science effort which has led to development of methods, data resources and standards enabling fundamental advances in understanding gene regulation and genome function.

    Article  ADS  Google Scholar 

  16. Roadmap Epigenomics Consortium. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).

    Article  PubMed Central  Google Scholar 

  17. GTEx Consortium. The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science 369, 1318–1330 (2020). This latest flagship manuscript from the GTEx Consortium maps how genomic variation regulates gene expression across human tissues, providing a resource for interpreting the molecular effects of variants associated with common diseases.

    Article  Google Scholar 

  18. Võsa, U. et al. Large-scale cis- and trans-eQTL analyses identify thousands of genetic loci and polygenic scores that regulate blood gene expression. Nat. Genet. 53, 1300–1310 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  19. Li, Y. I. et al. RNA splicing is a primary link between genetic variation and disease. Science 352, 600–604 (2016).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  20. HuBMAP Consortium. The human body at cellular resolution: the NIH Human Biomolecular Atlas Program. Nature 574, 187–192 (2019).

    Article  CAS  ADS  Google Scholar 

  21. Regev, A. et al. The Human Cell Atlas. eLife 6, e27041 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  22. Matreyek, K. A. et al. Multiplex assessment of protein variant abundance by massively parallel sequencing. Nat. Genet. 50, 874–882 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Findlay, G. M. et al. Accurate classification of BRCA1 variants with saturation genome editing. Nature 562, 217–222 (2018).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  24. Esposito, D. et al. MaveDB: an open-source platform to distribute and interpret data from multiplexed assays of variant effect. Genome Biol. 20, 223 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  25. Luck, K. et al. A reference map of the human binary protein interactome. Nature 580, 402–408 (2020).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  26. Szklarczyk, D. et al. The STRING database in 2021: customizable protein–protein networks, and functional characterization of user-uploaded gene/measurement sets. Nucleic Acids Res. 49, D605–D612 (2021).

    Article  CAS  PubMed  Google Scholar 

  27. Pacini, C. et al. Integrated cross-study datasets of genetic dependencies in cancer. Nat. Commun. 12, 1661 (2021).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  28. International Common Disease Alliance. ICDA Recommendations and White Paper. ICDA https://icda.bio (2020).

  29. Abdellaoui, A., Yengo, L., Verweij, K. J. H. & Visscher, P. M. 15 years of GWAS discovery: realizing the promise. Am. J. Hum. Genet. 110, 179–194 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Rehm, H. L. & Fowler, D. M. Keeping up with the genomes: scaling genomic variant interpretation. Genome Med. 12, 5 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  31. Bentley, A. R., Callier, S. & Rotimi, C. N. Diversity and inclusion in genomic research: why the uneven progress? J. Community Genet. 8, 255 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  32. Sirugo, G., Williams, S. M. & Tishkoff, S. A. The missing diversity in human genetic studies. Cell 177, 1080 (2019).

    Article  CAS  PubMed  Google Scholar 

  33. Lappalainen, T. & MacArthur, D. G. From variant to function in human disease genetics. Science 373, 1464–1468 (2021).

    Article  CAS  PubMed  ADS  Google Scholar 

  34. Findlay, G. M. Linking genome variants to disease: scalable approaches to test the functional impact of human mutations. Hum. Mol. Genet. 30, R187–R197 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Hu, Y. et al. Single-cell multi-scale footprinting reveals the modular organization of DNA regulatory elements. Preprint at bioRxiv https://doi.org/10.1101/2023.03.28.533945 (2023).

  36. Granja, J. M. et al. ArchR is a scalable software package for integrative single-cell chromatin accessibility analysis. Nat. Genet. 53, 403–411 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Kartha, V. K. et al. Functional inference of gene regulation using single-cell multi-omics. Cell Genomics 2, 100166 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Kotliar, D. et al. Identifying gene expression programs of cell-type identity and cellular activity with single-cell RNA-seq. eLife 8, e43803 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  39. Benner, C. et al. FINEMAP: efficient variable selection using summary data from genome-wide association studies. Bioinformatics 32, 1493–1501 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Wang, G., Sarkar, A., Carbonetto, P. & Stephens, M. A simple new approach to variable selection in regression, with application to genetic fine mapping. J. R. Stat. Soc. B 82, 1273–1300 (2020).

    Article  MathSciNet  Google Scholar 

  41. Weissbrod, O. et al. Functionally informed fine-mapping and polygenic localization of complex trait heritability. Nat. Genet. 52, 1355–1363 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Wang, Q. S. et al. Leveraging supervised learning for functionally informed fine-mapping of cis-eQTLs identifies an additional 20,913 putative causal eQTLs. Nat. Commun. 12, 3394 (2021).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  43. Cuella-Martin, R. et al. Functional interrogation of DNA damage response variants with base editing screens. Cell 184, 1081–1097.e19 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Morris, J. A. et al. Discovery of target genes and pathways at GWAS loci by pooled single-cell CRISPR screens. Science 380, eadh7699 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Martin-Rufino, J. D. et al. Massively parallel base editing to map variant effects in human hematopoiesis. Cell 186, 2456–2474.e24 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Anzalone, A. V., Koblan, L. W. & Liu, D. R. Genome editing with CRISPR–Cas nucleases, base editors, transposases and prime editors. Nat. Biotechnol. 38, 824–844 (2020).

    Article  CAS  PubMed  Google Scholar 

  47. Hanna, R. E. et al. Massively parallel assessment of human variants with base editor screens. Cell 184, 1064–1080.e20 (2021).

    Article  CAS  PubMed  Google Scholar 

  48. Klann, T. S. et al. CRISPR–Cas9 epigenome editing enables high-throughput screening for functional regulatory elements in the human genome. Nat. Biotechnol. 35, 561–568 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Fulco, C. P. et al. Systematic mapping of functional enhancer–promoter connections with CRISPR interference. Science 354, 769–773 (2016).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  50. Canver, M. C. et al. BCL11A enhancer dissection by Cas9-mediated in situ saturating mutagenesis. Nature 527, 192–197 (2015). This study applied CRISPR–Cas9 screens to dissect a GWAS-nominated enhancer of BCL11A, a negative regulator of fetal haemoglobin expression during erythropoiesis, and motivated the development of enhancer-targeting CRISPR therapeutics for sickle-cell disease.

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  51. Arnold, C. D. et al. Genome-wide quantitative enhancer activity maps identified by STARR-seq. Science 339, 1074–1077 (2013).

    Article  CAS  PubMed  ADS  Google Scholar 

  52. Melnikov, A. et al. Systematic dissection and optimization of inducible enhancers in human cells using a massively parallel reporter assay. Nat. Biotechnol. 30, 271–277 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Bergman, D. T. et al. Compatibility rules of human enhancer and promoter sequences. Nature 607, 176–184 (2022).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  54. Vockley, C. M. et al. Massively parallel quantification of the regulatory effects of noncoding genetic variation in a human cohort. Genome Res. 25, 1206–1214 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. Klein, J. C. et al. A systematic evaluation of the design and context dependencies of massively parallel reporter assays. Nat. Methods 17, 1083–1091 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Patwardhan, R. P. et al. High-resolution analysis of DNA regulatory elements by synthetic saturation mutagenesis. Nat. Biotechnol. 27, 1173–1175 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  57. Agarwal, V. et al. Massively parallel characterization of transcriptional regulatory elements in three diverse human cell types. Preprint at bioRxiv https://doi.org/10.1101/2023.03.05.531189 (2023).

  58. Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  59. Alipanahi, B., Delong, A., Weirauch, M. T. & Frey, B. J. Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning. Nat. Biotechnol. 33, 831–838 (2015).

    Article  CAS  PubMed  Google Scholar 

  60. Zhou, J. & Troyanskaya, O. G. Predicting effects of noncoding variants with deep learning-based sequence model. Nat. Methods 12, 931–934 (2015). The study develops a deep learning framework (DeepSEA) trained on chromatin profiling data to predict effects of single-nucleotide genomic variants on transcription factor binding and chromatin state.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Avsec, Ž. et al. Base-resolution models of transcription-factor binding reveal soft motif syntax. Nat. Genet. 53, 354–366 (2021). This study introduces the BPNet model, a convolutional neural network to predict basepair-resolution epigenomic data from DNA sequence, and applies this framework to learn rules of the regulatory syntax underlying transcription factor binding.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  62. Beer, M. A. Predicting enhancer activity and variant impact using gkm-SVM. Hum. Mutat. 38, 1251–1258 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Chen, K. M., Wong, A. K., Troyanskaya, O. G. & Zhou, J. A sequence-based global map of regulatory activity for deciphering human genetics. Nat. Genet. 54, 940–949 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  64. Avsec, Ž. et al. Effective gene expression prediction from sequence by integrating long-range interactions. Nat. Methods 18, 1196–1203 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  65. Boyle, E. A., Li, Y. I. & Pritchard, J. K. An expanded view of complex traits: from polygenic to omnigenic. Cell 169, 1177 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  66. Rentzsch, P., Witten, D., Cooper, G. M., Shendure, J. & Kircher, M. CADD: predicting the deleteriousness of variants throughout the human genome. Nucleic Acids Res. 47, D886–D894 (2018).

    Article  PubMed Central  Google Scholar 

  67. Han, J.-D. J. Understanding biological functions through molecular networks. Cell Res. 18, 224–237 (2008).

    Article  CAS  PubMed  Google Scholar 

  68. Kryshtafovych, A., Schwede, T., Topf, M., Fidelis, K. & Moult, J. Critical assessment of methods of protein structure prediction (CASP)-Round XIV. Proteins 89, 1607–1617 (2021). Work by CASP over almost 20 years illustrates how community efforts to develop gold-standard data, benchmarks and critical assessments can facilitate development of predictive models of protein structure and function, with CASP XIV marking a major advance through the introduction of AlphaFold2.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  69. The Critical Assessment of Genome Interpretation Consortium. CAGI, the Critical Assessment of Genome Interpretation, establishes progress and prospects for computational genetic variant interpretation methods. Genome Biol. 25, 53 (2024). This paper reports a collaborative effort to independently assess computational models for interpreting the effects of variants on molecular phenotypes and disease risk, and demonstrates their utility in clinical and research applications.

    Article  Google Scholar 

  70. Ma, S. et al. Chromatin potential identified by shared single-cell profiling of RNA and chromatin. Cell 183, 1103–1116.e20 (2020). This study introduces SHARE-seq and demonstrates how single-cell multiomic data enables mapping dynamics of regulatory element activity across differentiation states by correlating distal enhancers with target genes.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  71. Tran, V. et al. High sensitivity single cell RNA sequencing with split pool barcoding. Preprint at bioRxiv https://doi.org/10.1101/2022.08.27.505512 (2022).

  72. Xu, Y. et al. An atlas of genetic scores to predict multi-omic traits. Nature 616, 123–131 (2023).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  73. Nathan, A. et al. Single-cell eQTL models reveal dynamic T cell state dependence of disease loci. Nature 606, 120–128 (2022).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  74. Perez, R. K. et al. Single-cell RNA-seq reveals cell type–specific molecular and genetic associations to lupus. Science 376, eabf1970 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  75. Yazar, S. et al. Single-cell eQTL mapping identifies cell type–specific genetic control of autoimmune disease. Science 376, eabf3041 (2022).

    Article  CAS  PubMed  Google Scholar 

  76. Gate, R. E. et al. Genetic determinants of co-accessible chromatin regions in activated T cells across humans. Nat. Genet. 50, 1140–1150 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  77. Adamson, B. et al. A multiplexed single-cell CRISPR screening platform enables systematic dissection of the unfolded protein response. Cell 167, 1867–1882.e21 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  78. Dixit, A. et al. Perturb-Seq: dissecting molecular circuits with scalable single-cell RNA profiling of pooled genetic screens. Cell 167, 1853–1866.e17 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  79. Replogle, J. M. et al. Mapping information-rich genotype-phenotype landscapes with genome-scale Perturb-seq. Cell 185, 2559–2575.e28 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  80. Sahni, N. et al. Widespread macromolecular interaction perturbations in human genetic disorders. Cell 161, 647–660 (2015). Systematic open reading frame screens showed that a majority of coding variants in Mendelian disorders affect protein interaction networks, providing a resource to benchmark predictors of variant effects.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  81. Fayer, S. et al. Closing the gap: Systematic integration of multiplexed functional data resolves variants of uncertain significance in BRCA1, TP53 and PTEN. Am. J. Hum. Genet. 108, 2248–2258 (2021). This study illustrates how experimentally derived variant effect maps can have high clinical utility in interpreting variants for Mendelian diseases.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  82. Starita, L. M. et al. Massively parallel functional analysis of BRCA1 RING domain variants. Genetics 200, 413–422 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  83. Sun, S. et al. An extended set of yeast-based functional assays accurately identifies human disease mutations. Genome Res. 26, 670–680 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  84. Bray, M.-A. et al. Cell painting, a high-content image-based assay for morphological profiling using multiplexed fluorescent dyes. Nat. Protoc. 11, 1757–1774 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  85. Fulco, C. P. et al. Activity-by-contact model of enhancer-promoter regulation from thousands of CRISPR perturbations. Nat. Genet. 51, 1664–1669 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  86. Sakaue, S. et al. Tissue-specific enhancer–gene maps from multimodal single-cell data identify causal disease alleles. Nat. Genet. 56, 615–626 (2024).

    Article  CAS  PubMed  Google Scholar 

  87. Weeks, E. M. et al. Leveraging polygenic enrichments of gene features to predict genes underlying complex traits and diseases. Nat. Genet. 55, 1267–1276 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  88. Nasser, J. et al. Genome-wide enhancer maps link risk variants to disease genes. Nature 593, 238–243 (2021).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  89. Schnitzler, G. R. et al. Convergence of coronary artery disease genes onto endothelial cell programs. Nature 626, 799–807 (2024).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  90. Finucane, H. K. et al. Heritability enrichment of specifically expressed genes identifies disease-relevant tissues and cell types. Nat. Genet. 50, 621–629 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  91. Forgetta, V. et al. An effector index to predict target genes at GWAS loci. Hum. Genet. 141, 1431–1447 (2022).

    Article  CAS  PubMed  Google Scholar 

  92. Ghoussaini, M. et al. Open Targets Genetics: systematic identification of trait-associated genes using large-scale genetics and functional genomics. Nucleic Acids Res. 49, D1311–D1320 (2021).

    Article  CAS  PubMed  Google Scholar 

  93. Gschwind, A. R. et al. an encyclopedia of enhancer-gene regulatory interactions in the human genome. Preprint at bioRxiv https://doi.org/10.1101/2023.11.09.563812 (2023).

  94. Karollus, A., Mauermeier, T. & Gagneur, J. Current sequence-based models capture gene expression determinants in promoters but mostly ignore distal enhancers. Genome Biol. 24, 56 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  95. The Complex Trait Consortium. The Collaborative Cross, a community resource for the genetic analysis of complex traits. Nat. Genet. 36, 1133–1137 (2024).

    Article  Google Scholar 

  96. Hogan, A. et al. Knowledge Graphs. Preprint at arXiv https://doi.org/10.48550/arXiv.2003.02320 (2020).

  97. Feng, F. et al. GenomicKB: a knowledge graph for the human genome. Nucleic Acids Res. 51, D950–D956 (2023).

    Article  CAS  PubMed  Google Scholar 

  98. Chandak, P., Huang, K. & Zitnik, M. Building a knowledge graph to enable precision medicine. Sci. Data 10, 67 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  99. Lobentanzer, S. et al. Democratizing knowledge representation with BioCypher. Nat. Biotechnol. 41, 1056–1059 (2023).

    Article  CAS  PubMed  Google Scholar 

  100. Ambrosini, G. et al. Insights gained from a comprehensive all-against-all transcription factor binding motif benchmarking study. Genome Biol. 21, 114 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  101. de Boer, C. G. & Taipale, J. Hold out the genome: a roadmap to solving the cis-regulatory code. Nature 625, 41–50 (2024).

    Article  PubMed  ADS  Google Scholar 

  102. Yuan, H. & Kelley, D. R. scBasset: sequence-based modeling of single-cell ATAC-seq using convolutional neural networks. Nat. Methods 19, 1088–1096 (2022).

    Article  CAS  PubMed  Google Scholar 

  103. Inoue, F. et al. A systematic comparison reveals substantial differences in chromosomal versus episomal encoding of enhancer activity. Genome Res. 27, 38–52 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  104. Xie, S., Duan, J., Li, B., Zhou, P. & Hon, G. C. Multiplexed engineering and analysis of combinatorial enhancer activity in single cells. Mol. Cell 66, 285–299.e5 (2017).

    Article  CAS  PubMed  Google Scholar 

  105. Gasperini, M. et al. A genome-wide framework for mapping gene regulation via cellular genetic screens. Cell 176, 1516 (2019).

    Article  CAS  PubMed  Google Scholar 

  106. Reilly, S. K. et al. Direct characterization of cis-regulatory elements and functional dissection of complex genetic associations using HCR–FlowFISH. Nat. Genet. 53, 1166–1176 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  107. Schraivogel, D. et al. Targeted Perturb-seq enables genome-scale genetic screens in single cells. Nat. Methods 17, 629–635 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  108. McGinnis, C. S. et al. MULTI-seq: sample multiplexing for single-cell RNA sequencing using lipid-tagged indices. Nat. Methods 16, 619–626 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  109. Daniel, B. et al. Divergent clonal differentiation trajectories of T cell exhaustion. Nat. Immunol. 23, 1614–1627 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  110. Rebboah, E. et al. Mapping and modeling the genomic basis of differential RNA isoform expression at single-cell resolution with LR-Split-seq. Genome Biol. 22, 286 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  111. Pratapa, A., Jalihal, A. P., Law, J. N., Bharadwaj, A. & Murali, T. M. Benchmarking algorithms for gene regulatory network inference from single-cell transcriptomic data. Nat. Methods 17, 147–154 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  112. Zhang, Y., Qi, G., Park, J.-H. & Chatterjee, N. Estimation of complex effect-size distributions using summary-level statistics from genome-wide association studies across 32 complex traits. Nat. Genet. 50, 1318–1326 (2018).

    Article  CAS  PubMed  Google Scholar 

  113. O’Connor, L. J. The distribution of common-variant effect sizes. Nat. Genet. 53, 1243–1249 (2021).

    Article  PubMed  Google Scholar 

  114. Lewis, C. M. & Vassos, E. Polygenic risk scores: from research tools to clinical instruments. Genome Med. 12, 44 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  115. Hekselman, I. & Yeger-Lotem, E. Mechanisms of tissue and cell-type specificity in heritable traits and diseases. Nat. Rev. Genet. 21, 137–150 (2020).

    Article  CAS  PubMed  Google Scholar 

  116. Uffelmann, E. et al. Genome-wide association studies. Nat. Rev. Methods Prime. 1, 59 (2021).

    Article  CAS  Google Scholar 

  117. Weissbrod, O. et al. Leveraging fine-mapping and multipopulation training data to improve cross-population polygenic risk scores. Nat. Genet. 54, 450–458 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  118. Heid, I. M. et al. Meta-analysis identifies 13 new loci associated with waist-hip ratio and reveals sexual dimorphism in the genetic basis of fat distribution. Nat. Genet. 42, 949–960 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  119. Goossens, G. H., Jocken, J. W. E. & Blaak, E. E. Sexual dimorphism in cardiometabolic health: the role of adipose tissue, muscle and liver. Nat. Rev. Endocrinol. 17, 47–66 (2021).

    Article  PubMed  Google Scholar 

  120. Rajabli, F. et al. Ancestral origin of ApoE ε4 Alzheimer disease risk in Puerto Rican and African American populations. PLoS Genet. 14, e1007791 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  121. Blue, E. E., Horimoto, A. R. V. R., Mukherjee, S., Wijsman, E. M. & Thornton, T. A. Local ancestry at APOE modifies Alzheimer’s disease risk in Caribbean Hispanics. Alzheimers Dement. 15, 1524–1532 (2019).

    Article  PubMed  Google Scholar 

  122. Baxter, S. M. et al. Centers for Mendelian genomics: a decade of facilitating gene discovery. Genet. Med. 24, 784–797 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  123. Costanzo, M. C. et al. The Type 2 Diabetes Knowledge Portal: an open access genetic resource dedicated to type 2 diabetes and related traits. Cell Metab. 35, 695–710.e6 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  124. Richards, S. et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet. Med. 17, 405–424 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  125. Scott, A. et al. Saturation-scale functional evidence supports clinical variant interpretation in Lynch syndrome. Genome Biol. 23, 266 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  126. Radford, E. J. et al. Saturation genome editing of DDX3X clarifies pathogenicity of germline and somatic variation. Nat. Commun. 14, 7702 2023

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  127. Wojcik, M. H. et al. Beyond the exome: what’s next in diagnostic testing for Mendelian conditions. Am. J. Hum. Genet. 110, 1229–1248 (2023).

  128. Miller, D. T. et al. ACMG SF v3.1 list for reporting of secondary findings in clinical exome and genome sequencing: a policy statement of the American College of Medical Genetics and Genomics (ACMG). Genet. Med. 24, 1407–1414 (2022).

    Article  CAS  PubMed  Google Scholar 

  129. Pejaver, V. et al. Calibration of computational tools for missense variant pathogenicity classification and ClinGen recommendations for PP3/BP4 criteria. Am. J. Hum. Genet. 109, 2163–2177 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  130. Brnich, S. E. et al. Recommendations for application of the functional evidence PS3/BS3 criterion using the ACMG/AMP sequence variant interpretation framework. Genome Med. 12, 3 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  131. Graham, S. E. et al. The power of genetic diversity in genome-wide association studies of lipids. Nature 600, 675–679 (2021). This study demonstrates how expanding genomic studies to include people of non-European ancestries will improve identification of functional variants and the portability of polygenic risk scores to diverse groups.

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  132. Musunuru, K. & Kathiresan, S. Genetics of common, complex coronary artery disease. Cell 177, 132–145 (2019).

    Article  CAS  PubMed  Google Scholar 

  133. Hamilton, M. C. et al. Systematic elucidation of genetic mechanisms underlying cholesterol uptake. Cell Genomics 3, 100304 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  134. Martin, A. R. et al. Clinical use of current polygenic risk scores may exacerbate health disparities. Nat. Genet. 51, 584–591 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  135. Shi, H. et al. Population-specific causal disease effect sizes in functionally important regions impacted by selection. Nat. Commun. 12, 1098 (2021).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  136. Hou, K. et al. Causal effects on complex traits are similar for common variants across segments of different continental ancestries within admixed individuals. Nat. Genet. 55, 549–558 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  137. 1000 Genomes Project Consortium. A global reference for human genetic variation. Nature 526, 68–74 (2015).

    Article  Google Scholar 

  138. Gaziano, J. M. et al. Million Veteran Program: a mega-biobank to study genetic influences on health and disease. J. Clin. Epidemiol. 70, 214–223 (2016).

    Article  PubMed  Google Scholar 

  139. Kanoni, S. et al. Implicating genes, pleiotropy, and sexual dimorphism at blood lipid loci through multi-ancestry meta-analysis. Genome Biol. 23, 268 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  140. Sakaue, S. et al. A cross-population atlas of genetic associations for 220 human phenotypes. Nat. Genet. 53, 1415–1424 (2021).

    Article  CAS  PubMed  Google Scholar 

  141. Aragam, K. G. et al. Discovery and systematic characterization of risk variants and genes for coronary artery disease in over a million participants. Nat. Genet. 54, 1803–1815 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  142. Tcheandjieu, C. et al. Large-scale genome-wide association study of coronary artery disease in genetically diverse populations. Nat. Med. 28, 1679–1692 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  143. Threadgill, D. W., Miller, D. R., Churchill, G. A. & de Villena, F. P.-M. The collaborative cross: a recombinant inbred mouse population for the systems genetic era. ILAR J. 52, 24–31 (2011).

    Article  CAS  PubMed  Google Scholar 

  144. Fowler, D. M. et al. An Atlas of Variant Effects to understand the genome at nucleotide resolution. Genome Biol. 24, 147 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  145. Wilkinson, M. D. et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci. Data 3, 160018 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  146. Schatz, M. C. et al. Inverting the model of genomics data sharing with the NHGRI Genomic Data Science Analysis, Visualization, and Informatics Lab-space. Cell Genomics 2, 100085 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  147. Wang, T. et al. The Human Pangenome Project: a global resource to map genomic diversity. Nature 604, 437–446 (2022).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  148. All of Us Research Program Investigators. The ‘All of Us’ Research Program. N. Engl. J. Med. 381, 668–676 (2019).

    Article  Google Scholar 

  149. Gilbert, L. A. et al. Genome-scale CRISPR-mediated control of gene repression and activation. Cell 159, 647–661 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  150. Sherry, S. T. et al. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 29, 308–311 (2001).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  151. Frankish, A. et al. GENCODE reference annotation for the human and mouse genomes. Nucleic Acids Res. 47, D766–D773 (2019).

    Article  CAS  PubMed  Google Scholar 

  152. Ashburner, M. et al. Gene Ontology: tool for the unification of biology. Nat. Genet. 25, 25–29 (2000).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  153. Köhler, S. et al. The Human Phenotype Ontology in 2021. Nucleic Acids Res. 49, D1207–D1217 (2021).

    Article  PubMed  ADS  Google Scholar 

  154. Del Toro, N. et al. The IntAct database: efficient access to fine-grained molecular interaction data. Nucleic Acids Res. 50, D648–D653 (2022).

    Article  PubMed  Google Scholar 

  155. Vasilevsky, N. A. et al. Mondo: Unifying diseases for the world, by the world. Preprint at medRxiv https://doi.org/10.1101/2022.04.13.22273750 (2022).

  156. Amberger, J. S., Bocchini, C. A., Scott, A. F. & Hamosh, A. OMIM.org: leveraging knowledge across phenotype–gene relationships. Nucleic Acids Res. 47, D1038–D1043 (2019).

    Article  CAS  PubMed  Google Scholar 

  157. The UniProt Consortium UniProt: the universal protein knowledgebase in 2023. Nucleic Acids Res. 51, D523–D531 (2023).

    Article  Google Scholar 

  158. Musunuru, K. et al. From noncoding variant to phenotype via SORT1 at the 1p13 cholesterol locus. Nature 466, 714–719 (2010).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  159. Kjolby, M. et al. Sort1, encoded by the cardiovascular risk locus 1p13.3, is a regulator of hepatic lipoprotein export. Cell Metab. 12, 213–223 (2010).

    Article  CAS  PubMed  Google Scholar 

  160. Claussnitzer, M. et al. FTO obesity variant circuitry and adipocyte browning in humans. N. Engl. J. Med. 373, 895–907 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  161. Smemo, S. et al. Obesity-associated variants within FTO form long-range functional connections with IRX3. Nature 507, 371 (2014).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  162. Graham, D. B. & Xavier, R. J. Pathway paradigms revealed from the genetics of inflammatory bowel disease. Nature 578, 527–539 (2020).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  163. Kim, S., Eun, H. S. & Jo, E.-K. Roles of autophagy-related genes in the pathogenesis of inflammatory bowel disease. Cells 8, 77 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  164. Singh, N. K., Singh, N. N., Androphy, E. J. & Singh, R. N. Splicing of a critical exon of human survival motor neuron is regulated by a unique silencer element located in the last intron. Mol. Cell. Biol. 26, 1333 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  165. Hua, Y. et al. Antisense correction of SMN2 splicing in the CNS rescues necrosis in a type III SMA mouse model. Genes Dev. 24, 1634 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  166. Frangoul, H. et al. CRISPR–Cas9 gene editing for sickle cell disease and β-thalassemia. N. Engl. J. Med. 384, 252–260 (2021).

    Article  CAS  PubMed  Google Scholar 

  167. Sankaran, V. G. et al. Human fetal hemoglobin expression is regulated by the developmental stage-specific repressor BCL11A. Science 322, 1839–1842 (2008).

    Article  CAS  PubMed  ADS  Google Scholar 

Download references

Acknowledgements

This work was supported by the NIH NHGRI IGVF Program (UM1HG011966, UM1HG011969, UM1HG011972, UM1HG011989, UM1HG011996, UM1HG012003, UM1HG012010, UM1HG012053, UM1HG011986, UM1HG012076, UM1HG012077, U01HG011952, U01HG011967, U01HG012009, U01HG012022, U01HG012039, U01HG012064, U01HG012069, U01HG012041, U01HG012047, U01HG012051, U01HG012059, U01HG012079, U01HG012103, U24HG012012, U24HG012070), NIH NCI (R01CA197774), and the Novo Nordisk Foundation (NNF21SA0072102). Artwork in Figs. 13 were created by SciStories and V. Yeaung. We thank members of the IGVF External Consultants Panel (G. Bourque, P. Mali, J. Cho, B. Engelhardt and O. Troyanskaya) for critical feedback on the manuscript.

Author information

Authors and Affiliations

Consortia

Contributions

J.M.E., H.A.L. and H. Singh co-led the Writing Group. J.M.E., H.A.L., H. Singh, L. M. Starita, G.C.H., H. Carter, N. Sahni, T.E.R., X. Lin, Y. Li, N.V.M., M.H.C., B.C.H. and A.M. wrote initial text based on input from principal investigators, the Writing Group, and Working Group and Focus Group co-chairs. J.M.E., A.P.B. and J. Ryu developed figures. All authors contributed to developing the vision and goals of the IGVF Consortium, outlining the project, and editing the manuscript. The role of the NHGRI Program Management in the preparation of this paper was limited to coordination and scientific management of the IGVF Consortium.

Ethics declarations

Competing interests

R.D.S. has been a consultant for Leadiant Biosciences, Mirum Pharmaceuticals, PTC Therapeutics and Travere. He has received honoraria from Medscape and is an employee and shareholder of PreventionGenetics, part of Exact Sciences. B.P.K. is a co-inventor on patents and patent applications that describe genome engineering technologies, and is on the scientific advisory board of Acrigen Biosciences, Life Edit Therapeutics and Prime Medicine. The other authors declare no competing interests.

Peer review

Peer review information

Nature thanks Tiffany Amariuta and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

IGVF Consortium. Deciphering the impact of genomic variation on function. Nature 633, 47–57 (2024). https://doi.org/10.1038/s41586-024-07510-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41586-024-07510-0

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research