Abstract
Our genomes influence nearly every aspect of human biology—from molecular and cellular functions to phenotypes in health and disease. Studying the differences in DNA sequence between individuals (genomic variation) could reveal previously unknown mechanisms of human biology, uncover the basis of genetic predispositions to diseases, and guide the development of new diagnostic tools and therapeutic agents. Yet, understanding how genomic variation alters genome function to influence phenotype has proved challenging. To unlock these insights, we need a systematic and comprehensive catalogue of genome function and the molecular and cellular effects of genomic variants. Towards this goal, the Impact of Genomic Variation on Function (IGVF) Consortium will combine approaches in single-cell mapping, genomic perturbations and predictive modelling to investigate the relationships among genomic variation, genome function and phenotypes. IGVF will create maps across hundreds of cell types and states describing how coding variants alter protein activity, how noncoding variants change the regulation of gene expression, and how such effects connect through gene-regulatory and protein-interaction networks. These experimental data, computational predictions and accompanying standards and pipelines will be integrated into an open resource that will catalyse community efforts to explore how our genomes influence biology and disease across populations.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 51 print issues and online access
$199.00 per year
only $3.90 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Claussnitzer, M. et al. A brief history of human disease genetics. Nature 577, 179–189 (2020). This review describes progress in identifying genomic variants associated with common and rare diseases, and the approaches needed to combine these data with maps of genome function to advance diagnostic and therapeutic strategies.
Loos, R. J. F. 15 years of genome-wide association studies and no signs of slowing down. Nat. Commun. 11, 5900 (2020).
Green, E. D. et al. Strategic vision for improving human health at the forefront of genomics. Nature 586, 683–692 (2020).
Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443 (2020).
Collins, R. L. et al. A structural variation reference for medical and population genetics. Nature 581, 444–451 (2020).
Sollis, E. et al. The NHGRI–EBI GWAS Catalog: knowledgebase and deposition resource. Nucleic Acids Res. 51, D977–D985 (2023).
Landrum, M. J. et al. ClinVar: public archive of relationships among sequence variation and human phenotype. Nucleic Acids Res. 42, D980–D985 (2014).
Rehm, H. L. et al. ClinGen—the clinical genome resource. N. Engl. J. Med. 372, 2235–2242 (2015).
Zhou, W. et al. Global Biobank Meta-analysis Initiative: powering genetic discovery across human disease. Cell Genom 2, 100192 (2022).
Taliun, D. et al. Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program. Nature 590, 290–299 (2021).
Backman, J. D. et al. Exome sequencing and analysis of 454,787 UK Biobank participants. Nature 599, 628–634 (2021).
Karczewski, K. J. et al. Systematic single-variant and gene-based association testing of thousands of phenotypes in 394,841 UK Biobank exomes. Cell Genomics 2, 100168 (2022).
Doolittle, W. F., Brunet, T. D. P., Linquist, S. & Gregory, T. R. Distinguishing between ‘function’ and ‘effect’ in genome biology. Genome Biol. Evol. 6, 1234–1237 (2014).
Kellis, M. et al. Defining functional DNA elements in the human genome. Proc. Natl. Acad. Sci. USA 111, 6131–6138 (2014).
ENCODE Project Consortium. Expanded encyclopaedias of DNA elements in the human and mouse genomes. Nature 583, 699–710 (2020). An exemplary team science effort which has led to development of methods, data resources and standards enabling fundamental advances in understanding gene regulation and genome function.
Roadmap Epigenomics Consortium. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).
GTEx Consortium. The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science 369, 1318–1330 (2020). This latest flagship manuscript from the GTEx Consortium maps how genomic variation regulates gene expression across human tissues, providing a resource for interpreting the molecular effects of variants associated with common diseases.
Võsa, U. et al. Large-scale cis- and trans-eQTL analyses identify thousands of genetic loci and polygenic scores that regulate blood gene expression. Nat. Genet. 53, 1300–1310 (2021).
Li, Y. I. et al. RNA splicing is a primary link between genetic variation and disease. Science 352, 600–604 (2016).
HuBMAP Consortium. The human body at cellular resolution: the NIH Human Biomolecular Atlas Program. Nature 574, 187–192 (2019).
Regev, A. et al. The Human Cell Atlas. eLife 6, e27041 (2017).
Matreyek, K. A. et al. Multiplex assessment of protein variant abundance by massively parallel sequencing. Nat. Genet. 50, 874–882 (2018).
Findlay, G. M. et al. Accurate classification of BRCA1 variants with saturation genome editing. Nature 562, 217–222 (2018).
Esposito, D. et al. MaveDB: an open-source platform to distribute and interpret data from multiplexed assays of variant effect. Genome Biol. 20, 223 (2019).
Luck, K. et al. A reference map of the human binary protein interactome. Nature 580, 402–408 (2020).
Szklarczyk, D. et al. The STRING database in 2021: customizable protein–protein networks, and functional characterization of user-uploaded gene/measurement sets. Nucleic Acids Res. 49, D605–D612 (2021).
Pacini, C. et al. Integrated cross-study datasets of genetic dependencies in cancer. Nat. Commun. 12, 1661 (2021).
International Common Disease Alliance. ICDA Recommendations and White Paper. ICDA https://icda.bio (2020).
Abdellaoui, A., Yengo, L., Verweij, K. J. H. & Visscher, P. M. 15 years of GWAS discovery: realizing the promise. Am. J. Hum. Genet. 110, 179–194 (2023).
Rehm, H. L. & Fowler, D. M. Keeping up with the genomes: scaling genomic variant interpretation. Genome Med. 12, 5 (2019).
Bentley, A. R., Callier, S. & Rotimi, C. N. Diversity and inclusion in genomic research: why the uneven progress? J. Community Genet. 8, 255 (2017).
Sirugo, G., Williams, S. M. & Tishkoff, S. A. The missing diversity in human genetic studies. Cell 177, 1080 (2019).
Lappalainen, T. & MacArthur, D. G. From variant to function in human disease genetics. Science 373, 1464–1468 (2021).
Findlay, G. M. Linking genome variants to disease: scalable approaches to test the functional impact of human mutations. Hum. Mol. Genet. 30, R187–R197 (2021).
Hu, Y. et al. Single-cell multi-scale footprinting reveals the modular organization of DNA regulatory elements. Preprint at bioRxiv https://doi.org/10.1101/2023.03.28.533945 (2023).
Granja, J. M. et al. ArchR is a scalable software package for integrative single-cell chromatin accessibility analysis. Nat. Genet. 53, 403–411 (2021).
Kartha, V. K. et al. Functional inference of gene regulation using single-cell multi-omics. Cell Genomics 2, 100166 (2022).
Kotliar, D. et al. Identifying gene expression programs of cell-type identity and cellular activity with single-cell RNA-seq. eLife 8, e43803 (2019).
Benner, C. et al. FINEMAP: efficient variable selection using summary data from genome-wide association studies. Bioinformatics 32, 1493–1501 (2016).
Wang, G., Sarkar, A., Carbonetto, P. & Stephens, M. A simple new approach to variable selection in regression, with application to genetic fine mapping. J. R. Stat. Soc. B 82, 1273–1300 (2020).
Weissbrod, O. et al. Functionally informed fine-mapping and polygenic localization of complex trait heritability. Nat. Genet. 52, 1355–1363 (2020).
Wang, Q. S. et al. Leveraging supervised learning for functionally informed fine-mapping of cis-eQTLs identifies an additional 20,913 putative causal eQTLs. Nat. Commun. 12, 3394 (2021).
Cuella-Martin, R. et al. Functional interrogation of DNA damage response variants with base editing screens. Cell 184, 1081–1097.e19 (2021).
Morris, J. A. et al. Discovery of target genes and pathways at GWAS loci by pooled single-cell CRISPR screens. Science 380, eadh7699 (2023).
Martin-Rufino, J. D. et al. Massively parallel base editing to map variant effects in human hematopoiesis. Cell 186, 2456–2474.e24 (2023).
Anzalone, A. V., Koblan, L. W. & Liu, D. R. Genome editing with CRISPR–Cas nucleases, base editors, transposases and prime editors. Nat. Biotechnol. 38, 824–844 (2020).
Hanna, R. E. et al. Massively parallel assessment of human variants with base editor screens. Cell 184, 1064–1080.e20 (2021).
Klann, T. S. et al. CRISPR–Cas9 epigenome editing enables high-throughput screening for functional regulatory elements in the human genome. Nat. Biotechnol. 35, 561–568 (2017).
Fulco, C. P. et al. Systematic mapping of functional enhancer–promoter connections with CRISPR interference. Science 354, 769–773 (2016).
Canver, M. C. et al. BCL11A enhancer dissection by Cas9-mediated in situ saturating mutagenesis. Nature 527, 192–197 (2015). This study applied CRISPR–Cas9 screens to dissect a GWAS-nominated enhancer of BCL11A, a negative regulator of fetal haemoglobin expression during erythropoiesis, and motivated the development of enhancer-targeting CRISPR therapeutics for sickle-cell disease.
Arnold, C. D. et al. Genome-wide quantitative enhancer activity maps identified by STARR-seq. Science 339, 1074–1077 (2013).
Melnikov, A. et al. Systematic dissection and optimization of inducible enhancers in human cells using a massively parallel reporter assay. Nat. Biotechnol. 30, 271–277 (2012).
Bergman, D. T. et al. Compatibility rules of human enhancer and promoter sequences. Nature 607, 176–184 (2022).
Vockley, C. M. et al. Massively parallel quantification of the regulatory effects of noncoding genetic variation in a human cohort. Genome Res. 25, 1206–1214 (2015).
Klein, J. C. et al. A systematic evaluation of the design and context dependencies of massively parallel reporter assays. Nat. Methods 17, 1083–1091 (2020).
Patwardhan, R. P. et al. High-resolution analysis of DNA regulatory elements by synthetic saturation mutagenesis. Nat. Biotechnol. 27, 1173–1175 (2009).
Agarwal, V. et al. Massively parallel characterization of transcriptional regulatory elements in three diverse human cell types. Preprint at bioRxiv https://doi.org/10.1101/2023.03.05.531189 (2023).
Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).
Alipanahi, B., Delong, A., Weirauch, M. T. & Frey, B. J. Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning. Nat. Biotechnol. 33, 831–838 (2015).
Zhou, J. & Troyanskaya, O. G. Predicting effects of noncoding variants with deep learning-based sequence model. Nat. Methods 12, 931–934 (2015). The study develops a deep learning framework (DeepSEA) trained on chromatin profiling data to predict effects of single-nucleotide genomic variants on transcription factor binding and chromatin state.
Avsec, Ž. et al. Base-resolution models of transcription-factor binding reveal soft motif syntax. Nat. Genet. 53, 354–366 (2021). This study introduces the BPNet model, a convolutional neural network to predict basepair-resolution epigenomic data from DNA sequence, and applies this framework to learn rules of the regulatory syntax underlying transcription factor binding.
Beer, M. A. Predicting enhancer activity and variant impact using gkm-SVM. Hum. Mutat. 38, 1251–1258 (2017).
Chen, K. M., Wong, A. K., Troyanskaya, O. G. & Zhou, J. A sequence-based global map of regulatory activity for deciphering human genetics. Nat. Genet. 54, 940–949 (2022).
Avsec, Ž. et al. Effective gene expression prediction from sequence by integrating long-range interactions. Nat. Methods 18, 1196–1203 (2021).
Boyle, E. A., Li, Y. I. & Pritchard, J. K. An expanded view of complex traits: from polygenic to omnigenic. Cell 169, 1177 (2017).
Rentzsch, P., Witten, D., Cooper, G. M., Shendure, J. & Kircher, M. CADD: predicting the deleteriousness of variants throughout the human genome. Nucleic Acids Res. 47, D886–D894 (2018).
Han, J.-D. J. Understanding biological functions through molecular networks. Cell Res. 18, 224–237 (2008).
Kryshtafovych, A., Schwede, T., Topf, M., Fidelis, K. & Moult, J. Critical assessment of methods of protein structure prediction (CASP)-Round XIV. Proteins 89, 1607–1617 (2021). Work by CASP over almost 20 years illustrates how community efforts to develop gold-standard data, benchmarks and critical assessments can facilitate development of predictive models of protein structure and function, with CASP XIV marking a major advance through the introduction of AlphaFold2.
The Critical Assessment of Genome Interpretation Consortium. CAGI, the Critical Assessment of Genome Interpretation, establishes progress and prospects for computational genetic variant interpretation methods. Genome Biol. 25, 53 (2024). This paper reports a collaborative effort to independently assess computational models for interpreting the effects of variants on molecular phenotypes and disease risk, and demonstrates their utility in clinical and research applications.
Ma, S. et al. Chromatin potential identified by shared single-cell profiling of RNA and chromatin. Cell 183, 1103–1116.e20 (2020). This study introduces SHARE-seq and demonstrates how single-cell multiomic data enables mapping dynamics of regulatory element activity across differentiation states by correlating distal enhancers with target genes.
Tran, V. et al. High sensitivity single cell RNA sequencing with split pool barcoding. Preprint at bioRxiv https://doi.org/10.1101/2022.08.27.505512 (2022).
Xu, Y. et al. An atlas of genetic scores to predict multi-omic traits. Nature 616, 123–131 (2023).
Nathan, A. et al. Single-cell eQTL models reveal dynamic T cell state dependence of disease loci. Nature 606, 120–128 (2022).
Perez, R. K. et al. Single-cell RNA-seq reveals cell type–specific molecular and genetic associations to lupus. Science 376, eabf1970 (2022).
Yazar, S. et al. Single-cell eQTL mapping identifies cell type–specific genetic control of autoimmune disease. Science 376, eabf3041 (2022).
Gate, R. E. et al. Genetic determinants of co-accessible chromatin regions in activated T cells across humans. Nat. Genet. 50, 1140–1150 (2018).
Adamson, B. et al. A multiplexed single-cell CRISPR screening platform enables systematic dissection of the unfolded protein response. Cell 167, 1867–1882.e21 (2016).
Dixit, A. et al. Perturb-Seq: dissecting molecular circuits with scalable single-cell RNA profiling of pooled genetic screens. Cell 167, 1853–1866.e17 (2016).
Replogle, J. M. et al. Mapping information-rich genotype-phenotype landscapes with genome-scale Perturb-seq. Cell 185, 2559–2575.e28 (2022).
Sahni, N. et al. Widespread macromolecular interaction perturbations in human genetic disorders. Cell 161, 647–660 (2015). Systematic open reading frame screens showed that a majority of coding variants in Mendelian disorders affect protein interaction networks, providing a resource to benchmark predictors of variant effects.
Fayer, S. et al. Closing the gap: Systematic integration of multiplexed functional data resolves variants of uncertain significance in BRCA1, TP53 and PTEN. Am. J. Hum. Genet. 108, 2248–2258 (2021). This study illustrates how experimentally derived variant effect maps can have high clinical utility in interpreting variants for Mendelian diseases.
Starita, L. M. et al. Massively parallel functional analysis of BRCA1 RING domain variants. Genetics 200, 413–422 (2015).
Sun, S. et al. An extended set of yeast-based functional assays accurately identifies human disease mutations. Genome Res. 26, 670–680 (2016).
Bray, M.-A. et al. Cell painting, a high-content image-based assay for morphological profiling using multiplexed fluorescent dyes. Nat. Protoc. 11, 1757–1774 (2016).
Fulco, C. P. et al. Activity-by-contact model of enhancer-promoter regulation from thousands of CRISPR perturbations. Nat. Genet. 51, 1664–1669 (2019).
Sakaue, S. et al. Tissue-specific enhancer–gene maps from multimodal single-cell data identify causal disease alleles. Nat. Genet. 56, 615–626 (2024).
Weeks, E. M. et al. Leveraging polygenic enrichments of gene features to predict genes underlying complex traits and diseases. Nat. Genet. 55, 1267–1276 (2023).
Nasser, J. et al. Genome-wide enhancer maps link risk variants to disease genes. Nature 593, 238–243 (2021).
Schnitzler, G. R. et al. Convergence of coronary artery disease genes onto endothelial cell programs. Nature 626, 799–807 (2024).
Finucane, H. K. et al. Heritability enrichment of specifically expressed genes identifies disease-relevant tissues and cell types. Nat. Genet. 50, 621–629 (2018).
Forgetta, V. et al. An effector index to predict target genes at GWAS loci. Hum. Genet. 141, 1431–1447 (2022).
Ghoussaini, M. et al. Open Targets Genetics: systematic identification of trait-associated genes using large-scale genetics and functional genomics. Nucleic Acids Res. 49, D1311–D1320 (2021).
Gschwind, A. R. et al. an encyclopedia of enhancer-gene regulatory interactions in the human genome. Preprint at bioRxiv https://doi.org/10.1101/2023.11.09.563812 (2023).
Karollus, A., Mauermeier, T. & Gagneur, J. Current sequence-based models capture gene expression determinants in promoters but mostly ignore distal enhancers. Genome Biol. 24, 56 (2023).
The Complex Trait Consortium. The Collaborative Cross, a community resource for the genetic analysis of complex traits. Nat. Genet. 36, 1133–1137 (2024).
Hogan, A. et al. Knowledge Graphs. Preprint at arXiv https://doi.org/10.48550/arXiv.2003.02320 (2020).
Feng, F. et al. GenomicKB: a knowledge graph for the human genome. Nucleic Acids Res. 51, D950–D956 (2023).
Chandak, P., Huang, K. & Zitnik, M. Building a knowledge graph to enable precision medicine. Sci. Data 10, 67 (2023).
Lobentanzer, S. et al. Democratizing knowledge representation with BioCypher. Nat. Biotechnol. 41, 1056–1059 (2023).
Ambrosini, G. et al. Insights gained from a comprehensive all-against-all transcription factor binding motif benchmarking study. Genome Biol. 21, 114 (2020).
de Boer, C. G. & Taipale, J. Hold out the genome: a roadmap to solving the cis-regulatory code. Nature 625, 41–50 (2024).
Yuan, H. & Kelley, D. R. scBasset: sequence-based modeling of single-cell ATAC-seq using convolutional neural networks. Nat. Methods 19, 1088–1096 (2022).
Inoue, F. et al. A systematic comparison reveals substantial differences in chromosomal versus episomal encoding of enhancer activity. Genome Res. 27, 38–52 (2017).
Xie, S., Duan, J., Li, B., Zhou, P. & Hon, G. C. Multiplexed engineering and analysis of combinatorial enhancer activity in single cells. Mol. Cell 66, 285–299.e5 (2017).
Gasperini, M. et al. A genome-wide framework for mapping gene regulation via cellular genetic screens. Cell 176, 1516 (2019).
Reilly, S. K. et al. Direct characterization of cis-regulatory elements and functional dissection of complex genetic associations using HCR–FlowFISH. Nat. Genet. 53, 1166–1176 (2021).
Schraivogel, D. et al. Targeted Perturb-seq enables genome-scale genetic screens in single cells. Nat. Methods 17, 629–635 (2020).
McGinnis, C. S. et al. MULTI-seq: sample multiplexing for single-cell RNA sequencing using lipid-tagged indices. Nat. Methods 16, 619–626 (2019).
Daniel, B. et al. Divergent clonal differentiation trajectories of T cell exhaustion. Nat. Immunol. 23, 1614–1627 (2022).
Rebboah, E. et al. Mapping and modeling the genomic basis of differential RNA isoform expression at single-cell resolution with LR-Split-seq. Genome Biol. 22, 286 (2021).
Pratapa, A., Jalihal, A. P., Law, J. N., Bharadwaj, A. & Murali, T. M. Benchmarking algorithms for gene regulatory network inference from single-cell transcriptomic data. Nat. Methods 17, 147–154 (2020).
Zhang, Y., Qi, G., Park, J.-H. & Chatterjee, N. Estimation of complex effect-size distributions using summary-level statistics from genome-wide association studies across 32 complex traits. Nat. Genet. 50, 1318–1326 (2018).
O’Connor, L. J. The distribution of common-variant effect sizes. Nat. Genet. 53, 1243–1249 (2021).
Lewis, C. M. & Vassos, E. Polygenic risk scores: from research tools to clinical instruments. Genome Med. 12, 44 (2020).
Hekselman, I. & Yeger-Lotem, E. Mechanisms of tissue and cell-type specificity in heritable traits and diseases. Nat. Rev. Genet. 21, 137–150 (2020).
Uffelmann, E. et al. Genome-wide association studies. Nat. Rev. Methods Prime. 1, 59 (2021).
Weissbrod, O. et al. Leveraging fine-mapping and multipopulation training data to improve cross-population polygenic risk scores. Nat. Genet. 54, 450–458 (2022).
Heid, I. M. et al. Meta-analysis identifies 13 new loci associated with waist-hip ratio and reveals sexual dimorphism in the genetic basis of fat distribution. Nat. Genet. 42, 949–960 (2010).
Goossens, G. H., Jocken, J. W. E. & Blaak, E. E. Sexual dimorphism in cardiometabolic health: the role of adipose tissue, muscle and liver. Nat. Rev. Endocrinol. 17, 47–66 (2021).
Rajabli, F. et al. Ancestral origin of ApoE ε4 Alzheimer disease risk in Puerto Rican and African American populations. PLoS Genet. 14, e1007791 (2018).
Blue, E. E., Horimoto, A. R. V. R., Mukherjee, S., Wijsman, E. M. & Thornton, T. A. Local ancestry at APOE modifies Alzheimer’s disease risk in Caribbean Hispanics. Alzheimers Dement. 15, 1524–1532 (2019).
Baxter, S. M. et al. Centers for Mendelian genomics: a decade of facilitating gene discovery. Genet. Med. 24, 784–797 (2022).
Costanzo, M. C. et al. The Type 2 Diabetes Knowledge Portal: an open access genetic resource dedicated to type 2 diabetes and related traits. Cell Metab. 35, 695–710.e6 (2023).
Richards, S. et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet. Med. 17, 405–424 (2015).
Scott, A. et al. Saturation-scale functional evidence supports clinical variant interpretation in Lynch syndrome. Genome Biol. 23, 266 (2022).
Radford, E. J. et al. Saturation genome editing of DDX3X clarifies pathogenicity of germline and somatic variation. Nat. Commun. 14, 7702 2023
Wojcik, M. H. et al. Beyond the exome: what’s next in diagnostic testing for Mendelian conditions. Am. J. Hum. Genet. 110, 1229–1248 (2023).
Miller, D. T. et al. ACMG SF v3.1 list for reporting of secondary findings in clinical exome and genome sequencing: a policy statement of the American College of Medical Genetics and Genomics (ACMG). Genet. Med. 24, 1407–1414 (2022).
Pejaver, V. et al. Calibration of computational tools for missense variant pathogenicity classification and ClinGen recommendations for PP3/BP4 criteria. Am. J. Hum. Genet. 109, 2163–2177 (2022).
Brnich, S. E. et al. Recommendations for application of the functional evidence PS3/BS3 criterion using the ACMG/AMP sequence variant interpretation framework. Genome Med. 12, 3 (2019).
Graham, S. E. et al. The power of genetic diversity in genome-wide association studies of lipids. Nature 600, 675–679 (2021). This study demonstrates how expanding genomic studies to include people of non-European ancestries will improve identification of functional variants and the portability of polygenic risk scores to diverse groups.
Musunuru, K. & Kathiresan, S. Genetics of common, complex coronary artery disease. Cell 177, 132–145 (2019).
Hamilton, M. C. et al. Systematic elucidation of genetic mechanisms underlying cholesterol uptake. Cell Genomics 3, 100304 (2023).
Martin, A. R. et al. Clinical use of current polygenic risk scores may exacerbate health disparities. Nat. Genet. 51, 584–591 (2019).
Shi, H. et al. Population-specific causal disease effect sizes in functionally important regions impacted by selection. Nat. Commun. 12, 1098 (2021).
Hou, K. et al. Causal effects on complex traits are similar for common variants across segments of different continental ancestries within admixed individuals. Nat. Genet. 55, 549–558 (2023).
1000 Genomes Project Consortium. A global reference for human genetic variation. Nature 526, 68–74 (2015).
Gaziano, J. M. et al. Million Veteran Program: a mega-biobank to study genetic influences on health and disease. J. Clin. Epidemiol. 70, 214–223 (2016).
Kanoni, S. et al. Implicating genes, pleiotropy, and sexual dimorphism at blood lipid loci through multi-ancestry meta-analysis. Genome Biol. 23, 268 (2022).
Sakaue, S. et al. A cross-population atlas of genetic associations for 220 human phenotypes. Nat. Genet. 53, 1415–1424 (2021).
Aragam, K. G. et al. Discovery and systematic characterization of risk variants and genes for coronary artery disease in over a million participants. Nat. Genet. 54, 1803–1815 (2022).
Tcheandjieu, C. et al. Large-scale genome-wide association study of coronary artery disease in genetically diverse populations. Nat. Med. 28, 1679–1692 (2022).
Threadgill, D. W., Miller, D. R., Churchill, G. A. & de Villena, F. P.-M. The collaborative cross: a recombinant inbred mouse population for the systems genetic era. ILAR J. 52, 24–31 (2011).
Fowler, D. M. et al. An Atlas of Variant Effects to understand the genome at nucleotide resolution. Genome Biol. 24, 147 (2023).
Wilkinson, M. D. et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci. Data 3, 160018 (2016).
Schatz, M. C. et al. Inverting the model of genomics data sharing with the NHGRI Genomic Data Science Analysis, Visualization, and Informatics Lab-space. Cell Genomics 2, 100085 (2022).
Wang, T. et al. The Human Pangenome Project: a global resource to map genomic diversity. Nature 604, 437–446 (2022).
All of Us Research Program Investigators. The ‘All of Us’ Research Program. N. Engl. J. Med. 381, 668–676 (2019).
Gilbert, L. A. et al. Genome-scale CRISPR-mediated control of gene repression and activation. Cell 159, 647–661 (2014).
Sherry, S. T. et al. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 29, 308–311 (2001).
Frankish, A. et al. GENCODE reference annotation for the human and mouse genomes. Nucleic Acids Res. 47, D766–D773 (2019).
Ashburner, M. et al. Gene Ontology: tool for the unification of biology. Nat. Genet. 25, 25–29 (2000).
Köhler, S. et al. The Human Phenotype Ontology in 2021. Nucleic Acids Res. 49, D1207–D1217 (2021).
Del Toro, N. et al. The IntAct database: efficient access to fine-grained molecular interaction data. Nucleic Acids Res. 50, D648–D653 (2022).
Vasilevsky, N. A. et al. Mondo: Unifying diseases for the world, by the world. Preprint at medRxiv https://doi.org/10.1101/2022.04.13.22273750 (2022).
Amberger, J. S., Bocchini, C. A., Scott, A. F. & Hamosh, A. OMIM.org: leveraging knowledge across phenotype–gene relationships. Nucleic Acids Res. 47, D1038–D1043 (2019).
The UniProt Consortium UniProt: the universal protein knowledgebase in 2023. Nucleic Acids Res. 51, D523–D531 (2023).
Musunuru, K. et al. From noncoding variant to phenotype via SORT1 at the 1p13 cholesterol locus. Nature 466, 714–719 (2010).
Kjolby, M. et al. Sort1, encoded by the cardiovascular risk locus 1p13.3, is a regulator of hepatic lipoprotein export. Cell Metab. 12, 213–223 (2010).
Claussnitzer, M. et al. FTO obesity variant circuitry and adipocyte browning in humans. N. Engl. J. Med. 373, 895–907 (2015).
Smemo, S. et al. Obesity-associated variants within FTO form long-range functional connections with IRX3. Nature 507, 371 (2014).
Graham, D. B. & Xavier, R. J. Pathway paradigms revealed from the genetics of inflammatory bowel disease. Nature 578, 527–539 (2020).
Kim, S., Eun, H. S. & Jo, E.-K. Roles of autophagy-related genes in the pathogenesis of inflammatory bowel disease. Cells 8, 77 (2019).
Singh, N. K., Singh, N. N., Androphy, E. J. & Singh, R. N. Splicing of a critical exon of human survival motor neuron is regulated by a unique silencer element located in the last intron. Mol. Cell. Biol. 26, 1333 (2006).
Hua, Y. et al. Antisense correction of SMN2 splicing in the CNS rescues necrosis in a type III SMA mouse model. Genes Dev. 24, 1634 (2010).
Frangoul, H. et al. CRISPR–Cas9 gene editing for sickle cell disease and β-thalassemia. N. Engl. J. Med. 384, 252–260 (2021).
Sankaran, V. G. et al. Human fetal hemoglobin expression is regulated by the developmental stage-specific repressor BCL11A. Science 322, 1839–1842 (2008).
Acknowledgements
This work was supported by the NIH NHGRI IGVF Program (UM1HG011966, UM1HG011969, UM1HG011972, UM1HG011989, UM1HG011996, UM1HG012003, UM1HG012010, UM1HG012053, UM1HG011986, UM1HG012076, UM1HG012077, U01HG011952, U01HG011967, U01HG012009, U01HG012022, U01HG012039, U01HG012064, U01HG012069, U01HG012041, U01HG012047, U01HG012051, U01HG012059, U01HG012079, U01HG012103, U24HG012012, U24HG012070), NIH NCI (R01CA197774), and the Novo Nordisk Foundation (NNF21SA0072102). Artwork in Figs. 1–3 were created by SciStories and V. Yeaung. We thank members of the IGVF External Consultants Panel (G. Bourque, P. Mali, J. Cho, B. Engelhardt and O. Troyanskaya) for critical feedback on the manuscript.
Author information
Authors and Affiliations
Consortia
Contributions
J.M.E., H.A.L. and H. Singh co-led the Writing Group. J.M.E., H.A.L., H. Singh, L. M. Starita, G.C.H., H. Carter, N. Sahni, T.E.R., X. Lin, Y. Li, N.V.M., M.H.C., B.C.H. and A.M. wrote initial text based on input from principal investigators, the Writing Group, and Working Group and Focus Group co-chairs. J.M.E., A.P.B. and J. Ryu developed figures. All authors contributed to developing the vision and goals of the IGVF Consortium, outlining the project, and editing the manuscript. The role of the NHGRI Program Management in the preparation of this paper was limited to coordination and scientific management of the IGVF Consortium.
Ethics declarations
Competing interests
R.D.S. has been a consultant for Leadiant Biosciences, Mirum Pharmaceuticals, PTC Therapeutics and Travere. He has received honoraria from Medscape and is an employee and shareholder of PreventionGenetics, part of Exact Sciences. B.P.K. is a co-inventor on patents and patent applications that describe genome engineering technologies, and is on the scientific advisory board of Acrigen Biosciences, Life Edit Therapeutics and Prime Medicine. The other authors declare no competing interests.
Peer review
Peer review information
Nature thanks Tiffany Amariuta and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
IGVF Consortium. Deciphering the impact of genomic variation on function. Nature 633, 47–57 (2024). https://doi.org/10.1038/s41586-024-07510-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41586-024-07510-0