Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Molecular quantitative trait loci

Abstract

Understanding functional effects of genetic variants is one of the key challenges in human genetics, as much of disease-associated variation is located in non-coding regions with typically unknown putative gene regulatory effects. One of the most important approaches in this field has been molecular quantitative trait locus (molQTL) mapping, where genetic variation is associated with molecular traits that can be measured at scale, such as gene expression, splicing and chromatin accessibility. The maturity of the field and large-scale studies have produced a rich set of established methods for molQTL analysis, with novel technologies opening up new areas of discovery. In this Primer, we discuss the study design, input data and statistical methods for molQTL mapping and outline the properties of the resulting data as well as popular downstream applications. We review both the limitations and caveats of molQTL mapping as well as future potential approaches to tackle them. With technological development now providing many complementary methods for functional characterization of genetic variants, we anticipate that molQTLs will remain an important part of this toolkit as the only existing approach that can measure human variation in its native genomic, cellular and tissue context.

This is a preview of subscription content, access via your institution

Access options

Buy article

Get time limited or full article access on ReadCube.

$32.00

All prices are NET prices.

Fig. 1: Illustration of molQTLs.
Fig. 2: Overview of input data and processing steps for molQTL mapping and quality control.
Fig. 3: Transcriptome phenotypes that can be quantified from short-read RNA-seq data for molQTL mapping.
Fig. 4: QTL discovery power as a function of sample size.
Fig. 5: Visualization of molQTLs.
Fig. 6: Illustration of LD contamination in analysis of molQTL sharing.
Fig. 7: Questions in functional characterization of genetic variants, with examples of upcoming approaches for addressing these challenges.

References

  1. Claussnitzer, M. et al. A brief history of human disease genetics. Nature 577, 179–189 (2020).

    Article  ADS  Google Scholar 

  2. Uffelmann, E. et al. Genome-wide association studies. Nat. Rev. Methods Primers 1, 1–21 (2021).

    Article  Google Scholar 

  3. GTEx Consortium. The Genotype-Tissue Expression (GTEx) project. Nat. Genet. 45, 580–585 (2013).

    Article  Google Scholar 

  4. Huang, Q. Q., Ritchie, S. C., Brozynska, M. & Inouye, M. Power, false discovery rate and Winner’s Curse in eQTL studies. Nucleic Acids Res. 46, e133 (2018).

    Article  Google Scholar 

  5. Võsa, U. et al. Large-scale cis- and trans-eQTL analyses identify thousands of genetic loci and polygenic scores that regulate blood gene expression. Nat. Genet. 53, 1300–1310 (2021).

    Article  Google Scholar 

  6. Cheung, V. G. et al. Mapping determinants of human gene expression by regional and genome-wide association. Nature 437, 1365–1369 (2005).

    Article  ADS  Google Scholar 

  7. Stranger, B. E. et al. Population genomics of human gene expression. Nat. Genet. 39, 1217–1224 (2007).

    Article  Google Scholar 

  8. Deelen, P. et al. Calling genotypes from public RNA-sequencing data enables identification of genetic variants that affect gene-expression levels. Genome Med. 7, 30 (2015).

    Article  Google Scholar 

  9. Brown, A. A. et al. Predicting causal variants affecting expression by using whole-genome sequencing and RNA-seq from multiple human tissues. Nat. Genet. 49, 1747–1751 (2017).

    Article  Google Scholar 

  10. Li, J. H., Mazur, C. A., Berisa, T. & Pickrell, J. K. Low-pass sequencing increases the power of GWAS and decreases measurement error of polygenic risk scores compared to genotyping arrays. Genome Res. 31, 529–537 (2021).

    Article  Google Scholar 

  11. Fotsing, S. F. et al. The impact of short tandem repeat variation on gene expression. Nat. Genet. 51, 1652–1659 (2019).

    Article  Google Scholar 

  12. Montgomery, S. B. et al. The origin, evolution, and functional impact of short insertion-deletion variants identified in 179 human genomes. Genome Res. 23, 749–761 (2013).

    Article  Google Scholar 

  13. Chiang, C. et al. The impact of structural variation on human gene expression. Nat. Genet. 49, 692–699 (2017).

    Article  Google Scholar 

  14. Marees, A. T. et al. A tutorial on conducting genome-wide association studies: quality control and statistical analysis. Int. J. Methods Psychiatr. Res. 27, e1608 (2018).

    Article  Google Scholar 

  15. Kukurba, K. R. et al. Impact of the X chromosome and sex on regulatory variation. Genome Res. 26, 768–777 (2016).

    Article  Google Scholar 

  16. Gao, F. et al. XWAS: a software toolset for genetic data analysis and association studies of the X chromosome. J. Hered. 106, 666–671 (2015).

    Article  Google Scholar 

  17. Graubert, A., Aguet, F., Ravi, A., Ardlie, K. G. & Getz, G. RNA-SeQC 2: efficient RNA-seq quality control and quantification for large cohorts. Bioinformatics https://doi.org/10.1093/bioinformatics/btab135 (2021).

    Article  Google Scholar 

  18. Lahens, N. F. et al. IVT-seq reveals extreme bias in RNA sequencing. Genome Biol. 15, R86 (2014).

    Article  Google Scholar 

  19. van de Geijn, B., McVicker, G., Gilad, Y. & Pritchard, J. K. WASP: allele-specific software for robust molecular quantitative trait locus discovery. Nat. Methods 12, 1061–1063 (2015).

    Article  Google Scholar 

  20. Srivastava, A. et al. Alignment and mapping methodology influence transcript abundance estimation. Genome Biol. 21, 239 (2020).

    Article  Google Scholar 

  21. Saha, A. & Battle, A. False positives in trans-eQTL and co-expression analyses arising from RNA-sequencing alignment errors. F1000Res. 7, 1860 (2018).

    Article  Google Scholar 

  22. Teng, M. et al. A benchmark for RNA-seq quantification pipelines. Genome Biol. 17, 74 (2016).

    Article  Google Scholar 

  23. Love, M. I., Hogenesch, J. B. & Irizarry, R. A. Modeling of RNA-seq fragment sequence bias reduces systematic errors in transcript abundance estimation. Nat. Biotechnol. 34, 1287–1291 (2016).

    Article  Google Scholar 

  24. Liao, Y., Smyth, G. K. & Shi, W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics https://doi.org/10.5524/100059 (2014).

    Article  Google Scholar 

  25. Katz, Y., Wang, E. T., Airoldi, E. M. & Burge, C. B. Analysis and design of RNA sequencing experiments for identifying isoform regulation. Nat. Methods 7, 1009–1015 (2010).

    Article  Google Scholar 

  26. Shen, S. et al. rMATS: robust and flexible detection of differential alternative splicing from replicate RNA-seq data. Proc. Natl Acad. Sci. USA 111, E5593–E5601 (2014).

    Article  Google Scholar 

  27. Bray, N. L., Pimentel, H., Melsted, P. & Pachter, L. Erratum: near-optimal probabilistic RNA-seq quantification. Nat. Biotechnol. 34, 888 (2016).

    Article  Google Scholar 

  28. Li, B., Ruotti, V., Stewart, R. M., Thomson, J. A. & Dewey, C. N. RNA-seq gene expression estimation with read mapping uncertainty. Bioinformatics 26, 493–500 (2010).

    Article  Google Scholar 

  29. Sterne-Weiler, T., Weatheritt, R. J., Best, A. J., Ha, K. C. H. & Blencowe, B. J. Efficient and accurate quantitative profiling of alternative splicing patterns of any complexity on a laptop. Mol. Cell 72, 187–200.e6 (2018).

    Article  Google Scholar 

  30. Alasoo, K. et al. Genetic effects on promoter usage are highly context-specific and contribute to complex traits. eLife 8, e41673 (2019).

    Article  Google Scholar 

  31. Li, Y. I. et al. Annotation-free quantification of RNA splicing using LeafCutter. Nat. Genet. 50, 151–158 (2018).

    Article  Google Scholar 

  32. Vaquero-Garcia, J. et al. A new view of transcriptome complexity and regulation through the lens of local splicing variations. eLife 5, e11752 (2016).

    Article  Google Scholar 

  33. Garieri, M. et al. The effect of genetic variation on promoter usage and enhancer activity. Nat. Commun. 8, 1358 (2017).

    Article  ADS  Google Scholar 

  34. Vija, A. & Alasoo, K. Improved detection of genetic effects on promoter usage with augmented transcript annotations. Preprint at bioRxiv https://doi.org/10.1101/2022.07.12.499800 (2022).

    Article  Google Scholar 

  35. Xia, Z. et al. Dynamic analyses of alternative polyadenylation from RNA-seq reveal a 3′-UTR landscape across seven tumour types. Nat. Commun. 5, 1–13 (2014).

    Article  ADS  Google Scholar 

  36. Arefeen, A., Liu, J., Xiao, X. & Jiang, T. TAPAS: tool for alternative polyadenylation site analysis. Bioinformatics 34, 2521–2529 (2018).

    Article  Google Scholar 

  37. Ha, K. C. H., Blencowe, B. J. & Morris, Q. QAPA: a new method for the systematic analysis of alternative polyadenylation from RNA-seq data. Genome Biol. 19, 45 (2018).

    Article  Google Scholar 

  38. Shah, A., Mittleman, B. E., Gilad, Y. & Li, Y. I. Benchmarking sequencing methods and tools that facilitate the study of alternative polyadenylation. Genome Biol. 22, 291 (2021).

    Article  Google Scholar 

  39. Glinos, D. A. et al. Transcriptome variation in human tissues revealed by long-read sequencing. Nature 608, 353–359 (2022).

    Article  ADS  Google Scholar 

  40. Li, L. et al. An atlas of alternative polyadenylation quantitative trait loci contributing to complex trait and disease heritability. Nat. Genet. 53, 994–1005 (2021).

    Article  Google Scholar 

  41. Li, Q. et al. RNA editing underlies genetic risk of common inflammatory diseases. Nature 608, 569–577 (2022).

    Article  ADS  Google Scholar 

  42. Min, J. L. et al. Genomic and phenotypic insights from an atlas of genetic effects on DNA methylation. Nat. Genet. 53, 1311–1321 (2021).

    Article  Google Scholar 

  43. Hawe, J. S. et al. Genetic variation influencing DNA methylation provides insights into molecular mechanisms regulating genomic function. Nat. Genet. 54, 18–29 (2022).

    Article  Google Scholar 

  44. Zhou, W., Laird, P. W. & Shen, H. Comprehensive characterization, annotation and innovative use of Infinium DNA methylation BeadChip probes. Nucleic Acids Res. 45, e22 (2016).

    Google Scholar 

  45. Abante, J., Fang, Y., Feinberg, A. P. & Goutsias, J. Detection of haplotype-dependent allele-specific DNA methylation in WGBS data. Nat. Commun. 11, 5238 (2020).

    Article  ADS  Google Scholar 

  46. Onuchic, V. et al. Allele-specific epigenome maps reveal sequence-dependent stochastic switching at regulatory loci. Science 361, eaar3146 (2018).

    Article  Google Scholar 

  47. Degner, J. F. et al. DNase I sensitivity QTLs are a major determinant of human expression variation. Nature 482, 390–394 (2012).

    Article  ADS  Google Scholar 

  48. Li, Y. I. et al. RNA splicing is a primary link between genetic variation and disease. Science 352, 600–604 (2016).

    Article  ADS  Google Scholar 

  49. Delaneau, O. et al. Chromatin three-dimensional interactions mediate genetic effects on gene expression. Science 364, eaat8266 (2019).

    Article  Google Scholar 

  50. Ferkingstad, E. et al. Large-scale integration of the plasma proteome with genetics and disease. Nat. Genet. 53, 1712–1721 (2021).

    Article  Google Scholar 

  51. Sun, B. B., Chiou, J., Traylor, M., Benner, C. & Hsu, Y. H. Genetic regulation of the human plasma proteome in 54,306 UK Biobank participants. Preprint at bioRxiv https://doi.org/10.1101/2022.06.17.496443 (2022).

    Article  Google Scholar 

  52. Pietzner, M. et al. Synergistic insights into human health from aptamer- and antibody-based proteomic profiling. Nat. Commun. 12, 6822 (2021).

    Article  ADS  Google Scholar 

  53. Wu, L. et al. Variation and genetic control of protein abundance in humans. Nature 499, 79–82 (2013).

    Article  ADS  Google Scholar 

  54. Battle, A. et al. Impact of regulatory variation from RNA to protein. Science 347, 664–667 (2015).

    Article  ADS  Google Scholar 

  55. Mirauta, B. A. et al. Population-scale proteome variation in human induced pluripotent stem cells. eLife 9, e57390 (2020).

    Article  Google Scholar 

  56. Powell, J. E. et al. Congruence of additive and non-additive effects on gene expression estimated from pedigree and SNP data. PLoS Genet. 9, e1003502 (2013).

    Article  Google Scholar 

  57. ’t Hoen, P. A. C. et al. Reproducibility of high-throughput mRNA and small RNA sequencing across laboratories. Nat. Biotechnol. 31, 1015–1022 (2013).

    Article  Google Scholar 

  58. GTEx Consortium. The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science 369, 1318–1330 (2020).

    Article  Google Scholar 

  59. Westra, H.-J. et al. MixupMapper: correcting sample mix-ups in genome-wide datasets increases power to detect small genetic effects. Bioinformatics 27, 2104–2111 (2011).

    Article  Google Scholar 

  60. Fort, A. et al. MBV: a method to solve sample mislabeling and detect technical bias in large combined genotype and sequencing assay datasets. Bioinformatics 33, 1895–1897 (2017).

    Article  Google Scholar 

  61. Eagles, N. J. et al. SPEAQeasy: a scalable pipeline for expression analysis and quantification for R/bioconductor-powered RNA-seq analyses. BMC Bioinformatics 22, 1–18 (2021).

    Google Scholar 

  62. Zhang, F. et al. Ancestry-agnostic estimation of DNA sample contamination from sequence reads. Genome Res. 30, 185–194 (2020).

    Article  Google Scholar 

  63. Dillies, M.-A. et al. A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis. Brief. Bioinform. 14, 671–683 (2013).

    Article  Google Scholar 

  64. Delaneau, O. et al. A complete tool set for molecular QTL discovery and analysis. Nat. Commun. 8, 15452 (2017).

    Article  ADS  Google Scholar 

  65. Kumasaka, N., Knights, A. J. & Gaffney, D. J. Fine-mapping cellular QTLs with RASQUAL and ATAC-seq. Nat. Genet. 48, 206–213 (2016).

    Article  Google Scholar 

  66. Wang, A. T. et al. Allele-specific QTL fine mapping with PLASMA. Am. J. Hum. Genet. 106, 170–187 (2020).

    Article  Google Scholar 

  67. Liang, Y., Aguet, F., Barbeira, A. N., Ardlie, K. & Im, H. K. A scalable unified framework of total and allele-specific counts for cis-QTL, fine-mapping, and prediction. Nat. Commun. 12, 1424 (2021).

    Article  ADS  Google Scholar 

  68. Mohammadi, P., Castel, S. E., Brown, A. A. & Lappalainen, T. Quantifying the regulatory effect size of cis-acting genetic variation using allelic fold change. Genome Res. 27, 1872–1884 (2017).

    Article  Google Scholar 

  69. Jaffe, A. E. & Irizarry, R. A. Accounting for cellular heterogeneity is critical in epigenome-wide association studies. Genome Biol. 15, R31 (2014).

    Article  Google Scholar 

  70. Stegle, O., Parts, L., Durbin, R. & Winn, J. A Bayesian framework to account for complex non-genetic factors in gene expression levels greatly increases power in eQTL studies. PLoS Comput. Biol. 6, e1000770 (2010).

    Article  ADS  Google Scholar 

  71. GTEx Consortium et al. Genetic effects on gene expression across human tissues. Nature 550, 204–213 (2017).

    Article  Google Scholar 

  72. Dahl, A., Guillemot, V., Mefford, J., Aschard, H. & Zaitlen, N. Adjusting for principal components of molecular phenotypes induces replicating false positives. Genetics 211, 1179–1189 (2019).

    Article  Google Scholar 

  73. Taylor-Weiner, A. et al. Scaling computational genomics to millions of individuals with GPUs. Genome Biol. 20, 228 (2019).

    Article  Google Scholar 

  74. Price, A. L. et al. Single-tissue and cross-tissue heritability of gene expression via identity-by-descent in related or unrelated individuals. PLoS Genet. 7, e1001317 (2011).

    Article  Google Scholar 

  75. Grundberg, E. et al. Mapping cis- and trans-regulatory effects across multiple tissues in twins. Nat. Genet. 44, 1084–1089 (2012).

    Article  Google Scholar 

  76. Kerimov, N. et al. A compendium of uniformly processed human gene expression and splicing quantitative trait loci. Nat. Genet. 53, 1290–1299 (2021).

    Article  Google Scholar 

  77. Strober, B. J. et al. Dynamic genetic regulation of gene expression during cellular differentiation. Science 364, 1287–1290 (2019).

    Article  ADS  Google Scholar 

  78. Grishin, D. & Gusev, A. Allelic imbalance of chromatin accessibility in cancer identifies candidate causal risk variants and their mechanisms. Nat. Genet. 54, 837–849 (2022).

    Article  Google Scholar 

  79. Kim-Hellmuth, S. et al. Cell type-specific genetic regulation of gene expression across human tissues. Science 369, eaaz8528 (2020).

    Article  Google Scholar 

  80. Oliva, M. et al. The impact of sex on gene expression across human tissues. Science 369, eaba3066 (2020).

    Article  Google Scholar 

  81. Zhernakova, D. V. et al. Identification of context-dependent expression quantitative trait loci in whole blood. Nat. Genet. 49, 139–145 (2017).

    Article  Google Scholar 

  82. Raj, T. et al. Polarization of the effects of autoimmune and neurodegenerative risk alleles in leukocytes. Science 344, 519–523 (2014).

    Article  ADS  Google Scholar 

  83. Ye, C. J. et al. Intersection of population variation and autoimmunity genetics in human T cell activation. Science 345, 1254665 (2014).

    Article  Google Scholar 

  84. Bůžková, P., Lumley, T. & Rice, K. Permutation and parametric bootstrap tests for gene–gene and gene–environment interactions. Ann. Hum. Genet. 75, 36–45 (2011).

    Article  Google Scholar 

  85. Davis, J. R. et al. An efficient multiple-testing adjustment for eQTL studies that accounts for linkage disequilibrium between variants. Am. J. Hum. Genet. 98, 216–224 (2016).

    Article  Google Scholar 

  86. Zhabotynsky, V. et al. eQTL mapping using allele-specific count data is computationally feasible, powerful, and provides individual-specific estimates of genetic effects. PLoS Genet. 18, e1010076 (2022).

    Article  Google Scholar 

  87. Shabalin, A. A. Matrix eQTL: ultra fast eQTL analysis via large matrix operations. Bioinformatics 28, 1353–1358 (2012).

    Article  Google Scholar 

  88. Ongen, H., Buil, A., Brown, A. A., Dermitzakis, E. T. & Delaneau, O. Fast and efficient QTL mapper for thousands of molecular phenotypes. Bioinformatics 32, 1479–1485 (2016).

    Article  Google Scholar 

  89. Casale, F. P., Rakitsch, B., Lippert, C. & Stegle, O. Efficient set tests for the genetic analysis of correlated traits. Nat. Methods 12, 755–758 (2015).

    Article  Google Scholar 

  90. Quick, C. et al. A versatile toolkit for molecular QTL mapping and meta-analysis at scale. Preprint at bioRxiv https://doi.org/10.1101/2020.12.18.423490 (2020).

    Article  Google Scholar 

  91. Barbeira, A. N. et al. Exploiting the GTEx resources to decipher the mechanisms at GWAS loci. Genome Biol. 22, 49 (2021).

    Article  Google Scholar 

  92. Abell, N. S. et al. Multiple causal variants underlie genetic associations in humans. Science 375, 1247–1254 (2022).

    Article  ADS  Google Scholar 

  93. Fulco, C. P. et al. Activity-by-contact model of enhancer-promoter regulation from thousands of CRISPR perturbations. Nat. Genet. 51, 1664–1669 (2019).

    Article  Google Scholar 

  94. Katz, Y. et al. Quantitative visualization of alternative exon expression from RNA-seq data. Bioinformatics 31, 2400–2402 (2015).

    Article  Google Scholar 

  95. Alasoo, K. wiggleplotr: make read coverage plots from BigWig files. Bioconductor https://bioconductor.org/packages/release/bioc/html/wiggleplotr.html (2017).

  96. Liu, B., Gloudemans, M. J., Rao, A. S., Ingelsson, E. & Montgomery, S. B. Abundant associations with gene expression complicate GWAS follow-up. Nat. Genet. 51, 768–769 (2019).

    Article  Google Scholar 

  97. Kanai, M. et al. Meta-analysis fine-mapping is often miscalibrated at single-variant resolution. Cell Genomics 2, 100210 (2022).

    Article  Google Scholar 

  98. Hormozdiari, F., Kostem, E., Kang, E. Y., Pasaniuc, B. & Eskin, E. Identifying causal variants at loci with multiple signals of association. Genetics 198, 497–508 (2014).

    Article  Google Scholar 

  99. Wen, X., Lee, Y., Luca, F. & Pique-Regi, R. Efficient integrative multi-SNP association analysis via deterministic approximation of posteriors. Am. J. Hum. Genet. 98, 1114–1129 (2016).

    Article  Google Scholar 

  100. Benner, C. et al. FINEMAP: efficient variable selection using summary data from genome-wide association studies. Bioinformatics 32, 1493–1501 (2016).

    Article  Google Scholar 

  101. Wang, G., Sarkar, A., Carbonetto, P. & Stephens, M. A simple new approach to variable selection in regression, with application to genetic fine mapping. J. R. Stat. Soc. Ser. B Stat. Methodol. 82, 1273–1300 (2020).

    Article  MATH  Google Scholar 

  102. Arvanitis, M., Tayeb, K., Strober, B. J. & Battle, A. Redefining tissue specificity of genetic regulation of gene expression in the presence of allelic heterogeneity. Am. J. Hum. Genet. 109, 223–239 (2022).

    Article  Google Scholar 

  103. Storey, J. D. & Tibshirani, R. Statistical significance for genomewide studies. Proc. Natl Acad. Sci. USA 100, 9440–9445 (2003).

    Article  ADS  MATH  Google Scholar 

  104. Nica, A. C. et al. The architecture of gene regulatory variation across multiple human tissues: the MuTHER study. PLoS Genet. 7, e1002003 (2011).

    Article  Google Scholar 

  105. Castel, S. E. et al. A vast resource of allelic expression data spanning human tissues. Genome Biol. 21, 234 (2020).

    Article  Google Scholar 

  106. Flutre, T., Wen, X., Pritchard, J. & Stephens, M. A statistical framework for joint eQTL analysis in multiple tissues. PLoS Genet. 9, e1003486 (2013).

    Article  Google Scholar 

  107. Urbut, S. M., Wang, G., Carbonetto, P. & Stephens, M. Flexible statistical methods for estimating and testing effects in genomic studies with multiple conditions. Nat. Genet. 51, 187–195 (2019).

    Article  Google Scholar 

  108. Cuomo, A. S. E. et al. CellRegMap: a statistical framework for mapping context-specific regulatory variants using scRNA-seq. Mol. Syst. Biol. 18, e10663 (2022).

    Article  Google Scholar 

  109. Gay, N. R. et al. Impact of admixture and ancestry on eQTL analysis and GWAS colocalization in GTEx. Genome Biol. 21, 233 (2020).

    Article  Google Scholar 

  110. Storey, J. D. et al. Gene-expression variation within and among human populations. Am. J. Hum. Genet. 80, 502–509 (2007).

    Article  Google Scholar 

  111. Spielman, R. S. et al. Common genetic variants account for differences in gene expression among ethnic groups. Nat. Genet. 39, 226–231 (2007).

    Article  Google Scholar 

  112. Stranger, B. E. et al. Patterns of cis regulatory variation in diverse human populations. PLoS Genet. 8, e1002639 (2012).

    Article  Google Scholar 

  113. Lappalainen, T. et al. Transcriptome and genome sequencing uncovers functional variation in humans. Nature 501, 506–511 (2013).

    Article  ADS  Google Scholar 

  114. Yang, J., Zaitlen, N. A., Goddard, M. E., Visscher, P. M. & Price, A. L. Advantages and pitfalls in the application of mixed-model association methods. Nat. Genet. 46, 100–106 (2014).

    Article  Google Scholar 

  115. Lee, C. Genome-wide expression quantitative trait loci analysis using mixed models. Front. Genet. 9, 341 (2018).

    Article  Google Scholar 

  116. Lloyd-Jones, L. R. et al. The genetic architecture of gene expression in peripheral blood. Am. J. Hum. Genet. 100, 228–237 (2017).

    Article  Google Scholar 

  117. Pala, M. et al. Population- and individual-specific regulatory variation in Sardinia. Nat. Genet. 49, 700–707 (2017).

    Article  Google Scholar 

  118. Zhong, Y., Perera, M. A. & Gamazon, E. R. On using local ancestry to characterize the genetic architecture of human traits: genetic regulation of gene expression in multiethnic or admixed populations. Am. J. Hum. Genet. 104, 1097–1115 (2019).

    Article  Google Scholar 

  119. Li, B. et al. Incorporating local ancestry improves identification of ancestry-associated methylation signatures and meQTLs in African Americans. Commun. Biol. 5, 401 (2022).

    Article  Google Scholar 

  120. Maples, B. K., Gravel, S., Kenny, E. E. & Bustamante, C. D. RFMix: a discriminative modeling approach for rapid and robust local-ancestry inference. Am. J. Hum. Genet. 93, 278–288 (2013).

    Article  Google Scholar 

  121. Pierce, B. L. et al. Co-occurring expression and methylation QTLs allow detection of common causal variants and shared biological mechanisms. Nat. Commun. https://doi.org/10.1038/s41467-018-03209-9 (2018).

  122. Wu, Y. et al. Integrative analysis of omics summary data reveals putative mechanisms underlying complex traits. Nat. Commun. 9, 918 (2018).

    Article  ADS  Google Scholar 

  123. Argelaguet, R. et al. Multi-omics factor analysis—a framework for unsupervised integration of multi-omics data sets. Mol. Syst. Biol. 14, e8124 (2018).

    Article  Google Scholar 

  124. Brown, B. C. et al. Multiset correlation and factor analysis enables exploration of multi-omic data. Preprint at bioRxiv https://doi.org/10.1101/2022.07.18.500246 (2022).

  125. McVicker, G. et al. Identification of genetic variants that affect histone modifications in human cells. Science 342, 747–749 (2013).

    Article  ADS  Google Scholar 

  126. Chen, L. et al. Genetic drivers of epigenetic and transcriptional variation in human immune cells. Cell 167, 1398–1414.e24 (2016).

    Article  Google Scholar 

  127. Alasoo, K. et al. Shared genetic effects on chromatin and gene expression indicate a role for enhancer priming in immune response. Nat. Genet. 50, 424–431 (2018).

    Article  Google Scholar 

  128. Giambartolomei, C. et al. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genet. 10, e1004383 (2014).

    Article  Google Scholar 

  129. Hormozdiari, F. et al. Colocalization of GWAS and eQTL signals detects target genes. Am. J. Hum. Genet. https://doi.org/10.1016/j.ajhg.2016.10.003 (2016).

    Article  Google Scholar 

  130. Wen, X., Pique-Regi, R. & Luca, F. Integrating molecular QTL data into genome-wide genetic association analysis: probabilistic assessment of enrichment and colocalization. PLoS Genet. 13, e1006646 (2017).

    Article  Google Scholar 

  131. Hukku, A. et al. Probabilistic colocalization of genetic variants from complex and molecular traits: promise and limitations. Am. J. Hum. Genet. https://doi.org/10.1016/j.ajhg.2020.11.012 (2020).

    Article  Google Scholar 

  132. Wallace, C. A more accurate method for colocalisation analysis allowing for multiple causal variants. PLoS Genet. 17, e1009440 (2021).

    Article  Google Scholar 

  133. Gamazon, E. R. et al. A gene-based association method for mapping traits using reference transcriptome data. Nat. Genet. 47, 1091–1098 (2015).

    Article  Google Scholar 

  134. Gusev, A. et al. Integrative approaches for large-scale transcriptome-wide association studies. Nat. Genet. 48, 245–252 (2016).

    Article  Google Scholar 

  135. Hu, Y. et al. A statistical framework for cross-tissue transcriptome-wide association analysis. Nat. Genet. 51, 568–576 (2019).

    Article  Google Scholar 

  136. Zhang, Y. et al. PTWAS: investigating tissue-relevant causal molecular mechanisms of complex traits using probabilistic TWAS analysis. Genome Biol. 21, 232 (2020).

    Article  Google Scholar 

  137. Liu, X. et al. GBAT: a gene-based association test for robust detection of trans-gene regulation. Genome Biol. 21, 211 (2020).

    Article  Google Scholar 

  138. Gandal, M. J. et al. Transcriptome-wide isoform-level dysregulation in ASD, schizophrenia, and bipolar disorder. Science 362, eaat8127 (2018).

    Article  ADS  Google Scholar 

  139. Gusev, A. et al. A transcriptome-wide association study of high-grade serous epithelial ovarian cancer identifies new susceptibility genes and splice variants. Nat. Genet. 51, 815–823 (2019).

    Article  Google Scholar 

  140. Li, Y. I., Wong, G., Humphrey, J. & Raj, T. Prioritizing Parkinson’s disease genes using population-scale transcriptomic data. Nat. Commun. 10, 994 (2019).

    Article  ADS  Google Scholar 

  141. Zhang, J. et al. Plasma proteome analyses in individuals of European and African ancestry identify cis-pQTLs and models for proteome-wide association studies. Nat. Genet. 54, 593–602 (2022).

    Article  Google Scholar 

  142. Wainberg, M. et al. Opportunities and challenges for transcriptome-wide association studies. Nat. Genet. 51, 592–599 (2019).

    Article  Google Scholar 

  143. Barbeira, A. N. et al. Fine-mapping and QTL tissue-sharing information improves the reliability of causal gene identification. Genet. Epidemiol. https://doi.org/10.1002/gepi.22346 (2020).

    Article  Google Scholar 

  144. Mancuso, N. et al. Probabilistic fine-mapping of transcriptome-wide association studies. Nat. Genet. 51, 675–682 (2019).

    Article  Google Scholar 

  145. Frankish, A. et al. GENCODE 2021. Nucleic Acids Res. 49, D916–D923 (2021).

    Article  Google Scholar 

  146. Colavizza, G., Hrynaszkiewicz, I., Staden, I., Whitaker, K. & McGillivray, B. The citation advantage of linking publications to research data. PLoS ONE 15, e0230416 (2020).

    Article  Google Scholar 

  147. 1000 Genomes Project Consortium et al. A global reference for human genetic variation. Nature 526, 68–74 (2015).

    Article  Google Scholar 

  148. Hayhurst, J. et al. A community driven GWAS summary statistics standard. Preprint at bioRxiv https://doi.org/10.1101/2022.07.15.500230 (2022).

    Article  Google Scholar 

  149. Das, S. et al. Next-generation genotype imputation service and methods. Nat. Genet. 48, 1284–1287 (2016).

    Article  Google Scholar 

  150. Byrska-Bishop, M. et al. High-coverage whole-genome sequencing of the expanded 1000 Genomes Project cohort including 602 trios. Cell 185, 3426–3440.e19 (2022).

    Article  Google Scholar 

  151. Ewels, P. A. et al. The nf-core framework for community-curated bioinformatics pipelines. Nat. Biotechnol. 38, 276–278 (2020).

    Article  Google Scholar 

  152. Schwarz, T. et al. Powerful eQTL mapping through low-coverage RNA sequencing. HGG Adv. 3, 100103 (2022).

    Google Scholar 

  153. Yazar, S. et al. Single-cell eQTL mapping identifies cell type-specific genetic control of autoimmune disease. Science 376, eabf3041 (2022).

    Article  Google Scholar 

  154. van der Wijst, M. G. P. et al. Single-cell RNA sequencing identifies celltype-specific cis-eQTLs and co-expression QTLs. Nat. Genet. 50, 493–497 (2018).

    Article  Google Scholar 

  155. Perez, R. K. et al. Single-cell RNA-seq reveals cell type-specific molecular and genetic associations to lupus. Science 376, eabf1970 (2022).

    Article  Google Scholar 

  156. Cuomo, A. S. E. et al. Optimizing expression quantitative trait locus mapping workflows for single-cell studies. Genome Biol. 22, 188 (2021).

    Article  Google Scholar 

  157. Nathan, A. et al. Single-cell eQTL models reveal dynamic T cell state dependence of disease loci. Nature 606, 120–128 (2022).

    Article  ADS  Google Scholar 

  158. Jerber, J. et al. Population-scale single-cell RNA-seq profiling across dopaminergic neuron differentiation. Nat. Genet. 53, 304–312 (2021).

    Article  Google Scholar 

  159. Elorbany, R. et al. Single-cell sequencing reveals lineage-specific dynamic genetic regulation of gene expression during human cardiomyocyte differentiation. PLoS Genet. 18, e1009666 (2022).

    Article  Google Scholar 

  160. Kumasaka, N. et al. Mapping interindividual dynamics of innate immune response at single-cell resolution. Preprint at bioRxiv https://doi.org/10.1101/2021.09.01.457774 (2021).

    Article  Google Scholar 

  161. van der Wijst, M. et al. The single-cell eQTLGen consortium. eLife 9, e52155 (2020).

    Article  Google Scholar 

  162. Mu, Z. et al. The impact of cell type and context-dependent regulatory variants on human immune traits. Genome Biol. 22, 122 (2021).

    Article  Google Scholar 

  163. Mostafavi, H., Spence, J. P., Naqvi, S. & Pritchard, J. K. Limited overlap of eQTLs and GWAS hits due to systematic differences in discovery. Preprint at bioRxiv https://doi.org/10.1101/2022.05.07.491045 (2022).

    Article  Google Scholar 

  164. Ferraro, N. M. et al. Transcriptomic signatures across human tissues identify functional rare genetic variation. Science 369, eaaz5900 (2020).

    Article  Google Scholar 

  165. Li, X. et al. Transcriptome sequencing of a large human family identifies the impact of rare noncoding variants. Am. J. Hum. Genet. 95, 245–256 (2014).

    Article  Google Scholar 

  166. Zhao, J. et al. A burden of rare variants associated with extremes of gene expression in human peripheral blood. Am. J. Hum. Genet. 98, 299–309 (2016).

    Article  Google Scholar 

  167. Zeng, Y. et al. Aberrant gene expression in humans. PLoS Genet. 11, e1004942 (2015).

    Article  Google Scholar 

  168. Mertes, C. et al. Detection of aberrant splicing events in RNA-seq data using FRASER. Nat. Commun. 12, 529 (2021).

    Article  ADS  Google Scholar 

  169. Brechtmann, F. et al. OUTRIDER: a statistical method for detecting aberrantly expressed genes in RNA sequencing data. Am. J. Hum. Genet. 103, 907–917 (2018).

    Article  Google Scholar 

  170. Richter, F. et al. ORE identifies extreme expression effects enriched for rare variants. Bioinformatics 35, 3906–3912 (2019).

    Article  Google Scholar 

  171. Jaganathan, K. et al. Predicting splicing from primary sequence with deep learning. Cell 176, 535–548.e24 (2019).

    Article  Google Scholar 

  172. Avsec, Ž. et al. Effective gene expression prediction from sequence by integrating long-range interactions. Nat. Methods 18, 1196–1203 (2021).

    Article  Google Scholar 

  173. Yao, D. W., O’Connor, L. J., Price, A. L. & Gusev, A. Quantifying genetic effects on disease mediated by assayed gene expression levels. Nat. Genet. 52, 626–633 (2020).

    Article  Google Scholar 

  174. Chun, S. et al. Limited statistical evidence for shared genetic effects of eQTLs and autoimmune-disease-associated loci in three major immune-cell types. Nat. Genet. 49, 600–605 (2017).

    Article  Google Scholar 

  175. Mountjoy, E. et al. An open approach to systematically prioritize causal variants and genes at all published human GWAS trait-associated loci. Nat. Genet. 53, 1527–1533 (2021).

    Article  Google Scholar 

  176. Franzén, O. et al. Cardiometabolic risk loci share downstream cis- and trans-gene regulation across tissues and diseases. Science 353, 827–830 (2016).

    Article  ADS  Google Scholar 

  177. Umans, B. D., Battle, A. & Gilad, Y. Where are the disease-associated eQTLs? Trends Genet. 37, 109–124 (2021).

    Article  Google Scholar 

  178. Wang, X. & Goldstein, D. B. Enhancer domains predict gene pathogenicity and inform gene discovery in complex disease. Am. J. Hum. Genet. 106, 215–233 (2020).

    Article  Google Scholar 

  179. Connally, N. et al. The missing link between genetic association and regulatory function. eLife 11, e74970 (2022).

    Article  Google Scholar 

  180. Dobbyn, A. et al. Landscape of conditional eQTL in dorsolateral prefrontal cortex and co-localization with schizophrenia GWAS. Am. J. Hum. Genet. 102, 1169–1184 (2018).

    Article  Google Scholar 

  181. Wu, Y. et al. Colocalization of GWAS and eQTL signals at loci with multiple signals identifies additional candidate genes for body fat distribution. Hum. Mol. Genet. 28, 4161–4172 (2019).

    Article  Google Scholar 

  182. Tewhey, R. et al. Direct identification of hundreds of expression-modulating variants using a multiplexed reporter assay. Cell 172, 1132–1134 (2018).

    Article  Google Scholar 

  183. Gasperini, M. et al. A genome-wide framework for mapping gene regulation via cellular genetic screens. Cell 176, 1516 (2019).

    Article  Google Scholar 

  184. Brandt, M., Gokden, A., Ziosi, M. & Lappalainen, T. A polyclonal allelic expression assay for detecting regulatory effects of transcript variants. Genome Med. 12, 79 (2020).

    Article  Google Scholar 

  185. ENCODE Project Consortium et al. Expanded encyclopaedias of DNA elements in the human and mouse genomes. Nature 583, 699–710 (2020).

    Article  ADS  Google Scholar 

  186. Chandra, V. et al. Promoter-interacting expression quantitative trait loci are enriched for functional genetic variants. Nat. Genet. 53, 110–119 (2021).

    Article  Google Scholar 

  187. Javierre, B. M. et al. Lineage-specific genome architecture links enhancers and non-coding disease variants to target gene promoters. Cell 167, 1369–1384.e19 (2016).

    Article  Google Scholar 

  188. Nasser, J. et al. Genome-wide enhancer maps link risk variants to disease genes. Nature 593, 238–243 (2021).

    Article  ADS  Google Scholar 

  189. Brandt, M. & Lappalainen, T. Snapshot: discovering genetic regulatory variants by QTL analysis. Cell 171, 980–980.e1 (2017).

    Article  Google Scholar 

  190. Brem, R. B., Yvert, G., Clinton, R. & Kruglyak, L. Genetic dissection of transcriptional regulation in budding yeast. Science 296, 752–755 (2002).

    Article  ADS  Google Scholar 

  191. Schadt, E. E. et al. Genetics of gene expression surveyed in maize, mouse and man. Nature 422, 297–302 (2003).

    Article  ADS  Google Scholar 

Download references

Acknowledgements

T.L. is supported by the National Institutes of Health (NIH) (grants R01GM122924, R01AG057422, R01MH106842 and U24HG012090) and by the European Research Council (grant 101043238). S.B.M. is supported by the National Institutes of Health (NIH) (grants R01AG066490, R01MH125244, U01HG012069 and U24HG010090). K.A. is supported by funding from the European Union’s Horizon 2020 research and innovation programme (grant no. 825775), Estonian Research Council (grant no. PSG415), Open Targets (grant nos. OTAR2067, OTAR2069 and OTAR2077) and Estonian Centre of Excellence in ICT Research (EXCITE), funded by the European Regional Development Fund.

Author information

Authors and Affiliations

Authors

Contributions

Introduction (T.L., F.A., K.A., Y.I.L., A.B., H.K.I. and S.B.M.); Experimentation (T.L., F.A., K.A., Y.I.L., A.B., H.K.I. and S.B.M.); Results (T.L., F.A., K.A., Y.I.L., A.B., H.K.I. and S.B.M.); Applications (T.L., F.A., K.A., Y.I.L., A.B., H.K.I. and S.B.M.); Reproducibility and data deposition (T.L., F.A., K.A., Y.I.L., A.B., H.K.I. and S.B.M.); Limitations and optimizations (T.L., F.A., K.A., Y.I.L., A.B., H.K.I. and S.B.M.); Outlook (T.L., F.A., K.A., Y.I.L., A.B., H.K.I. and S.B.M.); Overview of the Primer (T.L. and F.A.).

Corresponding authors

Correspondence to François Aguet or Tuuli Lappalainen.

Ethics declarations

Competing interests

F.A. is an employee of Illumina, Inc. and an inventor on a patent application related to TensorQTL. A.B. consults for Third Rock Ventures, Inc. and is a shareholder in Alphabet, Inc. T.L. advises GSK, Variant Bio and Goldfinch Bio, and has equity in Variant Bio. S.B.M. advises BioMarin, MyOme and Tenaya Therapeutics. The other authors declare no competing interests.

Peer review

Peer review information

Nature Reviews Methods Primers thanks Jan Korbel and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Related links

nf-core: https://nf-co.re/

Glossary

Collider bias

A statistical distortion when an exposure and an outcome each influence a common third variable and that variable (collider) is controlled for in the analysis.

Co-localization

Sharing of causal variants between two association signals in the same locus.

eGenes

Genes whose expression is affected by at least one significant expression quantitative trait locus (eQTL).

Federated analysis

Analysis of different data sets separately in a coordinated and uniform manner with subsequent meta-analysis to integrate the results.

Gaussian residuals

An assumption that the linear regression residuals are normally distributed.

Homoscedasticity

An assumption of equal variances.

LD contamination

A given variant showing an association signal for a trait not because the variant itself is causally affecting this trait but because it is in linkage disequilibrium (LD) with a true causal variant.

Minor allele frequency

(MAF). The population frequency of the less common allele of a genetic variant.

Molecular traits

Phenotypes that are defined and measured at the molecular level.

Quantile normalization

A procedure applied to a data set such that the distribution of the values of each sample is the same.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Aguet, F., Alasoo, K., Li, Y.I. et al. Molecular quantitative trait loci. Nat Rev Methods Primers 3, 4 (2023). https://doi.org/10.1038/s43586-022-00188-6

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1038/s43586-022-00188-6

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing