Abstract
Trait-associated genetic variants affect complex phenotypes primarily via regulatory mechanisms on the transcriptome. To investigate the genetics of gene expression, we performed cis- and trans-expression quantitative trait locus (eQTL) analyses using blood-derived expression from 31,684 individuals through the eQTLGen Consortium. We detected cis-eQTL for 88% of genes, and these were replicable in numerous tissues. Distal trans-eQTL (detected for 37% of 10,317 trait-associated variants tested) showed lower replication rates, partially due to low replication power and confounding by cell type composition. However, replication analyses in single-cell RNA-seq data prioritized intracellular trans-eQTL. Trans-eQTL exerted their effects via several mechanisms, primarily through regulation by transcription factors. Expression of 13% of the genes correlated with polygenic scores for 1,263 phenotypes, pinpointing potential drivers for those traits. In summary, this work represents a large eQTL resource, and its results serve as a starting point for in-depth interpretation of complex phenotypes.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
Primary genotype and gene expression data were analyzed by individual cohorts participating in the study, and our study analyzed summary statistics. Full summary statistics of eQTLGen cis-eQTL, trans-eQTL and eQTS meta-analyses are available on the eQTLGen website, http://www.eqtlgen.org, which was built using the MOLGENIS framework76. We also provide cis-eQTL files formatted for use in SMR, MAFs and replication statistics for cis-eQTL, trans-eQTL and eQTSs. Per-cohort summary statistics for discovery cohorts can be made available after approval of an analysis proposal in eQTLGen and with agreement of the cohort PIs; contact corresponding authors for further information. Trait-associated variants were collected from the EBI GWAS Catalog (https://www.ebi.ac.uk/gwas/, accessed on 21 November 2016), the NIH GWAS Catalog (now hosted by the EBI GWAS Catalog, https://www.ebi.ac.uk/gwas/) and Immunobase (http://www.immunobase.org, accessed 26 April 2016; now hosted by Open Targets at https://genetics.opentargets.org/immunobase). Sources of numerous GWAS summary statistics used for eQTS analyses are outlined in the Supplementary Note and Supplementary Table 13. ExAC pLI scores used for Fig. 2 originate from ftp://ftp.broadinstitute.org/pub/ExAC_release/release0.3.1/functional_gene_constraint/fordist_cleaned_exac_r03_march16_z_pli_rec_null_data.txt. Genotype reference files used for harmonizing discovery datasets for meta-analysis originate from ftp://share.sph.umich.edu/1000genomes/fullProject/2012.03.14/GIANT.phase1_release_v3.20101123.snps_indels_svs.genotypes.refpanel.ALL.vcf.gz.tgz. The gene model used for gene annotations originates from Ensembl version 71 (ftp://ftp.ensembl.org/pub/release-71/gtf/homo_sapiens/Homo_sapiens.GRCh37.71.gtf.gz). FANTOM TF annotations used for eQTS enrichment analyses originate from http://fantom.gsc.riken.jp/5/sstar/Browse_Transcription_Factors_hg19. ChIP-seq data used for cis-eQTL overlap originate from https://www.chicp.org/. PPI data used for trans-eQTL mechanism enrichment analyses originate from https://www.intomics.com/inbio/map/api/get_data?file=InBio_Map_core_2016_09_12.tar.gz. Hi-C data used for trans-eQTL mechanism enrichment are deposited in the GEO (GM12878, GEO accession GSE63525). Curated gene sets used for enrichment analyses (gene ontology sets, ENCODE ChIP-X and CheA ChIP-X TF targets, TRANSFAC and JASPAR PWMs, ARCHS4 tissue expression, TargetScan miRNA target predictions, TarBase miRNA validated targets) were downloaded from the Enrichr website (https://maayanlab.cloud/Enrichr/#stats). Gene expression summaries and metadata from GTEx version 7 originate from https://gtexportal.org/home/. Gene expression summaries from BIOS are available in the BIOS Omics Atlas (http://bbmri.researchlumc.nl/atlas/#data). Per-cohort individual-level genotype and gene expression data are governed by respective biobanks and access can be requested according to procedures established by each biobank, with relevant restrictions applying as imposed by the IRB or local legislation. Data-access procedures established for the BIOS Consortium are available at https://www.bbmri.nl/acquisition-use-analyze/bios. Source data are provided with this paper.
Code availability
Individual cohorts participating in the study followed analysis plans as specified in our analysis cookbooks (https://github.com/molgenis/systemsgenetics/wiki/eQTL-mapping-analysis-cookbook-(eQTLGen), https://github.com/molgenis/systemsgenetics/wiki/eQTL-mapping-analysis-cookbook-for-RNA-seq-data, https://github.com/molgenis/systemsgenetics/wiki/QTL-mapping-analysis-cookbook-for-Affymetrix-expression-arrays) or with slight alterations as described in the Methods and the Supplementary Note. Tools and source codes used for genotype harmonization, identification of sample mix-ups, eQTL mapping, meta-analyses and calculation of PGSs are available at https://github.com/molgenis/systemsgenetics/. Tools used for primary analyses were written in Java (versions 6–8, https://www.java.com/). PLINK version 1.0.7 (https://zzz.bwh.harvard.edu/plink/) and version 1.90 (https://www.cog-genomics.org/plink/1.9/) was used for clumping and pruning. Downstream analyses and plots were performed and constructed with R (versions 3.4.4, 3.6.1 and 4.0.0, https://cran.r-project.org/) using packages data.table version 1.12 (https://cran.r-project.org/web/packages/data.table/), tidyverse version 1.2.1 (https://cran.r-project.org/web/packages/tidyverse/), broom version 0.5.1 (https://cran.r-project.org/web/packages/broom/), pheatmap version 1.0.12 (https://cran.r-project.org/web/packages/pheatmap/) and GeneOverlap version 1.18.0 (https://bioconductor.org/packages/release/bioc/html/GeneOverlap.html). Power analyses were conducted with the R package pwr version 1.3-0 (https://cran.r-project.org/web/packages/pwr/). scRNA-seq analyses were performed using the Cell Ranger Single Cell Software Suite version 3.0.2 (https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/what-is-cell-ranger) and its implementation of STAR aligner. The ToppGene web tool (https://toppgene.cchmc.org/) was used for some interpretative enrichment analyses, as well as the GeneNetwork web tool (https://genenetwork.nl/). The Decon2 framework (https://github.com/molgenis/systemsgenetics/tree/master/Decon2) was used for predicting cell counts in BIOS data. We formatted our cis-eQTL into the BESD format using SMR (https://cnsgenomics.com/software/smr/#Overview).
References
Yao, D. W., O’Connor, L. J., Price, A. L. & Gusev, A. Quantifying genetic effects on disease mediated by assayed gene expression levels. Nat. Genet. 52, 626–633 (2020).
O’Connor, L. J. et al. Extreme polygenicity of complex traits is explained by negative selection. Am. J. Hum. Genet. 105, 456–476 (2019).
Zeng, J. et al. Signatures of negative selection in the genetic architecture of human complex traits. Nat. Genet. 50, 746–753 (2018).
Westra, H. J. et al. Systematic identification of trans eQTLs as putative drivers of known disease associations. Nat. Genet. 45, 1238–1243 (2013).
Kirsten, H. et al. Dissecting the genetics of the human transcriptome identifies novel trait-related trans-eQTLs and corroborates the regulatory relevance of non-protein coding loci. Hum. Mol. Genet. 24, 4746–4763 (2015).
Lloyd-Jones, L. R. et al. The genetic architecture of gene expression in peripheral blood. Am. J. Hum. Genet. 100, 228–237 (2017).
Jansen, R. et al. Conditional eQTL analysis reveals allelic heterogeneity of gene expression. Hum. Mol. Genet. 26, 1444–1451 (2017).
Joehanes, R. et al. Integrated genome-wide analysis of expression quantitative trait loci aids interpretation of genomic association studies. Genome Biol. 18, 16 (2017).
Yao, C. et al. Dynamic role of trans regulation of gene expression in relation to complex traits. Am. J. Hum. Genet. 100, 571–580 (2017).
Brynedal, B. et al. Large-scale trans-eQTLs affect hundreds of transcripts and mediate patterns of transcriptional co-regulation. Am. J. Hum. Genet. 100, 581–591 (2017).
Lewis, C. M. & Vassos, E. Prospects for using risk scores in polygenic medicine. Genome Med. 9, 96 (2017).
Natarajan, P. et al. Polygenic risk score identifies subgroup with higher burden of atherosclerosis and greater relative benefit from statin therapy in the primary prevention setting. Circulation 135, 2091–2101 (2017).
Boyle, E. A., Li, Y. I. & Pritchard, J. K. An expanded view of complex traits: from polygenic to omnigenic. Cell 169, 1177–1186 (2017).
Liu, X., Li, Y. I. & Pritchard, J. K. Trans effects on gene expression can drive omnigenic inheritance. Cell 177, 1022–1034 (2019).
Zhernakova, D. V. et al. Identification of context-dependent expression quantitative trait loci in whole blood. Nat. Genet. 49, 139–145 (2017).
Bonder, M. J. et al. Disease variants alter transcription factor levels and methylation of their binding sites. Nat. Genet. 49, 131–138 (2017).
Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B Stat. Methodol. 57, 289–300 (1995).
Aguet, F. et al. Genetic effects on gene expression across human tissues. Nature 550, 204–213 (2017).
Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016).
Glassberg, E. C., Gao, Z., Harpak, A., Lan, X. & Pritchard, J. K. Evidence for weak selective constraint on human gene expression. Genetics 211, 757–772 (2019).
Wu, Y., Zheng, Z., Visscher, P. M. & Yang, J. Quantifying the mapping precision of genome-wide association studies using whole-genome sequencing data. Genome Biol. 18, 86 (2017).
Astle, W. J. et al. The allelic landscape of human blood cell trait variation and links to common complex disease. Cell 167, 1415–1429 (2016).
Melé, M. et al. The human transcriptome across tissues and individuals. Science 348, 660–665 (2015).
Qi, T. et al. Identifying gene targets for brain-related traits using transcriptomic and methylomic data from blood. Nat. Commun. 9, 2282 (2018).
Marbach, D. et al. Tissue-specific regulatory circuits reveal variable modular perturbations across complex diseases. Nat. Methods 13, 366–370 (2016).
Li, T. et al. A scored human protein–protein interaction network to catalyze genomic interpretation. Nat. Methods 14, 61–64 (2016).
Lamparter, D., Marbach, D., Rueedi, R., Kutalik, Z. & Bergmann, S. Fast and rigorous computation of gene and pathway scores from SNP-based summary statistics. PLoS Comput. Biol. 12, e1004714 (2016).
Rao, S. S. P. et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159, 1665–1680 (2014).
Nikpay, M. et al. A comprehensive 1000 Genomes-based genome-wide association meta-analysis of coronary artery disease. Nat. Genet. 47, 1121–1130 (2015).
Bentham, J. et al. Genetic association analyses implicate aberrant regulation of innate and adaptive immunity genes in the pathogenesis of systemic lupus erythematosus. Nat. Genet. 47, 1457–1464 (2015).
Davenport, E. E. et al. Discovering in vivo cytokine–eQTL interactions from a lupus clinical trial. Genome Biol. 19, 168 (2018).
McBride, J. M. et al. Safety and pharmacodynamics of rontalizumab in patients with systemic lupus erythematosus: results of a phase I, placebo-controlled, double-blind, dose-escalation study. Arthritis Rheum. 64, 3666–3676 (2012).
Yao, Y. et al. Development of potential pharmacodynamic and diagnostic markers for anti-IFN-α monoclonal antibody trials in systemic lupus erythematosus. Hum. Genomics Proteomics 2009, 374312 (2009).
Perry, J. R. B. et al. Parent-of-origin-specific allelic associations among 106 genomic loci for age at menarche. Nature 514, 92–97 (2014).
Lemaitre, R. N. et al. Genetic loci associated with plasma phospholipid n-3 fatty acids: a meta-analysis of genome-wide association studies from the CHARGE Consortium. PLoS Genet. 7, e1002193 (2011).
Liu, J. Z. et al. Association analyses identify 38 susceptibility loci for inflammatory bowel disease and highlight shared genetic risk across populations. Nat. Genet. 47, 979–986 (2015).
Gateva, V. et al. A large-scale replication study identifies TNIP1, PRDM1, JAZF1, UHRF1BP1 and IL10 as risk loci for systemic lupus erythematosus. Nat. Genet. 41, 1228–1233 (2009).
Moffatt, M. F. et al. A large-scale, consortium-based genomewide association study of asthma. N. Engl. J. Med. 363, 1211–1221 (2010).
Wood, A. R. et al. Defining the role of common variation in the genomic and biological architecture of adult human height. Nat. Genet. 46, 1173–1186 (2014).
Van Der Harst, P. et al. Seventy-five genetic loci influencing the human red blood cell. Nature 492, 369–375 (2012).
Teslovich, T. M. et al. Biological, clinical and population relevance of 95 loci for blood lipids. Nature 466, 707–713 (2010).
Willer, C. J. et al. Discovery and refinement of loci associated with lipid levels. Nat. Genet. 45, 1274–1285 (2013).
Wang, X. et al. Macrophage ABCA1 and ABCG1, but not SR-BI, promote macrophage reverse cholesterol transport in vivo. J. Clin. Invest. 117, 2216–2224 (2007).
Goldstein, J. L. & Brown, M. S. Binding and degradation of low density lipoproteins by cultured human fibroblasts. Comparison of cells from a normal subject and from a patient with homozygous familial hypercholesterolemia. J. Biol. Chem. 249, 5153–5162 (1974).
Singh, A. B., Kan, C. F. K., Shende, V., Dong, B. & Liu, J. A novel posttranscriptional mechanism for dietary cholesterol-mediated suppression of liver LDL receptor expression. J. Lipid Res. 55, 1397–1407 (2014).
Kettunen, J. et al. Genome-wide study for circulating metabolites identifies 62 loci and reveals novel systemic effects of LPA. Nat. Commun. 7, 11122 (2016).
Shin, S. Y. et al. An atlas of genetic influences on human blood metabolites. Nat. Genet. 46, 543–550 (2014).
El-Hattab, A. W. Serine biosynthesis and transport defects. Mol. Genet. Metab. 118, 153–159 (2016).
Leuzzi, V., Alessandrì, M. G., Casarano, M., Battini, R. & Cioni, G. Arginine and glycine stimulate creatine synthesis in creatine transporter 1-deficient lymphoblasts. Anal. Biochem. 375, 153–155 (2008).
Hart, C. E. et al. Phosphoserine aminotransferase deficiency: a novel disorder of the serine biosynthesis pathway. Am. J. Hum. Genet. 80, 931–937 (2007).
Klomp, L. W. J. et al. Molecular characterization of 3-phosphoglycerate dehydrogenase deficiency—a neurometabolic disorder associated with reduced l-serine biosynthesis. Am. J. Hum. Genet. 67, 1389–1399 (2000).
Shaheen, R. et al. Neu-Laxova syndrome, an inborn error of serine metabolism, is caused by mutations in PHGDH. Am. J. Hum. Genet. 94, 898–904 (2014).
Price, A. L. et al. Single-tissue and cross-tissue heritability of gene expression via identity-by-descent in related or unrelated individuals. PLoS Genet. 7, e1001317 (2011).
Mostafavi, H. et al. Variable prediction accuracy of polygenic scores within an ancestry group. eLife 9, e48376 (2020).
van der Wijst, M. et al. The single-cell eQTLGen consortium. eLife 9, e52155 (2020).
Wang, D. et al. Comprehensive functional genomic resource and integrative model for the human brain. Science 362, eaat8464 (2018).
Feingold, E. A. et al. The ENCODE (ENCyclopedia of DNA Elements) Project. Science 306, 636–640 (2004).
Myers, R. M. et al. A user’s guide to the Encyclopedia of DNA Elements (ENCODE). PLoS Biol. 9, e1001046 (2011).
Lachmann, A. et al. ChEA: transcription factor regulation inferred from integrating genome-wide ChIP-X experiments. Bioinformatics 26, 2438–2444 (2010).
Deelen, P. et al. Genotype Harmonizer: automatic strand alignment and format conversion for genotype data integration. BMC Res. Notes 7, 901 (2014).
Rumble, S. M. et al. SHRiMP: accurate mapping of short color-space reads. PLoS Comput. Biol. 5, e1000386 (2009).
Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).
Westra, H. J. et al. MixupMapper: correcting sample mix-ups in genome-wide datasets increases power to detect small genetic effects. Bioinformatics 27, 2104–2111 (2011).
Robinson, M. D. & Oshlack, A. A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biol. 11, R25 (2010).
Zerbino, D. R. et al. Ensembl 2018. Nucleic Acids Res. 46, D754–D761 (2018).
Zaykin, D. V. Optimally weighted Z-test is a powerful method for combining probabilities in meta-analysis. J. Evol. Biol. 24, 1836–1841 (2011).
MacArthur, J. et al. The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog). Nucleic Acids Res. 45, D896–D901 (2017).
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
Zhu, Z. et al. Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nat. Genet. 48, 481–487 (2016).
Chen, E. Y. et al. Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool. BMC Bioinformatics 14, 128 (2013).
Kuleshov, M. V. et al. Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Res. 44, W90–W97 (2016).
Lachmann, A. et al. Massive mining of publicly available RNA-seq data from human and mouse. Nat. Commun. 9, 1366 (2018).
Yu, G., Wang, L. G., Han, Y. & He, Q. Y. clusterProfiler: An R package for comparing biological themes among gene clusters. OMICS 16, 284–287 (2012).
Javierre, B. M. et al. Lineage-specific genome architecture links enhancers and non-coding disease variants to target gene promoters. Cell 167, 1369–1384 (2016).
Schofield, E. C. et al. CHiCP: a web-based tool for the integrative and interactive visualization of promoter capture Hi-C datasets. Bioinformatics 32, 2511–2513 (2016).
Swertz, M. A. et al. The MOLGENIS toolkit: rapid prototyping of biosoftware at the push of a button. BMC Bioinformatics 11, S12 (2010).
Acknowledgements
The cohorts participating in this study list their acknowledgements in the cohort-specific sections of the Supplementary Note. This work is supported by a grant from the European Research Council (ERC, ERC Starting Grant agreement number 637640 ImmRisk), a VIDI grant (917.14.374) and a VICI grant from the Netherlands Organisation for Scientific Research (NWO) to L.F. This work has been supported by the European Regional Development Fund and the program Mobilitas Pluss (MOBTP108) to U.Võsa. The project was supported by the ‘De Drie Lichten’ foundation in the Netherlands with a grant to A.C. M.G.N. is supported by ZonMw grants 849200011 and 531003014 from the Netherlands Organisation for Health Research and Development, a VENI grant from the NWO (VI.Veni.191G.030) and a Jacobs Foundation research fellowship. H.Y. is funded by a Diabetes UK RD Lawrence fellowship (17/0005594). This project received funding from the ERC under the European Union’s Horizon 2020 research and innovation program (grant agreement no. 772376 (EScORIAL)) to J.H.V. T.E. and A.K. were supported by the Estonian Research Council grant PRG (PRG1291). A.Battle was supported by NIH grant R01MH109905, NIH grant R01HG008150 (NHGRI; Non-Coding Variants Program) and NIH grant R01MH101814 (NIH Common Fund; GTEx Program). M.G.P.v.d.W. was funded by the Nederlandse Organisatie voor Wetenschappelijk onderzoek, NWO-Veni 192.029. This work was supported by NIH grants R21ES024834 (B.Pierce), R01ES020506 (B.Pierce), R01ES023834 (B.Pierce), R35ES028379 (B.Pierce) and R01CA107431 (H.A.). This work was supported by the Sigrid Juselius Foundation (J.Kettunen) and funds from the Academy of Finland (grant numbers 297338 and 307247) (J.Kettunen) and the Novo Nordisk Foundation (grant number NNF17OC0026062) (J.Kettunen). S.Ripatti was supported by the Academy of Finland Centre of Excellence in Complex Disease Genetics (grant no. 312062). M.G. was supported by EU Horizon 2020 (grant 733100 for SYSCID) and a grant from the Excellence of Science (FNRS and FWO) (grant no. 30770923). We acknowledge support from the BBMRI-NL (Biobanking and Biomolecular Resources Research Infrastructure 184.021.007 and 184.033.111), Spinozapremie (NWO 56-464-14192), the ERC (ERC Advanced 230374) and the KNAW Academy Professor Award (PAH/6635) to D.I.B. G.H. works in a unit that receives funding from the UK MRC (MC_UU_12013/1&2&5) and the University of Bristol. S.B. was supported by the Swiss National Science Foundation (310030-152724). B.M.P. was supported by CHARGE infrastructure grant number HJ105756 for the HVH cohort. This work was supported by the German Federal Ministry of Education and Research (BMBF) within the framework of the e:Med research and funding concept (grant 01ZX1906B) and by LIFE (Leipzig Research Center for Civilization Diseases), Universität Leipzig (which is funded by the European Union, by the European Regional Development Fund and by the Free State of Saxony within the framework of the excellence initiative to H.K. and M.Scholz). We thank the UMCG Genomics Coordination Center, the MOLGENIS team, the UG Center for Information Technology and the UMCG research IT program and their sponsors, in particular the BBMRI-NL for data storage, high-performance computing and web hosting infrastructure. The BBMRI-NL is a research infrastructure financed by the NWO (grant number 184.033.111). We thank K. McIntyre for editing the manuscript text.
Author information
Authors and Affiliations
Consortia
Contributions
U.Võsa. and A.C. coordinated consortium analyses, ran meta-analyses, interpreted data, performed downstream analyses and drafted and revised the manuscript. H.-J.W., M.J.B. and P.D. developed software used in the analyses, performed downstream analyses and participated in manuscript writing and revisions. L.F. and T.E. conceived the study. L.F. supervised the project, ran downstream analyses and participated in manuscript writing and revisions. B.Z., H.K., A.S., S.K., N.P., I.A., M.-J.F., M.A., M.W.C., R.J., I.S., L.T., A.Teumer., K.S., J.V., H.Y., V.K., A.K., J. Kettunen, J.P. and B.L. ran consortium analyses in their respective cohorts. A.S., R.K., S.K., G.H., R.S. and A.Brown ran replication analyses in their respective cohorts. A.A., G.W.M., S.Ripatti, M.P., E.D., S.B., T.F., J.v.M., H.P., H.A., B.Pierce., T.L., D.I.B., B.M.P., S.A.G., P.A., L.M., W.H.O., K.D., O.S., A.Battle, M.Scholz, G.G., T.E., W.A., F.B., J.D., M.E., B.P.F., M.G., B.T.H., M.K., Y.K., J.C.K., P.K., K.K., M.L., U.M.M., H.M., Y.M., M.M.-N., M.Nauck, M.G.N., B.W.J.H.P., O.T.R., O.Rotzschke, E.P.S., C.D.A.S., M.Stumvoll, P.S., P.A.C.’t.H., J.T., A.Tönjes, J.v.D., M.v.I., J.H.V., U.Völker and C.W. provided data used in the study. B.Z., H.K., Z.K., J.Kronberg, S.Rüeger, E.P., S.L., J.Y., F.Z., P.M.V., J.P., T.Q., R.W., H.K., M.Scholz and G.G. participated in downstream analyses. S.Y., H.B., R.O., D.H.d.V. and M.G.P.v.d.W. ran replication analyses in scRNA-seq cohorts. A.W.H., J.A.H. and J.P. generated scRNA-seq replication data. H.K., A.Teumer., M.G., M.G.N., J.P., Z.K., J.Y., P.M.V., M.Scholz, G.G., J.P., S.A.G. and P.A.C.’t.H. contributed to writing and revising the manuscript. J.K.P. provided Supplementary Equations for interpretation of results. H.B. and M.Swertz created the website to host results. H-J.W., M.J.B. and P.D. contributed equally to this work. The BIOS Consortium contributed the subset of whole-blood data used in discovery analyses. The i2QTL Consortium contributed trans-eQTL and eQTS replication analyses of iPSCs.
Corresponding authors
Ethics declarations
Competing interests
B.M.P. serves on the Steering Committee for the Yale Open Data Access Project funded by Johnson & Johnson. This activity is unrelated to this work. The rest of the authors declare no competing interests.
Additional information
Peer review information Nature Genetics thanks Eric Gamazon, Douglas Yao, and Vijay Sankaran for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Cis-eQTL replication in GTEx v7 tissues.
Cis-eQTL replication in GTEx v7 tissues. For this analysis, the most significant cis-eQTL SNP for each gene was tested in the available post-mortem tissues in GTEx v7. Since GTEx was part of our discovery meta-analysis, the cis-eQTL discovery analysis was repeated while excluding GTEx whole blood, identifying 16,963 lead cis-eQTL effects that were subsequently replicated in each GTEx tissue. Left: while the majority of the 16,963 cis-eQTL were tested in the GTEx replication study, a relatively small fraction had an FDR < 0.05. Middle: of those cis-eQTL showing a replication FDR < 0.05, allelic directions were highly consistent with the discovery meta-analysis. Right: sample sizes of GTEx tissues. Limited replication rates at FDR < 0.05 were probably due to the relatively small sample size per GTEx tissue.
Extended Data Fig. 2 Dot-plot showing the locations of the trans-eQTL effects identified in discovery meta-analysis and their association P-values (-log10 scale).
Dot-plot showing the locations of the trans-eQTL effects identified in discovery meta-analysis (weighted Z-score meta-analysis on Spearman correlation) and their respective two-sided association P-values in -log10 scale. SNP positions are shown on the x-axis and gene locations on the y-axis, each dot shows one significant trans-eQTL effect (FDR < 0.05). Vertical bands appear where a single genomic locus affects many genes in trans, while horizontal bands illustrate genes affected by many SNPs.
Extended Data Fig. 3 Overview of GWAS trait classes in eQTS analysis.
Overview of tested and significant (FDR < 0.05) GWAS trait classes in eQTS analysis.
Supplementary information
Supplementary Information
Supplementary Figs. 1–20, Note and Equations.
Supplementary Tables
Supplementary Tables 1–33.
Supplementary Data 1
Cis-eQTL lead SNP replication in the GTEx project.
Supplementary Data 2
Significant trans-eQTL effects, replication results in purified cell types and cell lines.
Supplementary Data 3
Trans-eQTL replication results in GTEx tissues.
Supplementary Data 4
Putative mechanisms of trans-eQTL.
Supplementary Data 5
Results of eQTS analysis, replications in cell lines.
Supplementary Data 6
eQTS replication analyses in the GTEx European subset of samples.
Supplementary Data 7
eQTS replication analyses in all GTEx samples.
Supplementary Data 8
Effect of cell type composition on trans-eQTL.
Supplementary Data 9
Results of cell type interaction analyses for trans-eQTL.
Source data
Source Data Fig. 2
Statistical source data for Fig. 2a–c.
Source Data Fig. 3
Statistical source data for Fig. 3a, right.
Rights and permissions
About this article
Cite this article
Võsa, U., Claringbould, A., Westra, HJ. et al. Large-scale cis- and trans-eQTL analyses identify thousands of genetic loci and polygenic scores that regulate blood gene expression. Nat Genet 53, 1300–1310 (2021). https://doi.org/10.1038/s41588-021-00913-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41588-021-00913-z
This article is cited by
-
Large-scale integrative analysis of juvenile idiopathic arthritis for new insight into its pathogenesis
Arthritis Research & Therapy (2024)
-
Impact of periodontitis on type 2 diabetes: a bioinformatic analysis
BMC Oral Health (2024)
-
Does metformin really reduce prostate cancer risk: an up-to-date comprehensive genome-wide analysis
Diabetology & Metabolic Syndrome (2024)
-
Integrated multi-omics analyses revealed the association between rheumatoid arthritis and colorectal cancer: MYO9A as a shared gene signature and an immune-related therapeutic target
BMC Cancer (2024)
-
Causal roles and clinical utility of cardiovascular proteins in colorectal cancer risk: a multi-modal study integrating mendelian randomization, expression profiling, and survival analysis
BMC Medical Genomics (2024)