Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Large-scale cis- and trans-eQTL analyses identify thousands of genetic loci and polygenic scores that regulate blood gene expression

Abstract

Trait-associated genetic variants affect complex phenotypes primarily via regulatory mechanisms on the transcriptome. To investigate the genetics of gene expression, we performed cis- and trans-expression quantitative trait locus (eQTL) analyses using blood-derived expression from 31,684 individuals through the eQTLGen Consortium. We detected cis-eQTL for 88% of genes, and these were replicable in numerous tissues. Distal trans-eQTL (detected for 37% of 10,317 trait-associated variants tested) showed lower replication rates, partially due to low replication power and confounding by cell type composition. However, replication analyses in single-cell RNA-seq data prioritized intracellular trans-eQTL. Trans-eQTL exerted their effects via several mechanisms, primarily through regulation by transcription factors. Expression of 13% of the genes correlated with polygenic scores for 1,263 phenotypes, pinpointing potential drivers for those traits. In summary, this work represents a large eQTL resource, and its results serve as a starting point for in-depth interpretation of complex phenotypes.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Overview of the study.
Fig. 2: Results of cis- and trans-eQTL analyses.
Fig. 3: Trans-eQTL replication in scRNA-seq data and mechanisms leading to trans-eQTL.
Fig. 4: The REST locus regulates the expression of 88 trans-eQTL genes.
Fig. 5: SNPs associated with SLE converge on a shared cluster of interferon-response genes.
Fig. 6: eQTS analyses.

Similar content being viewed by others

Data availability

Primary genotype and gene expression data were analyzed by individual cohorts participating in the study, and our study analyzed summary statistics. Full summary statistics of eQTLGen cis-eQTL, trans-eQTL and eQTS meta-analyses are available on the eQTLGen website, http://www.eqtlgen.org, which was built using the MOLGENIS framework76. We also provide cis-eQTL files formatted for use in SMR, MAFs and replication statistics for cis-eQTL, trans-eQTL and eQTSs. Per-cohort summary statistics for discovery cohorts can be made available after approval of an analysis proposal in eQTLGen and with agreement of the cohort PIs; contact corresponding authors for further information. Trait-associated variants were collected from the EBI GWAS Catalog (https://www.ebi.ac.uk/gwas/, accessed on 21 November 2016), the NIH GWAS Catalog (now hosted by the EBI GWAS Catalog, https://www.ebi.ac.uk/gwas/) and Immunobase (http://www.immunobase.org, accessed 26 April 2016; now hosted by Open Targets at https://genetics.opentargets.org/immunobase). Sources of numerous GWAS summary statistics used for eQTS analyses are outlined in the Supplementary Note and Supplementary Table 13. ExAC pLI scores used for Fig. 2 originate from ftp://ftp.broadinstitute.org/pub/ExAC_release/release0.3.1/functional_gene_constraint/fordist_cleaned_exac_r03_march16_z_pli_rec_null_data.txt. Genotype reference files used for harmonizing discovery datasets for meta-analysis originate from ftp://share.sph.umich.edu/1000genomes/fullProject/2012.03.14/GIANT.phase1_release_v3.20101123.snps_indels_svs.genotypes.refpanel.ALL.vcf.gz.tgz. The gene model used for gene annotations originates from Ensembl version 71 (ftp://ftp.ensembl.org/pub/release-71/gtf/homo_sapiens/Homo_sapiens.GRCh37.71.gtf.gz). FANTOM TF annotations used for eQTS enrichment analyses originate from http://fantom.gsc.riken.jp/5/sstar/Browse_Transcription_Factors_hg19. ChIP-seq data used for cis-eQTL overlap originate from https://www.chicp.org/. PPI data used for trans-eQTL mechanism enrichment analyses originate from https://www.intomics.com/inbio/map/api/get_data?file=InBio_Map_core_2016_09_12.tar.gz. Hi-C data used for trans-eQTL mechanism enrichment are deposited in the GEO (GM12878, GEO accession GSE63525). Curated gene sets used for enrichment analyses (gene ontology sets, ENCODE ChIP-X and CheA ChIP-X TF targets, TRANSFAC and JASPAR PWMs, ARCHS4 tissue expression, TargetScan miRNA target predictions, TarBase miRNA validated targets) were downloaded from the Enrichr website (https://maayanlab.cloud/Enrichr/#stats). Gene expression summaries and metadata from GTEx version 7 originate from https://gtexportal.org/home/. Gene expression summaries from BIOS are available in the BIOS Omics Atlas (http://bbmri.researchlumc.nl/atlas/#data). Per-cohort individual-level genotype and gene expression data are governed by respective biobanks and access can be requested according to procedures established by each biobank, with relevant restrictions applying as imposed by the IRB or local legislation. Data-access procedures established for the BIOS Consortium are available at https://www.bbmri.nl/acquisition-use-analyze/bios. Source data are provided with this paper.

Code availability

Individual cohorts participating in the study followed analysis plans as specified in our analysis cookbooks (https://github.com/molgenis/systemsgenetics/wiki/eQTL-mapping-analysis-cookbook-(eQTLGen), https://github.com/molgenis/systemsgenetics/wiki/eQTL-mapping-analysis-cookbook-for-RNA-seq-data, https://github.com/molgenis/systemsgenetics/wiki/QTL-mapping-analysis-cookbook-for-Affymetrix-expression-arrays) or with slight alterations as described in the Methods and the Supplementary Note. Tools and source codes used for genotype harmonization, identification of sample mix-ups, eQTL mapping, meta-analyses and calculation of PGSs are available at https://github.com/molgenis/systemsgenetics/. Tools used for primary analyses were written in Java (versions 6–8, https://www.java.com/). PLINK version 1.0.7 (https://zzz.bwh.harvard.edu/plink/) and version 1.90 (https://www.cog-genomics.org/plink/1.9/) was used for clumping and pruning. Downstream analyses and plots were performed and constructed with R (versions 3.4.4, 3.6.1 and 4.0.0, https://cran.r-project.org/) using packages data.table version 1.12 (https://cran.r-project.org/web/packages/data.table/), tidyverse version 1.2.1 (https://cran.r-project.org/web/packages/tidyverse/), broom version 0.5.1 (https://cran.r-project.org/web/packages/broom/), pheatmap version 1.0.12 (https://cran.r-project.org/web/packages/pheatmap/) and GeneOverlap version 1.18.0 (https://bioconductor.org/packages/release/bioc/html/GeneOverlap.html). Power analyses were conducted with the R package pwr version 1.3-0 (https://cran.r-project.org/web/packages/pwr/). scRNA-seq analyses were performed using the Cell Ranger Single Cell Software Suite version 3.0.2 (https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/what-is-cell-ranger) and its implementation of STAR aligner. The ToppGene web tool (https://toppgene.cchmc.org/) was used for some interpretative enrichment analyses, as well as the GeneNetwork web tool (https://genenetwork.nl/). The Decon2 framework (https://github.com/molgenis/systemsgenetics/tree/master/Decon2) was used for predicting cell counts in BIOS data. We formatted our cis-eQTL into the BESD format using SMR (https://cnsgenomics.com/software/smr/#Overview).

References

  1. Yao, D. W., O’Connor, L. J., Price, A. L. & Gusev, A. Quantifying genetic effects on disease mediated by assayed gene expression levels. Nat. Genet. 52, 626–633 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. O’Connor, L. J. et al. Extreme polygenicity of complex traits is explained by negative selection. Am. J. Hum. Genet. 105, 456–476 (2019).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  3. Zeng, J. et al. Signatures of negative selection in the genetic architecture of human complex traits. Nat. Genet. 50, 746–753 (2018).

    Article  CAS  PubMed  Google Scholar 

  4. Westra, H. J. et al. Systematic identification of trans eQTLs as putative drivers of known disease associations. Nat. Genet. 45, 1238–1243 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Kirsten, H. et al. Dissecting the genetics of the human transcriptome identifies novel trait-related trans-eQTLs and corroborates the regulatory relevance of non-protein coding loci. Hum. Mol. Genet. 24, 4746–4763 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Lloyd-Jones, L. R. et al. The genetic architecture of gene expression in peripheral blood. Am. J. Hum. Genet. 100, 228–237 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Jansen, R. et al. Conditional eQTL analysis reveals allelic heterogeneity of gene expression. Hum. Mol. Genet. 26, 1444–1451 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Joehanes, R. et al. Integrated genome-wide analysis of expression quantitative trait loci aids interpretation of genomic association studies. Genome Biol. 18, 16 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  9. Yao, C. et al. Dynamic role of trans regulation of gene expression in relation to complex traits. Am. J. Hum. Genet. 100, 571–580 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Brynedal, B. et al. Large-scale trans-eQTLs affect hundreds of transcripts and mediate patterns of transcriptional co-regulation. Am. J. Hum. Genet. 100, 581–591 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Lewis, C. M. & Vassos, E. Prospects for using risk scores in polygenic medicine. Genome Med. 9, 96 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  12. Natarajan, P. et al. Polygenic risk score identifies subgroup with higher burden of atherosclerosis and greater relative benefit from statin therapy in the primary prevention setting. Circulation 135, 2091–2101 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  13. Boyle, E. A., Li, Y. I. & Pritchard, J. K. An expanded view of complex traits: from polygenic to omnigenic. Cell 169, 1177–1186 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Liu, X., Li, Y. I. & Pritchard, J. K. Trans effects on gene expression can drive omnigenic inheritance. Cell 177, 1022–1034 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Zhernakova, D. V. et al. Identification of context-dependent expression quantitative trait loci in whole blood. Nat. Genet. 49, 139–145 (2017).

    Article  CAS  PubMed  Google Scholar 

  16. Bonder, M. J. et al. Disease variants alter transcription factor levels and methylation of their binding sites. Nat. Genet. 49, 131–138 (2017).

    Article  CAS  PubMed  Google Scholar 

  17. Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B Stat. Methodol. 57, 289–300 (1995).

    Google Scholar 

  18. Aguet, F. et al. Genetic effects on gene expression across human tissues. Nature 550, 204–213 (2017).

    Article  Google Scholar 

  19. Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Glassberg, E. C., Gao, Z., Harpak, A., Lan, X. & Pritchard, J. K. Evidence for weak selective constraint on human gene expression. Genetics 211, 757–772 (2019).

    Article  CAS  PubMed  Google Scholar 

  21. Wu, Y., Zheng, Z., Visscher, P. M. & Yang, J. Quantifying the mapping precision of genome-wide association studies using whole-genome sequencing data. Genome Biol. 18, 86 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  22. Astle, W. J. et al. The allelic landscape of human blood cell trait variation and links to common complex disease. Cell 167, 1415–1429 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Melé, M. et al. The human transcriptome across tissues and individuals. Science 348, 660–665 (2015).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  24. Qi, T. et al. Identifying gene targets for brain-related traits using transcriptomic and methylomic data from blood. Nat. Commun. 9, 2282 (2018).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  25. Marbach, D. et al. Tissue-specific regulatory circuits reveal variable modular perturbations across complex diseases. Nat. Methods 13, 366–370 (2016).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  26. Li, T. et al. A scored human protein–protein interaction network to catalyze genomic interpretation. Nat. Methods 14, 61–64 (2016).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  27. Lamparter, D., Marbach, D., Rueedi, R., Kutalik, Z. & Bergmann, S. Fast and rigorous computation of gene and pathway scores from SNP-based summary statistics. PLoS Comput. Biol. 12, e1004714 (2016).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  28. Rao, S. S. P. et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159, 1665–1680 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Nikpay, M. et al. A comprehensive 1000 Genomes-based genome-wide association meta-analysis of coronary artery disease. Nat. Genet. 47, 1121–1130 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Bentham, J. et al. Genetic association analyses implicate aberrant regulation of innate and adaptive immunity genes in the pathogenesis of systemic lupus erythematosus. Nat. Genet. 47, 1457–1464 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Davenport, E. E. et al. Discovering in vivo cytokine–eQTL interactions from a lupus clinical trial. Genome Biol. 19, 168 (2018).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  32. McBride, J. M. et al. Safety and pharmacodynamics of rontalizumab in patients with systemic lupus erythematosus: results of a phase I, placebo-controlled, double-blind, dose-escalation study. Arthritis Rheum. 64, 3666–3676 (2012).

    Article  CAS  PubMed  Google Scholar 

  33. Yao, Y. et al. Development of potential pharmacodynamic and diagnostic markers for anti-IFN-α monoclonal antibody trials in systemic lupus erythematosus. Hum. Genomics Proteomics 2009, 374312 (2009).

    PubMed  PubMed Central  Google Scholar 

  34. Perry, J. R. B. et al. Parent-of-origin-specific allelic associations among 106 genomic loci for age at menarche. Nature 514, 92–97 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Lemaitre, R. N. et al. Genetic loci associated with plasma phospholipid n-3 fatty acids: a meta-analysis of genome-wide association studies from the CHARGE Consortium. PLoS Genet. 7, e1002193 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Liu, J. Z. et al. Association analyses identify 38 susceptibility loci for inflammatory bowel disease and highlight shared genetic risk across populations. Nat. Genet. 47, 979–986 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Gateva, V. et al. A large-scale replication study identifies TNIP1, PRDM1, JAZF1, UHRF1BP1 and IL10 as risk loci for systemic lupus erythematosus. Nat. Genet. 41, 1228–1233 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Moffatt, M. F. et al. A large-scale, consortium-based genomewide association study of asthma. N. Engl. J. Med. 363, 1211–1221 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Wood, A. R. et al. Defining the role of common variation in the genomic and biological architecture of adult human height. Nat. Genet. 46, 1173–1186 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Van Der Harst, P. et al. Seventy-five genetic loci influencing the human red blood cell. Nature 492, 369–375 (2012).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  41. Teslovich, T. M. et al. Biological, clinical and population relevance of 95 loci for blood lipids. Nature 466, 707–713 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Willer, C. J. et al. Discovery and refinement of loci associated with lipid levels. Nat. Genet. 45, 1274–1285 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Wang, X. et al. Macrophage ABCA1 and ABCG1, but not SR-BI, promote macrophage reverse cholesterol transport in vivo. J. Clin. Invest. 117, 2216–2224 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Goldstein, J. L. & Brown, M. S. Binding and degradation of low density lipoproteins by cultured human fibroblasts. Comparison of cells from a normal subject and from a patient with homozygous familial hypercholesterolemia. J. Biol. Chem. 249, 5153–5162 (1974).

    Article  CAS  PubMed  Google Scholar 

  45. Singh, A. B., Kan, C. F. K., Shende, V., Dong, B. & Liu, J. A novel posttranscriptional mechanism for dietary cholesterol-mediated suppression of liver LDL receptor expression. J. Lipid Res. 55, 1397–1407 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Kettunen, J. et al. Genome-wide study for circulating metabolites identifies 62 loci and reveals novel systemic effects of LPA. Nat. Commun. 7, 11122 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Shin, S. Y. et al. An atlas of genetic influences on human blood metabolites. Nat. Genet. 46, 543–550 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. El-Hattab, A. W. Serine biosynthesis and transport defects. Mol. Genet. Metab. 118, 153–159 (2016).

    Article  CAS  PubMed  Google Scholar 

  49. Leuzzi, V., Alessandrì, M. G., Casarano, M., Battini, R. & Cioni, G. Arginine and glycine stimulate creatine synthesis in creatine transporter 1-deficient lymphoblasts. Anal. Biochem. 375, 153–155 (2008).

    Article  CAS  PubMed  Google Scholar 

  50. Hart, C. E. et al. Phosphoserine aminotransferase deficiency: a novel disorder of the serine biosynthesis pathway. Am. J. Hum. Genet. 80, 931–937 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Klomp, L. W. J. et al. Molecular characterization of 3-phosphoglycerate dehydrogenase deficiency—a neurometabolic disorder associated with reduced l-serine biosynthesis. Am. J. Hum. Genet. 67, 1389–1399 (2000).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Shaheen, R. et al. Neu-Laxova syndrome, an inborn error of serine metabolism, is caused by mutations in PHGDH. Am. J. Hum. Genet. 94, 898–904 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Price, A. L. et al. Single-tissue and cross-tissue heritability of gene expression via identity-by-descent in related or unrelated individuals. PLoS Genet. 7, e1001317 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Mostafavi, H. et al. Variable prediction accuracy of polygenic scores within an ancestry group. eLife 9, e48376 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. van der Wijst, M. et al. The single-cell eQTLGen consortium. eLife 9, e52155 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  56. Wang, D. et al. Comprehensive functional genomic resource and integrative model for the human brain. Science 362, eaat8464 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  57. Feingold, E. A. et al. The ENCODE (ENCyclopedia of DNA Elements) Project. Science 306, 636–640 (2004).

    Article  CAS  Google Scholar 

  58. Myers, R. M. et al. A user’s guide to the Encyclopedia of DNA Elements (ENCODE). PLoS Biol. 9, e1001046 (2011).

    Article  CAS  Google Scholar 

  59. Lachmann, A. et al. ChEA: transcription factor regulation inferred from integrating genome-wide ChIP-X experiments. Bioinformatics 26, 2438–2444 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  60. Deelen, P. et al. Genotype Harmonizer: automatic strand alignment and format conversion for genotype data integration. BMC Res. Notes 7, 901 (2014).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  61. Rumble, S. M. et al. SHRiMP: accurate mapping of short color-space reads. PLoS Comput. Biol. 5, e1000386 (2009).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  62. Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Westra, H. J. et al. MixupMapper: correcting sample mix-ups in genome-wide datasets increases power to detect small genetic effects. Bioinformatics 27, 2104–2111 (2011).

    Article  CAS  PubMed  Google Scholar 

  64. Robinson, M. D. & Oshlack, A. A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biol. 11, R25 (2010).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  65. Zerbino, D. R. et al. Ensembl 2018. Nucleic Acids Res. 46, D754–D761 (2018).

    Article  CAS  PubMed  Google Scholar 

  66. Zaykin, D. V. Optimally weighted Z-test is a powerful method for combining probabilities in meta-analysis. J. Evol. Biol. 24, 1836–1841 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  67. MacArthur, J. et al. The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog). Nucleic Acids Res. 45, D896–D901 (2017).

    Article  CAS  PubMed  Google Scholar 

  68. Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).

    Article  CAS  PubMed  Google Scholar 

  69. Zhu, Z. et al. Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nat. Genet. 48, 481–487 (2016).

    Article  CAS  PubMed  Google Scholar 

  70. Chen, E. Y. et al. Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool. BMC Bioinformatics 14, 128 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  71. Kuleshov, M. V. et al. Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Res. 44, W90–W97 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  72. Lachmann, A. et al. Massive mining of publicly available RNA-seq data from human and mouse. Nat. Commun. 9, 1366 (2018).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  73. Yu, G., Wang, L. G., Han, Y. & He, Q. Y. clusterProfiler: An R package for comparing biological themes among gene clusters. OMICS 16, 284–287 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  74. Javierre, B. M. et al. Lineage-specific genome architecture links enhancers and non-coding disease variants to target gene promoters. Cell 167, 1369–1384 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  75. Schofield, E. C. et al. CHiCP: a web-based tool for the integrative and interactive visualization of promoter capture Hi-C datasets. Bioinformatics 32, 2511–2513 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  76. Swertz, M. A. et al. The MOLGENIS toolkit: rapid prototyping of biosoftware at the push of a button. BMC Bioinformatics 11, S12 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

The cohorts participating in this study list their acknowledgements in the cohort-specific sections of the Supplementary Note. This work is supported by a grant from the European Research Council (ERC, ERC Starting Grant agreement number 637640 ImmRisk), a VIDI grant (917.14.374) and a VICI grant from the Netherlands Organisation for Scientific Research (NWO) to L.F. This work has been supported by the European Regional Development Fund and the program Mobilitas Pluss (MOBTP108) to U.Võsa. The project was supported by the ‘De Drie Lichten’ foundation in the Netherlands with a grant to A.C. M.G.N. is supported by ZonMw grants 849200011 and 531003014 from the Netherlands Organisation for Health Research and Development, a VENI grant from the NWO (VI.Veni.191G.030) and a Jacobs Foundation research fellowship. H.Y. is funded by a Diabetes UK RD Lawrence fellowship (17/0005594). This project received funding from the ERC under the European Union’s Horizon 2020 research and innovation program (grant agreement no. 772376 (EScORIAL)) to J.H.V. T.E. and A.K. were supported by the Estonian Research Council grant PRG (PRG1291). A.Battle was supported by NIH grant R01MH109905, NIH grant R01HG008150 (NHGRI; Non-Coding Variants Program) and NIH grant R01MH101814 (NIH Common Fund; GTEx Program). M.G.P.v.d.W. was funded by the Nederlandse Organisatie voor Wetenschappelijk onderzoek, NWO-Veni 192.029. This work was supported by NIH grants R21ES024834 (B.Pierce), R01ES020506 (B.Pierce), R01ES023834 (B.Pierce), R35ES028379 (B.Pierce) and R01CA107431 (H.A.). This work was supported by the Sigrid Juselius Foundation (J.Kettunen) and funds from the Academy of Finland (grant numbers 297338 and 307247) (J.Kettunen) and the Novo Nordisk Foundation (grant number NNF17OC0026062) (J.Kettunen). S.Ripatti was supported by the Academy of Finland Centre of Excellence in Complex Disease Genetics (grant no. 312062). M.G. was supported by EU Horizon 2020 (grant 733100 for SYSCID) and a grant from the Excellence of Science (FNRS and FWO) (grant no. 30770923). We acknowledge support from the BBMRI-NL (Biobanking and Biomolecular Resources Research Infrastructure 184.021.007 and 184.033.111), Spinozapremie (NWO 56-464-14192), the ERC (ERC Advanced 230374) and the KNAW Academy Professor Award (PAH/6635) to D.I.B. G.H. works in a unit that receives funding from the UK MRC (MC_UU_12013/1&2&5) and the University of Bristol. S.B. was supported by the Swiss National Science Foundation (310030-152724). B.M.P. was supported by CHARGE infrastructure grant number HJ105756 for the HVH cohort. This work was supported by the German Federal Ministry of Education and Research (BMBF) within the framework of the e:Med research and funding concept (grant 01ZX1906B) and by LIFE (Leipzig Research Center for Civilization Diseases), Universität Leipzig (which is funded by the European Union, by the European Regional Development Fund and by the Free State of Saxony within the framework of the excellence initiative to H.K. and M.Scholz). We thank the UMCG Genomics Coordination Center, the MOLGENIS team, the UG Center for Information Technology and the UMCG research IT program and their sponsors, in particular the BBMRI-NL for data storage, high-performance computing and web hosting infrastructure. The BBMRI-NL is a research infrastructure financed by the NWO (grant number 184.033.111). We thank K. McIntyre for editing the manuscript text.

Author information

Authors and Affiliations

Authors

Consortia

Contributions

U.Võsa. and A.C. coordinated consortium analyses, ran meta-analyses, interpreted data, performed downstream analyses and drafted and revised the manuscript. H.-J.W., M.J.B. and P.D. developed software used in the analyses, performed downstream analyses and participated in manuscript writing and revisions. L.F. and T.E. conceived the study. L.F. supervised the project, ran downstream analyses and participated in manuscript writing and revisions. B.Z., H.K., A.S., S.K., N.P., I.A., M.-J.F., M.A., M.W.C., R.J., I.S., L.T., A.Teumer., K.S., J.V., H.Y., V.K., A.K., J. Kettunen, J.P. and B.L. ran consortium analyses in their respective cohorts. A.S., R.K., S.K., G.H., R.S. and A.Brown ran replication analyses in their respective cohorts. A.A., G.W.M., S.Ripatti, M.P., E.D., S.B., T.F., J.v.M., H.P., H.A., B.Pierce., T.L., D.I.B., B.M.P., S.A.G., P.A., L.M., W.H.O., K.D., O.S., A.Battle, M.Scholz, G.G., T.E., W.A., F.B., J.D., M.E., B.P.F., M.G., B.T.H., M.K., Y.K., J.C.K., P.K., K.K., M.L., U.M.M., H.M., Y.M., M.M.-N., M.Nauck, M.G.N., B.W.J.H.P., O.T.R., O.Rotzschke, E.P.S., C.D.A.S., M.Stumvoll, P.S., P.A.C.’t.H., J.T., A.Tönjes, J.v.D., M.v.I., J.H.V., U.Völker and C.W. provided data used in the study. B.Z., H.K., Z.K., J.Kronberg, S.Rüeger, E.P., S.L., J.Y., F.Z., P.M.V., J.P., T.Q., R.W., H.K., M.Scholz and G.G. participated in downstream analyses. S.Y., H.B., R.O., D.H.d.V. and M.G.P.v.d.W. ran replication analyses in scRNA-seq cohorts. A.W.H., J.A.H. and J.P. generated scRNA-seq replication data. H.K., A.Teumer., M.G., M.G.N., J.P., Z.K., J.Y., P.M.V., M.Scholz, G.G., J.P., S.A.G. and P.A.C.’t.H. contributed to writing and revising the manuscript. J.K.P. provided Supplementary Equations for interpretation of results. H.B. and M.Swertz created the website to host results. H-J.W., M.J.B. and P.D. contributed equally to this work. The BIOS Consortium contributed the subset of whole-blood data used in discovery analyses. The i2QTL Consortium contributed trans-eQTL and eQTS replication analyses of iPSCs.

Corresponding authors

Correspondence to Urmo Võsa, Annique Claringbould or Lude Franke.

Ethics declarations

Competing interests

B.M.P. serves on the Steering Committee for the Yale Open Data Access Project funded by Johnson & Johnson. This activity is unrelated to this work. The rest of the authors declare no competing interests.

Additional information

Peer review information Nature Genetics thanks Eric Gamazon, Douglas Yao, and Vijay Sankaran for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Cis-eQTL replication in GTEx v7 tissues.

Cis-eQTL replication in GTEx v7 tissues. For this analysis, the most significant cis-eQTL SNP for each gene was tested in the available post-mortem tissues in GTEx v7. Since GTEx was part of our discovery meta-analysis, the cis-eQTL discovery analysis was repeated while excluding GTEx whole blood, identifying 16,963 lead cis-eQTL effects that were subsequently replicated in each GTEx tissue. Left: while the majority of the 16,963 cis-eQTL were tested in the GTEx replication study, a relatively small fraction had an FDR < 0.05. Middle: of those cis-eQTL showing a replication FDR < 0.05, allelic directions were highly consistent with the discovery meta-analysis. Right: sample sizes of GTEx tissues. Limited replication rates at FDR < 0.05 were probably due to the relatively small sample size per GTEx tissue.

Extended Data Fig. 2 Dot-plot showing the locations of the trans-eQTL effects identified in discovery meta-analysis and their association P-values (-log10 scale).

Dot-plot showing the locations of the trans-eQTL effects identified in discovery meta-analysis (weighted Z-score meta-analysis on Spearman correlation) and their respective two-sided association P-values in -log10 scale. SNP positions are shown on the x-axis and gene locations on the y-axis, each dot shows one significant trans-eQTL effect (FDR < 0.05). Vertical bands appear where a single genomic locus affects many genes in trans, while horizontal bands illustrate genes affected by many SNPs.

Extended Data Fig. 3 Overview of GWAS trait classes in eQTS analysis.

Overview of tested and significant (FDR < 0.05) GWAS trait classes in eQTS analysis.

Supplementary information

Supplementary Information

Supplementary Figs. 1–20, Note and Equations.

Reporting Summary

Peer Review Information

Supplementary Tables

Supplementary Tables 1–33.

Supplementary Data 1

Cis-eQTL lead SNP replication in the GTEx project.

Supplementary Data 2

Significant trans-eQTL effects, replication results in purified cell types and cell lines.

Supplementary Data 3

Trans-eQTL replication results in GTEx tissues.

Supplementary Data 4

Putative mechanisms of trans-eQTL.

Supplementary Data 5

Results of eQTS analysis, replications in cell lines.

Supplementary Data 6

eQTS replication analyses in the GTEx European subset of samples.

Supplementary Data 7

eQTS replication analyses in all GTEx samples.

Supplementary Data 8

Effect of cell type composition on trans-eQTL.

Supplementary Data 9

Results of cell type interaction analyses for trans-eQTL.

Source data

Source Data Fig. 2

Statistical source data for Fig. 2a–c.

Source Data Fig. 3

Statistical source data for Fig. 3a, right.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Võsa, U., Claringbould, A., Westra, HJ. et al. Large-scale cis- and trans-eQTL analyses identify thousands of genetic loci and polygenic scores that regulate blood gene expression. Nat Genet 53, 1300–1310 (2021). https://doi.org/10.1038/s41588-021-00913-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41588-021-00913-z

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing