Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Analysis of blood methylation quantitative trait loci in East Asians reveals ancestry-specific impacts on complex traits

Subjects

Abstract

Methylation quantitative trait loci (mQTLs) are essential for understanding the role of DNA methylation changes in genetic predisposition, yet they have not been fully characterized in East Asians (EAs). Here we identified mQTLs in whole blood from 3,523 Chinese individuals and replicated them in additional 1,858 Chinese individuals from two cohorts. Over 9% of mQTLs displayed specificity to EAs, facilitating the fine-mapping of EA-specific genetic associations, as shown for variants associated with height. Trans-mQTL hotspots revealed biological pathways contributing to EA-specific genetic associations, including an ERG-mediated 233 trans-mCpG network, implicated in hematopoietic cell differentiation, which likely reflects binding efficiency modulation of the ERG protein complex. More than 90% of mQTLs were shared between different blood cell lineages, with a smaller fraction of lineage-specific mQTLs displaying preferential hypomethylation in the respective lineages. Our study provides new insights into the mQTL landscape across genetic ancestries and their downstream effects on cellular processes and diseases/traits.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Identification and validation of mQTLs in East Asians.
Fig. 2: The characteristics and potential applications of EA-specific mQTLs.
Fig. 3: Cell-type- and cell-lineage-specific mQTLs and validation.
Fig. 4: Mechanistic interpretation of cis- and lcis-mQTLs.
Fig. 5: mQTL hotspots mediated by TFs and super enhancers.
Fig. 6: A FOSL2-mediated mQTL hotspot influences eosinophil count.
Fig. 7: A NFKB1 trans-mQTL hotspot associated with BMI.

Similar content being viewed by others

Data availability

Our mQTL database is available for download at https://www.biosino.org/panmqtl/, which incorporates mQTLs not only in EA (NSPT) but also in published European and South Asian data. The database also supports searching and visualization of genomic, functional and downstream disease/trait hits of mQTLs and mCpGs. The statistics of mQTLs in NSPT and CGZ cohort are available for download at NODE https://www.biosino.org/node under accession number OEP002902, or directly accessed at https://www.biosino.org/node/project/detail/OEP002902. The statistics of mQTLs replicated in CAS is available for download at OMIX https://ngdc.cncb.ac.cn/omix under accession number OMIX004116, or directly accessed at https://ngdc.cncb.ac.cn/omix/release/OMIX004116. The individual-level genotype data is not available because of IRB restrictions due to privacy concerns. The individual-level DNAm data can be requested at https://ngdc.cncb.ac.cn/omix/release/OMIX004363 (NSPT), https://ngdc.cncb.ac.cn/omix/release/OMIX004333 (CAS) and https://www.biosino.org/node/project/detail/OEP002902 (CGZ). Requests are normally processed within 1–3 months. Data usage shall be in full compliance with the Regulations on Management of Human Genetic Resources in China. The DNAm dataset in buccal cells is available by submitting data requests to mrclha.enquiries@ucl.ac.uk; see the full policy at http://www.nshd.mrc.ac.uk/data.aspx. Managed access is in place for this 69-year-old NSHD study to ensure that the use of the data is within the bounds of consent given previously by participants, and to safeguard any potential threat to anonymity because the participants are all born in the same week. The mQTL results of the EUR cohort (GoDMC) were downloaded from http://mqtldb.godmc.org.uk/downloads. The mQTL results of the EUR cohort (FHS) were downloaded from https://ftp.ncbi.nlm.nih.gov/eqtl/original_submissions/FHS_meQTLs/ (date: September 14, 2020). The annotation of CpG probes was downloaded from https://zwdzwd.github.io/InfiniumAnnotation (date: November 25, 2019). Significant GWAS results were downloaded from GWAS Catalog (https://www.ebi.ac.uk/gwas/docs/file-downloads, date: December 25, 2020) and significant EWAS results were downloaded from EWAS Atlas (https://ngdc.cncb.ac.cn/ewas/downloads, date: December 25, 2020). The cis-eQTL results in whole blood were downloaded from GTEx V8 database (https://www.gtexportal.org/home/datasets; date: June 17, 2020) and HGVD (http://www.genome.med.kyoto-u.ac.jp/SnpDB/). The human gene information (Ensembl release v104) was downloaded from GENCODE (https://www.gencodegenes.org/human/release_37lift37.html; date: April 26, 2021), the list of human TFs was from http://humantfs.ccbr.utoronto.ca/download.php (date: April 3, 2020), and the list of House-Keeping genes was downloaded from https://www.tau.ac.il/~elieis/HKG/. Motifs information of TFs was obtained from JASPAR 2020 database (http://jaspar.genereg.net/; date: July 2, 2021) and JASPAR 2022 (date: August 22, 2022). ChIP–seq signals of TFs were downloaded from the ChIP-Atlas database (http://chip-atlas.org/; date: June 2, 2021). Other data sources used in this study include BLUEPRINT mQTLs summary statistics (https://ega-archive.org/datasets/EGAD00001005200); Phenoscanner GWAS summary statistics (http://www.phenoscanner.medschl.cam.ac.uk/); Functional genomic regions from the Functional Annotation of Animal Genomes (FAANG) Project (https://www.faang.org); PCHi-C data (https://osf.io/u8tzp); H3K27ac HiChIP data (https://www.ncbi.nlm.nih.gov/geo/, GSE101498); The DNase-seq data for B cells and T cells and the H3K27ac ChIP–seq data of neutrophil cells (https://www.encodeproject.org); GO terms, KEGG pathways, and Reactome pathways were downloaded from the Molecular Signatures Database (https://www.gsea-msigdb.org/gsea/msigdb/index.jsp); and FANTOM5 (https://fantom.gsc.riken.jp/data/). Experimental Factor Ontology (EFO) (https://www.ebi.ac.uk/ols/ontologies/efo). GWASs in BBJ (https://pheweb.jp/); GWASs in UKBB (https://pan.ukbb.broadinstitute.org/); super enhancer databases (http://www.licpathway.net/sedb/; http://www.asntech.org/dbsuper/; http://www.licpathway.net/SEanalysis/); segmented functional regions from GM12878 cell line (http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=wgEncodeAwgSegmentation); 15 chromatin states (https://egg2.wustl.edu/roadmap/data/byFileType/chromhmmSegmentations/ChmmModels/coreMarks/jointModel/final/).

Code availability

Code for the analysis is available at GitHub (https://github.com/Fun-Gene/fastQTLmapping) and Zenodo (https://doi.org/10.5281/zenodo.8084877)75. Most operations are carried out by R (https://cran.r-project.org/), and the plots are mainly made by ggplot2 v3.4.2 R package (https://cran.r-project.org/web/packages/ggplot2/index.html). mQTL mapping is performed by fastQTLmapping (https://github.com/Fun-Gene/fastQTLmapping) and R package MatrixEQTL v2.3 (https://cran.r-project.org/web/packages/MatrixEQTL/index.html). Heritability is estimated by GCTA (https://yanglab.westlake.edu.cn/software/gcta/). MKL is available at https://software.intel.com/tools/onemkl. GSL is available at http://www.gnu.org/software/gsl/. Annotation of SNP is based on ANNOVAR (https://annovar.openbioinformatics.org/en/latest/, date: 2020.11.2) and annotation of CpG is based on the manufacturer’s manifest files (date: 2020.10.21). Genotype calling is based on GenomeStudio (https://support.illumina.com/array/array_software/genomestudio/downloads.html). Imputation of SNP chip is based on SHAPEIT2 (https://mathgen.stats.ox.ac.uk/genetics_software/shapeit/shapeit.html) and IMPUTE2 (https://mathgen.stats.ox.ac.uk/impute/impute_v2.html). Enrichment analysis of mQTLs is performed by R package clusterProfiler v4.8.1 (https://bioconductor.org/packages/release/bioc/html/clusterProfiler.html). DNAm processing is based on R package minfi Bioconductor package v1.46.0 (https://bioconductor.org/packages/release/bioc/html/minfi.html) and CHAMP Bioconductor package v2.30.0 (https://bioconductor.org/packages/release/bioc/html/ChAMP.html). Cell-type mQTLs are estimated by CellDMC, which is available as part of the EpiDISH v2.8 Bioconductor R package (http://bioconductor.org/packages/devel/EpiDISH. eFORGE is run with the web server at eFORGE2.0 (https://eforge.altiusinstitute.org/). Sharing Effect of cell-type mQTLs is estimated by R package mashr (https://cran.r-project.org/web/packages/mashr/index.html). The GO and KEGG pathway enrichment analyses of mCpGs are conducted using R package missMethyl v1.34.0 (https://bioconductor.org/packages/3.13/bioc/html/missMethyl.html). Genes enrichment for diseases/traits analysis is performed by the R package disgenet2r v0.99.3 (https://www.disgenet.org/disgenet2r) based on the DisGeNET knowledgebase (date: 2021.6.9). The two-sample MR analysis is conducted using the R package TwoSampleMR v0.4.26 (https://mrcieu.github.io/TwoSampleMR/). The HiChIP loops are processed by HiCCUPS and implemented in the Juicer Tools (v0.7.5) with default parameter settings. The influence of SNPs on REs is calculated using the tool OpenCausal (https://github.com/liwenran/OpenCausal). Colocalization is performed by SMR v1.3.1 (https://yanglab.westlake.edu.cn/software/smr/#Download). Enrichment of mQTL CpGs for TF motifs is performed by TFmotifView (http://bardet.u-strasbg.fr/tfmotifview/) and R package PWMEnrich v4.30.0 (https://bioconductor.org/packages/release/bioc/html/PWMEnrich.html). Phenome-wide association analysis is carried out by PheWAS (https://gwas.mrcieu.ac.uk/phewas).

References

  1. Bonder, M. J. et al. Disease variants alter transcription factor levels and methylation of their binding sites. Nat. Genet. 49, 131–138 (2017).

    Article  CAS  PubMed  Google Scholar 

  2. Huan, T. et al. Genome-wide identification of DNA methylation QTLs in whole blood highlights pathways for cardiovascular disease. Nat. Commun. 10, 4267 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  3. McRae, A. F. et al. Identification of 55,000 replicated DNA methylation QTL. Sci. Rep. 8, 17605 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  4. van Dongen, J. et al. Genetic and environmental influences interact with age and sex in shaping the human methylome. Nat. Commun. 7, 11115 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  5. Gaunt, T. R. et al. Systematic identification of genetic influences on methylation across the human life course. Genome Biol. 17, 61 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  6. Hannon, E. et al. Leveraging DNA-methylation quantitative-trait loci to characterize the relationship between methylomic variation, gene expression, and complex traits. Am. J. Hum. Genet. 103, 654–665 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Chen, L. et al. Genetic drivers of epigenetic and transcriptional variation in human immune cells. Cell 167, 1398–1414 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Min, J. L. et al. Genomic and phenotypic insights from an atlas of genetic effects on DNA methylation. Nat. Genet. 53, 1311–1321 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. McClay, J. L. et al. High density methylation QTL analysis in human blood via next-generation sequencing of the methylated genomic DNA fraction. Genome Biol. 16, 291 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  10. Lemire, M. et al. Long-range epigenetic regulation is conferred by genetic variation located at thousands of independent loci. Nat. Commun. 6, 6326 (2015).

    Article  CAS  PubMed  Google Scholar 

  11. Banovich, N. E. et al. Methylation QTLs are associated with coordinated changes in transcription factor binding, histone modifications, and gene expression levels. PLoS Genet. 10, e1004663 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  12. Bell, C. G. et al. Obligatory and facilitative allelic variation in the DNA methylome within common disease-associated loci. Nat. Commun. 9, 8 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  13. Liu, Y. et al. GeMes, clusters of DNA methylation under genetic control, can inform genetic and epigenetic analysis of disease. Am. J. Hum. Genet. 94, 485–495 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Kassam, I. A.-O. et al. Genome-wide identification of cis DNA methylation quantitative trait loci in three Southeast Asian Populations. Hum. Mol. Genet. 30, 603–618 (2021).

    Article  CAS  PubMed  Google Scholar 

  15. Hawe, J. A.-O. X. et al. Genetic variation influencing DNA methylation provides insights into molecular mechanisms regulating genomic function. Nat. Genet. 54, 18–29 (2022).

    Article  CAS  PubMed  Google Scholar 

  16. Buniello, A. et al. The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res. 47, D1005–D1012 (2019).

    Article  CAS  PubMed  Google Scholar 

  17. Li, M. et al. EWAS Atlas: a curated knowledgebase of epigenome-wide association studies. Nucleic Acids Res. 47, D983–D988 (2019).

    Article  CAS  PubMed  Google Scholar 

  18. Higasa, K. et al. Human genetic variation database, a reference database of genetic variations in the Japanese population. J. Hum. Genet. 61, 547–553 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Narahara, M. et al. Large-scale East-Asian eQTL mapping reveals novel candidate genes for LD mapping and the genomic landscape of transcriptional effects of sequence variants. PLoS ONE 9, e100924 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  20. Akiyama, M. et al. Characterizing rare and low-frequency height-associated variants in the Japanese population. Nat. Commun. 10, 4393 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Yengo, L. et al. A saturated map of common genetic variants associated with human height. Nature 610, 704–712 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Wilson, N. K. et al. Combinatorial transcriptional control in blood stem/progenitor cells: genome-wide analysis of ten major transcriptional regulators. Cell Stem Cell 7, 532–544 (2010).

    Article  CAS  PubMed  Google Scholar 

  23. Hoang, T., Lambert, J. A. & Martin, R. SCL/TAL1 in hematopoiesis and cellular reprogramming. Curr. Top. Dev. Biol. 118, 163–204 (2016).

    Article  CAS  PubMed  Google Scholar 

  24. Zheng, S. C., Breeze, C. E., Beck, S. & Teschendorff, A. E. Identification of differentially methylated cell types in epigenome-wide association studies. Nat. Methods 15, 1059–1066 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. You, C. et al. A cell-type deconvolution meta-analysis of whole blood EWAS reveals lineage-specific smoking-associated DNA methylation changes. Nat. Commun. 11, 4779 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Teschendorff, A. E., Jing, H., Paul, D. S., Virta, J. & Nordhausen, K. Tensorial blind source separation for improved analysis of multi-omic data. Genome Biol. 19, 76 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  27. Li, W., Duren, Z., Jiang, R. & Wong, W. H. A method for scoring the cell type-specific impacts of noncoding variants in personal genomes. Proc. Natl Acad. Sci. USA 117, 21364–21372 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Hnisz, D. et al. Super-enhancers in the control of cell identity and disease. Cell 155, 934–947 (2013).

    Article  CAS  PubMed  Google Scholar 

  29. Whyte, W. A. et al. Master transcription factors and mediator establish super-enhancers at key cell identity genes. Cell 153, 307–319 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Chen, M. H. et al. Trans-ethnic and ancestry-specific blood-cell genetics in 746,667 individuals from 5 global populations. Cell 182, 1198–1213 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Goyama, S., Huang, G., Kurokawa, M. & Mulloy, J. C. Posttranslational modifications of RUNX1 as potential anticancer targets. Oncogene 34, 3483–3492 (2015).

    Article  CAS  PubMed  Google Scholar 

  32. Wahl, S. et al. Epigenome-wide association study of body mass index, and the adverse outcomes of adiposity. Nature 541, 81–86 (2017).

    Article  CAS  PubMed  Google Scholar 

  33. Dick, K. J. et al. DNA methylation and body-mass index: a genome-wide analysis. Lancet 383, 1990–1998 (2014).

    Article  CAS  PubMed  Google Scholar 

  34. Mendelson, M. M. et al. Association of body mass index with DNA methylation and gene expression in blood cells and relations to cardiometabolic disease: a Mendelian randomization approach. PLoS Med. 14, e1002215 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  35. Demerath, E. W. et al. Epigenome-wide association study (EWAS) of BMI, BMI change and waist circumference in African American adults identifies multiple replicated loci. Hum. Mol. Genet. 24, 4464–4479 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Teschendorff, A. E. et al. Correlation of smoking-associated DNA methylation changes in buccal cells with DNA methylation changes in epithelial cancer. JAMA Oncol. 1, 476–485 (2015).

    Article  PubMed  Google Scholar 

  37. Gurzov, E. N., Stanley, W. J., Brodnicki, T. C. & Thomas, H. E. Protein tyrosine phosphatases: molecular switches in metabolism and diabetes. Trends Endocrinol. Metab. 26, 30–39 (2015).

    Article  CAS  PubMed  Google Scholar 

  38. Rodriguez-Nunez, I. et al. Nod2 and Nod2-regulated microbiota protect BALB/c mice from diet-induced obesity and metabolic dysfunction. Sci. Rep. 7, 548 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  39. Gurses, S. A. et al. Nod2 protects mice from inflammation and obesity-dependent liver cancer. Sci. Rep. 10, 20519 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Kreuter, R., Wankell, M., Ahlenstiel, G. & Hebbard, L. The role of obesity in inflammatory bowel disease. Biochim. Biophys. Acta Mol. Basis Dis. 1865, 63–72 (2019).

    Article  CAS  PubMed  Google Scholar 

  41. Hugot, J. P. et al. Association of NOD2 leucine-rich repeat variants with susceptibility to Crohn’s disease. Nature 411, 599–603 (2001).

    Article  CAS  PubMed  Google Scholar 

  42. Liu, P. et al. Foxp1 controls brown/beige adipocyte differentiation and thermogenesis through regulating β3-AR desensitization. Nat. Commun. 10, 5070 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  43. Palmer, C. J. et al. Cdkal1, a type 2 diabetes susceptibility gene, regulates mitochondrial function in adipose tissue. Mol. Metab. 6, 1212–1225 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Consortium, U. I. G. et al. Genome-wide association study of ulcerative colitis identifies three new susceptibility loci, including the HNF4A region. Nat. Genet. 41, 1330–1334 (2009).

    Article  Google Scholar 

  45. Anderson, C. A. et al. Investigation of Crohn’s disease risk loci in ulcerative colitis further defines their molecular relationship. Gastroenterology 136, 523–529 (2009).

    Article  PubMed  Google Scholar 

  46. Hachim, M. Y. et al. An integrative phenotype–genotype approach using phenotypic characteristics from the UAE National Diabetes Study identifies HSD17B12 as a candidate gene for obesity and type 2 diabetes. Genes (Basel) 11, 461 (2020).

    Article  CAS  PubMed  Google Scholar 

  47. Moreno-Navarrete, J. M. et al. Heme biosynthetic pathway is functionally linked to adipogenesis via mitochondrial respiratory activity. Obesity (Silver Spring) 25, 1723–1733 (2017).

    Article  CAS  PubMed  Google Scholar 

  48. Cox, B. et al. A co-expression analysis of the pacental transcriptome in association with maternal pre-pregnancy BMI and newborn birth weight. Front. Genet. 10, 354 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Huang, L. O. et al. Genome-wide discovery of genetic loci that uncouple excess adiposity from its comorbidities. Nat. Metab. 3, 228–243 (2021).

    Article  CAS  PubMed  Google Scholar 

  50. Oliva, M. et al. DNA methylation QTL mapping across diverse human tissues provides molecular links between genetic variation and complex traits. Nat. Genet. 55, 112–122 (2023).

    Article  CAS  PubMed  Google Scholar 

  51. DeFronzo, R. A. Chiglitazar: a novel pan-PPAR agonist. Sci. Bull. 66, 1497–1498 (2021).

    Article  CAS  Google Scholar 

  52. Ji, L. et al. Efficacy and safety of chiglitazar, a novel peroxisome proliferator-activated receptor pan-agonist, in patients with type 2 diabetes: a randomized, double-blind, placebo-controlled, phase 3 trial (CMAP). Sci. Bull. 66, 1571–1580 (2021).

    Article  CAS  Google Scholar 

  53. Jia, W. et al. Chiglitazar monotherapy with sitagliptin as an active comparator in patients with type 2 diabetes: a randomized, double-blind, phase 3 trial (CMAS). Sci. Bull. 66, 1581–1590 (2021).

    Article  CAS  Google Scholar 

  54. Aryee, M. J. et al. Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays. Bioinformatics 30, 1363–1369 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. Teschendorff, A. E. et al. A β-mixture quantile normalization method for correcting probe design bias in Illumina Infinium 450 k DNA methylation data. Bioinformatics 29, 189–196 (2013).

    Article  CAS  PubMed  Google Scholar 

  56. Johnson, W. E., Li, C. & Rabinovic, A. Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics 8, 118–127 (2007).

    Article  PubMed  Google Scholar 

  57. Tian, Y. et al. ChAMP: updated methylation analysis pipeline for Illumina BeadChips. Bioinformatics 33, 3982–3984 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. Wu, M. C. & Kuan, P. F. A guide to Illumina BeadChip data analysis. Methods Mol. Biol. 1708, 303–330 (2018).

    Article  CAS  PubMed  Google Scholar 

  59. Zhou, W., Laird, P. W. & Shen, H. Comprehensive characterization, annotation and innovative use of Infinium DNA methylation BeadChip probes. Nucleic Acids Res. 45, 22–22 (2016).

    CAS  Google Scholar 

  60. Gao, X. et al. FastQTLmapping: an ultra-fast package for mQTL-like analysis. Preprint at bioRxiv https://doi.org/10.1101/2021.11.16.468610 (2021).

  61. Shabalin, A. A. Matrix eQTL: ultra fast eQTL analysis via large matrix operations. Bioinformatics 28, 1353–1358 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  62. Zhu, Z. et al. Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nat. Genet. 48, 481–487 (2016).

    Article  CAS  PubMed  Google Scholar 

  63. Teschendorff, A. E., Breeze, C. E., Zheng, S. C. & Beck, S. A comparison of reference-based algorithms for correcting cell-type heterogeneity in epigenome-wide association studies. BMC Bioinformatics 18, 105 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  64. Zheng, S. C. et al. EpiDISH web server: epigenetic dissection of intra-sample-heterogeneity with online GUI. Bioinformatics 36, 1950–1951 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  65. Liu, Y. et al. Blood monocyte transcriptome and epigenome analyses reveal loci associated with human atherosclerosis. Nat. Commun. 8, 393 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  66. Leporcq, C. et al. TFmotifView: a webserver for the visualization of transcription factor motifs in genomic regions. Nucleic Acids Res. 48, W208–W217 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  67. Stojnic, R. & Diez, D. PWMEnrich: PWM enrichment analysis. R package version 4.30.0. https://bioconductor.org/packages/release/bioc/html/PWMEnrich.html (2021).

  68. Fornes, O. et al. JASPAR 2020: update of the open-access database of transcription factor binding profiles. Nucleic Acids Res. 48, D87–D92 (2020).

    CAS  PubMed  Google Scholar 

  69. Lambert, S. A. et al. The human transcription factors. Cell 172, 650–665 (2018).

    Article  CAS  PubMed  Google Scholar 

  70. Oki, S. et al. ChIP-Atlas: a data-mining suite powered by full integration of public ChIP–seq data. EMBO Rep. 19, e46255 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  71. Stower, H. Gene expression: super enhancers. Nat. Rev. Genet. 14, 367 (2013).

    Article  CAS  PubMed  Google Scholar 

  72. Jiang, Y. et al. SEdb: a comprehensive human super-enhancer database. Nucleic Acids Res. 47, 235–243 (2019).

    Article  Google Scholar 

  73. Khan, A. & Zhang, X. dbSUPER: a database of super-enhancers in mouse and human genome. Nucleic Acids Res. 44, D164–D171 (2016).

    Article  CAS  PubMed  Google Scholar 

  74. Qian, F. C. et al. SEanalysis: a web tool for super-enhancer associated regulatory analysis. Nucleic Acids Res. 47, W248–W255 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  75. Peng, Q. et al. Code for the mQTL analyses in 2023 Nature Genetics (v1.0). Zenodo https://doi.org/10.5281/zenodo.8084877 (2023).

Download references

Acknowledgements

This work is supported by the Strategic Priority Research Program of Chinese Academy of Sciences (grant XDB38000000 to S.W., F.L. and P.J.), the National Natural Science Foundation of China (NSFC; 92249302 to S.W. and T.N., 32325013 to S.W., 32370699 and 32170652 to A.E.T., 81930056 to F.L., 32170657 to Y.Z. and L.S., 32200472 to W.L.), CAS Young Team Program for Stable Support of Basic Research (YSBR-077 to S.W.), CAS Interdisciplinary Innovation Team to S.W., CAS Youth Innovation Promotion Association (2020276 to Q.P.), Shanghai Science and Technology Commission Excellent Academic Leaders Program (22XD1424700 to S.W.), the Strategic Priority Research Program of Chinese Academy of Sciences (grant XDC01000000 to F.L.), the National Key Research and Development Project (2018YFC0910403 to S.W. and 2018YFE0201603 to Y.Z. and L.S.), Ministry of Science and Technology of the People’s Republic of China (2015FY111700 to L.J.), Science and Technology Commission of Shanghai Municipality Major Project (2017SHZDZX01 to L.J., S.W., F.L., Y.Z. and L.S.), 111 Project (B13016 to L.J.), CAMS Innovation Fund for Medical Science (2019-I2M-5-066 to L.J. and J.W.), Shanghai Science and Technology Commission Excellent Academic Leaders Program (22XD1424700 to S.W.), Science and Technology Service Network Initiative of Chinese Academy of Sciences (KFJ-STS-QYZD-2021-08-001 and KFJ-STS-ZDTP-079 to F.L.), Naif Arab University for Security Sciences (NAUSS-23-R18 and NAUSS-23-R19 to F.L.), CAS Young Team Program for Stable Support of Basic Research (YSBR-077 to S.W.), CAS Interdisciplinary Innovation Team to S.W., CAS Youth Innovation Promotion Association (2020276 to Q.P.), China Postdoctoral Science Foundation (2021M693274 and BX2021336 to W.L.). We are grateful to S. Beck from University College London, W. H. Wong from Stanford University and C. Wang from Huazhong University of Science and Technology for helpful discussion, C. Relton and J. Min from the University of Bristol for sharing information about SNPs, CpGs and mQTLs in GoDMC, X. Chen from Taizhou Institute of Health Sciences of Fudan University, Y. Fan from Human Phenome Institute of Fudan University and Y. Hu from CAS for providing materials and samples in this study, and X. Cai and Q. Qian from the University of Chinese Academy of Sciences for helping in data preparation.

Author information

Authors and Affiliations

Authors

Contributions

S.W., F.L. and A.E.T. designed and drafted the work. Q.P., X.L., W.L., H.J. and A.E.T. performed the statistical analyses and contributed to data interpretation and writing of the paper. J.L., Q.L., C.E.B., G.L. and S.P. contributed to data analysis. X.G. contributed to the program of QTL mapping (fastQTLmapping). J.L., N.Y., J.Q., L.Y. and G.Z. generated the mQTL database. C.Y. and S.D. contributed to preprocessing of methylation and SNP chip data in the discovery cohort. L.J., J.W., J.T. and Z.Y. contributed to the design and acquisition of data in the discovery cohort NSPT. Q.Z., P.J. and C.Z. contributed to the design and acquisition of data in the validation panel CAS. Y.Z., X.L. and L.S. contributed to the design and acquisition of data in the validation panel CGZ. S.G., Y.L., T.N. and B.W. contributed to the design of this work. All the authors revised this work, approved the submitted version, agreed with personal contributions and are responsible for the integrity of the data and the accuracy of the data analysis.

Corresponding authors

Correspondence to Andrew E. Teschendorff, Fan Liu or Sijia Wang.

Ethics declarations

Competing interests

X. Lu is an employee of Shenzhen Chipscreen Biosciences. The other authors declare no competing interests.

Peer review

Peer review information

Nature Genetics thanks Carmen Marsit, Matthew Sudermann and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 mQTLs enrichment for different functional elements.

a,b, Enrichment of mQTLs (a) and mCpGs (b) in six functional elements: CTCF-enriched elements (CTCF), enhancers (E), promoters (P), promoter flanking regions (PF), regulatory elements (RE), and TF binding sites (TFBS). The y-axis indicates the fold changes (see Methods) and the significance from the one-tailed hypergeometric test is denoted by different symbols on each bar, that is, *, P < 0.05; **, P < 0.01; ***, P < 0.001. c, Enrichment of cis-mQTL pairs (left), lcis-mQTL pairs (middle), and trans-mQTL pairs (right) in all combinations of the six functional categories (that is, CTCF-E, P-E, RE-E, and etc). A one-tailed hypergeometric test is applied. The fold changes are labeled within each box. d, Proportion of SNP-CpG pairs of mQTL within the same TAD. e, Comparison of distance distributions of mQTLs and that of 3D loops.

Extended Data Fig. 2 The cis-colocalization at chr13q14.11 provides epigenetic evidence for the East Asian-specific height-association (rs7335629-height).

a, The East Asian-specific height signal (rs7335629) is in high-linkage with three SNPs in the colocalization locus at chr13q14.11. b, rs7335629 has potential chromatin interaction with one of the CpGs (cg21067652) that colocalized at chr13q14.11. c, Both rs7335629 and cg21067652 are located in regions of high DNase in several blood cell lines. d, Two-sample MR result indicates that cg21067652 is a causal factor for ELF1 RNA expression in CAGE (N = 2,765). Two-tailed MR egger test is applied. The dot and error bar indicate the beta value and s.e., which is SNP effect on CpG (x-axis) and ELF1 expression (y-axis). The blue dotted line indicates the regression line from MR egger test with beta = -9.19, P = 1.3410-4. e, Two-sample MR result indicates that cg21067652 is a causal factor for height in BBJ (N = 165,056). Two-tailed MR egger test is applied. The dot and error bar indicate the beta value and s.e., which is the SNP effect on CpG (x-axis) and body height (y-axis). The blue dotted line indicates the regression line from the MR egger test with beta = -0.31, P = 3.7310-9.

Extended Data Fig. 3 The enrichment of cis- and trans-colocalizations in EA-specific colocalizations and functional states.

a, Enrichment of trans- vs cis-mQTLs amongst EA-specific colocalizations vs others in NSPT (left), and amongst EA-specific vs EAS-EUR shared colocalizations (right). Trans-, cis- colocalizations in East Asian is carried out based on mQTLs in NSPT (N = 3,523) and 107 GWASs in BBJ (N = ~170,000). Trans-, cis-colocalization in European is carried out based on mQTLs in GoDMC (N = 27,750) and 107 GWAS traits in UKBB (N = ~500,000) which are overlapped with traits in BBJ. b, Enrichment results of cis- vs trans-colocalization loci in functional elements. Left: enrichment of cis- and trans-colocalization loci in functional elements; Middle, enrichment of East Asian-specific and EAS-EUR shared cis-colocalization loci in functional elements; Right, enrichment of East Asian-specific and EAS-EUR shared trans-colocalization loci in functional elements. c, Enrichment of cis- and trans-colocalization loci in chromatin states. Left, enrichment of cis- and trans-colocalization loci in chromatin states; Middle, enrichment of East Asian-specific and EAS-EUR shared cis-colocalization signals in chromatin states; Right, enrichment of East Asian-specific and EAS-EUR shared trans-colocalization signals in chromatin states. Two-tailed Fisher’s exact test is applied. Each point with an error bar indicates log10-scaled odds ratio and its 95% confidence interval.

Extended Data Fig. 4 The relation between the trans-colocalization at chr21q22.2 and blood cell traits and immune diseases.

a, The geographic distribution of rs80109907 allele frequencies in different populations (1000 Genomes Phase 3) by the Geography of Genetic Variants (GGV) browser (https://popgen.uchicago.edu/ggv). b, The PheWAS result of rs80107709 (https://gwas.mrcieu.ac.uk/phewas). c, The colocalization result of chr21q22.2 with other blood cell count and immune-related diseases. SMR test is applied, and the x-axis indicates the beta estimates from original GWAS while the y-axis shows the -log10(P) of the SMR test. d, Two-sample MR results showing that 39 CpGs are causal for 7 traits (several blood cell count and immune-related diseases) at FDR < 0.05. MR IVW test is applied. Red and blue squares indicate positive or negative causal effect of CpG on trait, while the size of the square indicates -log10(P) of the MR IVW test.

Supplementary information

Supplementary Information

Supplementary Protocols, References and Figs. 1–16.

Reporting Summary

Peer Review File

Supplementary Tables 1–23

Supplementary Table 1: The extent of 3.46 million GoDMC mQTLs being also mQTLs in NSPT for different MAF bins. Supplementary Table 2: The extent of 2.65 million NSPT mQTLs being also mQTLs in GoDMC for different MAF bins. Supplementary Table 3: NSPT-only mQTLs (NSPT P < 10−14) defined with different significance thresholds in GoDMC. Supplementary Table 4: The overlapping of population-specific and nonspecific mQTLs with significant SNPs and associations in GWAS Catalog, UKBB and BBJ. Supplementary Table 5: EA-specific mQTLs met with associations and signals in three groups (BBJ-specific, UKBB-specific and shared). Supplementary Table 6: Colocalization results of EA-specific mQTLs and GWAS signals of 230 traits in BBJ. Supplementary Table 7: A total of 144 locus–trait associations (96 loci and 38 traits) identified by cis-colocalization, especially in EAs based on mQTLs in NSPT and traits in BBJ. Supplementary Table 8: A total of 541 locus–trait associations (36 loci and 15 traits) identified by trans-colocalization, especially in EAs based on mQTLs in NSPT and traits in BBJ. Supplementary Table 9: PheWAS result of rs80109907 from public database (https://gwas.mrcieu.ac.uk/phewas/). Supplementary Table 10: Weaker (compared to basophil count) but significant East Asian-specific trans-colocalizations at the locus on chr21q22.2 involving several blood cell counts and immune-related diseases. Supplementary Table 11: A total of 233 CpGs significantly enriched for motifs of 62 TFs (P < 5.3 × 10−5). Supplementary Table 12: Enrichment of 233 CpGs in the binding sites of 13 TFs was validated by blood cell ChIP–seq data. Supplementary Table 13: Cell-lineage-specific mQTLs calculated for random unrelated SNP–CpG pairs. Supplementary Table 14: mQTL hotspots in NSPT. Supplementary Table 15: The 12 (of 16) hotspots index mQTLs or their high LD (r2 > 0.6) trans-mQTLs in GWAS Catalog (P < 5 × 10−8). Supplementary Table 16: mQTL hotspots on each chromosome in NSPT. Supplementary Table 17: Annotation of 16 trans-mQTL hotspots. Supplementary Table 18: Overlap of nearby TF/DBPs related cis-eQTL and trans-mQTLs in each trans-mQTL hotspot. Supplementary Table 19: Overlap of nearby TF/DBPs related cis-eQTL in GTEX V7 (P < 0.05) and trans-mQTLs in each trans-mQTL hotspot. Supplementary Table 20: Enrichment of SNPs with significant OpenCausal scores in trans-mQTL hotspots. Supplementary Table 21: A total of 21 putative causal mCpGs (rs4666078trans-associated) of blood eosinophil count identified by two-sample MR analysis. Supplementary Table 22: Two-sample MR analysis statistics summary of NFKB1 CpGs. Supplementary Table 23: Two-sample MR analysis statistics summary of BMI EWAS CpGs.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Peng, Q., Liu, X., Li, W. et al. Analysis of blood methylation quantitative trait loci in East Asians reveals ancestry-specific impacts on complex traits. Nat Genet (2024). https://doi.org/10.1038/s41588-023-01494-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1038/s41588-023-01494-9

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing