Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Epigenomic and transcriptomic analyses define core cell types, genes and targetable mechanisms for kidney disease

Abstract

More than 800 million people suffer from kidney disease, yet the mechanism of kidney dysfunction is poorly understood. In the present study, we define the genetic association with kidney function in 1.5 million individuals and identify 878 (126 new) loci. We map the genotype effect on the methylome in 443 kidneys, transcriptome in 686 samples and single-cell open chromatin in 57,229 kidney cells. Heritability analysis reveals that methylation variation explains a larger fraction of heritability than gene expression. We present a multi-stage prioritization strategy and prioritize target genes for 87% of kidney function loci. We highlight key roles of proximal tubules and metabolism in kidney function regulation. Furthermore, the causal role of SLC47A1 in kidney disease is defined in mice with genetic loss of Slc47a1 and in human individuals carrying loss-of-function variants. Our findings emphasize the key role of bulk and single-cell epigenomic information in translating genome-wide association studies into identifying causal genes, cellular origins and mechanisms of complex traits.

Your institute does not have access to this article

Access options

Buy article

Get time limited or full article access on ReadCube.

$32.00

All prices are NET prices.

Fig. 1: Graphic summary of new datasets created and analyses performed in the present study.
Fig. 2: The eGFRcrea GWAS of 1.5 million individuals and kidney eQTLs for 686 samples.
Fig. 3: Robust identification of human kidney meQTLs.
Fig. 4: Methylation variation explains a larger fraction of GWAS heritability than gene expression variation.
Fig. 5: Single-cell chromatin accessibility map enables target cell type and gene prioritization for GWAS variants.
Fig. 6: Integrative analysis of epigenetic and gene expression data improves kidney disease target gene prioritization.
Fig. 7: Identification of SLC47A1 as a kidney disease risk gene.
Fig. 8: Slc47a1 loss confers kidney disease risk in mice.

Data availability

The data of eGFRcrea GWAS, kidney meQTLs and kidney eQTLs produced in the present study are publicly available online at the Susztaklab Kidney Biobank (https://susztaklab.com/GWAS; https://susztaklab.com/Kidney_meQTL; https://susztaklab.com/Kidney_eQTL) and figshare (https://doi.org/10.6084/m9.figshare.15183495)91. The GWAS summary statistics are also available at the GWAS Catalog (accession no. GCST90100220). The RNA-seq and human kidney snATAC-seq data have been deposited with the Gene Expression Omnibus (GEO) under accession nos. GSE115098, GSE173343, GSE172008 and GSE200547 and the Common Metabolic Diseases Genome Atlas (https://cmdga.org/search/?type=Experiment&searchTerm=FNIH0000000). The Integrative Genomics Viewer visualization of human kidney snATAC-seq is publicly available at https://susztaklab.com/Human_snATAC. The summary statistics of five eGFRcrea GWAS datasets used for GWAS meta-analysis were obtained from consortium websites (download links provided in Supplementary Table 1). No consent was obtained to share individual-level genotype data for kidney samples. There is no mechanism to obtain consent because kidney tissue was collected as medical discard and the samples were permanently deidentified. Summary statistics for GWAS heritability analysis were obtained from the Alkes Price lab (https://alkesgroup.broadinstitute.org/LDSCORE/independent_sumstats)37. Mouse kidney snATAC-seq data were obtained from the GEO (accession no. GSE157079)60 and mouse kidney single-cell RNA-seq data from the GEO (accession no. GSE107585)56. Drug–gene interactions were identified using the Drug Gene Interaction Database (DGIdb v.4.2.0, https://www.dgidb.org)45. Source data are provided with this paper.

Code availability

Customized code used in the present study is available at github (https://github.com/hbliu/Kidney_Epi_Pri) and Zenodo (https://doi.org/10.5281/zenodo.6392494)92.

References

  1. GBD Chronic Kidney Disease Collaboration. Global, regional, and national burden of chronic kidney disease, 1990–2017: a systematic analysis for the Global Burden of Disease Study 2017. Lancet 395, 709–733 (2020).

    Google Scholar 

  2. Kottgen, A. et al. Multiple loci associated with indices of renal function and chronic kidney disease. Nat. Genet. 41, 712–717 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  3. Pattaro, C. et al. Genetic associations at 53 loci highlight cell types and biological pathways relevant for kidney function. Nat. Commun. 7, 10023 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  4. Wuttke, M. et al. A catalog of genetic loci associated with kidney function from analyses of a million individuals. Nat. Genet. 51, 957–972 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  5. Hellwege, J. N. et al. Mapping eGFR loci to the renal transcriptome and phenome in the VA Million Veteran Program. Nat. Commun. 10, 3842 (2019).

    PubMed  PubMed Central  Google Scholar 

  6. Sullivan, K. M. & Susztak, K. Unravelling the complex genetics of common kidney diseases: from variants to mechanisms. Nat. Rev. Nephrol. 16, 628–640 (2020).

    PubMed  PubMed Central  Google Scholar 

  7. Qiu, C. et al. Renal compartment-specific genetic variation analyses identify new pathways in chronic kidney disease. Nat. Med. 24, 1721–1731 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  8. GTEx Consortium. The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science 369, 1318–1330 (2020).

    Google Scholar 

  9. Ko, Y. A. et al. Genetic-variation-driven gene-expression changes highlight genes with important functions for kidney disease. Am. J. Hum. Genet. 100, 940–953 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  10. Gillies, C. E. et al. An eQTL landscape of kidney tissue in human nephrotic syndrome. Am. J. Hum. Genet. 103, 232–244 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  11. Sheng, X. et al. Mapping the genetic architecture of human traits to cell types in the kidney identifies mechanisms of disease and potential treatments. Nat. Genet. 53, 1322–1333 (2021).

    CAS  PubMed  Google Scholar 

  12. Yao, D. W., O’Connor, L. J., Price, A. L. & Gusev, A. Quantifying genetic effects on disease mediated by assayed gene expression levels. Nat. Genet. 52, 626–633 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  13. Reik, W. Stability and flexibility of epigenetic gene regulation in mammalian development. Nature 447, 425–432 (2007).

    CAS  PubMed  Google Scholar 

  14. Boix, C. A., James, B. T., Park, Y. P., Meuleman, W. & Kellis, M. Regulatory genomic circuitry of human disease loci by integrative epigenomics. Nature 590, 300–307 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  15. Jones, P. A. Functions of DNA methylation: islands, start sites, gene bodies and beyond. Nat. Rev. Genet. 13, 484–492 (2012).

    CAS  PubMed  Google Scholar 

  16. Ziller, M. J. et al. Charting a dynamic DNA methylation landscape of the human genome. Nature 500, 477–481 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  17. Hannon, E. et al. Methylation QTLs in the developing brain and their enrichment in schizophrenia risk loci. Nat. Neurosci. 19, 48–54 (2016).

    CAS  PubMed  Google Scholar 

  18. Chen, L. et al. Genetic drivers of epigenetic and transcriptional variation in human immune cells. Cell 167, 1398–1414 e24 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  19. Taylor, D. L. et al. Integrative analysis of gene expression, DNA methylation, physiological traits, and genetic variation in human skeletal muscle. Proc. Natl Acad. Sci. USA 116, 10883–10888 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  20. Wojcik, G. L. et al. Genetic analyses of diverse populations improves discovery for complex traits. Nature 570, 514–518 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  21. van Zuydam, N. R. et al. A genome-wide association study of diabetic kidney disease in subjects with type 2 diabetes. Diabetes 67, 1414–1427 (2018).

    PubMed  PubMed Central  Google Scholar 

  22. Willer, C. J., Li, Y. & Abecasis, G. R. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics 26, 2190–2191 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  23. Sinnott-Armstrong, N. et al. Genetics of 35 blood and urine biomarkers in the UK Biobank. Nat. Genet. 53, 185–194 (2021).

    CAS  PubMed  Google Scholar 

  24. Stanzick, K. J. et al. Discovery and prioritization of variants and genes for kidney function in >1.2 million individuals. Nat. Commun. 12, 4350 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  25. Backman, J. D. et al. Exome sequencing and analysis of 454,787 UK Biobank participants. Nature 599, 628–634 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  26. Barton, A. R., Sherman, M. A., Mukamel, R. E. & Loh, P. R. Whole-exome imputation within UK Biobank powers rare coding variant association and fine-mapping analyses. Nat. Genet. 53, 1260–1269 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  27. Kaushal, G. P., Haun, R. S., Herzog, C. & Shah, S. V. Meprin A metalloproteinase and its role in acute kidney injury. Am. J. Physiol. Renal Physiol. 304, F1150–F1158 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  28. Wen, X. et al. Transgenic expression of the human MRP2 transporter reduces cisplatin accumulation and nephrotoxicity in Mrp2-null mice. Am. J. Pathol. 184, 1299–1308 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  29. Lu, W. et al. NFIA haploinsufficiency is associated with a CNS malformation syndrome and urinary tract defects. PLoS Genet. 3, e80 (2007).

    PubMed  PubMed Central  Google Scholar 

  30. Eales, J. M. et al. Uncovering genetic mechanisms of hypertension through multi-omic analysis of the kidney. Nat. Genet. 53, 630–637 (2021).

    CAS  PubMed  Google Scholar 

  31. Chambers, B. E. et al. Tfap2a is a novel gatekeeper of nephron differentiation during kidney development. Development 146, dev172387 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  32. Jonker, J. W., Wagenaar, E., Van Eijl, S. & Schinkel, A. H. Deficiency in the organic cation transporters 1 and 2 (Oct1/Oct2 [Slc22a1/Slc22a2]) in mice abolishes renal secretion of organic cations. Mol. Cell Biol. 23, 7902–7908 (2003).

    CAS  PubMed  PubMed Central  Google Scholar 

  33. Sheng, X. et al. Systematic integrated analysis of genetic and epigenetic variation in diabetic kidney disease. Proc. Natl Acad. Sci. USA 117, 29013–29024 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  34. Delahaye, F. et al. Genetic variants influence on the placenta regulatory landscape. PLoS Genet. 14, e1007785 (2018).

    PubMed  PubMed Central  Google Scholar 

  35. Husquin, L. T. et al. Exploring the genetic basis of human population differences in DNA methylation and their causal impact on immune gene regulation. Genome Biol. 19, 222 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  36. Bonder, M. J. et al. Disease variants alter transcription factor levels and methylation of their binding sites. Nat. Genet. 49, 131–138 (2017).

    CAS  PubMed  Google Scholar 

  37. Loh, P. R., Kichaev, G., Gazal, S., Schoech, A. P. & Price, A. L. Mixed-model association for biobank-scale datasets. Nat. Genet. 50, 906–908 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  38. Hekselman, I. & Yeger-Lotem, E. Mechanisms of tissue and cell-type specificity in heritable traits and diseases. Nat. Rev. Genet. 21, 137–150 (2020).

    CAS  PubMed  Google Scholar 

  39. Roadmap Epigenomics Consortium et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).

    PubMed Central  Google Scholar 

  40. Ulirsch, J. C. et al. Interrogation of human hematopoiesis at single-cell and single-variant resolution. Nat. Genet. 51, 683–693 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  41. Pliner, H. A. et al. Cicero predicts cis-regulatory DNA interactions from single-cell chromatin accessibility data. Mol. Cell 71, 858–871 e8 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  42. Groopman, E. E. et al. Diagnostic utility of exome sequencing for kidney disease. N. Engl. J. Med. 380, 142–151 (2019).

    CAS  PubMed  Google Scholar 

  43. Wu, Y. et al. Integrative analysis of omics summary data reveals putative mechanisms underlying complex traits. Nat. Commun. 9, 918 (2018).

    PubMed  PubMed Central  Google Scholar 

  44. Nasser, J. et al. Genome-wide enhancer maps link risk variants to disease genes. Nature 593, 238–243 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  45. Freshour, S. L. et al. Integration of the drug-gene interaction database (DGIdb 4.0) with open crowdsource efforts. Nucleic Acids Res. 49, D1144–D1151 (2021).

    CAS  PubMed  Google Scholar 

  46. Guo, D. et al. Selective inhibition on organic cation transporters by carvedilol protects mice from cisplatin-induced nephrotoxicity. Pharm. Res. 35, 204 (2018).

    PubMed  PubMed Central  Google Scholar 

  47. Sarhan, M., von Mässenhausen, A., Hugo, C., Oberbauer, R. & Linkermann, A. Immunological consequences of kidney cell death. Cell Death Dis. 9, 114 (2018).

    PubMed  PubMed Central  Google Scholar 

  48. Miao, N. et al. The cleavage of gasdermin D by caspase-11 promotes tubular epithelial cell pyroptosis and urinary IL-18 excretion in acute kidney injury. Kidney Int. 96, 1105–1120 (2019).

    CAS  PubMed  Google Scholar 

  49. Tsuda, M. et al. Targeted disruption of the multidrug and toxin extrusion 1 (mate1) gene in mice reduces renal secretion of metformin. Mol. Pharmacol. 75, 1280–1286 (2009).

    CAS  PubMed  Google Scholar 

  50. Vilaysane, A. et al. The NLRP3 inflammasome promotes renal inflammation and contributes to CKD. J. Am. Soc. Nephrol. 21, 1732–1744 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  51. Xu, Y. et al. A role for tubular necroptosis in cisplatin-induced AKI. J. Am. Soc. Nephrol. 26, 2647–2658 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  52. Mulay, S. R., Linkermann, A. & Anders, H. J. Necroinflammation in kidney disease. J. Am. Soc. Nephrol. 27, 27–39 (2016).

    CAS  PubMed  Google Scholar 

  53. Gamazon, E. R. et al. Using an atlas of gene regulation across 44 human tissues to inform complex disease- and trait-associated variation. Nat. Genet. 50, 956–967 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  54. Li, Y. I. et al. RNA splicing is a primary link between genetic variation and disease. Science 352, 600–604 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  55. Zhang, Z. et al. Genetic analyses support the contribution of mRNA N6-methyladenosine (m6A) modification to human disease heritability. Nat. Genet. 52, 939–949 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  56. Park, J. et al. Single-cell transcriptomics of the mouse kidney reveals potential cellular targets of kidney disease. Science 360, 758–763 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  57. Li, Y. et al. Integration of GWAS summary statistics and gene expression reveals target cell types underlying kidney function traits. J. Am. Soc. Nephrol. 31, 2326–2340 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  58. Fairfax, B. P. et al. Innate immune activity conditions the effect of regulatory variants upon monocyte gene expression. Science 343, 1246949 (2014).

    PubMed  PubMed Central  Google Scholar 

  59. Guan, Y. et al. Dnmt3a- and Dnmt3b-decommissioned fetal enhancers are linked to kidney disease. J. Am. Soc. Nephrol. 31, 765–782 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  60. Miao, Z. et al. Single cell regulatory landscape of the mouse kidney highlights cellular differentiation programs and disease targets. Nat. Commun. 12, 2277 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  61. Sveinbjornsson, G. et al. Rare mutations associating with serum creatinine and chronic kidney disease. Hum. Mol. Genet. 23, 6935–6943 (2014).

    CAS  PubMed  Google Scholar 

  62. Levey, A. S. et al. A new equation to estimate glomerular filtration rate. Ann. Intern. Med. 150, 604–612 (2009).

    PubMed  PubMed Central  Google Scholar 

  63. Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).

    CAS  PubMed  PubMed Central  Google Scholar 

  64. Delaneau, O., Zagury, J.-F. & Marchini, J. Improved whole-chromosome phasing for disease and population genetic studies. Nat. Methods 10, 5 (2013).

    CAS  PubMed  Google Scholar 

  65. Howie, B. N., Donnelly, P. & Marchini, J. A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet. 5, e1000529 (2009).

    PubMed  PubMed Central  Google Scholar 

  66. Howie, B., Marchini, J. & Stephens, M. Genotype imputation with thousands of genomes. G3 1, 457–470 (2011).

    PubMed  PubMed Central  Google Scholar 

  67. Zhou, W., Triche, T. J. Jr., Laird, P. W. & Shen, H. SeSAMe: reducing artifactual detection of DNA methylation by Infinium BeadChips in genomic deletions. Nucleic Acids Res. 46, e123 (2018).

    PubMed  PubMed Central  Google Scholar 

  68. Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).

    CAS  PubMed  Google Scholar 

  69. Li, B. & Dewey, C. N. RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinf. 12, 323 (2011).

    CAS  Google Scholar 

  70. Fang, R. et al. Comprehensive analysis of single cell ATAC-seq data with SnapATAC. Nat. Commun. 12, 1337 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  71. Stouffer, S. A., Suchman, E. A., Devinney, L. C., Star, S. A. & Williams Jr, R. M. The American Soldier: Adjustment during army life (Studies in Social Psychology in World War II) Vol. 1 (Princeton Univ. Press, 1949).

  72. Chu, A. Y. et al. Multiethnic genome-wide meta-analysis of ectopic fat depots identifies loci associated with adipocyte development and differentiation. Nat. Genet. 49, 125–130 (2017).

    CAS  PubMed  Google Scholar 

  73. Zhu, Z. et al. Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nat. Genet. 48, 481–487 (2016).

    CAS  PubMed  Google Scholar 

  74. McLean, C. Y. et al. GREAT improves functional interpretation of cis-regulatory regions. Nat. Biotechnol. 28, 495–501 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  75. Yang, J. et al. FTO genotype is associated with phenotypic variability of body mass index. Nature 490, 267–272 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  76. Shabalin, A. A. Matrix eQTL: ultra fast eQTL analysis via large matrix operations. Bioinformatics 28, 1353–1358 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  77. Stegle, O., Parts, L., Durbin, R. & Winn, J. A Bayesian framework to account for complex non-genetic factors in gene expression levels greatly increases power in eQTL studies. PLoS Comput. Biol. 6, e1000770 (2010).

    PubMed  PubMed Central  Google Scholar 

  78. Ongen, H., Buil, A., Brown, A. A., Dermitzakis, E. T. & Delaneau, O. Fast and efficient QTL mapper for thousands of molecular phenotypes. Bioinformatics 32, 1479–1485 (2016).

    CAS  PubMed  Google Scholar 

  79. Storey, J. D. & Tibshirani, R. Statistical significance for genomewide studies. Proc. Natl Acad. Sci. USA 100, 9440–9445 (2003).

    CAS  PubMed  PubMed Central  Google Scholar 

  80. Han, B. & Eskin, E. Interpreting meta-analyses of genome-wide association studies. PLoS Genet. 8, e1002555 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  81. Frankish, A. et al. GENCODE reference annotation for the human and mouse genomes. Nucleic Acids Res. 47, D766–D773 (2019).

    CAS  PubMed  Google Scholar 

  82. Giambartolomei, C. et al. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genet. 10, e1004383 (2014).

    PubMed  PubMed Central  Google Scholar 

  83. Giambartolomei, C. et al. A Bayesian framework for multiple trait colocalization from summary association statistics. Bioinformatics 34, 2538–2545 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  84. Watanabe, K., Taskesen, E., van Bochoven, A. & Posthuma, D. Functional mapping and annotation of genetic associations with FUMA. Nat. Commun. 8, 1826 (2017).

    PubMed  PubMed Central  Google Scholar 

  85. Huang da, W., Sherman, B. T. & Lempicki, R. A. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protoc. 4, 44–57 (2009).

    PubMed  Google Scholar 

  86. Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203–209 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  87. Park, J. et al. Exome-wide evaluation of rare coding variants using electronic health records identifies new gene-phenotype associations. Nat. Med. 27, 66–72 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  88. Bramer, G. R. International statistical classification of diseases and related health problems. Tenth revision. World Health Stat. Q. 41, 32–36 (1988).

    CAS  PubMed  Google Scholar 

  89. Carroll, R. J., Bastarache, L. & Denny, J. C. R PheWAS: data analysis and plotting tools for phenome-wide association studies in the R environment. Bioinformatics 30, 2375–2376 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  90. Li, Q., Peng, X., Yang, H., Wang, H. & Shu, Y. Deficiency of multidrug and toxin extrusion 1 enhances renal accumulation of paraquat and deteriorates kidney injury in mice. Mol. Pharm. 8, 2476–2483 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  91. Liu, H. et al. Epigenomic and transcriptomic analyses define core cell types, genes and targetable mechanisms for kidney disease (Data Set). figshare https://doi.org/10.6084/m9.figshare.15183495 (2022).

  92. Liu, H. et al. Epigenomic and transcriptomic analyses define core cell types, genes and targetable mechanisms for kidney disease (Code). Zenodo https://doi.org/10.5281/zenodo.6392494 (2022).

Download references

Acknowledgements

We thank the Molecular Pathology and Imaging Core (grant no. P30-DK050306 to K.S.) and Diabetes Research Center (grant no. P30-DK19525 to K.S.) at the University of Pennsylvania for their services. The work in K.S.’s laboratory has been supported by the NIH (grant nos. R01DK087635, R01DK076077 and R01DK105821 to K.S.).

Author information

Authors and Affiliations

Authors

Contributions

K.S. and H.L. conceived, planned and oversaw the present study and wrote the manuscript. H.L. analyzed the data. T.D. performed the wet lab experiments. Z.Y.M., X.S., A.A., Z.M., B.F.V., H.Z.L. and C.B. assisted with data generation and analysis. J.P., M.D.R., H.M.T.V. and G.N.N. performed PheWAS analysis. M.P. performed histopathological descriptor measurement. G.D. and S.Y. provided Slc47a1 KO mice and helped with the animal experiments.

Corresponding author

Correspondence to Katalin Susztak.

Ethics declarations

Competing interests

The laboratory of K.S. receives funding from GSK, Regeneron, Gilead, Merck, Boehringer Ingelheim, Bayer, Novartis Maze, Jnana, Ventus and Novo Nordisk. The funders had no influence on the data analysis. K.S. serves on the scientific advisory board (SAB) of Jnana pharmaceuticals and receives equity. M.D.R. serves on the SAB for Goldfinch Bio and Cipherome. The other authors declare no competing interests.

Peer review

Peer review information

Nature Genetics thanks Cristian Pattaro, Pascal Schlosser and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Meta-analysis of eGFRcrea GWAS and validation using eGFRcys and BUN GWAS.

a. Manhattan plots of meta-analysis eGFRcrea GWAS (N = 1,508,659 individuals) and eGFRcrea GWAS datasets including CKDGen, UKBB, MVP, PAGE and SUMMIT. For each panel, the x-axis is chromosomal location of SNP. The y-axis strength of association -log10(p). The two-sided p value was obtained from GWAS studies. b. Scatter plot of effect size correlation between meta-analysis eGFRcrea GWAS (x-axis) and five source eGFRcrea GWAS data from CKDGen, UKBB, MVP, PAGE and SUMMIT (y-axis). The density of dots from low to high are shown from yellow to red. Correlation coefficient was calculated using Spearman’s rho (R) statistic and two-sided p value was calculated using asymptotic t approximation. c. Scatter plot of effect sizes between eGFRcrea GWAS (N = 1,508,659 individuals) and eGFRcys GWAS (N = 421,714 individuals). Significant eGFRcrea GWAS variants passing two-sided p < 5 × 10−8 in this study were used for the plot. Red dots represent validated variants showing nominally significant (two-sided GWAS p < 0.05) association with eGFRcys in the same effect direction. Correlation coefficient was calculated using Spearman’s rho (R) statistic and two-sided p value was calculated using asymptotic t approximation. d. Scatter plot of effect sizes between eGFRcrea GWAS (N = 1,508,659 individuals) and BUN GWAS (N = 852,678 individuals). Significant eGFRcrea GWAS variants passing two-sided p < 5 × 10−8 in this study were used for plot. Blue dots represent validated variants showing nominally significant (two-sided GWAS p < 0.05) association with BUN in the opposite effect direction. Correlation coefficient was calculated using Spearman’s rho (R) statistic and two-sided p value was calculated using asymptotic t approximation. e. Venn plot of eGFRcrea GWAS significant variants validated by eGFRcys GWAS or BUN GWAS. Y-axis is strength of GWAS association -log10(p value based of z statistic).

Extended Data Fig. 2 Identification and function annotation of independent eGFRcrea GWAS loci.

a. The strategy to identify independent loci and novel eGFRcrea GWAS loci. b. Pie chart of the number of independent loci categorized into different groups by comparing previously reported sentinel variants tagging independent loci. c. Pie chart of the number of novel independent loci validated by eGFRcys GWAS and/or BUN GWAS. d. Functional enrichment analysis of 126 novel independent loci annotated by GREAT. The positions of lead SNPs were inputted into GREAT (http://great.stanford.edu/public/html/), and the two nearest genes within 1 Mb were used for function enrichment in mouse phenotype catalogue. The two-sided uncorrected p-value was calculated by binomial test over inputted loci, and false discovery rate q-value was calculated for multiple test correction. e. Literature-based gene function of the closest genes to the 126 novel kidney disease loci. f. Expression of the mouse orthologues of 42 kidney disease genes (of the 126 newly identified GWAS genes) in adult mouse kidney samples (GSE107585). The mean expression was calculated for each cell types and z-scores were plotted. g. LocusZoom view of three novel independent loci, MEP1A, ABCC2 and NFIA. Y-axis is strength of association -log10(two-sided p value from GWAS meta-analysis z-statistic).

Extended Data Fig. 3 Meta-analysis of the kidney cis-eQTL data.

a. Manhattan plot of eQTL meta-analysis by integrating four eQTL datasets consisting of a total of 686 kidney samples. X-axis is chromosomal location of SNP, and y-axis is strength of association -log10 (two-sided p value based z-statistic from eQTL meta-analysis). b. Manhattan plot of eQTLs by Sheng et al. (n = 356 human kidney tubule samples). X-axis is chromosomal location of SNP, and y-axis is strength of association -log10 (two-sided p value from linear regression eQTL model). c. Manhattan plot of eQTLs by Ko et al. (n = 91 human kidney cortex samples). X-axis is chromosomal location of SNP, and y-axis is strength of association -log10 (two-sided p value from linear regression eQTL model). d. Manhattan plot of eQTLs by GTEx (v8) (n = 73 human kidney cortex samples). X-axis is chromosomal location of SNP, and y-axis is strength of association -log10 (two-sided p value from linear regression eQTL model). e. Manhattan plot of eQTLs by NephQTL (n = 166 human kidney tubule samples). X-axis is chromosomal location of SNP, and y-axis is strength of association -log10 (two-sided p value from linear regression eQTL model). f. Scatter plots of effect size correlation between eQTL meta-analysis and each individual eQTL datasets. The common variant-gene pairs passing eQTL p < 0.00001 in any of the two datasets were used for each plot. The density of dots from low to high was represented by yellow to red. Correlation coefficient was calculated using Spearman’s rho (R) statistic and two-sided p value was calculated using asymptotic t approximation.

Extended Data Fig. 4 Functional annotation of kidney-specific meQTLs and mCpGs.

a. Tissue-specific and shared meQTLs across kidney, blood and skeletal muscle tissue. M value > 0.9 was used to define meQTL for each set. b. Fraction of meQTL CpGs annotated by ChromHMM chromatin states in kidney, blood (CD3 + ) cell and skeletal muscle. c. Transcription factor motif enrichment (HOMER) of tissue-specific mCpGs. The p value was calculated by binomial test. d. Enrichment of kidney specific meQTL CpGs to cell type-specific open chromatin regions determined by snATAC-seq in human kidney. X-axis is odds ratio and Y-axis is strength of enrichment -log10(two-sided chi-square test p). Size of the dot represents the number of kidney-specific meQTL CpG sites. e. Enrichment of kidney specific meQTL SNPs to GWAS traits. X-axis is odds ratio and Y-axis is strength of enrichment -log10(two-sided chi-square test p). Size of the dot represents the number of SNPs and colors represent the type of GWAS trait.

Extended Data Fig. 5 Human kidney expression quantitative trait methylation (eQTM).

a. Schematic representation of the eQTM analysis. b. eQTM discovery rate estimated by the number of identified CpG~Gene pairs using different number of PEER factors as covariates. c. Volcano plot of eQTMs. The x-axis is the beta value and y-axis the strength of association (-log10(p)). Negative and positive eQTMs are colored in blue and red, respectively. d. The fraction of identified meQTL CpGs by eQTM analysis. The red line is the global FDR, dark blue line CpG level FDR and light blue line is nominal significance threshold. The x-axis is the eQTM significance and the y-axis is the cumulative fraction of meQTL CpGs. Vertical line represents the significance cutoff 0.05. e. Validation of the eQTMs in publicly available eQTM studies. Correlation coefficient was calculated using Spearman’s rho (R) statistic and two-sided p value was calculated using asymptotic t approximation. f. Scatter plot of CpG methylation (x-axis) and gene expression of PMD201 and CYP4F1 (y-axis) in 414 kidney samples. Each dot represents one kidney sample. Correlation coefficient was calculated using Spearman’s rho (R) statistic and two-sided p value was calculated using asymptotic t approximation. g. IGV visualization of eQTM association at the PM20D1, CYP4F11 and TBX5 loci. h. Number and fraction of negative and positive eQTM CpGs associated with the expression of nearest or distal genes. The nearest gene was defined based on the TSS (transcription start site) to eQTM CpG distance. The distal gene was defined if it was not the closest TSS to the eQTM CpG. Two-sided p value was calculated by chi-square test. i. Relative fraction of negative and positive eQTM CpGs localized to regulatory regions in the kidney. j. Profile plot of H3K4me3, H3K4me1, H3K27ac, and H3K27me3 histone modification across negative and positive eQTM CpGs and 5 kb flanking regions.

Extended Data Fig. 6 Estimated proportion of heritability mediated by kidney methylation and expression.

a. Estimation of heritability (\(h_{med}^2/h_g^2\)) mediated by kidney meQTL, kidney eQTL and the eQTL of best non-kidney GTEx tissue for three kidney function traits based three different biomarkers (eGFRcrea, eGFRcys and BUN). Here, best non-kidney GTEx tissue refers to the non-kidney tissue whose eQTL resulted in the highest estimates of \(h_{med}^2/h_g^2\) compared to all other non-kidney tissues. The x-axis represents different QTL groups and y-axis for \(h_{med}^2/h_g^2\) estimated for three kidney function traits Data are presented as mean ± SD. P values were calculated by one-tailed paired t test. b, c. Estimation of eGFRcrea GWAS heritability (\(h_{med}^2/h_g^2\)) mediated by methylation and expression for different number of human kidneys using multi-ancestry datasets (b) and European-ancestry datasets(c). The x-axis represents sample sizes used for the meQTL and eQTL, and y-axis for \(h_{med}^2/h_g^2\) estimated for eGFRcrea GWAS. d. Estimation of eGFRcrea and eGFRcys GWAS heritability mediated by meQTL and eQTL from different tissues. The x-axis represents \(h_{med}^2/h_g^2\), while the y-axis represents eQTL or meQTL data obtained from different tissues. meQTL data is shown in red and eQTL in blue. e. Estimation of heritability mediated by kidney eQTL and non-kidney eQTL for six kidney function traits and 28 independent non-kidney GWAS traits. The x-axis represents \(h_{med}^2/h_g^2\), while the y-axis represents different GWAS traits. For each trait, kidney eQTL data is shown in blue and best non-kidney GTEx tissue in gray. Here, best non-kidney GTEx tissue refers to the non-kidney tissue whose eQTL resulted in the highest estimates of \(h_{med}^2/h_g^2\) compared to all other non-kidney tissues. (b-e) For each bar plot, the centre of error bar represents the value of \(h_{med}^2/h_g^2\), and error bar represent jackknife standard error estimated for \(h_{med}^2/h_g^2\).

Extended Data Fig. 7 Enrichment of GWAS trait heritability mediated by enhancer methylation in 128 tissues/cell types.

a. GWAS heritability mediated by kidney methylation categorized as enhancers in 128 tissues/cell types. The x-axis shows the GWAS traits, while the y-axis shows tissue enhancers in kidney and 127 other tissue samples from the Roadmap project ChromHMM data. Gray, non-significant, while white to red indicates significant enrichment (nominal two-sided p < 0.05 calculated by MESC). Asterisk indicates h2med enrichment passing FDR q < 0.05 (accounting for 4,352 tests for 128 enhancer CpG sets and 34 GWAS traits). b. GWAS heritability mediated by blood methylation categorized as enhancers in 128 tissues/cell types. The x-axis shows the GWAS traits, while the y-axis shows tissue enhancers in kidney and 127 other tissue samples from the Roadmap project ChromHMM data. Gray, non-significant, while white to red indicates significant enrichment (nominal two-sided p < 0.05 calculated by MESC). Asterisk indicates h2med enrichment passing FDR q < 0.05 (accounting for 4,352 tests for 128 enhancer CpG sets and 34 GWAS traits).

Extended Data Fig. 8 Gene prioritization for eGFRcrea GWAS variants and functional annotation.

a. Schematic representation of gene prioritization strategy based on eight prioritization datasets and methods. b. Number of eGFRcrea GWAS variants prioritized using different priority score threshold. c. eGFRcrea GWAS independent loci prioritized by this study (priority score ≥ 1) and previous studies. The number represents the number of independent loci overlapping with independent signals prioritized (GPS score ≥ 1) by Stanzick et al. and/or creatinine-associated exome rare variants by Backman et al. or Barton et al. d. Features of the top variants prioritized for the 328 loci with priority score ≥ 3. Each row shows the top variant for each locus. Loci were ordered from top to bottom based on priority scores from 8 to 3. Loci with the same priority score were ordered by GWAS significance from strongest (dark blue) to lowest (light blue). Each column represents a feature overlapped with the variant. For each feature, the fraction of overlapping variants is shown in the upper panel. 22 top prioritized genes supported by all eight datasets and methods were listed. e. Tissue specificity of 566 prioritized genes (priority score ≥ 3) in 54 tissue types (GTEx v8) using GENE2FUNC of FUMA. The x-axis is the 54 tissue types ordered according to significance of enrichment in up-regulated differentially expressed gene sets. Y-axis represents enrichment significance -log10(p value calculated by hypergeometric test). Tissue with Bonferroni p value < 0.05 is shown in red. f. Heatmap of the expression of 417 mouse orthologues of prioritized genes in adult mouse kidney single cell dataset. The mean expression was calculated for each cell types and z-scores were plotted. Right panel shows 87 genes with the highest level of expression in proximal tubule cells.

Extended Data Fig. 9 PheWAS analysis of rs111653425 SLC47A1 variants in UKBB and BioMe Biobanks.

a. Single variant (rs111653425) PheWAS analysis of SLC47A1 in UKBB dataset. The x-axis is the strength of association -log10(p value calculated by linear regression PheWAS model). Blue line is p = 0.05 and red line is Bonferroni adjusted p = 0.05. The y-axis is the analyzed phenotype. b. SLC47A1 pLOF burden pheWAS analysis in BioMe dataset. The x-axis is the strength of association -log10(p value calculated by linear regression PheWAS model). Blue line is p = 0.05 and red line is Bonferroni adjusted p = 0.05. The y-axis is the analyzed phenotype. c. Single variant (rs111653425) pheWAS analysis of SLC47A1 in BioMe dataset. The x-axis is the strength of association -log10(p value calculated by linear regression PheWAS model). Blue line is p = 0.05 and red line is Bonferroni adjusted p = 0.05. The y-axis is the analyzed phenotype.

Extended Data Fig. 10 Slc47a1 loss confers kidney disease risk in mice.

a. The relative expression of fibrosis markers; Collagen3 (Col3a1), Collagen4 (Col4a1), Fibronectin (Fn1), and Connective tissue growth factor (Ctgf) in kidney of control or cisplatin treated Slc47a1+/+and Slc47a1−/− mice. Data are presented as mean ± SD. P values were calculated by one-way ANOVA with post hoc Tukey test. n.s., not significant. n = 4 biologically independent Slc47a1+/+ cisplatin mice examined over n = 3 independent Slc47a1+/+ control; n = 5 biologically independent Slc47a1−/− cisplatin mice examined over n = 4 independent Slc47a1+/+ cisplatin mice). b. Relative expression of markers of inflammation; Adhesion G protein-coupled receptor E1 (Adgre1), Tumor necrosis factor ligand (Tnfsf12), Interleukin 1beta (Il1b) in kidneys of control or cisplatin treated Slc47a1+/+and Slc47a1−/− mice. Data are presented as mean ± SD. P values were calculated by one-way ANOVA with post hoc Tukey test. n.s., not significant. n = 4 biologically independent Slc47a1+/+ cisplatin mice examined over n = 3 independent Slc47a1+/+ control; n = 5 biologically independent Slc47a1−/− cisplatin mice examined over n = 4 independent Slc47a1+/+ cisplatin mice).

Supplementary information

Supplementary Information

Supplementary Note and Figs. 1–15.

Reporting Summary

Peer Review File

Supplementary Tables

Supplementary Tables 1–28.

Source data

Source Data Fig. 8

Unprocessed scan of gel image for Fig. 8i.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Liu, H., Doke, T., Guo, D. et al. Epigenomic and transcriptomic analyses define core cell types, genes and targetable mechanisms for kidney disease. Nat Genet 54, 950–962 (2022). https://doi.org/10.1038/s41588-022-01097-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41588-022-01097-w

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing