Article | Published:

Human pancreatic islet three-dimensional chromatin architecture provides insights into the genetics of type 2 diabetes

Abstract

Genetic studies promise to provide insight into the molecular mechanisms underlying type 2 diabetes (T2D). Variants associated with T2D are often located in tissue-specific enhancer clusters or super-enhancers. So far, such domains have been defined through clustering of enhancers in linear genome maps rather than in three-dimensional (3D) space. Furthermore, their target genes are often unknown. We have created promoter capture Hi-C maps in human pancreatic islets. This linked diabetes-associated enhancers to their target genes, often located hundreds of kilobases away. It also revealed >1,300 groups of islet enhancers, super-enhancers and active promoters that form 3D hubs, some of which show coordinated glucose-dependent activity. We demonstrate that genetic variation in hubs impacts insulin secretion heritability, and show that hub annotations can be used for polygenic scores that predict T2D risk driven by islet regulatory variants. Human islet 3D chromatin architecture, therefore, provides a framework for interpretation of T2D genome-wide association study (GWAS) signals.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Data availability

Raw sequence reads from pcHi-C, RNA-seq, ChIP-seq, ATAC-seq and 4C-seq are available from EGA (https://www.ebi.ac.uk/ega), under accession number EGAS00001002917. Processed data files for islet pcHi-C interactions, islet regulome annotations, enhancer–promoter assignments, hub coordinates and components and 3D model videos are provided as supplementary data. The robust set of ATAC-seq peaks, consistent set of Mediator, cohesin, H3K27ac and H3K4me3 peaks, list of islet super-enhancers defined using ROSE algorithm, islet regulome, ChromHMM segmentation model, list of islet TAD-like domains, PATs and the list of high-confidence pcHi-C interactions are provided as Supplementary Datasets and are also deposited at https://www.crg.eu/en/programmes-groups/ferrer-lab#datasets.

Code availability

Custom code in this manuscript is available upon request.

Ethics declarations

Competing interests

P.R. is a shareholder and consultant for Endocells/Unicercell Biosolutions.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  1. 1.

    Chatterjee, S., Khunti, K. & Davies, M. J. Type 2 diabetes. Lancet 389, 2239–2251 (2017).

  2. 2.

    Flannick, J. & Florez, J. C. Type 2 diabetes: genetic data sharing to advance complex disease research. Nat. Rev. Genet. 17, 535–549 (2016).

  3. 3.

    Fuchsberger, C. et al. The genetic architecture of type 2 diabetes. Nature 536, 41–47 (2016).

  4. 4.

    Whyte, W. A. et al. Master transcription factors and mediator establish super-enhancers at key cell identity genes. Cell 153, 307–319 (2013).

  5. 5.

    Parker, S. C. et al. Chromatin stretch enhancer states drive cell-specific gene regulation and harbor human disease risk variants. Proc. Natl Acad. Sci. USA 110, 17921–17926 (2013).

  6. 6.

    Gaulton, K. J. et al. A map of open chromatin in human pancreatic islets. Nat. Genet. 42, 255–259 (2010).

  7. 7.

    Pasquali, L. et al. Pancreatic islet enhancer clusters enriched in type 2 diabetes risk-associated variants. Nat. Genet. 46, 136–143 (2014).

  8. 8.

    Cohen, A. J. et al. Hotspots of aberrant enhancer activity punctuate the colorectal cancer epigenome. Nat. Commun. 8, 14400 (2017).

  9. 9.

    Farh, K. K. et al. Genetic and epigenetic fine mapping of causal autoimmune disease variants. Nature 518, 337–343 (2015).

  10. 10.

    Hnisz, D. et al. Super-enhancers in the control of cell identity and disease. Cell 155, 934–947 (2013).

  11. 11.

    Vahedi, G. et al. Super-enhancers delineate disease-associated regulatory nodes in T cells. Nature 520, 558–562 (2015).

  12. 12.

    Montavon, T. et al. A regulatory archipelago controls Hox genes transcription in digits. Cell 147, 1132–1145 (2011).

  13. 13.

    Patrinos, G. P. et al. Multiple interactions between regulatory regions are required to stabilize an active chromatin hub. Genes Dev. 18, 1495–1509 (2004).

  14. 14.

    Javierre, B. M. et al. Lineage-specific genome architecture links enhancers and non-coding disease variants to target gene promoters. Cell 167, 1369–1384.e19 (2016).

  15. 15.

    Cairns, J. et al. CHiCAGO: robust detection of DNA looping interactions in Capture Hi-C data. Genome Biol. 17, 127 (2016).

  16. 16.

    Schofield, E. C. et al. CHiCP: a web-based tool for the integrative and interactive visualization of promoter capture Hi-C datasets. Bioinformatics 32, 2511–2513 (2016).

  17. 17.

    Mularoni, L., Ramos-Rodriguez, M. & Pasquali, L. The pancreatic islet regulome browser. Front Genet. 8, 13 (2017).

  18. 18.

    Dixon, J. R. et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 485, 376–380 (2012).

  19. 19.

    Nora, E. P. et al. Spatial partitioning of the regulatory landscape of the X-inactivation centre. Nature 485, 381–385 (2012).

  20. 20.

    Benazra, M. et al. A human beta cell line with drug inducible excision of immortalizing transgenes. Mol. Metab. 4, 916–925 (2015).

  21. 21.

    Fogarty, M. P., Cannon, M. E., Vadlamudi, S., Gaulton, K. J. & Mohlke, K. L. Identification of a regulatory variant that binds FOXA1 and FOXA2 at the CDC123/CAMK1D type 2 diabetes GWAS locus. PLoS Genet. 10, e1004633 (2014).

  22. 22.

    Thurner, M. et al. Integration of human pancreatic islet genomic data refines regulatory mechanisms at Type 2 diabetes susceptibility loci. eLife 7, e31977 (2018).

  23. 23.

    van de Bunt, M. et al. Transcript expression data from human islets links regulatory signals from genome-wide association studies for Type 2 diabetes and glycemic traits to their downstream effectors. PLoS Genet. 11, e1005694 (2015).

  24. 24.

    Varshney, A. et al. Genetic regulatory signatures underlying islet gene expression and type 2 diabetes. Proc. Natl Acad. Sci. USA 114, 2301–2306 (2017).

  25. 25.

    Scott, R. A. et al. An expanded genome-wide association study of type 2 diabetes in Europeans. Diabetes 66, 2888–2902 (2017).

  26. 26.

    Wood, A. R. et al. A genome-wide association study of IVGTT-based measures of first-phase insulin secretion refines the underlying physiology of type 2 diabetes variants. Diabetes 66, 2296–2309 (2017).

  27. 27.

    Lyssenko, V. et al. Mechanisms by which common variants in the TCF7L2 gene increase risk of type 2 diabetes. J. Clin. Invest. 117, 2155–2163 (2007).

  28. 28.

    Xia, Q. et al. The type 2 diabetes presumed causal variant within TCF7L2 resides in an element that controls the expression of ACSL5. Diabetologia 59, 2360–2368 (2016).

  29. 29.

    Nobrega, M. A. TCF7L2 and glucose metabolism: time to look beyond the pancreas. Diabetes 62, 706–708 (2013).

  30. 30.

    Bau, D. et al. The three-dimensional folding of the alpha-globin gene domain reveals formation of chromatin globules. Nat. Struct. Mol. Biol. 18, 107–114 (2011).

  31. 31.

    Serra, F. et al. Automatic analysis and 3D-modelling of Hi-C data using TADbit reveals structural features of the fly chromatin colors. PLoS Comput. Biol. 13, e1005665 (2017).

  32. 32.

    Gaulton, K. J. et al. Genetic fine mapping and genomic annotation defines causal mechanisms at type 2 diabetes susceptibility loci. Nat. Genet. 47, 1415–1425 (2015).

  33. 33.

    Boyle, E. A., Li, Y. I. & Pritchard, J. K. An expanded view of complex traits: from polygenic to omnigenic. Cell 169, 1177–1186 (2017).

  34. 34.

    Wood, A. R. et al. Defining the role of common variation in the genomic and biological architecture of adult human height. Nat. Genet. 46, 1173–1186 (2014).

  35. 35.

    Khera, A. V. et al. Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nat. Genet. 50, 1219–1224 (2018).

  36. 36.

    Finucane, H. K. et al. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat. Genet. 47, 1228–1235 (2015).

  37. 37.

    DeFronzo, R. A. et al. Type 2 diabetes mellitus. Nat. Rev. Dis. Primers 1, 15019 (2015).

  38. 38.

    Gjesing, A. P. et al. Genetic and phenotypic correlations between surrogate measures of insulin release obtained from OGTT data. Diabetologia 58, 1006–1012 (2015).

  39. 39.

    Mahajan, A. et al. Fine-mapping type 2 diabetes loci to single-variant resolution using high-density imputation and islet-specific epigenome maps. Nat. Genet. 50, 1505–1513 (2018).

  40. 40.

    Khera, A. V. et al. Polygenic prediction of weight and obesity trajectories from birth to adulthood. Cell 177, 587–596.e9 (2019).

  41. 41.

    Richardson, T. G., Harrison, S., Hemani, G. & Davey Smith, G. An atlas of polygenic risk score associations to highlight putative causal relationships across the human phenome. eLife 8, e43657 (2019).

  42. 42.

    Bonas-Guarch, S. et al. Re-analysis of public genetic data reveals a rare X-chromosomal variant associated with type 2 diabetes. Nat. Commun. 9, 321 (2018).

  43. 43.

    Sudlow, C. et al. UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 12, e1001779 (2015).

  44. 44.

    Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203–209 (2018).

  45. 45.

    Schmitt, A. D. et al. A compendium of chromatin contact maps reveals spatially active regions in the human genome. Cell Rep. 17, 2042–2059 (2016).

  46. 46.

    Harmston, N. et al. Topologically associating domains are ancient features that coincide with Metazoan clusters of extreme noncoding conservation. Nat. Commun. 8, 441 (2017).

  47. 47.

    Akalin, A. et al. Transcriptional features of genomic regulatory blocks. Genome Biol. 10, R38 (2009).

  48. 48.

    Ahlqvist, E. et al. Novel subgroups of adult-onset diabetes and their association with outcomes: a data-driven cluster analysis of six variables. Lancet Diabetes Endocrinol. 6, 361–369 (2018).

  49. 49.

    Kahn, S. E., Cooper, M. E. & Del Prato, S. Pathophysiology and treatment of type 2 diabetes: perspectives on the past, present, and future. Lancet 383, 1068–1083 (2014).

  50. 50.

    Melzi, R. et al. Role of CCL2/MCP-1 in islet transplantation. Cell Transplant. 19, 1031–1046 (2010).

  51. 51.

    Kerr-Conte, J. et al. Upgrading pretransplant human islet culture technology requires human serum combined with media renewal. Transplantation 89, 1154–1160 (2010).

  52. 52.

    Bucher, P. et al. Assessment of a novel two-component enzyme preparation for human islet isolation and transplantation. Transplantation 79, 91–97 (2005).

  53. 53.

    Ricordi, C., Lacy, P. E., Finke, E. H., Olack, B. J. & Scharp, D. W. Automated method for isolation of human pancreatic islets. Diabetes 37, 413–420 (1988).

  54. 54.

    Nagano, T. et al. Comparison of Hi-C results using in-solution versus in-nucleus ligation. Genome Biol. 16, 175 (2015).

  55. 55.

    Wingett, S. et al. HiCUP: pipeline for mapping and processing Hi-C data. F1000Res. 4, 1310 (2015).

  56. 56.

    Buenrostro, J. D., Giresi, P. G., Zaba, L. C., Chang, H. Y. & Greenleaf, W. J. Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nat. Methods 10, 1213–1218 (2013).

  57. 57.

    Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J. 17, 10–12 (2011).

  58. 58.

    Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).

  59. 59.

    Li, H. et al. The sequence alignment/map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).

  60. 60.

    McKenna, A. et al. The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).

  61. 61.

    Dunham, I. et al. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).

  62. 62.

    Kharchenko, P. V., Tolstorukov, M. Y. & Park, P. J. Design and analysis of ChIP-seq experiments for DNA-binding proteins. Nat. Biotechnol. 26, 1351–1359 (2008).

  63. 63.

    Zhang, Y. et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 9, R137 (2008).

  64. 64.

    Leisch, F. A toolbox for K-centroids cluster analysis. Comput. Stat. Data Anal. 51, 526–544 (2006).

  65. 65.

    Kuleshov, M. V. et al. Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Res. 44, W90–W97 (2016).

  66. 66.

    Baù, D. & Marti-Renom, M. A. Genome structure determination via 3C-based data integration by the Integrative Modeling Platform. Methods 58, 300–306 (2012).

  67. 67.

    Di Stefano, M., Paulsen, J., Lien, T. G., Hovig, E. & Micheletti, C. Hi-C-constrained physical models of human chromosomes recover functionally-related properties of genome organization. Sci. Rep. 6, 35985 (2016).

  68. 68.

    Ahmed, M. et al. Variant Set Enrichment: an R package to identify disease-associated functional genomic regions. BioData Min. 10, 9 (2017).

  69. 69.

    Thuesen, B. H. et al. Cohort Profile: the Health2006 cohort, research centre for prevention and health. Int. J. Epidemiol. 43, 568–575 (2014).

  70. 70.

    Drivsholm, T., Ibsen, H., Schroll, M., Davidsen, M. & Borch-Johnsen, K. Increasing prevalence of diabetes mellitus and impaired glucose tolerance among 60-year-old Danes. Diabet. Med. 18, 126–132 (2001).

  71. 71.

    Johansen, N. B. et al. Protocol for ADDITION-PRO: a longitudinal cohort study of the cardiovascular experience of individuals at high risk for diabetes recruited from Danish primary care. BMC Public Health 12, 1078 (2012).

  72. 72.

    McCarthy, S. et al. A reference panel of 64,976 haplotypes for genotype imputation. Nat. Genet. 48, 1279–1283 (2016).

  73. 73.

    Loh, P. R., Palamara, P. F. & Price, A. L. Fast and accurate long-range phasing in a UK Biobank cohort. Nat. Genet. 48, 811–816 (2016).

  74. 74.

    Marchini, J. & Howie, B. Genotype imputation for genome-wide association studies. Nat. Rev. Genet. 11, 499–511 (2010).

  75. 75.

    Schwarzer, G. meta: an R package for meta-analysis. R. News 7, 40–45 (2007).

  76. 76.

    Bulik-Sullivan, B. K. et al. LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47, 291–295 (2015).

  77. 77.

    Euesden, J., Lewis, C. M. & O’Reilly, P. F. PRSice: polygenic risk score software. Bioinformatics 31, 1466–1468 (2015).

Download references

Acknowledgements

This research was supported by the National Institute for Health Research Imperial Biomedical Research Centre. Work was funded by grants from the Wellcome Trust (nos. WT101033 to J.F. and WT205915 to I.P.), Horizon 2020 (Research and Innovation Programme nos. 667191, to J.F., 633595, to I.P., and 676556, to M.A.M.-R.; Marie Sklodowska-Curie 658145, to I.M.-E., and 43062 ZENCODE, to G.A.), European Research Council (nos. 789055, to J.F., and 609989, to M.A.M.-R.). Marató TV3 (no. 201611, to J.F. and M.A.M.-R.), Ministerio de Ciencia Innovación y Universidades (nos. BFU2014-54284-R, RTI2018-095666, to J.F., BFU2017-85926-P, to M.A.M.-R., IJCI-2015-23352, to I.F.), AGAUR (to M.A.M.-R.). UK Medical Research Council (no. MR/L007150/1, to P.F., MR/L02036X/1 to J.F.), World Cancer Research Fund (WCRF UK, to I.P.) and World Cancer Research Fund International (no. 2017/1641 to I.P.), Biobanking and Biomolecular Resources Research Infrastructure (nos. BBMRI-NL, NWO 184.021.007, to I.O.F.). Work in IDIBAPS, CRG and CNAG was supported by the CERCA Programme, Generalitat de Catalunya and Centros de Excelencia Severo Ochoa (no. SEV-2012-0208). Human islets were provided through the European islet distribution program for basic research supported by JDRF (no. 3-RSC-2016-160-I-X). We thank N. Ruiz-Gomez for technical assistance; R. L. Fernandes, T. Thorne (University of Reading) and A. Perdones-Montero (Imperial College London) for helpful discussions regarding Machine Learning approaches; B. Lenhard and M. Merkenschlager (London Institute of Medical Sciences, Imperial College London), F. Müller (University of Birmingham) and J. L. Gómez-Skarmeta (Centro Andaluz de Biología del Desarrollo) for critical comments on the draft; the CRG Genomics Unit; and the Imperial College High Performance Computing Service.

Author information

I.M.-E., I.C. and B.M.J. performed and analyzed experiments. I.M.-E. and J.G.-H. processed human islet samples. I.M.-E., S.B.-G., I.C., J.P.-C., D.M.Y.R., G.A., C.C.M. and I.M. performed computational analysis. J.M.-E. and I.F. modeled and analyzed 3D data. L. Piemonti, T.B., E.J.P.d.K., J.K.-C., F.P. and P.R. provided material and reagents. E.V.R.A., A.L., A.P.G., D.R.W., O.P., N.G., J.M.M., D.T., I.O.F., I.P., T.H., and L.G. provided genetics data. M.R.-R. and L. Pasquali created software resources. I.C. and A.B. developed genome-editing methods. M.A.M.-R., P.F. and J.F. supervised analysis. I.M.-E., I.C., S.B.-G., J.P.-C., D.M.Y.R. and J.F. conceived the project. I.M.-E., S.B.-G., I.C. and J.F. wrote and edited the manuscript, which all authors have approved.

Competing interests

P.R. is a shareholder and consultant for Endocells/Unicercell Biosolutions.

Correspondence to Jorge Ferrer.

Integrated supplementary information

Supplementary Figure 1 pcHi-C in human pancreatic islets.

Supplementary Fig. 1. a, Schematic representation of the pcHi-C analysis workflow. b, Relative frequency of high-confidence interactions between baits and interacting regions. c, Distances from bait to interacting regions for high-confidence interactions. The dashed line represents the median distance. d, CHiCAGO score distribution of high-confidence interactions in merged pcHi-C data (n=175,784) and individual islet samples, and in distance-matched interactions. Boxplots show IQR, and whiskers show 5th and 95th percentiles. e, Pairwise Pearson correlation values of CHiCAGO scores between individual islet samples and merged dataset. f-g. Epigenomic maps and virtual 4C profiles in merged and individual human islet samples in TCF7L2 and ISL1. h,i. pcHi-C recapitulates interactions identified by 4C-seq in human islets and the human β cell line EndoC-βH1 at ISL1 and MAFB loci. The top track depicts a virtual 4C representation of human islet pcHi-C data in both promoters. High-confidence interactions from 4 pooled human islet samples and naïve CD4+ T cells are shown below. Inverted triangles depict viewpoints.

Supplementary Figure 2 pcHi-C and chromatin landscape of human islets.

a, Binding patterns for indicated epitopes in ± 25 Kb regions centered on interacting pcHi-C baits (top), and promoter-interacting regions (bottom). Expected occupancy profiles after randomizing 10 times the positions of indicated signals are represented with a red line, and IQR are shown as a shade. b, Relative frequency of CTCF binding sites in baits and non-bait interacting regions. Nearly 50% of interactions are associated with CTCF binding in at least one of the interacting regions. c, CTCF-binding motif orientation at CTCF-bound interacting regions. 56.62% of 9,657 interactions are convergent, consistent with expectations. d, Tissue-selectivity of islet pcHi-C interactions relative to identically processed pcHi-C from erythroblasts, macrophages, naïve CD4+ T cells and total B lymphocytes. e, Genes located in baits with islet-selective interactions show increased gene expression islet-specificity scores vs. genes with tissue-invariant interactions. The islet-specificity Z score was calculated with a gene expression distribution from 18 human tissues. P value was calculated with Wilcoxon’s two-sided signed ranked test. Boxplot represents IQRs. f, Ratio of tissue-invariant to islet-selective interactions overlapping major open chromatin classes, normalized by the total number of tissue-invariant and islet-selective interactions. All categories showed significant differences with interactions in the remaining genome (Fisher’s P < 0.01).

Supplementary Figure 3 Definition of TAD-like domains, PATs, and enhancer-gene assignments.

a, Features of islet TAD-like domains. b, Representative example of human islet TAD-like domains (chr 11:1132582-4719948, hg19). Negative and positive directionality index (DI) scores are represented in blue and red, respectively. ESC and IMR90 TADs generated with Hi-C are shown for reference. c, Size of TAD-like domains in human islets and Hi-C TADs from ESC and IMR90 cells. d, TAD-like domains display known features of TADs, such as enrichment of CTCF binding and convergent CTCF motif orientation in borders. e, Tissue-selectivity of islet TAD-like boundary regions was estimated by comparison with TADs defined by Hi-C in 21 tissues. f, Enhancers frequently interact with more than one gene. Fraction of enhancers showing high-confidence (CHiCAGO > 5) interactions to 1-5+ promoter ”baits” in the same TAD. g, Schematic of promoter-associated three-dimensional spaces (PATs), defined as the genomic space that spans high-confidence interactions originating from one bait. h, Fraction of islet TAD-like spaces occupied by each PAT. i. ChromHMM state enrichments in PATs were consistent with the expression level of their associated genes. The heatmap shows ChromHMM state median log2 fold-enrichments in PATs over their genomic distributions, in 5 bins based on bait gene expression levels in human islets. j. Active islet enhancer or H3K9me3-enriched ChromHMM states in PATs were enriched over the remaining TAD-like space in accordance with islet expression of PAT genes. Only PATs at least 25% smaller than their TAD were used (n=7,085). Median enrichments (circles) and IQR (shade) are shown. k. Emission probabilities of the 15 ChromHMM states for all islet chromatin features used to create the model. l. Sequential steps used to impute the assignment of islet enhancers to target genes. m. CHiCAGO scores for imputed enhancer-promoter pairs vs. distance-matched controls (n=50 sets). P value is from Wilcoxon’s two-sided signed rank test. Boxplot represents IQRs. n. Genes assigned to enhancers were enriched in islet-specific genes, as compared with unassigned control genes from the same islet TAD-like structure (Chi-square P = 6 x 10−08). o. Islet exposure to 4 mM vs. 11 mM glucose causes widespread induction of H3K27 acetylation in islet enhancers. Dots represent H3K27ac-enriched regions, and are red if Benjamini-Hochberg adjusted P ≤ 0.05.

Supplementary Figure 4 eQTLs support the identification of unexpected T2D target genes.

a, T2D and FG-associated variants used to examine gene targets (see Supplementary Table 3). b, Proportion of DIAGRAM credible set SNPs with high posterior probability (PP > 0.1) mapping to islet regulome elements within intervals containing credible sets. Note the enrichment in active enhancers and promoters vs.100 sets of elements shuffled within the genomic spaces that contain credible sets, shown as grey IQR boxplot distributions and outliers as black dots. Z-scores represent deviations from the mean of the shuffled distribution. c-d, Selected examples of loci with T2D-risk variants with gene targets supported by both significant eQTLs and pcHi-C, showing enhancer-gene assignments through pcHiC high-confidence interactions (from pooled data, in magenta) and imputations (grey). Enhancer eQTL-eGene pairs are represented as horizontal black lines. A vertical yellow stripe highlights the eGene promoter. Concordant gene targets include c, STARD10 d, ABCB9. pcHiC interactions are represented as arcs connecting HindIII fragments. Boxplots shows first and third quartiles as boxes and 1.5 x IQR as whiskers of gene expression for different genotypes, shown as PEER residuals, along with P and adjusted P (q) values from eQTL meta-analysis. Red dots represent individual PEER residual values of gene expression for 183 samples across different genotypes. For additional eQTL findings see Supplementary Table 2.

Supplementary Figure 5 Functional perturbations of CAMK1D and OPTN.

a, Long-range interactions of the enhancer carrying rs11257655 are replicated in individual human islet pcHi-C samples. Note how interactions between this enhancer and OPTN are detected with high confidence (ChICAGO >5) in each pcHi-C replicate. b, Luciferase assay in the human β cell line EndoC-βH3 shows allele-dependent activity for the rs11257655-enhancer. Data are means ± s.d. (n=3 independent experiments, with 3-6 independent transfections). Statistical significance: two-tailed Student's t-test. c,d. Analysis of OPTN and CAMK1D mRNA after c, CRISPRi of the rs11257655-enhancer in HepG2 and d, CRISPRi or CRISPRa in EndoC-βH3 cells. Bars show average values of 3-4 gRNAs targeting either the rs11257655 enhancer, or the transcriptional start sites. Data are presented as means ± s.e.m. (enhancer activation: 4 gRNAs n=6; inhibition: 4 gRNAs n=3). Statistical significance: two-tailed Student's t-test.

Supplementary Figure 6 Functional perturbations of TCF7L2.

a, Virtual 4C representations from pooled human islet samples centered on all genes in this locus show that the region containing rs7903146 connects with TCF7L2 through moderate-confidence interactions and an imputed assignment, without evidence for interactions with other genes. The HindIII fragment that contains the enhancer with rs7903146 is highlighted in yellow. The bottom panel reveals that this enhancer shows unusually high occupancy by Mediator and islet-enriched transcription factors in islet chromatin. b, RNA analysis in EndoC-βH3 cells after deletion of either the rs7903146-enhancer or a control region in the same locus. Deletions were tested with 2 different gRNA pairs, n=3 experiments. Statistical significance was determined using two-tailed Student's t-test. Only active genes in the locus were tested. c, RNA analysis in EndoC-βH3 cells after CRISPRa or CRISPRi of the rs7903146-enhancer. Statistical significance was determined using two-tailed Student's t-test (activation: 1 gRNA, n=3 experiments; inhibition: 3 gRNAs n=3 experiments).

Supplementary Figure 7 Functional perturbations of VEGFA and ZFAND3.

a,c. T2D variant-target gene assignments in VEGFA and ZFAND3 loci. pcHi-C and virtual 4C representations are from pooled samples. b,d. VEGFA or MDGA1 and ZFAND3 mRNAs in EndoC-βH3 cells after CRISPRa or CRISPRi of T2D-associated enhancers. C6orf223 was not detectable by qPCR. Note that we did not examine all potential targets near VEGFA (see other imputed genes in Supplementary Table 3). Data are presented as means ± s.e.m. (VEGFA enhancer CRISPRa: 3 guides n=3 experiments; VEGFA enhancer CRISPRi: 4 guides n=2 experiments; ZFAND3-MDGA1 enhancer: 4 guides n=3 experiments). Statistical significance was determined using two-tailed Student's t-test.

Supplementary Figure 8 Tissue-specific enhancer hubs.

a, Multiple logistic regression analysis was used to identify PAT features that predict islet-expressed genes with islet-selective vs. non islet-selective expression. Islet-selective expression was examined as a surrogate endpoint because it is a property of many (though not all) genes important for islet cell identity. The PAT feature with the highest logistic regression coefficient was the number of non-islet tissues with promoter H3K27me3-enrichment. This feature was considered as almost synonymous with islet-specific islet expression. The next highest coefficient was the number assigned class I enhancers in the PAT. Further analysis showed that ≥3 assigned class I enhancers in a PAT optimized the prediction of islet-selective expression (Supplementary Fig. 9). b, Classification of PATs based on assigned enhancers revealed 2,623 enhancer-rich PATs (≥3 assigned class I enhancers). Enhancers are shown as red boxes. Turquoise and dashed green lines are high-confidence interactions and imputed assignments, respectively. c, Enhancer hubs were defined as enhancer-rich PATs, which were merged with other PATs connected through at least one common enhancer-associated high-confidence interaction. d, Descriptive characteristics of enhancer hubs in human islets. Multi-target enhancers show high confidence interactions with two or more promoter-containing baits. e, Enhancer hubs are enriched in islet-selective interactions relative to non-hub PATs that had at least 1 high-confidence interaction. Boxes are IQR, notches are 95% CI of the median and P values are from Wilcoxon’s two-sided signed rank test. f, Linear genomic space occupied by class I enhancers in three-dimensional enhancer hubs compared with the space occupied by super-enhancers (SEs) calculated with the ROSE algorithm, all enhancers from linear enhancer clusters (ECs), and stretch enhancers. g-i. Venn diagrams depicting how often hub enhancers overlap with other human islet enhancer domains: g, SEs, h, highly-bound (top two TF occupancy quartiles) ECs, and i, stretch enhancers. j-l. Islet enhancer hubs often contain enhancers that do not form part of SEs or ECs. Charts show the fraction of hub class I enhancers that overlapped SEs, ECs or stretch-enhancers. Note that the genomic space occupied by stretch enhancers is an order of magnitude greater than hubs (panel g). m-o. Islet enhancer hubs very frequently contain multiple SEs, ECs or stretch enhancers.

Supplementary Figure 9 Alternative definitions of enhancer hubs.

We considered alternative definitions of hubs as follows: a, enhancer-rich PATs with ≥3 class I enhancers, but without merging interconnected PATs, b-e, enhancer-rich PATs with ≥2-5 assigned class I enhancers, merged with PATs interconnected through high-confidence enhancer interactions, f,g, enhancer-rich PATs with ≥2 or ≥3 class I enhancers exclusively assigned through high-confidence interactions, and then merged to PATs interconnected through high-confidence enhancer interactions, h, enhancer-rich PATs with ≥3 assigned class I enhancers, merged to PATs interconnected through promoter-promoter (instead of enhancer-promoter) interactions. We found that canonical islet-cell functional annotations ranked highest only in definitions with ≥3 assigned class I enhancers. Hubs with ≥4-5 assigned class I enhancers (d,e), as well as those defined exclusively with high-confidence interactions (f,g), showed high ranking islet cell functional annotation enrichments, at the expense of reducing the number of hubs. Panels in the right show post-hoc VSE analysis of T2D/FG-associated SNPs (n=2,771; Supplementary Table 9). Consistent with the notion that the hub definitions in d-g were restrictive, they failed to show selective enrichment of T2D/FG-associated SNPs. Boxplots show null distributions based on 500 permutations of matched random haplotype blocks. Red dots indicate significant enrichment relative to the null distribution (Bonferroni–adjusted P < 0.01).

Supplementary Figure 10 3D models of enhancer hubs.

a, The FOXA2 locus forms a tissue-specific enhancer hub. Human islet epigenome maps and high-confidence pcHi-C interactions in islets and total B lymphocytes show that islet active enhancers, super-enhancers and enhancer clusters interact to form a single tissue-specific three-dimensional structure. b-c, 360o views of top-scoring 3D model of ISL1 enhancer hub in human islets and total B lymphocytes. Class I, II and III enhancers within 200 nm of ISL1 promoter are colored dark to light red, while promoters within 200 nm of ISL1 (including ISL1) are colored blue. Islet enhancers and promoters are otherwise represented as white spheres. These models show that active islet regulatory elements interact in a common restricted space in islet nuclei. See also Supplementary Videos 1 and 2. d-h, Left panels show the most populated community of the promoter-enhancer interaction network in chosen hubs, as obtained via MCODE clustering, in human islets and total B lymphocytes. Network nodes are promoters (blue) and enhancers (dark to light red for enhancer classes I to III). Edges are mean distance values in the most populated 3D structure cluster. The central panel compares the neighborhood connectivity distribution of networks in both tissues. The right panel shows the 3D distances between hub promoters and enhancers in both tissues. All boxplots show IQRs and outliers as grey diamonds. The number of nodes analysed for each locus is shown in Supplementary Table 16. Statistical significance was computed using two-sided Kolmogorov-Smirnov test.

Supplementary Figure 11 Epigenome editing of hubs carrying T2D risk noncoding variants.

a, pcHi-C and virtual 4C representations from pooled human islet samples in the ZBED3 locus for all promoters with active transcripts in the region. b, Islet pcHi-C assigns CRY2 and PHF21A as gene targets of an enhancer containing a FG-associated variant (vertical yellow stripe). c, Analysis of CRY2 and PHF21A mRNA after CRISPRa or CRISPRi of their transcriptional start sites or of the islet enhancer bearing the FG-associated variant rs1401419 in EndoC-βH3 cells. Data are presented as means ± s.e.m. (enhancer CRISPRa: 4 gRNAs n=3; CRISPRi: 2 gRNAs n=2). Statistical significance was determined using two-tailed Student's t-test.

Supplementary Figure 12 Epigenome editing of the C2CD4A/B hub.

a, Islet pcHi-C assigns C2CD4A and C2CD4B as gene targets of three enhancers containing T2D-associated variants (vertical yellow stripes) in the C2CD4A/B locus. pcHi-C and virtual 4C representations are from pooled human islet samples. b, Analysis of VPS13C, C2CD4A and C2CD4B mRNA after CRISPRa or CRISPRi targeting of their transcriptional start sites or of three islet enhancers bearing T2D-FG variants in EndoC-βH3 cells. Data are presented as means ± s.e.m. (CRISPRa: 4 gRNAs n=3 experiments; CRISPRi: 4 gRNAs n=2 experiments). Statistical significance was determined using two-tailed Student's t-test.

Supplementary Figure 13 Epigenome editing of the GLIS3 hub.

a, Islet pcHi-C virtual 4C representations from pooled samples, showing the T1D/T2D-associated locus GLIS3. The inset shows the enhancer bearing rs4237150. b, Luciferase assays in EndoC-βH3 cells show haplotype-dependent activity of the rs4237150-enhancer. Data are means ± s.d. (n=3 independent experiments with 4-6 independent transfections). Statistical significance: two-tailed Student's t-test. c, Analysis of GLIS3, RFX3 and RFX3-AS1 mRNA upon deletion of rs4237150-enhancer or control regions. Data are presented as means ± s.e.m. (2 pairs of gRNAs per target region, n=3 experiments each). Statistical significance: two-tailed Student's t-test. d, Analysis of predicted target gene transcripts after CRISPRa or CRISPRi targeting of the GLIS3 transcriptional start site or the rs4237150-enhancer in EndoC-βH3 cells. Data are means ± s.e.m. (enhancer CRISPRa: 3 gRNAs n=3 experiments; CRISPRi: 3 gRNAs n=2 experiments). Statistical significance: two-tailed Student's t-test. e, Top-scoring GLIS3 hub model from the most populated cluster of the ensemble in human islets and total B lymphocytes. Enhancers and promoters within 200 nm GLIS3 or RFX3 promoters are colored in red and blue, respectively, or as white spheres if located further. f, Most populated community of the promoter-enhancer interaction network obtained via MCODE clustering of this locus in human islets and total B lymphocytes. Nodes represent promoters (blue) and enhancers (dark to light red for enhancer classes I to III). Edges are mean distances in most populated 3D cluster. Although GLIS3 and RFX3 are connected in a common hub, the networks suggest that they form part of separable sub-communities. g, Neighborhood connectivity distribution between the islet and total B lymphocytes networks. h, 3D distance distribution between enhancers and promoters in GLIS3 hub. Boxplots show IQRs. Statistical significance was computed using two-sample Kolmogorov-Smirnov two-sided test as described in Supplementary Fig. 10. See also Supplementary Table 16.

Supplementary Figure 14 T2D-associated variants are enriched in interacting regions and hub class I enhancers.

a,b, VSE enrichment analysis of T2D and FG (n=2,771) and breast cancer (n=3,048) variants in islet active regulatory elements (see Supplementary Dataset 1). Box plots show null distributions based on 500 permutations of matched random haplotype blocks. Each dot denotes VSE enrichment of disease-associated variants in each genomic feature. The red dot indicates significant enrichment relative to the null distribution (Bonferroni-adjusted P < 0.01). c, Breast cancer-associated variants show no enrichment in islet enhancer sub-classes. d-e, VSE enrichment analysis of T2D and FG and breast cancer SNPs in chromatin regions with high-confident pcHi-C interactions in islets. f, VSE enrichment analysis of T2D and FG-associated variants in indicated enhancer categories. All boxplots show IQRs.

Supplementary Figure 15 Class I enhancers in hubs contribute to heritability of beta cell-related traits.

a-e, Per-SNP heritability estimates of variants in eight islet enhancer domain subtypes calculated using summary statistics data from: a, T2D (12,931 cases, and 57,196 controls); b, acute insulin release (AIR)-in vivo glucose tolerance test (IVGTT, up to 5,567 individuals); c, insulinogenic index (OGTT, 7,807 individuals); d, HOMA-B; and e, HOMA-IR (up to ~80,000 individuals). Bars show category specific per-SNP heritability coefficients (τ c ) divided by the LD score heritability (h 2) score observed for each trait. All normalized τ c coefficients were multiplied by 107 and shown with s.e.m. τ c coefficients were estimated using stratified LD score regression, controlling for 53 functional annotation categories included in the baseline model. f, Per-SNP T2D and Attention-Deficit/Hyperactivity Disorder (ADHD, up to 55,374 individuals) heritability estimates in islet regulatory elements and Central Nervous System (CNS) annotations. τ c coefficients, normalizations by h 2 and representations are as explained in panels a-e. g, Impact of polygenic risk scores (PRS) on T2D frequency. T2D frequency (y-axis) was calculated in 40 bins, each one representing 2.5% of individuals in the UK Biobank test set. PRS values were calculated with common genetic variants in islet hub enhancers and baits (pink dots), other islet open chromatin regions (light blue dots) and in the rest of genome (black dots). h. T2D risk ratios stratified by BMI (left) and age of onset of T2D (right). Controls were censored at the age of recruitment. Boxplots show IQR of the risk ratio from 100 sets of pseudo-hubs PRS, and with whiskers 1.5 x IQR. Color dots as in g. h, T2D risk stratified by BMI and age of onset of T2D. Odds ratios (OR) for T2D were calculated for 2.5% individuals with the highest PRS vs. all other individuals via adjusted logistic regression. Boxplots show IQR of the risk ratio from 100 sets of pseudo-hubs PRS, and with whiskers 1.5 x IQR. For all panels, Z-scores define standard deviations relative to average values from pseudo-hub PRS. See also Supplementary Fig. 15 and Supplementary Table 17.

Supplementary information

Supplementary Information

Supplementary Figs. 1–15 and Supplementary Notes 1–18

Reporting Summary

Supplementary Tables

Supplementary Tables 1–17

Supplementary Datasets

Supplementary Datasets 1–11

Supplementary Video 1

Top-scoring 3D model of ISL1 enhancer hub in human islets.

Supplementary Video 2

Top-scoring 3D model of ISL1 enhancer hub in total B lymphocites.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark
Fig. 1: The promoter interactome of human pancreatic islets.
Fig. 2: Identification of target genes of islet enhancers.
Fig. 3: Identification of gene targets of T2D-relevant enhancers.
Fig. 4: Tissue-specific enhancer hubs regulate key islet genes.
Fig. 5: Tissue-specific topology of the ISL1 enhancer hub.
Fig. 6: The ZBED3 enhancer hub links an enhancer bearing a T2D SNP with multiple target genes.
Fig. 7: Islet hub variants impact insulin secretion and provide tissue-specific risk scores.
Supplementary Figure 1: pcHi-C in human pancreatic islets.
Supplementary Figure 2: pcHi-C and chromatin landscape of human islets.
Supplementary Figure 3: Definition of TAD-like domains, PATs, and enhancer-gene assignments.
Supplementary Figure 4: eQTLs support the identification of unexpected T2D target genes.
Supplementary Figure 5: Functional perturbations of CAMK1D and OPTN.
Supplementary Figure 6: Functional perturbations of TCF7L2.
Supplementary Figure 7: Functional perturbations of VEGFA and ZFAND3.
Supplementary Figure 8: Tissue-specific enhancer hubs.
Supplementary Figure 9: Alternative definitions of enhancer hubs.
Supplementary Figure 10: 3D models of enhancer hubs.
Supplementary Figure 11: Epigenome editing of hubs carrying T2D risk noncoding variants.
Supplementary Figure 12: Epigenome editing of the C2CD4A/B hub.
Supplementary Figure 13: Epigenome editing of the GLIS3 hub.
Supplementary Figure 14: T2D-associated variants are enriched in interacting regions and hub class I enhancers.
Supplementary Figure 15: Class I enhancers in hubs contribute to heritability of beta cell-related traits.