Promoter capture Hi-C-based identification of recurrent noncoding mutations in colorectal cancer


Efforts are being directed to systematically analyze the non-coding regions of the genome for cancer-driving mutations1,2,3,4,5,6. cis-regulatory elements (CREs) represent a highly enriched subset of the non-coding regions of the genome in which to search for such mutations. Here we use high-throughput chromosome conformation capture techniques (Hi-C) for 19,023 promoter fragments to catalog the regulatory landscape of colorectal cancer in cell lines, mapping CREs and integrating these with whole-genome sequence and expression data from The Cancer Genome Atlas7,8. We identify a recurrently mutated CRE interacting with the ETV1 promoter affecting gene expression. ETV1 expression influences cell viability and is associated with patient survival. We further refine our understanding of the regulatory effects of copy-number variations, showing that RASL11A is targeted by a previously identified enhancer amplification1. This study reveals new insights into the complex genetic alterations driving tumor development, providing a paradigm for employing chromosome conformation capture to decipher non-coding CREs relevant to cancer biology.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1
Fig. 2: Non-coding mutations in CREs.
Fig. 3: Mutations in CREs affect ETV1 expression.
Fig. 4: Amplification of the CRE upregulates RASL11A expression.
Fig. 5: ETV1 and RASL11A levels are associated with differential cell growth.

Data availability

Hi-C, CHi-C and histone ChIP–seq sequencing data have been deposited in the European Genome-phenome Archive (EGA) under accession number EGAS00001001946. WGS, RNA-seq, CNV and survival data for TCGA COAD and READ samples and RNA-seq data for HT29 and LoVo lines (CCLE program) were obtained from the NCI Genomic Data Commons Data Portal (see URLs). Transcription-factor ChIP–seq data were obtained from the Gene Expression Omnibus (GEO) (GSE49402). Survival data were obtained from GEO (GSE33113, GSE39582). Replication timing data were downloaded from the UCSC Genome Browser (see URLs). GTEx data (release v.6) were obtained from the GTex portal (see URLs).


  1. 1.

    Zhang, X. et al. Identification of focally amplified lineage-specific super-enhancers in human epithelial cancers. Nat. Genet. 48, 176–182 (2016).

  2. 2.

    Sur, I. & Taipale, J. The role of enhancers in cancer. Nat. Rev. Cancer 16, 483–493 (2016).

  3. 3.

    Kim, K. et al. Chromatin structure-based prediction of recurrent noncoding mutations in cancer. Nat. Genet. 48, 1321–1326 (2016).

  4. 4.

    Weischenfeldt, J. et al. Pan-cancer analysis of somatic copy-number alterations implicates IRS4 and IGF2 in enhancer hijacking. Nat. Genet. 49, 65–74 (2017).

  5. 5.

    Fujimoto, A. et al. Whole-genome mutational landscape and characterization of noncoding and structural mutations in liver cancer. Nat. Genet. 48, 500–509 (2016).

  6. 6.

    Melton, C., Reuter, J. A., Spacek, D. V. & Snyder, M. Recurrent somatic mutations in regulatory regions of human cancer genomes. Nat. Genet. 47, 710–716 (2015).

  7. 7.

    The Cancer Genome Atlas Network. Comprehensive molecular characterization of human colon and rectal cancer. Nature 487, 330–337 (2012).

  8. 8.

    Mifsud, B. et al. Mapping long-range promoter contacts in human cells with high-resolution capture Hi-C. Nat. Genet. 47, 598–606 (2015).

  9. 9.

    Weinhold, N., Jacobsen, A., Schultz, N., Sander, C. & Lee, W. Genome-wide analysis of noncoding regulatory mutations in cancer. Nat. Genet. 46, 1160–1165 (2014).

  10. 10.

    Fredriksson, N. J., Ny, L., Nilsson, J. A. & Larsson, E. Systematic analysis of noncoding somatic mutations and gene expression alterations across 14 tumor types. Nat. Genet. 46, 1258–1263 (2014).

  11. 11.

    Mansour, M. R. et al. An oncogenic super-enhancer formed through somatic mutation of a noncoding intergenic element. Science 346, 1373–1377 (2014).

  12. 12.

    Puente, X. S. et al. Non-coding recurrent mutations in chronic lymphocytic leukaemia. Nature 526, 519–524 (2015).

  13. 13.

    Javierre, B. M. et al. Lineage-specific genome architecture links enhancers and non-coding disease variants to target gene promoters. Cell 167, 1369–1384 (2016).

  14. 14.

    Rao, S. S. et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159, 1665–1680 (2014).

  15. 15.

    Jager, R. et al. Capture Hi-C identifies the chromatin interactome of colorectal cancer risk loci. Nat. Commun. 6, 6178 (2015).

  16. 16.

    Orlando, G., Kinnersley, B. & Houlston, R. S. Capture Hi-C library generation and analysis to detect chromatin interactions. Curr. Protoc. Hum. Genet. 98, e63 (2018).

  17. 17.

    Roadmap Epigenomics Consortium. et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).

  18. 18.

    Alexandrov, L. B. et al. Signatures of mutational processes in human cancer. Nature 500, 415–421 (2013).

  19. 19.

    Katainen, R. et al. CTCF/cohesin-binding sites are frequently mutated in cancer. Nat. Genet. 47, 818–821 (2015).

  20. 20.

    Martincorena, I. et al. Universal patterns of selection in cancer and somatic tissues. Cell 171, 1029–1041 (2017).

  21. 21.

    Rheinbay, E. et al. Recurrent and functional regulatory mutations in breast cancer. Nature 547, 55–60 (2017).

  22. 22.

    Imielinski, M., Guo, G. & Meyerson, M. Insertions and deletions target lineage-defining genes in human cancers. Cell 168, 460–472 (2017).

  23. 23.

    Jeon, I. S. et al. A variant Ewing’s sarcoma translocation (7;22) fuses the EWS gene to the ETS gene ETV1. Oncogene 10, 1229–1234 (1995).

  24. 24.

    Attard, G. et al. Heterogeneity and clinical significance of ETV1 translocations in human prostate cancer. Br. J. Cancer 99, 314–320 (2008).

  25. 25.

    Clark, J. P. & Cooper, C. S. ETS gene fusions in prostate cancer. Nat. Rev. Urol. 6, 429–439 (2009).

  26. 26.

    Jane-Valbuena, J. et al. An oncogenic role for ETV1 in melanoma. Cancer Res. 70, 2075–2084 (2010).

  27. 27.

    Chi, P. et al. ETV1 is a lineage survival factor that cooperates with KIT in gastrointestinal stromal tumours. Nature 467, 849–853 (2010).

  28. 28.

    Ran, L. et al. Combined inhibition of MAP kinase and KIT signaling synergistically destabilizes ETV1 and suppresses GIST tumor growth. Cancer Discov. 5, 304–315 (2015).

  29. 29.

    Grant, C. E., Bailey, T. L. & Noble, W. S. FIMO: scanning for occurrences of a given motif. Bioinformatics 27, 1017–1018 (2011).

  30. 30.

    Kulakovskiy, I. V. et al. HOCOMOCO: expansion and enhancement of the collection of transcription factor binding sites models. Nucleic Acids Res. 44, D116–D125 (2016).

  31. 31.

    Zhou, J. & Troyanskaya, O. G. Predicting effects of noncoding variants with deep learning-based sequence model. Nat. Methods 12, 931–934 (2015).

  32. 32.

    The GTEx Consortium. The genotype-tissue expression (GTEx) pilot analysis: multitissue gene regulation in humans. Science 348, 648–660 (2015).

  33. 33.

    Pistoni, M., Verrecchia, A., Doni, M., Guccione, E. & Amati, B. Chromatin association and regulation of rDNA transcription by the Ras-family protein RasL11a. EMBO J. 29, 1215–1224 (2010).

  34. 34.

    de Sousa, E. M. F. et al. Methylation of cancer-stem-cell-associated Wnt target genes predicts poor prognosis in colorectal cancer patients. Cell Stem Cell 9, 476–485 (2011).

  35. 35.

    Marisa, L. et al. Gene expression classification of colon cancer into molecular subtypes: characterization, validation, and prognostic value. PLoS Med. 10, e1001453 (2013).

  36. 36.

    The ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).

  37. 37.

    Rands, C. M., Meader, S., Ponting, C. P. & Lunter, G. 8.2% of the human genome is constrained: variation in rates of turnover across functional element classes in the human lineage. PLoS Genet. 10, e1004525 (2014).

  38. 38.

    Sizemore, G. M., Pitarresi, J. R., Balakrishnan, S. & Ostrowski, M. C. The ETS family of oncogenic transcription factors in solid tumours. Nat. Rev. Cancer 17, 337–351 (2017).

  39. 39.

    Duensing, A. Targeting ETV1 in gastrointestinal stromal tumors: tripping the circuit breaker in GIST? Cancer Discov. 5, 231–233 (2015).

  40. 40.

    Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).

  41. 41.

    Wingett, S. et al. HiCUP: pipeline for mapping and processing Hi-C data. F1000Res. 4, 1310 (2015).

  42. 42.

    Cairns, J. et al. CHiCAGO: robust detection of DNA looping interactions in capture Hi-C data. Genome. Biol. 17, 127 (2016).

  43. 43.

    Trapnell, C. et al. Transcript assembly and quantification by RNA-seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat. Biotechnol. 28, 511–515 (2010).

  44. 44.

    Schoenfelder, S. et al. The pluripotent regulatory circuitry connecting promoters to their long-range interacting elements. Genome Res. 25, 582–597 (2015).

  45. 45.

    Pollard, K. S., Hubisz, M. J., Rosenbloom, K. R. & Siepel, A. Detection of nonneutral substitution rates on mammalian phylogenies. Genome Res. 20, 110–121 (2010).

  46. 46.

    Yan, J. et al. Transcription factor binding in human cells occurs in dense clusters formed around cohesin anchor sites. Cell 154, 801–813 (2013).

  47. 47.

    Van den Eynden, J. & Larsson, E. Mutational signatures are critical for proper estimation of purifying selection pressures in cancer somatic mutation data when using the dN/dS metric. Front. Genet. 8, 74 (2017).

  48. 48.

    Johnson, W. E., Li, C. & Rabinovic, A. Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics 8, 118–127 (2007).

  49. 49.

    Grubert, F. et al. Genetic control of chromatin states in humans involves local and distal chromosomal interactions. Cell 162, 1051–1065 (2015).

  50. 50.

    Robinson, M. D., McCarthy, D. J. & Smyth, G. K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140 (2010).

  51. 51.

    Hansen, R. S. et al. Sequencing newly replicated DNA reveals widespread plasticity in human replication timing. Proc. Natl Acad. Sci. USA 107, 139–144 (2010).

  52. 52.

    Aulchenko, Y. S., Ripke, S., Isaacs, A. & van Duijn, C. M. GenABEL: an R library for genome-wide association analysis. Bioinformatics 23, 1294–1296 (2007).

  53. 53.

    Carter, H. et al. Interaction landscape of inherited polymorphisms with somatic events in cancer. Cancer Discov. 7, 410–423 (2017).

  54. 54.

    Mermel, C. H. et al. GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers. Genome. Biol. 12, R41 (2011).

  55. 55.

    Litchfield, K. et al. Whole-exome sequencing reveals the mutational spectrum of testicular germ cell tumours. Nat. Commun. 6, 5973 (2015).

  56. 56.

    Drier, Y. et al. Somatic rearrangements across cancer reveal classes of samples with distinct patterns of DNA breakage and rearrangement-induced hypermutability. Genome Res. 23, 228–235 (2013).

  57. 57.

    Heigwer, F., Kerr, G. & Boutros, M. E-CRISP: fast CRISPR target site identification. Nat. Methods 11, 122–123 (2014).

Download references


This work was supported by grants from Cancer Research UK grant (C1298/A8362), the European Union Seventh Framework Programme (FP7/207–2013) under grant 258236 and FP7 collaborative project SYSCOL, all awarded to R.S.H. This publication is supported by COST Action BM1206. CIHR funded Epigenome Mapping Centre at McGill University (EP1-120608), awarded to T.P. We acknowledge the work of The Institute of Cancer Research Tumour Profiling Unit. The results published here are in part based on data generated by TCGA established by the NCI and NHGRI. Information about TCGA and the investigators and institutions that constitute the TCGA research network can be found at

Author information

G.O., P.J.L. and R.S.H. conceived and designed the study; G.O. performed Hi-C and CHi-C experiments, luciferase assays, CRISPR experiments, and cell viability and proliferation assays; G.O. and P.B. performed 3C validation; G.O., P.J.L., A.J.C., S.E.D., D.C. and K.L. performed bioinformatics; F.H. performed ChIP–seq experiments; T.P. and J.T. contributed reagents and materials for the ChIP–seq experiments; C.S.O. designed the capture baits; and G.O., P.J.L., A.J.C., S.E.D., D.C., P.B. and R.S.H wrote the manuscript with contributions from T.P., C.S.O. and J.T. All authors reviewed the final manuscript.

Correspondence to Richard S. Houlston.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–18 and Supplementary Methods

Reporting Summary

Supplementary Tables 1–19

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Further reading