Abstract
The transcription factor Runx1 is essential for definitive haematopoiesis, and the RUNX1 gene is frequently translocated or mutated in leukaemia. Runx1 is transcribed from two promoters, P1 and P2, to give rise to different protein isoforms. Although the expression of Runx1 must be tightly regulated for normal blood development, the mechanisms that regulate Runx1 isoform expression during haematopoiesis remain poorly understood. Gene regulatory elements located in non-coding DNA are likely to be important for Runx1 transcription. Here we use circular chromosome conformation capture sequencing to identify DNA interactions with the P1 and P2 promoters of Runx1, and the previously identified +24 enhancer, in the mouse multipotent haematopoietic progenitor cell line HPC-7. The active promoter, P1, interacts with nine non-coding regions that are occupied by transcription factors within a 1 Mb topologically associated domain. Eight of nine regions function as blood-specific enhancers in zebrafish, of which two were previously shown to harbour blood-specific enhancer activity in mice. Interestingly, the +24 enhancer interacted with multiple distant regions on chromosome 16, suggesting it may regulate the expression of additional genes. The Runx1 DNA contact map identifies connections with multiple novel and known haematopoietic enhancers that are likely to be involved in regulating Runx1 expression in haematopoietic progenitor cells.
Similar content being viewed by others
Introduction
Runx1 is a key regulator of haematopoietic development. Deletion of Runx1 in mouse embryos is lethal in embryonic stage (E) 12.5 due to the complete absence of definitive blood cell progenitors accompanied by extensive haemorrhaging1,2. Runx1 is crucial for haematopoietic stem cell (HSC) emergence and maintenance during development3, since conditional ablation of Runx1 in adult mice results in HSC exhaustion4. In acute myeloid leukaemia (AML) and myelodysplastic syndrome, RUNX1 function is frequently altered through mutations or translocations5, resulting in dysregulation of its target genes. While mutations directly affecting the RUNX1 protein are common in leukaemia, mutations in regulatory elements that affect RUNX1 expression remain enigmatic. As yet unidentified mutations in regulatory elements, such as enhancers, could alter Runx1 expression, resulting in abnormal haematopoiesis.
Runx1 is transcribed from two promoters, P1 and P2, to give rise to different protein isoforms6. Expression of these isoforms is tightly controlled during haematopoiesis. At the onset of mouse haematopoiesis (E7.5), preceding the generation of HSCs, expression of the P2 isoform(s) is predominant7,8. P1 is expressed soon after P2, and its expression is synchronised with the generation of HSCs7,9. P1 expression is predominant in the mouse fetal liver, the main site of definitive haematopoietic stem/progenitor cell (HSPC) development from E12.5 onward9.
Regulatory elements, such as enhancers, can control the expression of genes via long-range chromatin interactions10. One previously identified Runx1 enhancer is located 24 kb downstream of the P1 transcriptional start site11,12. The +24 enhancer (also known as the +2313, or the +23.512 enhancer) is active in HSCs that express Runx1 during mouse embryogenesis11,12,13. The human equivalent of the +24 enhancer (+32 kb downstream of P1) directly contacts the promoters of RUNX1 in leukaemia cell lines14. In addition to the +24 enhancer, putative regulatory elements for RUNX1 have been identified upstream of RUNX1-P1 and between P1 and P2; however, whether they directly contact the RUNX1 promoters has not been investigated15,16.
Here we used circular chromosome conformation capture sequencing (4C-seq) to identify regulatory elements that interact with an active Runx1 P1 promoter, versus an inactive P2 promoter. While Hi-C provides genome-wide contact profiles, and Capture Hi-C enriches for interactions with preselected genomic features (usually promoters), 4C-seq can generate very high resolution contact profiles from ‘baits’ of particular interest17. Therefore, 4C-seq can yield richer information about a selected genomic region than Hi-C or Capture Hi-C. HPC-7 is a well characterised mouse HSPC line with genomic annotations, including transcription factor (TF) binding, histone modifications and chromatin accessibility18,19,20. 4C-seq in HPC-7 cells identified nine haematopoietic enhancers that interact with the P1 promoter and +24 enhancer, and that are occupied by haematopoietic TFs. Four of these were previously identified, three of which were functionally tested and two out of the three showed activity during mouse haematopoiesis15. Here we assessed the activity of all nine enhancers in zebrafish, and show that eight are active in zebrafish haematopoiesis. Further, the +24 enhancer was highly interactive both within a topologically associated domain (TAD) harbouring Runx1, as well as with loci outside the TAD. Collectively, our results point to the formation of a local ‘active chromatin hub’ controlling Runx1 expression in haematopoietic cells.
Results
We first confirmed that P1 is actively expressed in HPC-7 cells, while P2 is silent (Supplementary Fig. S1). 4C-seq in HPC-7 cells using the P1 and P2 promoters and the +24 enhancer as ‘baits’ identified genomic interactions at Runx1 (Fig. 1a). Bait locations were designed taking into account cohesin and CTCF binding sites near both promoters and the +24 enhancer. 4C baits were designed to regions of interest (P1, +24, P2), allowing for comparison of interactions between the active P1 promoter and inactive P2 promoter, with secondary baits located at nearby cohesin/CTCF (cc) binding sites (P1cc, +24cc, P2cc) (Fig. 1a).
4C-seq in HPC-7 cells confirms the presence of a 1.1 Mb domain harbouring Runx1
For each bait, two replicate 4C-seq libraries and one control library were sequenced. Reads were predominantly located within a 1.1 Mb region surrounding the Runx1 gene (Fig. 1b,c), representing a TAD harbouring Runx1. From visual inspection of the contact profile (Fig. 1b,c), domain boundaries appear to be present at the Cbr1/Setd4 genes upstream of Runx1, and Clic6 downstream (Fig. 1b). A comparison with existing Hi-C data from mouse embryonic stem cells, mouse CH12 (erythroleukaemia) cells, human GM12878 (lymphoblastoid), human K562 (myeloid leukaemia) and human IMR90 (foetal lung) cells revealed that TAD boundaries are conserved, and that our 4C data is consistent with existing Hi-C data (Supplementary Fig. S2). Most P1 and +24 interactions take place upstream of the Runx1 gene (Fig. 1b,c). In contrast, there are fewer upstream interactions from P2, while downstream interactions are retained (Fig. 1b,c). There are no other coding genes within the Runx1 domain; however, three inactive non-coding genes19, Mir802 and the long non-coding RNAs 1810053B23Rik and 1700029J03Rik, are located near the upstream border of the TAD.
Interactions from ‘cc’ baits (cohesin/CTCF binding sites) were similar to their nearby corresponding baits at P1, +24 and P2 (Fig. 1b). We investigated whether significant interactions for the ‘cc’ baits have more overlap with other cohesin or CTCF binding sites than for the non ‘cc’ baits, but did not find any difference (data not shown). The ‘cc’ baits may be too close to the corresponding P1, P2 and +24 baits to resolve unique interactions, although they are separated by at least one restriction fragment.
Chromatin interactions anchored by the Runx1 P1 and P2 promoters and +24 enhancer
Most of the significant interactions for P1, P2 and the +24 enhancer occur within the 1.1 Mb domain (Fig. 1b). There are ~40–50 significant interactions for P1cc/P1 baits, ~60–75 for +24/+24cc baits, and ~30 for P2/P2cc baits (Fig. 1d). Strikingly, the +24 enhancer in particular forms many significant interactions outside the Runx1 domain (Fig. 1b,d). Many of these long-range interactions are with other genes or gene promoters on mouse chromosome 16 (Fig. 2). Expression levels of these genes in human tissues according to Genotype-Tissue Expression (GTEx, https://www.gtexportal.org/home/)21 did not reveal obvious tissue-specific expression patterns. However, among the genes contacted by +24 are Erg, a haematopoietic TF, and Tiam1 (T-cell lymphoma invasion and metastasis 1), involved in cell adhesion and cell migration. These distant connections suggest that +24 may also regulate other haematopoietic genes in adjacent domains.
Identification of haematopoietic enhancers
We hypothesised that DNA connections formed from the active P1 promoter and the +24 enhancer may correspond to haematopoietic enhancers that regulate Runx1 expression. To identify putative enhancers, we aligned significantly interacting sites with the occupancy of thirteen TFs involved in haematopoietic progenitor cell production; enhancer histone modifications and DNase I hypersensitivity sites18,19,20; and conserved non-coding elements (CNE), which were identified based on comparative genomics alignment and retroviral integration site mapping to indicate potential DNaseI hypersensitive sites11. We note that the +24 enhancer binds all thirteen haematopoietic progenitor TFs in HPC-7 cells (Fig. 3).
We selected putative enhancers based on the binding of at least six TFs involved in haematopoietic progenitor cell development (Fig. 3), and the presence of a significant interaction directly at, or within 2 kb, of the TF binding cluster. Based on these criteria, we found eight other potential enhancers within the Runx1 domain that form connections to the baits. These enhancers were named according to their distance from the P1 transcriptional start site: −371, −354, −327, −321, −303, −58, −48, and +110 (Fig. 3b). An additional interacting region at −368, a CNE, was located adjacent to a cluster of haematopoietic TFs (Fig. 3b). The +24 and +110 are also CNEs11. The putative enhancers either form distinct blocks upstream of Runx1-P1 (from −371 to −303 and −48 to −58), or fall between the P1 and P2 promoters (+24 and +110). Four out of nine of the identified putative enhancers (−327, −321, −58 and +110) were identified previously based on the binding of at least three blood TFs and the presence of a H3K27ac peak (denoting active chromatin) from a subset of the same HPC-7 datasets that we have used here15.
Eight out of nine putative enhancers (−371, −368, −354, −327, −321, −58, −48, and +110) form long-range interactions with the P1 promoter, the +24 enhancer, or both (Fig. 3b and Supplementary Table S1). The exception was −303, which does not interact with P1, but instead with the +24 and the P2 promoter. The +24 enhancer interacts promiscuously within the whole domain. Filtering at maximum stringency (see Methods) showed that the +24 enhancer connects to all putative enhancers and both Runx1 promoters (Fig. 3b). In contrast, the inactive P2 promoter connects only to −303 and +24 (Fig. 3b and Supplementary Table S1). A specific interaction between +110 and P2 could not be resolved owing to a contiguous block of significant interactions throughout the short region between +110 and P2. A model of Runx1 interactions based on the 4C-seq results is shown in Fig. 4.
Long range chromatin interactions at haematopoietic enhancers
Long-range chromatin interactions can be mediated by cohesin and CTCF22, and cohesin is involved in transcription regulation at active genes23. We found that four of the nine enhancer loci (in addition to the +24 enhancer) coincide with Rad21 (cohesin) binding in the absence of CTCF (Fig. 3b and Supplementary Table S1). This is consistent with the idea that cohesin (but not necessarily CTCF) mediates local DNA-DNA interactions within TADs22. All Rad21 binding sites interacted with at least one ‘cc’ bait, therefore cohesin could mediate at least a subset of enhancer-promoter communication events in HPC-7 cells.
We compared the 4C interactions identified with recently published Capture Hi-C data in HPC-7 cells18 (Supplementary Fig. S3). Capture Hi-C data was only available for interactions anchored at P1, and has a lower coverage and resolution than our 4C-seq study (an average of ~18,000 reads per promoter for Capture Hi-C with a 6-cutter, and over 1 million reads per bait for 4C-seq with a 4-cutter). The Capture Hi-C study in HPC-7 cells identified 15 P1-interacting regions that were reproduced in our study (Supplementary Fig. S3). All of these are upstream of Runx1-P1, and most are within the −368 to −303 enhancer cluster (Supplementary Fig. S3). Therefore, our study provides additional Runx1-anchored interactions that were not previously described as connected to Runx1 promoters (enhancers −371, −48, −58 and +110).
In vivo characterization of haematopoietic enhancers
Enhancer regions interacting with Runx1 recruit haematopoietic TFs in HPC-7 cells, therefore we determined if these regions act as enhancers in vivo. Each of the putative enhancers was tested for the ability to drive tissue-specific GFP expression in zebrafish embryos. Eight out of nine drove GFP expression specifically in the intermediate cell mass and posterior blood island, which are sites of haematopoietic progenitor cell production at 20–24 hours post-fertilisation (hpf)24 (Fig. 5a,b and Supplementary Fig. S4). Two of these (−58 and +110) were previously shown to be active during mouse haematopoiesis15. The −303 enhancer also expressed GFP in keratinocytes, particularly after 24 hpf (Figs 5c, S4). Interestingly, −303 was the only enhancer identified that interacted with the P2 promoter of Runx1, rather than P1. Despite the occupancy of multiple haematopoietic TFs in HPC-7 cells, and the presence of similar TF binding motifs compared to the other enhancers (Fig. 3 and Supplementary Tables 1 and 2, we did not observe enhancer activity for −327, consistent with a previous study in mice15.
In summary, we have assigned in vivo function to multiple putative enhancers upstream of Runx1 that were previously identified in silico 18. These regions not only drive haematopoietic expression, but also physically connect with the Runx1-P1 promoter, lending confidence to the concept that they are bona fide regulators of Runx1 transcription.
Discussion
4C analysis in HPC-7 cells generated a high-resolution connectivity map of the genomic region harbouring Runx1, and confirmed previously identified upstream connections from P1 to putative regulatory elements18. Runx1 appears to be contained within a ~1 Mb chromatin domain, consistent with Hi-C analyses in other cell types25,26. We observed multiple connections both up- and downstream from all 4C-seq baits (Runx1-P1, Runx1-P2 and +24 enhancer). Significantly, there were many more upstream connections anchored by the active elements (Runx1-P1 and +24).
Surprisingly, the +24 enhancer forms many up- and downstream connections outside of the Runx1 TAD. These comprise up to one-third of all connections formed and include other haematopoietic genes, such as Tiam1 downstream, and Erg upstream. Erg and Tiam1 dysregulation is associated with several tumor types, including AML and B- and T-cell lymphomas27,28. Strikingly, these genes are located over 2 Mb away from Runx1. These findings raise the interesting possibility that the +24 enhancer acts as a scaffold to recruit multiple promoters, enhancers and TFs over long distances in cis. Connections to +24 identified by our 4C-seq may point to the identity of some of these genes.
Our 4C-seq data identified multiple chromatin connections that intersect with some of the strongest indicators of upstream enhancers characterised by the binding of clusters of TFs and epigenetic modifications18,20. Eight out of nine of these DNA regions are able to drive haematopoietic expression in zebrafish, strongly suggesting an in vivo function. Although previous studies had identified transcriptional activity for a subset of these enhancers15, or an interaction with the P1 promoter for an overlapping subset18, no single prior study has shown that enhancers both interact with Runx1 promoters (and +24), and have transcriptional activity. Table 1 provides an overview of findings from these previous studies together with additional evidence from our study that connects spatial proximity with function. Two enhancers, termed −59 and +110, were previously shown to drive LacZ expression in mice15; these are equivalent to the −58 and +110 enhancers identified here. Our data show in addition that these regions are in contact with P1 and the +24 enhancer (Table 1). While the −327 enhancer has similar characteristics as the other enhancers identified here, neither a previous study in mice15 nor our study identified any enhancer activity for it (Table 1). The −368, −354, −327, −321 and −303 enhancers were found to interact with P1 in a Capture Hi-C study in HPC-7 cells (Table 1). Our study confirms the presence of these interactions, and in addition, shows that these enhancers also interact with either the +24 enhancer or P2 (Table 1).
The −303 enhancer interacts with P2 rather than P1 and, in addition to being active in haematopoietic sites, it drives expression in keratinocytes, indicating that it could act in a complex that keeps P2 silent; and/or that it regulates expression of Runx1 in other tissues. Interestingly, Runx1 is actively expressed in mouse keratinocytes where it is important for hair follicle development29. Neuronal cells also express Runx1 30, and there may be additional regulatory elements that control Runx1 expression in a manner distinct from haematopoietic expression. In support of this idea, we previously determined that cohesin and CTCF influence runx1 expression in haematopoietic, but not neuronal cells, in zebrafish31,32.
Cohesin and CTCF organise chromatin structure and when present in combination, they appear to negatively correlate with the HSPC TFs in HPC-7 cells18. However, we observed a coincidence of cohesin subunit Rad21 binding in the absence of CTCF with four out of nine identified enhancers, as well as the +24 enhancer. This suggests that CTCF-independent cohesin mediates a subset of enhancer-promoter looping in combination with TFs. This interpretation is consistent with previously identified CTCF-independent functions for cohesin in genome organization and transcription33. Importantly, cohesin mutations are prevalent in AML and other myeloid malignancies34, and are categorised together with RUNX1 and spliceosome mutations in a genetic category that confers poor prognosis in AML5. Cohesin mutations led to increased chromatin accessibility of Runx1 as measured by ATAC-seq.35, raising the possibility that spatiotemporal regulation of Runx1 is cohesin-dependent in mouse and human, as was previously observed in zebrafish31.
4C-seq in HPC-7 cells has provided new high resolution connectivity data that sheds light on the genomic organization of Runx1, an important haematopoietic transcription factor. These data confirm and extend previous analyses (Table 1), and furthermore, provide insight into the function of enhancers that have potential to regulate Runx1 expression. The data presented here set the scene for functional analyses to precisely determine how Runx1 is regulated, including CRISPR/Cas9-mediated interference with enhancer activity. They also provide a rationale for screening patients with myeloproliferative disorders for mutations in enhancer regions.
Methods
Cell culture
HPC-7 cells were maintained at a density of 1–10 × 105 cells/mL in Iscove’s modified Dulbecco’s media (Gibco®) supplemented with 3.024 g/L sodium bicarbonate, 10% fetal bovine serum (Moregate, New Zealand), 10% stem cell factor conditioned media and 0.15 mM monothiolglycerol (Sigma-Aldrich) as previously described36. SCF-conditioned media was obtained from culturing BHK-MKL cells maintained in Dulbecco’s modified Eagle media (Sigma-Aldrich) supplemented with 3.5 g/L glucose, 3.7 g/L Sodium Bicarbonate and 10% FBS.
4C-seq library preparation
4C library preparation was performed as previously described37 with modifications. Three libraries were generated, two replicates (passage 8 and passage 10 cells) and one control (a 1:1 mix of both replicates for which the first ligation step was omitted). Cells were cross-linked in 2% formaldehyde, 5% FBS and 1x PBS for 10 minutes at room temperature while rotating. Formaldehyde was quenched with a final concentration of 125 mM glycine for 5 minutes on ice while inverting several times. Cell pellets were washed twice with ice-cold 1x PBS.
Nuclei were harvested by lysing the cell pellets in ice-cold lysis buffer (10 mM Tris pH 8.0, 10 mM NaCl, 0.2% NP-40, protease inhibitors) for 10 minutes on ice. Nuclei were then resuspended in 1.2x DpnII restriction buffer (New England Biolabs) and 0.3% SDS and incubated for 1 hour at 37 °C while shaking. Triton X-100 was then added to a final concentration of 1.8% and the reaction was left at 37 °C while shaking for another hour. Chromatin was digested with 800 U of DpnII overnight at 37 °C while shaking. DpnII was inactivated by adding SDS to a final concentration of 1.3% and incubating at 65 °C for 20 minutes. Nuclei were diluted into a volume of 7 mL containing 1.01x T4 DNA ligase buffer (Life Technologies) and Triton-X100 at a final concentration of 1% and incubated at 37 °C for 1 hour. Ligations were carried out with 100 U of T4 ligase (Life Technologies) for 4.5 hours at 16 °C and 30 minutes at room temperature while shaking. For control libraries, ligase was omitted. Samples were proteinase K treated and reverse-crosslinked overnight at 65 °C. Samples were then treated with RNase A at 37 °C for 30 minutes. DNA was purified by phenol/chloroform extraction and ethanol precipitated.
A second digestion was performed with 25 U of BfaI (New England Biolabs) for P1 and P2 baits or MseI (New England Biolabs) for +24 baits in restriction buffer overnight at 37 °C while shaking. Restriction was inactivated by adding SDS to a final concentration of 1.3% and incubating at 65 °C for 20 minutes. A second ligation was performed in the same way as the first ligation, except that ligations were incubated overnight. DNA was purified with two phenol/chloroform extractions and one chloroform extraction followed by ethanol precipitation. DNA concentrations were measured using a Qubit® 3.0 Fluorometer (Life Technologies) and Qubit® double-stranded DNA (dsDNA) High Sensitivity Assay kit (Life Technologies).
For each bait, a total of 1 μg of DNA was amplified by PCR using Q5® High-Fidelity DNA Polymerase (New England Biolabs). Bait primer sequences are listed in Supplementary Table S3. PCR products were purified using the QIAquick PCR Purification kit (QIAGEN). DNA concentrations were measured using a Qubit® 3.0 Fluorometer and average fragment size by the 2100 Bioanalyser (Agilent Technologies) using a High Sensitivity DNA Kit (Agilent Technologies). Amplicons from the six different baits were mixed equally based on the concentration, average fragment size and ratio of demultiplexed 4C baits obtained from an initial MiSeq run. Libraries were prepared with Prep2SeqTM DNA Library Prep Kit from IlluminaTM (Affymetrix) and TruSeq® adaptors (Illumina). Libraries were mixed equimolarly and sequenced as 125 bp paired-end reads on two IlluminaTM HiSeq 2500 lanes by New Zealand Genomics Limited.
4C-seq data analysis
4C-seq data was analysed in command-line and the R statistical environment38, and visualised using the University of California, Santa Cruz (UCSC) genome browser (http://genome.ucsc.edu/) with mouse assemblies mm9 or mm1039,40, or with the R package ggplot241. Baits were demultiplexed based on bait primer sequences up to and including the digestion site using a custom awk script, allowing 0 mismatches. Only read pairs that had the forward and reverse bait sequences in the correct orientation were selected. Adapter sequences, bait sequences up to but excluding the digestion site, and bases with a Phred quality score under 20 were then trimmed from the reads, using the fastq-multx, fastq-mcf and cleanadaptors v1.24 tools42,43. Quality of reads was assessed using FastQC (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/)44.
Reads with a minimum length of 30 bp were mapped to the mm10 reference genome using Bowtie145, allowing 0 mismatches. Mapped reads were assigned to DpnII digestion fragments using fourSig46. The following reads were removed from the files: 1) self-ligated reads, 2) uncut reads (fragments adjacent to baits), and 3) reads at fragments that have at least 1 read in the control (non-ligated) library. The running mean was calculated from the sum of read counts from nine successive fragments, which was obtained using fourSig46, and was read per million normalised.
Significant interaction calling was performed using the R package fourSig with the following settings: window size of 3, 1000 iterations, fdr of 0.01, fdr.prob of 0.05 (which selects the top fifth percentile of interactions with a FDR of <0.01), and only included mappable fragments46. Significant interactions were called for two regions: 1) the whole of chr16, and 2) from chr16:92,250,000-93,635,000 (within the domain). Significant interactions in both replicates were overlapped using the bedIntersect tool from UCSC47.
In the fourSig package, significant interactions can be categorised into three categories: 1) interactions that are significant after the reads from the fragment with the highest read count is removed, 2) interactions that are significant when the fragment with the highest read count is averaged to the read counts of the neighbouring fragments, and 3) interactions that are significant only when all fragment read counts are included46. For this study, only category 1 and 2 interactions that overlap between both replicates were included, as they are more likely to represent true interactions (because they span multiple fragments), and were previously shown to be more reproducible between replicates than single-fragment interactions46. Furthermore, we distinguished category 1 interactions that overlap between both replicates from other category 1 and 2 interactions by colouring them red and orange, respectively, to visualise the most significant interactions. For conversion from assembly mm10 to mm9, the liftOver tool from UCSC was used (http://genome.ucsc.edu/)47. Gene annotations used in Figures are UCSC reference genes.
Zebrafish enhancer assay
Runx1 regulatory regions were amplified from HPC-7 gDNA or from I-SceI-zhsp70 plasmid containing the −368, +24 and +110 sequences11 (primer sequences are in Supplementary Table S3), and cloned into the zebrafish enhancer detection vector48. Plasmid purifications were performed with the NucleoSpin® Plasmid or NucleoBond® Xtra Midi prep kits (Machery-Nagel). Primers amplified the TF binding peak +/− 200 bp, except for −321 which (due to a repetitive region) did not have a 200-bp extension on the 3′ end. The −368, +24 and +110 fragments are 471, 529 and 579 bp, respectively11. A mixture of 30 pg vector DNA and 120 pg Tol2 transposase mRNA49 was injected into 1-cell zebrafish embryos. Embryos were imaged at 20–24 or ~48 hpf using a Leica M205FA stereomicroscope with a DFC490 camera and LAS software (Leica Microsystems), images were processed using Adobe Photoshop. Zebrafish were maintained as described previously50 and zebrafish handling and procedures were carried out in accordance with the Otago Zebrafish Facility Standard Operating Procedures. The University of Otago Animal Ethics Committee approved all zebrafish research under approval AEC 48/11.
Identification of conserved non-coding elements
Mouse conserved non-coding elements (mCNEs) were identified as described previously11.
ChIP-seq, Capture Hi-C and DNase I hypersensitivity data
Occupancy of the transcription factors Erg, Fli1, Scl, Runx1, Gata2, E2A, Ldb1, Lyl1, Lmo2, Gfi1b, Meis1, Myb, phospho-Stat1, Pu.1, Stat3, Eto2, Cebp-α, Cebp-β, Elf1, Nfe2, p53, cMyc, Egr1, E2f4, cFos, Mac and Jun; Rad21 and CTCF; H3K27ac and H3K4me3; DNase I hypersensitivity sites; and Capture Hi-C data in HPC-7 cells was obtained from previously published data18,19,20. Rad21, Smc3 and CTCF chromatin immunoprecipitation sequencing (ChIP-seq) data in MEL and CH12 cells were obtained from ENCODE51.
Availability of data
The 4C-seq dataset is accessible through GEO Series accession number GSE86994 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc = GSE86994).
References
Okuda, T., van Deursen, J., Hiebert, S. W., Grosveld, G. & Downing, J. R. AML1, the target of multiple chromosomal translocations in human leukemia, is essential for normal fetal liver hematopoiesis. Cell 84, 321–30 (1996).
Wang, Q. et al. Disruption of the Cbfa2 gene causes necrosis and hemorrhaging in the central nervous system and blocks definitive hematopoiesis. Proc Natl Acad Sci USA 93, 3444–9 (1996).
Swiers, G., de Bruijn, M. & Speck, N. A. Hematopoietic stem cell emergence in the conceptus and the role of Runx1. The International journal of developmental biology 54, 1151–63 (2010).
Growney, J. D. et al. Loss of Runx1 perturbs adult hematopoiesis and is associated with a myeloproliferative phenotype. Blood 106, 494–504 (2005).
Papaemmanuil, E. et al. Genomic Classification and Prognosis in Acute Myeloid Leukemia. N Engl J Med 374, 2209–21 (2016).
Levanon, D. & Groner, Y. Structure and regulated expression of mammalian RUNX genes. Oncogene 23, 4211–9 (2004).
Bee, T. et al. Nonredundant roles for Runx1 alternative promoters reflect their activity at discrete stages of developmental hematopoiesis. Blood 115, 3042–50 (2010).
Pozner, A. et al. Developmentally regulated promoter-switch transcriptionally controls Runx1 function during embryonic hematopoiesis. BMC developmental biology 7, 84 (2007).
Bee, T. et al. Alternative Runx1 promoter usage in mouse developmental hematopoiesis. Blood cells, molecules & diseases 43, 35–42 (2009).
Marsman, J. & Horsfield, J. A. Long distance relationships: enhancer-promoter communication and dynamic gene transcription. Biochimica et biophysica acta 1819, 1217–27 (2012).
Ng, C. E. et al. A Runx1 intronic enhancer marks hemogenic endothelial cells and hematopoietic stem cells. Stem Cells 28, 1869–81 (2010).
Nottingham, W. T. et al. Runx1-mediated hematopoietic stem-cell emergence is controlled by a Gata/Ets/SCL-regulated enhancer. Blood 110, 4188–97 (2007).
Bee, T. et al. The mouse Runx1 +23 hematopoietic stem cell enhancer confers hematopoietic specificity to both Runx1 promoters. Blood 113, 5121–4 (2009).
Markova, E. N., Kantidze, O. L. & Razin, S. V. Transcriptional regulation and spatial organisation of the human AML1/RUNX1 gene. J Cell Biochem 112, 1997–2005 (2011).
Schütte, J. et al. An experimentally validated network of nine haematopoietic transcription factors reveals mechanisms of cell state stability. eLife 5, e11469 (2016).
Gunnell, A. et al. RUNX super-enhancer control through the Notch pathway by Epstein-Barr virus transcription factors regulates B cell growth. Nucleic Acids Res 44, 4636–50 (2016).
Sati, S. & Cavalli, G. Chromosome conformation capture technologies and their impact in understanding genome function. Chromosoma (2016).
Wilson, N. K. et al. Integrated genome-scale analysis of the transcriptional regulatory landscape in a blood stem/progenitor cell model. Blood 127, e12–e23 (2016).
Calero-Nieto, F. J. et al. Key regulators control distinct transcriptional programmes in blood progenitor and mast cells. EMBO J 33, 1212–26 (2014).
Wilson, N. K. et al. Combinatorial transcriptional control in blood stem/progenitor cells: genome-wide analysis of ten major transcriptional regulators. Cell Stem Cell 7, 532–44 (2010).
Consortium, G. T. The Genotype-Tissue Expression (GTEx) project. Nat Genet 45, 580–5 (2013).
Merkenschlager, M. & Nora, E. P. CTCF and Cohesin in Genome Folding and Transcriptional Gene Regulation. Annu Rev Genomics Hum Genet 31, 17–43 (2016).
Dorsett, D. & Merkenschlager, M. Cohesin at active genes: a unifying theme for cohesin and gene expression from model organisms to humans. Curr Opin Cell Biol 25, 327–33 (2013).
de Jong, J. L. & Zon, L. I. Use of the zebrafish system to study primitive and definitive hematopoiesis. Annu Rev Genet 39, 481–501 (2005).
Rao, S. S. et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159, 1665–80 (2014).
Dixon, J. R. et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 485, 376–80 (2012).
Ives, J. H. et al. Increased levels of a chromosome 21-encoded tumour invasion and metastasis factor (TIAM1) mRNA in bone marrow of Down syndrome children during the acute phase of AML(M7). Genes, chromosomes & cancer 23, 61–6 (1998).
Shimizu, K. et al. An ets-related gene, ERG, is rearranged in human myeloid leukemia with t(16;21) chromosomal translocation. Proceedings of the National Academy of Sciences of the United States of America 90, 10280–4 (1993).
Ortt, K., Raveh, E., Gat, U. & Sinha, S. A chromatin immunoprecipitation screen in mouse keratinocytes reveals Runx1 as a direct transcriptional target of DeltaNp63. J Cell Biochem 104, 1204–19 (2008).
Inoue, K., Shiga, T. & Ito, Y. Runx transcription factors in neuronal development. Neural development 3, 20 (2008).
Horsfield, J. A. et al. Cohesin-dependent regulation of Runx genes. Development 134, 2639–49 (2007).
Marsman, J. et al. Cohesin and CTCF differentially regulate spatiotemporal runx1 expression during zebrafish development. Biochim Biophys Acta 1839, 50–61 (2014).
Zuin, J. et al. Cohesin and CTCF differentially affect chromatin architecture and gene expression in human cells. Proc Natl Acad Sci USA 111, 996–1001 (2014).
Leeke, B., Marsman, J., O’Sullivan, J. M. & Horsfield, J. A. Cohesin mutations in myeloid malignancies: underlying mechanisms. Exp Hematol Oncol 3, 13 (2014).
Mazumdar, C. et al. Leukemia-Associated Cohesin Mutants Dominantly Enforce Stem Cell Programs and Impair Human Hematopoietic Progenitor Differentiation. Cell Stem Cell 17, 675–688 (2015).
Pinto do, O. P., Kolterud, A. & Carlsson, L. Expression of the LIM-homeobox gene LH2 generates immortalized steel factor-dependent multipotent hematopoietic precursors. EMBO J 17, 5744–56 (1998).
van de Werken, H. J. et al. 4C technology: protocols and data analysis. Methods Enzymol 513, 89–112 (2012).
Team, R. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. (R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing., Vienna, Austria, 2016).
Kent, W. J. et al. The human genome browser at UCSC. Genome Res 12, 996–1006 (2002).
Mouse Genome Sequencing, C. et al. Initial sequencing and comparative analysis of the mouse genome. Nature 420, 520–62 (2002).
Wickham, H. ggplot2: Elegant Graphics for Data Analysis (Springer-Verlag New York, 2009).
Aronesty, E. Ea-utils: Command-line tools for processing biological sequencing data. (ExpressionAnalysis, Durham, NC, 2011).
Chatterjee, A., Stockwell, P. A., Rodger, E. J. & Morison, I. M. Comparison of alignment software for genome-wide bisulphite sequence data. Nucleic Acids Res 40, e79 (2012).
Andrews, S. FastQC A Quality Control tool for High Throughput Sequence Data.
Langmead, B., Trapnell, C., Pop, M. & Salzberg, S. L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10, R25 (2009).
Williams, R. L. Jr et al. fourSig: a method for determining chromosomal interactions in 4C-Seq data. Nucleic Acids Res 42, e68 (2014).
Kent, W. J. et al. The human genome browser at UCSC. Genome research 12, 996–1006 (2002).
Bessa, J. et al. Zebrafish enhancer detection (ZED) vector: a new tool to facilitate transgenesis and the functional analysis of cis-regulatory regions in zebrafish. Developmental dynamics: an official publication of the American Association of Anatomists 238, 2409–17 (2009).
Kawakami, K. Transposon tools and methods in zebrafish. Dev Dyn 234, 244–54 (2005).
Westerfield, M. The Zebrafish Book. A guide for the laboratory use of zebrafish (Brachydanio rerio). (University of Oregon Press, Eugene, Oregon, 1995).
Rosenbloom, K. R. et al. ENCODE data in the UCSC Genome Browser: year 5 update. Nucleic Acids Res 41, D56–63 (2013).
Acknowledgements
HPC-7 and BHK/MKL cells were kindly provided by Kathy Knezevic and John Pimanda. TruSeq® adaptors were kindly provided by Ian Morison. The authors would like to thank Anita Dunbier and Sofie Van Huffel for help with development of 4C protocols, and Noel Jhinku for expert management of the Otago Zebrafish Facility. This research was funded by the Royal Society of NZ Marsden Fund [grant number 11-UOO-027 to JAH] and the Health Research Council of NZ [grant number 15/229 to JAH].
Author information
Authors and Affiliations
Contributions
J.M., A.T., M.O., J.M.O., J.A.H., designed experiments; J.M., A.T., performed experiments; J.M., A.T., M.O., J.M.O., J.A.H. analysed data; A.T. produced the graphic for Fig. S4; J.M. and J.A.H. wrote the paper with input from the other authors.
Corresponding author
Ethics declarations
Competing Interests
The authors declare that they have no competing interests.
Additional information
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Marsman, J., Thomas, A., Osato, M. et al. A DNA Contact Map for the Mouse Runx1 Gene Identifies Novel Haematopoietic Enhancers. Sci Rep 7, 13347 (2017). https://doi.org/10.1038/s41598-017-13748-8
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-017-13748-8
This article is cited by
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.