The HoxA and HoxD gene clusters of jawed vertebrates are organized into bipartite three-dimensional chromatin structures that separate long-range regulatory inputs coming from the anterior and posterior Hox-neighboring regions1. This architecture is instrumental in allowing vertebrate Hox genes to pattern disparate parts of the body, including limbs2. Almost nothing is known about how these three-dimensional topologies originated. Here we perform extensive 4C-seq profiling of the Hox cluster in embryos of amphioxus, an invertebrate chordate. We find that, in contrast to the architecture in vertebrates, the amphioxus Hox cluster is organized into a single chromatin interaction domain that includes long-range contacts mostly from the anterior side, bringing distant cis-regulatory elements into contact with Hox genes. We infer that the vertebrate Hox bipartite regulatory system is an evolutionary novelty generated by combining ancient long-range regulatory contacts from DNA in the anterior Hox neighborhood with new regulatory inputs from the posterior side.
Subscribe to Journal
Get full journal access for 1 year
only $4.92 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Lonfat, N. & Duboule, D. Structure, function and evolution of topologically associating domains (TADs) at HOX loci. FEBS Lett. 589, 2869–2876 (2015).
Andrey, G. et al. A switch between topological domains underlies HoxD genes collinearity in mouse limbs. Science 340, 1234167 (2013).
de Laat, W. & Duboule, D. Topology of mammalian developmental enhancers and their regulatory landscapes. Nature 502, 499–506 (2013).
Gómez-Díaz, E. & Corces, V.G. Architectural proteins: regulators of 3D genome organization in cell fate. Trends Cell Biol. 24, 703–711 (2014).
Ciabrelli, F. & Cavalli, G. Chromatin-driven behavior of topologically associating domains. J. Mol. Biol. 427, 608–625 (2015).
Mallo, M. & Alonso, C.R. The regulation of Hox gene expression during animal development. Development 140, 3951–3963 (2013).
Montavon, T. et al. A regulatory archipelago controls Hox genes transcription in digits. Cell 147, 1132–1145 (2011).
Woltering, J.M., Noordermeer, D., Leleu, M. & Duboule, D. Conservation and divergence of regulatory strategies at Hox loci and the origin of tetrapod digits. PLoS Biol. 12, e1001773 (2014).
Berlivet, S. et al. Clustering of tissue-specific sub-TADs accompanies the regulation of HoxA genes in developing limbs. PLoS Genet. 9, e1004018 (2013).
Lonfat, N., Montavon, T., Darbellay, F., Gitto, S. & Duboule, D. Convergent evolution of complex regulatory landscapes and pleiotropy at Hox loci. Science 346, 1004–1006 (2014).
Dehal, P. & Boore, J.L. Two rounds of whole genome duplication in the ancestral vertebrate. PLoS Biol. 3, e314 (2005).
Lehoczky, J.A., Williams, M.E. & Innis, J.W. Conserved expression domains for genes upstream and within the HoxA and HoxD clusters suggests a long-range enhancer existed before cluster duplication. Evol. Dev. 6, 423–430 (2004).
Maeso, I. et al. An ancient genomic regulatory block conserved across bilaterians and its dismantling in tetrapods by retrogene replacement. Genome Res. 22, 642–655 (2012).
Paps, J., Holland, P.W. & Shimeld, S.M. A genome-wide view of transcription factor gene diversity in chordate evolution: less gene loss in amphioxus? Brief. Funct. Genomics 11, 177–186 (2012).
Bertrand, S. & Escriva, H. Evolutionary crossroads in developmental biology: amphioxus. Development 138, 4819–4830 (2011).
Holland, L.Z. & Onai, T. Early development of cephalochordates (amphioxus). Wiley Interdiscip. Rev. Dev. Biol. 1, 167–183 (2012).
Noordermeer, D. et al. Temporal dynamics and developmental memory of 3D chromatin architecture at Hox gene loci. eLife 3, e02557 (2014).
Noordermeer, D. et al. The dynamic architecture of Hox gene clusters. Science 334, 222–225 (2011).
Pascual-Anaya, J. et al. Broken colinearity of the amphioxus Hox cluster. Evodevo 3, 28 (2012).
Dixon, J.R. et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 485, 376–380 2012).
Ghavi-Helm, Y. et al. Enhancer loops appear stable during development and are associated with paused polymerase. Nature 512, 96–100 (2014).
Gonzalez, F., Duboule, D. & Spitz, F. Transgenic analysis of Hoxd gene regulation during digit development. Dev. Biol. 306, 847–859 (2007).
Gehrke, A.R. et al. Deep conservation of wrist and digit enhancers in fish. Proc. Natl. Acad. Sci. USA 112, 803–808 (2015).
Buenrostro, J.D., Giresi, P.G., Zaba, L.C., Chang, H.Y. & Greenleaf, W.J. Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nat. Methods 10, 1213–1218 (2013).
Ferrier, D.E., Minguillón, C., Cebrián, C. & Garcia-Fernàndez, J. Amphioxus Evx genes: implications for the evolution of the midbrain-hindbrain boundary and the chordate tailbud. Dev. Biol. 237, 270–281 (2001).
Liu, Y., Schröder, J. & Schmidt, B. Musket: a multistage k-mer spectrum–based error corrector for Illumina sequence data. Bioinformatics 29, 308–315 (2013).
Magoč, T. & Salzberg, S.L. FLASH: fast length adjustment of short reads to improve genome assemblies. Bioinformatics 27, 2957–2963 (2011).
Luo, R. et al. SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler. Gigascience 1, 18 (2012).
Huang, S. et al. HaploMerger: reconstructing allelic relationships for polymorphic diploid genome assemblies. Genome Res. 22, 1581–1588 (2012).
Haas, B.J. et al. Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biol. 9, R7 (2008).
Stanke, M. et al. AUGUSTUS: ab initio prediction of alternative transcripts. Nucleic Acids Res. 34, W435–W439 (2006).
Parra, G., Bradnam, K. & Korf, I. CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes. Bioinformatics 23, 1061–1067 (2007).
Slater, G.S. & Birney, E. Automated generation of heuristics for biological sequence comparison. BMC Bioinformatics 6, 31 (2005).
Baughman, K.W. et al. Genomic organization of Hox and ParaHox clusters in the echinoderm, Acanthaster planci. Genesis 52, 952–958 (2014).
Frazer, K.A., Pachter, L., Poliakov, A., Rubin, E.M. & Dubchak, I. VISTA: computational tools for comparative genomics. Nucleic Acids Res. 32, W273–W279 (2004).
Fuentes, M. et al. Insights into spawning behavior and development of the European amphioxus (Branchiostoma lanceolatum). J. Exp. Zool. B Mol. Dev. Evol. 308, 484–493 (2007).
Fuentes, M. et al. Preliminary observations on the spawning conditions of the European amphioxus (Branchiostoma lanceolatum) in captivity. J. Exp. Zool. B Mol. Dev. Evol. 302, 384–391 (2004).
Somorjai, I., Bertrand, S., Camasses, A., Haguenauer, A. & Escriva, H. Evidence for stasis and not genetic piracy in developmental expression patterns of Branchiostoma lanceolatum and Branchiostoma floridae, two amphioxus species that have evolved independently over the course of 200 Myr. Dev. Genes Evol. 218, 703–713 (2008).
Dekker, J., Rippe, K., Dekker, M. & Kleckner, N. Capturing chromosome conformation. Science 295, 1306–1311 (2002).
Hagège, H. et al. Quantitative analysis of chromosome conformation capture assays (3C-qPCR). Nat. Protoc. 2, 1722–1733 (2007).
Splinter, E., de Wit, E., van de Werken, H.J., Klous, P. & de Laat, W. Determining long-range chromatin interactions for selected genomic sites using 4C-seq technology: from fixation to computation. Methods 58, 221–230 (2012).
Rozen, S. & Skaletsky, H. Primer3 on the WWW for general users and for biologist programmers. Methods Mol. Biol. 132, 365–386 (2000).
Smemo, S. et al. Obesity-associated variants within FTO form long-range functional connections with IRX3. Nature 507, 371–375 (2014).
Baù, D. et al. The three-dimensional folding of the α-globin gene domain reveals formation of chromatin globules. Nat. Struct. Mol. Biol. 18, 107–114 (2011).
Russel, D. et al. Putting the pieces together: integrative modeling platform software for structure determination of macromolecular assemblies. PLoS Biol. 10, e1001244 (2012).
Saeed, A.I. et al. TM4: a free, open-source system for microarray data management and analysis. Biotechniques 34, 374–378 (2003).
Lupiáñez, D.G. et al. Disruptions of topological chromatin domains cause pathogenic rewiring of gene-enhancer interactions. Cell 161, 1012–1025 (2015).
van Weerd, J.H. et al. A large permissive regulatory domain exclusively controls Tbx3 expression in the cardiac conduction system. Circ. Res. 115, 432–441 (2014).
Vietri Rudan, M. et al. Comparative Hi-C reveals that CTCF underlies evolution of chromosomal domain architecture. Cell Rep. 10, 1297–1309 (2015).
Hou, C., Li, L., Qin, Z.S. & Corces, V.G. Gene density, transcription, and insulators contribute to the partition of the Drosophila genome into physical domains. Mol. Cell 48, 471–484 (2012).
Zhang, Y. et al. Spatial organization of the mouse genome and its role in recurrent chromosomal translocations. Cell 148, 908–921 (2012).
Bessa, J. et al. Zebrafish enhancer detection (ZED) vector: a new tool to facilitate transgenesis and the functional analysis of cis-regulatory regions in zebrafish. Dev. Dyn. 238, 2409–2417 (2009).
Kawakami, K. Transgenesis and gene trap methods in zebrafish by using the Tol2 transposable element. Methods Cell Biol. 77, 201–222 (2004).
We specially thank J. Pascual-Anaya for helping with some figures and helpful discussions. We would also like to thank F. Casares, I. Almudí and J.R. Martínez-Morales for fruitful discussions. Work was funded by grants from the Ministerio de Economía y Competitividad (BFU2013-41322-P to J.L.G.-S.; Juan de la Cierva postdoctoral contract to I.M.; BFU2014-58449-JIN to J.J.T.); the Andalusian government (BIO-396 to J.L.G.-S.; C2A (EE: 2013/2506) to D.P.D. and I.I.-A.); the European Research Council (ERC; grant 268513) to P.W.H.H. and F.M.; a European Molecular Biology Organization (EMBO) short fellowship to I.M.; the Universidad Pablo de Olavide to J.J.T.; and Conicyt 'Becas Chile' to D.A.
The authors declare no competing financial interests.
Integrated supplementary information
Supplementary Figure 1 Schematic phylogenetic tree showing the main chordate species used in the present study.
Red stars correspond to the two WGD events that occurred in the vertebrate ancestor (2R) and the extra WGD round that happened at the origin of teleost fish (3R). The asterisk indicates that, in zebrafish in contrast to other teleost species, HoxDb has been secondarily lost and only mir10 and the anterior and posterior Hox-neighboring genes still remain in this genomic region.
(a–c) Genomic organization of the mouse HoxA and HoxD and elephant shark HoxD clusters. Symbols for Hox, anterior and posterior neighboring genes are colored in black, blue and red, respectively. Several mouse long-range enhancers and their orthologs in elephant shark are represented by green bars. Red arrows indicate the two paralogous CsB enhancers that demonstrate the loss of a Hibadh2 gene from HoxA-neighboring regions and the pseudogenized remnant of mouse Jazf2. (d) VISTA plot of the Jazf2 genomic region in elephant shark (reference sequence) and mouse, showing the mouse Jazf2 pseudoexon (red asterisk) and the ancestral intronic location of mouse regulatory islands I and II. VISTA colored peaks (blue, coding; turquoise, UTR; pink, noncoding; green, mouse regulatory islands) indicate regions of at least 70 bp and ≥65% similarity. The alignment below the plot corresponds to the region indicated by the asterisk, showing several mutations in the mouse Jazf2 sequence, including splice sites (in lower case) and frameshifts.
(a) Microsynteny arrangements around the Hox clusters of different bilaterian species and the ‘ghost’ Hox locus of the placozoan Trichoplax adhaerens. Note that, because of the lack of synteny conservation, we could not infer a consensus for different vertebrate species beyond the genes included in the vertebrate ancestor reconstruction. Thus, for genes beyond these limits, the information displayed in this figure corresponds mainly to the mouse genome. (b) Conserved linkage of vertebrate posterior neighboring genes in amphioxus and non-chordate species. (c) Conserved linkage of amphioxus posterior neighboring genes in non-chordate species. Genes are represented by arrows (white, Hox clusters; blue, chordate anterior neighboring genes; red, vertebrate posterior neighboring genes; orange, amphioxus posterior neighboring genes; green, non-chordate neighboring genes linked to Hox genes in at least two species; black, non-Hox ANTP-class homeobox genes). Question marks represent genes whose status in the vertebrate ancestor could not be inferred. Slashes indicate the presence of genes not represented in the figure. Black circles represent the end of the genomic scaffold. Small black rectangles indicate the presence of predicted gene model(s) with no clear orthologs in other species and that in most cases have multiple additional copies in their corresponding genomes. The black asterisk within the Hox cluster arrow of L. gigantea indicates the reversed orientation of the last Hox posterior gene in this species.
Normalized 4C-seq profiles of the promoters for several Hox and neighboring genes in the zebrafish HoxDa region (labeled as in Fig. 2). The 4C-seq profiles corresponding to neighboring genes are indicated with large brackets at the left margin of the figure. Spider plots are color-coded as in Fig. 2. Green lines indicate the positions of the viewpoints. Dotted lines indicate the genomic region containing the HoxDa cluster. Units on the y axes correspond to normalized interacting counts.
Normalized 4C-seq profiles of the promoters for several Hox and neighboring genes in amphioxus Hox region (labeled as in Fig. 2). The 4C-seq profiles corresponding to neighboring genes are indicated with large brackets at the left margin of the figure. Spider plots are color-coded as in Fig. 2. Green lines indicate the positions of the viewpoints. Dotted lines indicate the genomic region containing the Hox cluster. Units on the y axes correspond to normalized interacting counts.
Supplementary Figure 6 Temporal dynamics of the 4C-seq interaction profiles of amphioxus Hox genes during development.
The amphioxus Hox genomic region showing one replicate for each of the 4C-seq profiles of the promoters of several Hox genes. The three different developmental stages are colored in red (8 h.p.f. gastrula), black (15 h.p.f. early neurula) and blue (36 h.p.f. larva). Dashed lines indicate the positions of the viewpoints.
(a,b) 4C data (a) were translated into distance restraints that were added to the rest of the restraints (b). (c) The genome was represented as concatenated beads of different size that represented 20 fragments. The size was proportional to the sum of the read counts. (d) Models were optimized, starting from randomized bead positions. (e) After 50,000 iterations, we selected the 200 models with the best score. These models were clustered on the basis of their RMSD. (f,g) The models from the most populated mirror image cluster were superimposed (f) and the virtual Hi-C heat map was generated by calculating the mean distance between all the beads from all the models (g).
Supplementary Figure 8 Explanatory cartoon for Hi-C comparisons using the shifting alignment approach.
The alignment of the Hi-C matrices being compared is iteratively shifted by bins of 20 kb (reaching ±25% of the total size of the matrix) to obtain a collection of ‘mock’ coefficients corresponding to misaligned Hi-C maps (from R–n to Rn), together with the coefficient of the correctly aligned comparison (R0). In this situation, R0 is expected to be the highest coefficient.
Supplementary Figure 9 Spearman's and Pearson's correlation comparisons between the experimental and virtual Hi-C data of different loci (HoxD; Med13l-Tbx3-Tbx5-Rbm19; and Wnt6-Ihh-Epha4-Pax3).
Correlations are gathered shifting the virtual Hi-C across the real Hi-C, as described in Supplementary Figure 8. Position 0 corresponds to correct alignment of the Hi-C matrices. The coefficients correspond to the values in the alignment at position 0.
(a,b) Zebrafish (a) and amphioxus (b) Spearman’s correlation coefficients between the virtual Hi-Cs obtained from the final 3D models and those resultant from the jackknife resampling experiments. For each number of viewpoints eliminated, five different combinations of viewpoint subsets were randomly generated and compared, except in those eliminating a single viewpoint, where all possible combinations were assayed. Black circles correspond to the correlation coefficients of each individual comparison (Supplementary Table 3), and red circles indicate the average for each resampling category.
(a,b) ATAC-seq profiles of the anterior (a) and posterior (b) regions showing accessible chromatin regions. The ATAC signal is depicted in black, except in cases having overlap with repetitive elements (gray) or transcriptional start sites (green). Elements tested in reporter assays are colored in blue (anterior) and red (posterior). Regions overlapping with the Evx loci ±5 kb are shaded in light red.
(a–f) Nfe2 expression pattern. No regionalized expression could be detected at the 8-cell, blastula and G3 stages. (a,b) In G5 (a) and G7 (b) embryos, Nfe2 is expressed in the mesendoderm of the blastoporal lip and in the presumptive neural plate region. (c) At the N1 stage, Nfe2 is expressed in the mesoderm and neural plate. (d) In N3 stage embryos, expression is detected in the endoderm of the pharynx and in the tailbud. (e) At the T stage, Nfe2 is expressed in the endoderm of the pharynx (forming club-shaped gland and preoral pit). (f) This expression is maintained in L1 stage larva. (g,h) Hnrnpa expression pattern. No regionalized expression could be detected at the 8-cell, blastula and G3 stages. (g) At the G5 stage, Hnrnpa is expressed in the dorsal blastoporal lip. (h) Then, in G7 embryos, expression is ubiquitous. (i) In N1 neurula, expression is detected in the mesoderm and in the neural plate. (j) In later N3 neurula stage embryos, expression is ubiquitous in the mesoderm and endoderm but a stronger level of expression is observed in the pharynx and in the tailbud. Hnrnpa is also expressed in the cerebral vesicle at this stage. (k) At the T stage, expression is observed in the whole gut, in the cerebral vesicle and in some neurons of the neural tube, as well as in the posterior notochord. (l,m) In L1 larvae, Hnrnpa is expressed in the taibud (l) and at a lower level in the cerebral vesicle, the club-shaped gland and preoral pit (m). (n–t) Cbx1-3-5 expression pattern. No regionalized expression could be detected at the 8-cell, blastula and G3 stages. (n) In G5 stage gastrulae, expression is detected in the mesendoderm of the blastopral lip and in the dorsal ectoderm. (o) At G7 stage, Cbx1-3-5 is expressed in the mesendoderm and in the presumptive neural plate. (p) In N1 neurulae, expression is observed in the mesendoderm and neural plate. (q–t) From N3 to L1, expression is similar to what is observed for Hnrnpa.
We used the upper Z score and lower Z score (uZ and lZ, represented as dashed lines in blue). Statistically significant data were defined as the ones above the uZ and below the lZ. On the basis of these boundaries, the reach window was established (in purple), an area that covers all fragments between the first (upstream) and last (downstream) fragment above the uZ. Those read counts inside the reach window above the uZ or below the lZ were translated as harmonic distance restraints (yellow), and the rest (red) were translated as harmonic lower-bound distance restraints.
(a–c) Heat map plots showing the RMSD of the 200 models compared between them. All models of each species were clustered in one of the two mirror-image clusters. Blue squares stand for an RMSD of 0 Å. Red squares are for maximum RSMD.
(a,c,e) 3D bar plots comparing the scores of different sets of bins with the maximum distance, the upper Z score and the lower Z score as parameters. The gradient of colors depends on the score, from blue (lowest) to red (highest). The score is the mean of the sum of the differences between the calculated distance for each bead from the 4C data and the measured distance in each model. The lower the score, the smaller the difference between the models and the 4C data. (b,d,f) Heat maps comparing the computed distances from the 4C data for each bead and the mean of the measured distances of the models with the best set of bins: 7,000 for maximum distance, 0.2 for uZ and −0.2 for the lZ.
About this article
Cite this article
Acemel, R., Tena, J., Irastorza-Azcarate, I. et al. A single three-dimensional chromatin compartment in amphioxus indicates a stepwise evolution of vertebrate Hox bimodal regulation. Nat Genet 48, 336–341 (2016). https://doi.org/10.1038/ng.3497
Plant super-barcode: a case study on genome-based identification for closely related species of Fritillaria
Chinese Medicine (2021)
Chinese Medicine (2020)
Nature Communications (2020)
Nature Ecology & Evolution (2019)
A conserved Shh cis-regulatory module highlights a common developmental origin of unpaired and paired fins
Nature Genetics (2018)