The genome of the jellyfish Clytia hemisphaerica and the evolution of the cnidarian life-cycle

Jellyfish (medusae) are a distinctive life-cycle stage of medusozoan cnidarians. They are major marine predators, with integrated neurosensory, muscular and organ systems. The genetic foundations of this complex form are largely unknown. We report the draft genome of the hydrozoan jellyfish Clytia hemisphaerica and use multiple transcriptomes to determine gene use across life-cycle stages. Medusa, planula larva and polyp are each characterized by distinct transcriptome signatures reflecting abrupt life-cycle transitions and all deploy a mixture of phylogenetically old and new genes. Medusa-specific transcription factors, including many with bilaterian orthologues, associate with diverse neurosensory structures. Compared to Clytia, the polyp-only hydrozoan Hydra has lost many of the medusa-expressed transcription factors, despite similar overall rates of gene content evolution and sequence evolution. Absence of expression and gene loss among Clytia orthologues of genes patterning the anthozoan aboral pole, secondary axis and endomesoderm support simplification of planulae and polyps in Hydrozoa, including loss of bilateral symmetry. Consequently, although the polyp and planula are generally considered the ancestral cnidarian forms, in Clytia the medusa maximally deploys the ancestral cnidarian–bilaterian transcription factor gene complement. The genome of the jellyfish Clytia hemisphaerica and the transcriptomes of its planula, polyp and medusa life stages show genetic changes associated with the evolution of complex life cycles.

considered the ancestral cnidarian forms, in Clytia the medusa maximally deploys ancestral cnidarian-bilaterian transcription factor gene complexity.
In most cnidarians a ciliated, worm-like planula larva settles to produce a polyp. In Anthozoa (corals and anemones), the polyp is the sexually reproductive form, but in the Medusozoa branch of Cnidaria, polyps generally produce sexually reproductive jellyfish by a process of strobilation or budding. Jellyfish (medusae) are gelatinous, pelagic, radially symmetric forms found only in the medusozoans. They show complex physiology and behaviour based on neural integration of welldefined reproductive organs, digestive systems, locomotory striated muscles and sensory structures, even camera eyes in some species. Medusae in many species show some degree of nervous system condensation, notably the nerve rings running around the bell margin [1]. Historical arguments that the medusa is the ancestral cnidarian form are no longer widely supported, as recent molecular phylogenies suggest Anthozoa and Medusozoa are sister groups, favouring a benthic, polyp-like adult ancestor on parsimony grounds [2,3] (supplementary note 1). Candidate gene expression studies have shown parallels between medusa and polyp development [4], and transcriptome comparisons between species with and without medusae have extended candidate gene lists [5], but in general the genetic foundations of complex medusa evolution within the cnidarian lineage are not well understood.
There are four classes of Medusozoa: the Cubozoa (box jellyfish), Scyphozoa (so-called 'true' jellyfish), Staurozoa (stalked 'jellyfish') and Hydrozoa [2,6]. Life cycles in different medusozoan lineages have undergone frequent modifications, including loss of polyp, planula and medusa stages. Hydra, the classical model of animal regeneration, is a hydrozoan characterized by the loss of the planula and medusa stages from the life-cycle. Compared to anthozoan genomes [7][8][9], the Hydra genome is highly diverged and dynamic; it may therefore be untypical of the medusozoa and even the Hydrozoa [10]. Here we describe the genome of Clytia hemisphaerica, a hydrozoan with a typical medusozoan life-cycle, including planula, polyp and medusa stages (Fig. 1). Clytia is easy to maintain and manipulate, and amenable to gene function analysis including Cas9 mediated mutation, allowing mechanistic insight into cellular and developmental processes [6,11,12]. We analyse transcriptomes from all life cycle forms, illuminating the evolution of the planula, polyp and medusa, and demonstrate how the gene complement of the cnidarian-bilaterian ancestor provided the foundation of anatomical complexity in the medusa.

Characteristics of the Clytia genome
We sequenced the Clytia hemisphaerica genome using a whole genome shotgun approach (see methods, tab. S1, fig. S1), giving an assembly with overall length of 445Mb. Staining of DNA in prophase oocytes shows the genome is packaged into 15 chromosome pairs ( fig. S2). We predicted transcript sequences using expressed sequence reads from a comprehensive set of stages and tissues as well as deep sequencing of mixed-stage libraries (tab. 1) giving 26,727 genes and 69,083 transcripts. BUSCO analysis of the presence of 'Universal Single Copy Orthologs' indicates a coverage of 86%, comparable or better than many other recently sequenced nonbilaterian animals (tab. S1). The GC content is 35%, higher than Hydra (29%, [10]) but lower than the anthozoan Nematostella (39%, [7]), with a repeat content of ~41%. Reads mapped to the genome suggested a polymorphism frequency of ~0.9%, likely an underestimate of heterozygosity in wild populations, as genomic DNA and mRNA for transcriptomes was derived from self-crossed laboratory-reared Clytia Z strains -see methods. The complete mitochondrial genome showed the same gene order as the Hydroidolina ancestor [13] (fig. S1).

Patterns of gene gain and loss
We identified groups of orthologs for a selection of animals with completely sequenced genomes, and unicellular eukaryotic outgroups (see methods). A binary matrix of orthologous group ! 2 presence or absence was then used to infer a maximum-likelihood phylogeny that recapitulated the widely accepted major groupings of animals ( fig. 2); Cnidarians were the sister group of the Bilateria, and within the Cnidaria, we recovered the expected monophyletic relationships: corals, anemones, anthozoans and hydrozoans. The hydrozoan branch lengths were the longest within the cnidarians, implying elevated rates of gene gain and loss in their lineage, although branches leading to several other species were noticeably longer, including the ecdysozoan models Caenorhabditis and Drosophila, the ascidian Ciona, as well as the ctenophore Mnemiopsis. Clytia and Hydra branch lengths were similar, suggesting that genome evolution has proceeded at comparable rates in these two hydrozoan lineages. This gene content-based phylogeny positioned sponges (represented by Amphimedon), not ctenophores (represented by Mnemiopsis), as the sister group of all other animals [14][15][16], although confirmation with additional species is required.
Among many examples of gene gain in Clytia, we could identify new multigene families and also instances of Horizontal Gene Transfers (HGT), as illustrated by a UDP Glucose Dehydrogenase (UGDH) gene ( fig. S3). UGDH is required for the biosynthesis of various proteoglycans and so to regulate signalling pathways during metazoan embryonic development [17]. Unusually, the Clytia genome contains two versions of the UGDH gene, including one acquired in Hydrozoa by HGT from a giant virus of the Mimiviridae family and expressed specifically during medusa formation. Interestingly, this UGDH xenolog was lost in the Hydra lineage and replaced by another UGDH acquired through HGT from bacteria (fig. S3) [10]. We also detected numerous gene duplications in the hydrozoan lineage, illustrated by the 39 Innexin gap junction genes ( fig. S4), 14 Green Fluorescent Protein (GFP) and 18 Clytin photoprotein genes ( fig. S5) found in the Clytia genome. The 4 GFPs and 3 Clytins previously reported in Clytia are transcribed from several recently duplicated genes, likely facilitating the levels of protein production needed to achieve the high cytoplasmic concentrations required for energy transfer between Clytins and GFPs [18].
of the previously described clusters of homologous developmental genes conserved between Nematostella and bilaterians, involving Wnt, Fox, NK, ParaHox or Hox family members [22,28,29] could be detected in Clytia (tab. S2) reinforcing the idea of rapid evolution of the genome organization in the common branch of Clytia and Hydra.

Elevated stage specific gene expression in medusae and polyps
Hydrozoan life cycles are characterised by abrupt morphological transitions: metamorphosis from the planula to polyp; and growth and budding of the compex medusa from gonozooid polyps. To better understand global trends in differential gene use across the life cycle we produced a comprehensive replicated transcriptome dataset from 11 samples: early gastrula embryos, planulae at 3 developmental stages, newly-metamorphosed primary polyps, 3 components of the polyp colony (the feeding gastrozooid polyps, budding gonozooids and stolon), freshly-budded baby medusae, and adult male and female medusae (Fig. 1A). We used a novel method to identify stage specific genes, fitting the distribution of log-transformed gene expression for a given library to the sum of two Gaussian distributions, one of which we interpret as an 'on' set and the other an 'off' set (see methods, Fig. 3C, Fig. S7) [30]. By these criteria, polyp and medusa stages expressed more genes than embryo and planula stages, with highest expression of distinct genes at the primary polyp stage (19801 genes) and lowest expression at the early gastrula (13489 genes).
The majority of predicted genes, 84% (22472/26727) are expressed in at least one of our sampled stages (see methods; note that our gene prediction protocol includes data from deep sequencing of other mixed libraries), and 41% (10874/26727) are expressed at all stages. We defined genes as specific to a life-cycle stage if they were 'on' in at least one of the component samples for each main stage (Planula: early gastrula, 1, 2 or 3 day old planula; Polyp: stolon, primary polyp, or polyp head; Medusa: baby, mature or male), and 'off' in all other stages. We also identified genes specifically not expressed at either planula, polyp or medusa stages and so expressed at the two other stages. By these criteria, we found 335 planula, 1534 polyp and 808 medusa specific genes, and 1932, 284 and 981 genes specifically not expressed at planula, polyp and medusa stages respectively. We further filtered these data, requiring that genes also show significant (P<0.001, as defined by DESeq2 contrasts) expression differences between stages defined as 'on' and other stages, allowing a rigorous treatment of the variance between biological replicates. The application of this comparative expression criterion to the initial "stage-specific" gene sets derived by independent "on-off" determination for each sample, reduced the lists to 183 planula, 783 polyp and 614 medusa specific genes, and 1126, 180 and 374 genes specifically not expressed at planula polyp and medusa stages respectively. We can thus conclude that the two adult stages in the Clytia life cycle show greater complexity of gene expression than the planula larva.
In order to determine whether the medusa stage was enriched in genes found only in the medusozoan clade, as might plausibly be expected of an evolutionary novelty, we combined these lists of stage specific genes with a phylogenetic classification. We classified stage specific genes by the phylogenetic extent of the OMA Hierarchical Orthology Group of which they were a member (see methods), or the absence of significant sequence similarity to other species (for 'Clytia ! 4 specific' sequences), and tested for enrichment of phylogenetic classes in each life-cycle stage (fig. S8). By these criteria, all three main life cycle stages (planula, polyp and medusa) were enriched in 'Clytia specific' sequences, indicating that phylogenetically 'new' genes are more likely than 'old' genes to show stage-specific expression, but are not associated with any one life cycle phase. In general, genes that evolved after the cnidarian/bilaterian split were more likely to be expressed specifically in adult (polyp/medusa) stages.

Stage specific transcription factors
To address the nature of the molecular differences between stages, we assessed enrichment of Gene Ontology (GO) terms in stage specific genes relative to the genome as a whole. Planula larvae were found to be significantly enriched in G-protein coupled receptor signalling components, while polyp and medusa were enriched in cell-cell and cell-matrix adhesion class molecules (see tab. S3). Medusa specific genes were unique in being significantly enriched in the "nucleic acid binding transcription factor activity" term.
Confirming the strong qualitative distinction in gene expression profiles between planula, polyp and medusa (see above Fig. 3a and 3b) clustering of transcription factor expression profiles recovers the three major life-cycle stages (Fig. 4A). The majority of transcription factors (TFs) that were specific to a particular stage, using the criteria described above, were specific to the medusa (33, of which 11 are plausibly sex specific, see tab. S4). 12 were polyp specific (e.g. VSX, two HMX orthologs) and 16 were polyp-medusa specific, such that a total of 63 TFs were expressed at polyp and/or medusa stages but not at the planula stage (12.5% of the total TF). Only 3 TFs showed expression specific to the planula (TBX8 [31], a zinc finger containing protein and another T-box containing protein, neither obviously orthologous to other animal genes), while 8 were planula-polyp and 7 planula-medusa specific. This pattern is even more striking in the case of the 64 total homeodomain-containing TFs: 9 were medusa specific, 6 polyp specific and 5 polypmedusa specific, with a total of 31% TF expressed at polyp and/or medusa stages but not at the planula stage. No homeodomain-containing TFs were identified as planula specific, with 1 specific to planula-polyp and 2 to planula-medusa.
Among TFs expressed strongly in the medusa, but poorly at planula stages, we noted a large number with known involvement neural patterning during bilaterian development (Medusa only: Paraxis/TCF15, PDX/XLOX, CDX, TLX, Six1/2, DRGX, FOXQ2 paralogs; Polyp and Medusa: Six3/6, FoxD, FezF, OTX paralogs, HMX, TBX4/5, DMBX, NK2, NK6, Neurogenin1/2/3). We performed in situ hybridization analysis in medusae for a selection of these TF genes, and detected major sites of expression in the manubrium, gonads, nerve rings and tentacle bulbs (Fig. 4c,d). Within these structures, which are involved in mediating and coordinating feeding, spawning and swimming in response to environmental stimuli [1,32,33], expression showed a variety of distinct patterns, including scattered cell populations with putative sensory or effector roles. The variety of cellular distributions indicates a degree of complexity not previously described at the molecular level. We propose that, in Clytia, expression of conserved TFs in the medusa is associated with diverse cell types, notably with the neural and neurosensory functions of a complex nervous system, with continuous expression of certain transcription factors in post-mitotic neurons being necessary to maintain neuronal identity [34]. Members of the Sox, PRDL and Achaete scute (bHLH subfamily) orthology groups, commonly associated with neurogenesis [35,36] are detectable across all life cycle stages in Clytia, so our results are unlikely simply due to a higher production of nerve cells in the medusa.

Discussion
Three lines of evidence suggest that the Clytia genome has undergone a period of rapid evolution since the divergence of Hydrozoa from their common ancestor with Anthozoa ( fig. 5). Firstly, rates of amino acid substitution appear to be elevated in hydrozoan relative to anthozoan cnidarians [44]. Secondly, orthologous gene content analysis shows that hydrozoans have the longest branches within the Cnidaria, with elevated rates of gene gain and loss ( fig. 2). Thirdly, analysis of adjacent gene pairs shows more conservation between Anthozoa and Branchiostoma (as a representative of the Bilateria) than between Hydrozoa and Branchiostoma.
Gene expression analysis and lost developmental genes point to secondarily simplified planula and polyp structures in Clytia. The planula larva, in particular, shows an absence of key apical (i.e. aboral/anterior) and endomesoderm patterning genes considered ancestral on the basis of shared expression patterns in Anthozoa and bilaterian larvae [37,38,42]. Similarly, several genes with roles in patterning the directive axis of the anthozoan planula [21,45] are lost from the Clytia and Hydra genomes (Chordin, Hox2, Gbx, Netrin, unc-5), providing support for loss of bilaterality in medusozoans [24]. Much of the directive axis-patterning gene expression lost in Clytia planulae is, in Nematostella, likely involved in differentiating structures -mesenteries -that are maintained in the adult polyp, supporting the idea that the simple state of the Clytia polyp is secondary.
The medusa stage, as well as being morphologically complex, expresses a notable number of transcription factors that are conserved between cnidarians and bilaterians. These genes are expressed either specifically in the medusa (eg. DRGX, Twist and Pdx), or in both polyp and medusa but not planula stages (eg. Six3/6, Otx and FoxD), with medusa expression patterns suggesting roles in establishment or maintenance of neural cell-type identity. Hydra has lost the medusa from its life-cycle and has lost orthologs of most transcription factors that in Clytia are expressed specifically in the medusa, further supporting the notion that these genes are regulating the identity of cells now restricted to the medusa.
We propose then that, in part, the rapid molecular evolution we observe at the genome scale in Hydrozoa is connected as much to the simplified planula and polyp as the more obvious novelty of the medusa. Genomic and transcriptomic studies of additional medusozoan species, including scyphozoans, which produce medusae by strobilation and whose polyps are less simple than those of hydrozoans, may show whether the expansion of cell type and morphological complexity in the medusa phase, involving both old and new genes, has similarly been offset by reduction of gene usage in planula and polyp stages in other medusozoan lineages.

Animals and extraction of genomic DNA.
A 3-times self-crossed strain (Z4C) 2 (male) was used for genomic DNA extraction, aiming to reduce polymorphisms. The first wild-type Z-strain colony was established using jellyfish sampled in the bay of Villefranche-Sur-Mer (France). Sex in Clytia is influenced by temperature [46] and some young polyp colonies can produce both male and female medusae. Male and female medusae from colony Z were crossed to make colony Z 2 . Two further rounds of self-crossing produced (Z4C) 2 (see fig. S1 for relationships between colonies). For in situ hybridization (and other histological staining) we used a female colony Z4B, a male colony Z10 (offspring of (Z4C) 2 x Z4B) as well as embryos produced by crossing Z10 and Z4B strains. (Z4C) 2 , Z4B and Z10 are maintained as vegetatively growing polyp colonies.
For genomic DNA extraction mature (Z4C) 2 jellyfish were cultured in artificial sea water (RedSea Salt, 37‰ salinity) then in Millipore filtered artificial sea water containing penicillin and streptomycin for 3 to 4 days. They were starved for at least 24 hours. Medusae were snap frozen in liquid nitrogen, ground with mortar and pestle into powder then transferred into a 50 ml Falcon tubes (roughly 50~100 jellyfish/tube). About 20 ml of DNA extraction buffer (200 mM Tris-HCl pH 8.0 and 20 mM EDTA, 0.5 mg/ml proteinase K and 0.1% SDS) were added and incubated at 50°C for 3 hours until the solution became uniform and less viscous. An equal volume of phenol was added, vortexed for 1 minute, centrifuged for 30 minutes at 8000 g, then supernatant was transferred to a new tube. This extraction process was repeated using chloroform. X1/10 volume of 5M NaCl then 2.5 volumes of ethanol were added to the supernatant before centrifugation for 30 min at 8000 g. The DNA precipitate was rinsed with 70% ethanol, dried and dissolved into distilled water. 210 µg of DNA was obtained from 270 male medusae.

Genome sequencing and assembly.
Libraries for Illumina & 454 sequencing were prepared by standard methods (full details in supplementary methods file 1).
Sequence files were error corrected using Musket [47] and assembled using SOAPdenovo2 [48] with a large kmer size of 91 in an effort to separate haplotypes at this stage. We subsequently used Haplomerger2 to collapse haplotypes to a single more contiguous assembly [49]. Further scaffolding was done with L_RNA_Scaffolder using Trinity predicted transcript sequences (see below) [50,51].

RNA extraction, transcriptome sequencing and gene prediction.
RNA samples were prepared from Z4B female and Z10 male medusae and polyps, as well as embryos generated by crossing these medusae. Animals were starved for at least 24 hours before extraction and kept in Millipore filtered artificial sea water containing penicillin and streptomycin. Then they were put in the lysis buffer (Ambion, RNAqueous ® MicroKit), vortexed, immediately frozen in liquid nitrogen, and stored at -80°C until RNA preparation.

! 7
Total RNA was prepared from each sample using the RNAqueous® Microkit or RNAqueous® (Ambion). Treatment with DNase I (Q1 DNAse, Promega) for 20 min at 37°C (2 units per sample) was followed by purification using the RNeasy minElute Cleanup kit (Qiagen). See tab. S5 for total RNA (evaluated using Nanodrop). RNA quality of all samples was checked using the Agilent 2100 Bioanalyzer. The samples used to generate the expression data presented in fig. 3A are described in tab. S5. For the 'mix' sample , purification of mRNA and construction of a non-directional cDNA library were performed by GATC Biotech®, and sequencing was performed on a HiSeq 2500 sequencing system (paired-end 100 cycles). For the other samples, purification of mRNA and construction of a non-directional cDNA library were performed by USC Genomics Center using the Kapa RNA library prep kit, and sequencing was performed using either HiSeq 2500 (single read 50 cycles) or NextSeq (single read 75 cycles).

Transcriptome assembly
We constructed a de novo transcriptome assembly without recourse to genomic data, assembling all RNA-Seq libraries with Trinity [50].

Gene prediction
Genes were predicted from transcriptome data. Using tophat2, we mapped single end RNA-seq reads from libraries of early gastrula; 1, 2 and 3 day old planula; stolon, polyp head, gonozooid, baby medusa, mature medusa, male medusa, growing oocyte and fully grown oocyte to the genomic sequence [52]. In addition we mapped a mixed library made from the above samples but sequenced with 100bp paired end reads, and a further mature medusa library (100bp paired end). Genes were then predicted from these mappings using cufflinks and cuffmerge. Proteins were predicted from these structures using Transdecoder, and the protein encoded with the most exons taken as a representative for gene level analyses.

Protein data sets.
Protein analyses. We constructed a database of metazoan protein coding genes from complete genomes, including the major bilaterian phyla, all non-bilaterian animal phyla (including 6 cnidarian species) and unicellular eukaryotic outgroups. For the majority of species, we used annotation from NCBI, and selected one representative protein per gene, to facilitate subsequent analyses (tab. S6). We used the proteins as the basis for an OMA analysis to identify orthologous groups [53]. We converted the OMA gene OrthologousMatrix.txt file into Nexus format with datatype=restriction and used it as the basis for a MrBayes analysis, using corrections for genes present in fewer than 2 taxa 'lset coding=noabsencesites|nosingletonpresence', as described in [14]. This tree was then used in a subsequent OMA run to produce Hierarchical Orthologous Groups (HOGs). These HOGs were used as the basis for the phylogenetic classification of Clytia genes into one category out of eukaryotic, holozoan, metazoan, planulozoan, cnidarian or hydrozoan, based on the broadest possible ranking of the constituent proteins. Genes were presumed to have evolved in the most recent common ancestor of extant leaves, and leaves under this node where the gene was not present were presumed to be losses, with the minimum number of losses inferred to explain the observed presence and absence. Clytia specific genes were identified as those whose encoded proteins had no phmmer hits to the set of proteins used in the OMA analysis.
Where specific genes are named in the text, orthology assignments were taken from classical phylogenetic analysis (or in a few cases pre-existing sequence database names). Signature domains (e.g. Homeobox, Forkhead, T-box, HLH) were searched against the protein set using Pfam HMM models and hmmsearch of the hmmer3 package, with the database supplied 'gathering' threshold cutoffs [54,55]. Sequence hits were extracted and aligned with MAFFT [56] and a ! 8 phylogeny reconstructed using RAxML with the LG model of protein evolution and gamma correction [57].
Transcription factors were assigned via matches beneath the 'gathering' threshold to Pfam domains contained in the transcriptionfactor.org database [58], with the addition of MH1, COE1_DBD, BTD, LAG1-DNAbind and HMG_Box.
Synteny analyses. Genes were ordered on their scaffolds (using the GFF files described in tab. S6) based on the average of their start and end position, and for each gene, the adjacent genes recorded, ignoring order and orientation but respecting boundaries between scaffolds (i.e. terminal genes had only one neighbour). Between species comparisons were performed using the Orthologous Groups from OMA, to avoid ambiguity from 1:many and many:many genes. When both members of an adjacent pair in one species were orthologous to the members of an adjacent pair in the other species, two genes were recorded as being involved in a CAPO (Conserved Adjacent Pair of Orthologs). A consecutive run of adjacent pairs (i.e. a conserved run of 3 genes) would thus be two pairs, but count as 3 unique genes.
RNA-Seq analyses. RNA-seq reads were aligned to the genome using STAR [60]. Counts of reads per gene were obtained using HTSeq-count. Gene level counts were further analyzed using the DESeq2 R package [61]. An estimate of the mode of row geometric means (rather than the default median) was used to calculate size factors. PCA and heatmaps were generated using regularized logarithms of counts (with blind = F). Bootstrapped hierarchical clustering was performed with pvclust using the default parameters [62]. Significant differences in gene expression were calculated via pairwise contrasts of different 'conditions' (replicated libraries). Planula stages were defined as any of 1,2 or 3 day old planula; Polyp any of stolon, primary polyp, polyp head or gonozooid; medusa any of gonozooid, baby medusa, mature medusa or male medusa. In order to be considered 'up' in planula, polyp or medusa, a gene needed to be significantly up (lfc threshold = 0.0, alt hypothesis = 'greater') in at least one 'condition' of that stage relative to all 'condition' of different stages.
Stage specific expression. To identify genes whose expression is restricted to particular stages we developed a two component pipeline: Firstly we discriminated 'on' vs 'off' genes independently for each stage as defined by absolute expression values, and then we assessed statistically significant differences between stages (which may not be between 'on' and 'off'). To estimate for each sample the relative proportions of genes in the on vs. off categories, following Hebenstreit and Teichmann [30] we fitted each of our length normalized rlog -transformed gene expression datasets, averaged over replicates, to a mixture of two Gaussian distributions using the mixmodel R package [63]. The total number of 'on' genes for a given stage is estimated by multiplying the mixing proportion (lambda) of the 'on' peak by the total number of genes. Individual genes were defined as 'on' if they had a posterior probability > 0.5 of coming from the more highly expressed distribution. This approach avoids choosing an arbitrary "FPKM" (Fragments per Kilobase per Million mapped reads) value as an indicator of expression. Our frequency-distribution based approach defines gene 'on' or 'off' states independently of the total numbers of distinct transcripts expressed in a given sample, unlike FPKM values which are a measure of concentration and so for similarly expressed genes will be relatively higher in samples with low complexity.
GO term enrichment. GO terms were assigned via sequence hits to the PANTHER database using the supplied 'pantherScore2.0.pl' program. Term enrichment was tested using the 'Ontologizer' software with a 'Parent-Child-Union' calculation (the default) and Bonferroni multiple testing correction [64].
Historically, some authors considered the medusa form to be the ancestral state of the cnidarians, with the anthozoans having lost this stage. Under this scenario hydrozoans are the earliest branching cnidarians (i.e. the sister group of other cnidarian taxa) because of their more radial (and hence assumed simpler) morphology, and consequently medusozoans are paraphyletic (to use modern terminology). Anthozoans would then have evolved from within the medusozoan group, and so must have lost the medusa stage [65]. Aside from debatable assumptions, this scenario has no support from molecular phylogenies [2,3]. More recently, some selected gene expression patterns have been used to support the idea of homology between the mesoderm of bilaterians and the entocodon of developing hydrozoan medusae, a 'third' cell layer that forms between ectoderm and endoderm during the early stages of medusa bud development [66][67][68]. Again, such arguments are not easily reconciled with modern phylogenies. Heatmap showing major classes of transcription factor differential expression. Gene names are taken from orthology assignments of this work, with the suffix -like indicating assignment to a class but no precise orthology. In cases where assignment was not clear, names preceded by * are taken from best human Blast hits. Those followed by ** are genes for which an assignment in Clytia existed in Genbank prior to this study (e.g. Cdx /Drgx). Trailing numbers are unique gene identifiers from this project. Expression units are rlog values from DESeq2 with the row means subtracted. Library names are as described in fig. 1  Hoechst staining (blue) of DNA in Clytia oocytes dissected from isolated gonads kept overnight in the dark and fixed, 35 minutes after exposing to light, for antitubulin immunofluorescence in red. The oocytes in a) and b), fixed using formaldehyde, are in the resting, meiotic prophase I arrested state, and t he duplicated chromosome pairs are clearly distinct in the dark oocyte nucleus (GV, Germinal Vesicle). The oocyte in c), fixed in cold methanol, has begun the maturation process, and the chromosomes are gathering on a microtubule aster. In each of 5 oocytes in which the Hoechst-stained lumps were clearly distinct, 15 chromosome packets were counted. Maximum projections of z stacks covering the whole GV or chromosome cluster are shown. Scale bar 50µm.