Non-collinear Hox gene expression in bivalves and the evolution of morphological novelties in mollusks

Hox genes are key developmental regulators that are involved in establishing morphological features during animal ontogeny. They are commonly expressed along the anterior–posterior axis in a staggered, or collinear, fashion. In mollusks, the repertoire of body plans is widely diverse and current data suggest their involvement during development of landmark morphological traits in Conchifera, one of the two major lineages that comprises those taxa that originated from a uni-shelled ancestor (Monoplacophora, Gastropoda, Cephalopoda, Scaphopoda, Bivalvia). For most clades, and bivalves in particular, data on Hox gene expression throughout ontogeny are scarce. We thus investigated Hox expression during development of the quagga mussel, Dreissena rostriformis, to elucidate to which degree they might contribute to specific phenotypic traits as in other conchiferans. The Hox/ParaHox complement of Mollusca typically comprises 14 genes, 13 of which are present in bivalve genomes including Dreissena. We describe here expression of 9 Hox genes and the ParaHox gene Xlox during Dreissena development. Hox expression in Dreissena is first detected in the gastrula stage with widely overlapping expression domains of most genes. In the trochophore stage, Hox gene expression shifts towards more compact, largely mesodermal domains. Only few of these domains can be assigned to specific developing morphological structures such as Hox1 in the shell field and Xlox in the hindgut. We did not find traces of spatial or temporal staggered expression of Hox genes in Dreissena. Our data support the notion that Hox gene expression has been coopted independently, and to varying degrees, into lineage-specific structures in the respective conchiferan clades. The non-collinear mode of Hox expression in Dreissena might be a result of the low degree of body plan regionalization along the bivalve anterior–posterior axis as exemplified by the lack of key morphological traits such as a distinct head, cephalic tentacles, radula apparatus, and a simplified central nervous system.


Materials and methods
Animal collection and cultures. Sexually mature individuals of Dreissena rostriformis were collected in the Danube river in Vienna, Austria (N 48° 14′ 45.812″, O 16° 23′ 38.145″). Collection took place between April and September 2017 and 2018, respectively. Adults were gathered from underneath stones and transferred to the laboratory. Here, the animals were washed, cleaned, and maintained in aquaria with filtered river water (FRW) at 19 °C.
Spawning of animals was induced by exposing 10-20 sexually mature specimens for 15 min to a 10 −3 M solution of serotonin (Sigma-Aldrich, Darmstadt, Germany), followed by one wash and subsequent maintenance in FRW. After approximately 30 min, up to 50% of the treated specimens started to spawn. Oocytes were collected from the water column, inseminated, and cultured in 50 ml glass beakers at 23 °C. After fertilization, water was changed every half an hour for the first three hours and then every six hours to remove excess sperm and to avoid bacterial or fungal infection.

RNA extraction and fixation of developmental stages.
Several hundred individuals of each, cleavage, trochophore, and various veliger stages were stored in RNALater (Lifetechnologies, Vienna, Austria) at − 20 °C. RNA was extracted with an RNeasy Mini Kit (Qiagen, Hilden, Germany) according to the manufacturer's instructions and was stored at − 80 °C.
Embryos and larvae from three developmental stages (gastrula, trochophore, and veliger) were fixed in 4% paraformaldehyde (PFA) in 0.1 M phosphate buffered saline (PBS) for 1 h at room temperature (23 °C). Veliger larvae aged 24 h post fertilization (hpf) or older were relaxed prior to fixation by adding cocaine crystals (Sigma-Aldrich, Darmstadt, Germany) to the FRW. All larvae were subsequently washed in 0.1 M PBS, transferred stepwise to 100% methanol, and stored at -20 °C.
Transcriptome assembly and expression profiling. RNA-seq transcription data collected from developmental stages of interest were assembled previously 41 . Gene expression levels of the 17 libraries were quantified with Kallisto (transcripts per kilobase million, TPM) 42 . Heatmaps showing relative quantitative expression of genes were plotted in R software with the heatmap.2 function from the gplots R package 43 and columns were normalized by z-score ( Fig. 1, Supplementary Table S1).
Sequence alignment and phylogenetic analysis. Amino acid sequences of Hox orthologs were retrieved from published NCBI sequences of other molluscan, lophotrochozoan, and bilaterian gene orthologs. Dreissena Hox candidates were found in the previously assembled transcriptome by blasting against confirmed ortholog sequences. Gene names and GenBank accession numbers used for phylogenetic reconstruction are available in Supplementary Table S1. Dreissena Hox and ParaHox ortholog candidates were confirmed by searching for characteristic motifs in each gene from the quagga mussel transcriptome as previously described 33,41 . Amino acid sequences of candidate Dreissena Hox genes and metazoan orthologs were aligned using MAFFT v7.123b 44 . The trimmed multiple sequence alignment used for the phylogenetic analysis was obtained with Tri-mAl v1.2 with the following parameters: -cons 20 -st 0.8 -gt 0.7-w 1 45 . Model selection with Prottest3 v3.4.2 46 through the Akaike Information Criterion (AIC) determined the Jones-Taylor-Thornton (JTT) model of aminoacid substitution as most appropriate for the alignment and thus was selected for the phylogenetic analysis. The maximum likelihood phylogenetic analysis was carried out with RAxML v8.2.X 47 with gamma-distributed rates and 1,000 non-parametric bootstrap replicates. Bayesian analysis was performed using MrBayes v3.2.6 48 with 25% of sampled trees discarded as burn-in, six rate categories for the gamma distribution, and 30,000,000 generations. The Bayesian phylogenetic run output was analyzed with the R package RWTY 49 . Likelihoods for the model parameters and resulting tree topologies were highlighted as a function of the number of sampled generations during the phylogenetic inference after burn-in of 25% www.nature.com/scientificreports/ phore and veliger). Identified Hox and ParaHox orthologs were used to design gene-specific primers. PCR amplified fragments were later separated by gel electrophoresis and purified with a QIAquick Gel Extraction Kit (Qiagen, Hilden, Germany). The products were cloned into pGEM-T Easy Vectors (Promega, Mannheim, Germany) and used to transform E. coli cells. Colonies were grown overnight, plasmids were later purified with a QIAprep Spin MiniprepKit (Qiagen) and sequenced for verification of insert orientation. For more descriptions of the whole experimental procedure see also 36,37,[51][52][53] . Forward and reverse primers for each gene sequence can be found in Supplementary Table S2. (B) Heat map shows relative normalized expression levels for each gene (Z-score). Genes expressed above the normalization threshold are depicted in graded shades of red, those below the threshold in shades of blue. Time after fertilization and corresponding developmental stages at 23 °C are to the left. Germ layers and major derivatives in animal schemes are depicted in grey (mesoderm), red (endoderm), and white (ectoderm), respectively. Asterisks mark the blastopore/mouth, sf indicates the shell field. www.nature.com/scientificreports/ Probe synthesis and whole mount in situ hybridization. Plasmid inserts were amplified by PCR amplification using M13 primers as described previously 36,52 . Antisense riboprobes were synthesized using digoxigenin-UTP (DIG RNA Labeling Kit, Roche Diagnostics) and SP6/ T7 polymerase (Roche Diagnostics). For whole mount in situ hybridization, animals stored in 100% methanol were rehydrated in 0.1 M PBS and decalcified with PPE (20% PFA + 10 × PBS + 0.5 M EGTA at pH 8 + diethylpyrocarbonate; DEPC-treated H 2 O). Afterwards, samples were treated with proteinase-K at 37 °C for 10 min (10 µg/ml in PTw: 1 × PBS + 0.1% Tween-20). Samples were washed in PTw and post-fixed for 45 min in 4% PFA. Subsequently, samples were washed and transferred to hybridization buffer (50% formamide, 5 × SSC, 50 μg/ml heparin, 500 μg/ml yeast tRNA, 0.1% tween-20, pH 6.0) for 8-10 h at 56-60 °C. Probe hybridization was performed at the same temperature with probe concentrations ranging between 1 and 2 ng/µL for 30-48 h. Washing steps after hybridization were performed in wash buffer (75% hybridization buffer + 25% of SSC (3 M NaCl + 0.3 M saline-sodium citrate; SSC buffer + DEPC H 2 O)) three times for 10 min each, then twice in 50% hybridization buffer + 50% SSC for 10 min each, and finally three times in 25% hybridization buffer + 75% SSC for 7 min (once) and 1xSSC + 0.1%Tween-20 for 5 min each.

Results
Phylogenetic analysis of gene orthologs. Candidate orthologs of all ten Hox genes (Hox7 appears to have been lost as in Crassostrea gigas) and three ParaHox genes were identified from the quagga mussel transcriptome. Multiple sequence alignment of the putative Hox proteins of Dreissena rostriformis and other bilaterians shows high sequence conservation within the homeodomain regions (Supplementary Data 2). Tree topologies obtained from Bayesian inference and maximum likelihood methods are highly consistent and reveal similar clusters among corresponding ortholog groups. All Dreissena gene candidates cluster with either a mollusk or another lophotrochozoan ortholog sequence (Supplementary Table S3, Supplementary Data 3, 4). Dro-Hox1-5, Dro-Lox2, Dro-Lox4, Dro-Lox5, Dro-Post1, and Dro-Post2, together with the ParaHox sequences Dro-Gsx, Dro-Xlox, and Dro-Cdx, were successfully cloned and the respective riboprobes were synthesized. All probes except those for Dro-Hox5, Dro-Post1, Dro-Cdx, and Dro-Gsx yielded results by in situ hybridization. The negative results of the latter were most likely due to low expression levels of these genes in the targeted developmental time points (Supplementary Table S1, Supplementary Data 5).
Hox and ParaHox gene arrangement. Dreissena rostriformis Hox and ParaHox genes are distributed along several scaffolds of the genome, which may either reflect the fragmented state of the assembly or disruption of the respective clusters (Supplementary Table S4, Supplementary Data 6). The Hox genes were distributed over five scaffolds, while two of the three ParaHox genes (Gsx, Xlox) were found on a single scaffold. Several non-Hox genes were found between members of the Hox cluster (between Xlox and Cdx, Hox5 and Lox5, Lox2 and Post2), demonstrating insertions all along the cluster (Fig. 1a

Hox and ParaHox gene expression in Dreissena rostriformis. First detection of Hox and ParaHox
gene expression by in situ hybridization started in the late gastrula stage (i.e. after 10hpf), in which Dreissena is heavily ciliated yet lacks a distinct mesodermal layer. The large invagination of the shell field marks the future dorsal side, and the smaller invagination of the blastopore lies on the opposite, ventral side (Fig. 2). Although this time point was not directly sampled for quantitative analysis, data from 8 and 13hpf old individuals suggest that significant transcription rates start between these stages. Both Hox and ParaHox gene expression is main-  www.nature.com/scientificreports/ tained throughout the trochophore stage (i.e. around 13-16hpf) in which larvae become slightly elongated, the shell field has already evaginated, and the prototroch is distinct (Fig. 3). Hox and ParaHox gene expression was not detected in veliger stages.
Hox 1. Expression of Dro-Hox1 is restricted to the shell field in the late gastrula and in the trochophore stage (Figs. 2a-c, 3a-d). Expression of Hox1 is initially found at the anterior margin of the shell field in gastrula stage.
In the trochophore larva the expression domain expands to all cells of the shell field.
Hox 2. Dro-Hox2 shows strong expression in the median region of the developing mesoderm of the gastrula and is located towards the posterior half of the embryo (Fig. 2d-f). Subsequently, mesodermal expression becomes confined to the median region of the trochophore that corresponds to the mesodermal layer underlying the shell field ( Fig. 3e-h).
Hox 3. Dro-Hox3 is widely expressed in the mesoderm of the gastrula. Its expression domain partially overlaps with Hox2, although Hox3 transcripts are distributed more widely all across the larval body ( Fig. 2g-i). The unspecific stain around the periphery of the gastrula constitutes non-larval material. In the trochophore, Hox3 expression overlaps with most genes except Hox1, Post2, and Xlox. Expression is confined to a mesodermal domain that underlies the shell field ( Fig. 3i-l). Lox 5. In the gastrula stage, the expression domain of Lox5 partially overlaps with Dro-Hox2, Dro-Hox3, and Dro-Hox4. Expression at this stage is restricted to the presumptive ventro-median mesoderm ( Fig. 2s-u). In the late trochophore stage, gene expression occurs in individual mesodermal cell clusters throughout the larval body ( Fig. 3q-t).
Lox 4. In the gastrula, expression of Dro-Lox4 is located in mesodermal tissue that underlies the posterior margin of the shell field ( Fig. 2p-r). In the trochophore stage, Dro-Lox4 is expressed in the dorsal mesoderm that underlies the shell field ( Fig. 3u-x).

Lox 2. Dro-Lox2
shows a clear shift of expression between developmental stages. In the gastrula stage, the expression is in median mesodermal tissue that extends towards the posterior region of the embryo. In the trochophore, transcripts are located in the dorsal mesoderm below the shell field. The pattern overlaps with Dro-Hox2, Dro-Hox3, and, to some extent, with Dro-Lox4 (Figs. 2m-o, 3y-b′).
Post 2. Dro-Post2 expression is located in cells of the posterior margin of the shell field in the gastrula, opposite to Dro-Hox1 (Fig. 2v-x). In the trochophore stage, transcripts are located more posteriorly on the ventral side of the larva, where expression co-localizes with the foot anlage ( Fig. 3c′-e′).
Xlox. Dro-Xlox is first detected in the gastrula stage, in a group of cells corresponding to the anterior region of the developing digestive tract, i.e. in an endodermal domain, and in parts of the emerging anterior mesoderm (Fig. 2y, z). In the trochophore, Dro-Xlox-expressing cells lie adjacent to the developing hindgut ( Fig. 3f′-i′).

Non-staggered Hox gene expression in Dreissena.
Hox genes in bilaterians are predominantly expressed along the anterior-posterior axis. They determine regionalization of morphological structures across the animal body. Since this staggered spatial expression is present in most clades, it is considered ancestral for Bilateria 9-12,54,55 . In bivalves, similar spatially staggered expression of Hox1, Hox4, Lox5, and Post2 were proposed for a scallop, Mizuhopecten yessoensis. However, only the gastrula stage was investigated in this bivalve and data  www.nature.com/scientificreports/ are therefore largely incomplete. Transcriptomic data on this scallop seem to suggest a subcluster-level of temporal staggered expression whereby the first genes of each subcluster are expressed simultaneously, followed by the sequential expression of each subsequent gene within each subcluster 35 . By contrast, stage-specific quantitative transcriptome data for another bivalve, Crassostrea gigas, appears to more closely reflect the results from Dreissena insofar as no evidence of sequential or collinear activation of Hox genes was found 28,56 .
In Dreissena, the relative expression domains of a few genes (Hox1, Post2, Lox4) might be interpreted as "partially spatially staggered" in the gastrula stage. Dro-Lox4 maintains its pattern of expression in the dorsoposterior half of the gastrula and the trochophore, while Dro-Hox1 is expressed anteriorly in the dorsal region of the gastrula. However, during the transition to the trochophore stage, Dro-Hox1 expands throughout the entire shell field. Such a Hox1 expression domain has also been found in gastropods, scaphopods, and the scallop, and might constitute an autapomorphy of Conchifera 26,35,37,40 . All other Dreissena Hox genes display highly dynamic spatial expression profiles with changes in expression domains frequently occurring between the gastrula and trochophore stages. As such, Dro-Post2 expression is initially restricted to the posterior margin of the presumptive shell field, in close proximity to Dro-Hox1, but later extends ventro-posteriorly in the trochophore. This non-staggered expression of Dro-Hox1, Dro-Hox2, Dro-Hox3, Dro-Hox4, Dro-Lox2, Dro-Lox5, Dro-Post2, and Dro-Xlox is also reflected by their non-collinear but rather synchronous temporal activation (Fig. 1, Supplementary Data 5).

Cooption of Hox genes in mollusks.
Conchiferan Hox gene expression is often located in specific developing morphological structures and does not strictly follow an anterior-posterior gradient, although rudiments of an ancestral staggered mode of expression are recognized to varying degrees in individual lineages and developmental stages 16,18,26,36,37,53 (Fig. 4b). Hox2, Hox3, Hox4, Lox5, Lox4, and Lox2 show an overlapping expression in the developing mesoderm in the gastrula stage in Dreissena (Figs. 2, 4a). Subsequently, the expression domains still overlap, but are more confined to dorsal parts of the mesoderm (Figs. 3, 4a). In other conchiferans, Hox genes show more distinct spatial staggering along the anterior-posterior axis. In pre-torsional gastropod larvae, Hox2, Hox3, Hox4, Hox5, and Hox7 expression domains appear somewhat staggered in the developing nervous system, while Hox1, Lox5, Lox4, Lox2, Post2, and Post1 do not. Instead, these genes are expressed in distinct morphological features such as the shell field, foot, prototroch, and the larval apical organ 14,20,26,37,38,40 . A similar situation is found in cephalopods, where Hox1, Hox3, Hox5, Lox4, and Post2 expression is located in the arm crown of the bobtail squid Euprymna scolopes in a somewhat staggered manner, but are also active in the pedal ganglia, the palliovisceral ganglia, and the light organ. Hox3, Hox5, Lox4, and Lox5 are additionally expressed in the developing funnel 21,22 . Trochophore larvae of the scaphopod Antalis entalis likewise show spatially staggered expression of seven out of nine Hox genes, with Hox2, Hox4, Hox5, Lox5, Lox4, and Post1 being expressed in the anlagen of the ganglia, the mantle margin, and the foot 37 . Taken together, it appears that Hox genes have been recruited independently into novel expression domains in various conchiferan lineages and have most likely also acquired novel roles at the cost of their original function in anterior-posterior axis specification.
Interestingly, and strikingly different from any conchiferan investigated to date, the aculiferan polyplacophoran Acanthochitona fascicularis shows almost textbook-like spatial (but not temporal) Hox gene expression along the anterior-posterior axis in the trochophore stage 30,31,38 . Here, Hox genes are not confined to distinct morphological features but instead to well-defined axial territories, thus probably resembling the conserved bilaterian condition 11 . Moreover, the genome of the polyplacophoran Acanthopleura granulata shows evidence of an intact cluster of the 11 Hox genes 57 . This supports the assumption that the last common ancestor of at least Aculifera likewise might have showed staggered and possibly collinear expression of Hox genes, although a final argument cannot be made without data from the aplacophoran clades (Fig. 4b). The deviation from this ancestral condition and deployment into novel roles in the conchiferan lineages might have facilitated the evolution of the immense diversity of body plans in the Conchifera 39 .
Comparative aspects of ParaHox gene expression. ParaHox genes are a set of genes which originated from a duplication of the ancestral metazoan Proto-Hox cluster 58 . They comprise the Cdx, Xlox, and Gsx genes, which are commonly expressed in a staggered fashion in the hindgut, midgut, and foregut, respectively. Thus, it is assumed that ParaHox genes have an ancestral function in digestive tract patterning and regionalization 58,59 . This is supported by staggered expression of these genes along the developing gut in the polychaete annelids Platynereis dumerilii, Alitta virens, and Capitella teleta [60][61][62] . Data from the polyplacophoran Acanthochitona support this notion, with Cdx being expressed in the region of the developing hindgut, but data on other ParaHox genes are still lacking 31 . Once again, the conchiferans deviate from this condition. Although gastropod ParaHox genes appear spatially staggered expressed in both the gut and developing ganglia 27 , Gsx is expressed in the developing nervous system of scaphopods and cephalopods 36 . Interestingly, Dro-Xlox expression in Dreissena resembles more the condition of other bilaterians including deuterostomes, where it is expressed in the midgut, rather than the one of its conchiferan relatives [60][61][62][63][64][65] . Thus, in contrast to the Hox gene condition, our results suggest that in Dreissena, Xlox has retained its ancestral expression domain reminiscent of the last common bilaterian ancestor.

Staggered Hox expression versus body plan regionalization in Mollusca.
The Hox complement of Mollusca comprises 11 genes. These are expressed in a staggered fashion in polyplacophorans, a representative of the aculiferan lineage with a number of putative conserved ancestral traits such as a serially arranged dorsoventral musculature, a nervous system with non-ganglionated longitudinal nerve cords, an unpronounced cephalic region, and an elongated foot that extends more than three quarters along the longitudinal body axis 30 . The non-regionalized anterior-posterior distribution of these morphological traits and the conserved mode of www.nature.com/scientificreports/ staggered Hox gene expression coincides with the situation in the polychaete annelid Platynereis which, despite significant differences with the polyplacophoran body plan such as segmentation and ganglionated nerve cords, likewise shows a largely evenly distributed set of organ systems along its longitudinal axis as well as staggered  www.nature.com/scientificreports/ expression of Hox genes 54 . From this common scheme, conchiferan molluscs have decoupled Hox gene expression from their original position along the anterior-posterior axis in various ways, depending on the degree of body regionalization. As such, scaphopod larvae with a pronounced longitudinal axis with relatively little structural concentration have largely retained the staggered mode of Hox expression, while gastropod veligers and cephalopod embryos with a high degree of organ system concentration, particularly in the anterior region, only show hidden traits of staggered Hox expression 37 . This independent loss (to varying degrees) of staggered Hox gene expression in the scaphopod-gastropod-cephalopod lineages has been replaced by coopted expression of these genes in (sometimes lineage-specific) distinct morphological structures such as the shell, the foot including funnel, the tentacles, or the larval prototroch. The situation in the bivalve Dreissena differs from that of their conchiferan (and polyplacophoran) relatives insofar as Hox genes are detected by in situ hybridization only until the trochophore stage, a larval type with little structural complexity. By the time the complex and regionalized veliger larva starts to establish, Hox gene expression is lost. Thus, the widely overlapping, non-staggered expression domains in the Dreissena trochophore and the low expression level during veliger patterning indicate that these genes might, at best, play a minor role in axis patterning in this bivalve, similar to the gastropods and cephalopods.
In conclusion, the data currently available corroborate the high plasticity of Hox gene expression in Mollusca. The use of functional assays and gene manipulation tools such as RNAi or the CRISPR/Cas9 system could help elucidate the function of individual Hox genes and the molecular basis of the developmental and evolutionary pathways that have resulted in the manifold morphological novelties in conchiferan mollusks.