Conservation of epigenetic regulation by the MLL3/4 tumour suppressor in planarian pluripotent stem cells

Currently, little is known about the evolution of epigenetic regulation in animal stem cells. Here we demonstrate, using the planarian stem cell system to investigate the role of the COMPASS family of MLL3/4 histone methyltransferases that their function as tumor suppressors in mammalian stem cells is conserved over a long evolutionary distance. To investigate the potential conservation of a genome-wide epigenetic regulatory program in animal stem cells, we assess the effects of Mll3/4 loss of function by performing RNA-seq and ChIP-seq on the G2/M planarian stem cell population, part of which contributes to the formation of outgrowths. We find many oncogenes and tumor suppressors among the affected genes that are likely candidates for mediating MLL3/4 tumor suppression function. Our work demonstrates conservation of an important epigenetic regulatory program in animals and highlights the utility of the planarian model system for studying epigenetic regulation.

T he pluripotent adult stem cell population of planarian flatworms is a highly accessible study system to elucidate fundamental aspects of stem cell function 1,2 . These stem cells, collectively known as neoblasts (NBs), bestow these animals with an endless capacity to regenerate all organs and tissues after amputation. Comparisons of stem cell expression profiles and functional data between animals show that some key aspects of stem cell biology are deeply conserved [3][4][5][6][7][8] , while others, like the transcription factors that define pluripotency in mammalian stem cells, appear not to be. Thus, studies of NBs have the potential to inform us about the origins of fundamental stem cell properties that underpin metazoan evolution, such as maintenance of genome stability 9 , self-renewal 7,10 , pluripotency [11][12][13] , differentiation [14][15][16] , and migration 17 . All of these are highly relevant to understanding human disease processes, particularly those leading to cancer.
Currently, very little comparative data exists for the role of epigenetic regulation in animal stem cells. Planarian NBs offer an opportunity to ask whether the cellular and physiological roles of different epigenetic regulators might be conserved between mammalian and other animal stem cells. Additionally, as mutations in many chromatin modifying enzymes are implicated in cancer [18][19][20] , using NBs as a model system may provide fundamental insight into why these mutations lead to cancer, if epigenetic regulatory programs are conserved.
The genome-wide effects of chromatin modifying enzymes make understanding how they contribute to cancer phenotypes very challenging. Complexity in the form of tissue and cell heterogeneity, life history stage and stage of pathology make resolution of epigenetic regulatory cause and effect relationships in vivo very challenging. From this perspective, planarians and their easily accessible NB population may be a very useful model system. The planarian system could be particularly suitable for investigating the early transformative changes in stem cells at the onset of hyperplasia, as the NB identity of all potentially hyperplastic cells is known a priori.
The human MLL proteins are the core members of the highly conserved COMPASS-like (complex of proteins associated with Set1) H3K4 methylase complexes. An extensive research effort has now established the evolutionary history and histone modifying activities of this protein family (Supplementary Figure 1 [21][22][23][24][25][26][27][28][29][30][31][32] ). Perturbation of MLL-mediated H3K4 methylase activity is characteristic of numerous cancer types. While prominent examples include the translocation events widely reported in leukemias involving the Mll1 gene 33,34 , the mutation rate of Mll3 across malignancies of different origin approaches 7%, making Mll3 one of the most commonly mutated genes in cancer 19 . In attempts to model the role of Mll3 in cancer, mice homozygous for a targeted deletion of the Mll3 SET domain were found to succumb to ureter epithelial tumors at high frequency 24 , an effect enhanced in a p53+/− mutational background. Heterozygous deletions of Mll3 in mice also lead to acute myeloid leukemia, as hematopoietic stem cells fail to differentiate correctly and overproliferate, implicating Mll3 in dose-dependent tumor suppression 20 . Recent studies have revealed an increasingly complicated molecular function of MLL3, its closely related paralog MLL4, and their partial Drosophila orthologs-LPT (Lost PHD-fingers of trithorax-related; corresponding to the N-terminus of MLL3/4) and Trr (trithorax-related; corresponding to the C-terminus of MLL3/4) 26 . LPT-Trr/MLL3/4 proteins have a role in transcriptional control via mono-methylating and/or tri-methylating H3K4 at promoters and enhancers 22,23,25,26,30,35 (Supplementary Figure 1).
Links between mutations in Mll3/4, effects on downstream targets of MLL3/4 and human cancers remain to be elucidated. If the role of MLL3/4 in regulating stem cells is conserved in NBs, planarians could provide an informative in vivo system for identifying potential candidate target genes that drive tumor formation. Here we exploit the accessibility of NBs to identify and investigate the role of Mll3/4 orthologs in the planarian Schmidtea mediterranea, and show knockdown leads to NB over proliferation, tissue outgrowths containing proliferative NBs and differentiation defects. Investigating the regulatory effects accompanying this phenotype, we demonstrate mis-regulation of both oncogenes and tumor suppressors, providing a potential explanation for how tumor suppressor function is mediated by MLL3/4.
We performed whole-mount in situ hybridization (WISH) and found that LPT, trr-1 and trr-2 are broadly expressed across many tissues and organs. Gamma irradiation to remove cycling cells in S. mediterranea revealed that the three transcripts are also likely to be expressed in NBs (Fig. 1b). A double fluorescent in situ hybridization (FISH) with the pan-stem cell marker Histone 2B (H2B) confirmed that over 90% of all NBs co-express LPT, trr-1 and trr-2 (Fig. 1c, d). The genes also showed clear expression in the brain, pharynx and other differentiated tissues (Fig. 1b).
Mll3/4 RNAi causes regeneration defects and outgrowths. In order to study the function of planarian Mll3/4, we investigated phenotypes after RNAi-mediated knockdown. Following LPT (RNAi), there was a clear failure to regenerate missing structures, including the eyes and pharynx, with regenerative blastemas smaller than controls (Fig. 2a, b). After 7 days of regeneration we observed that, as well as failure to regenerate missing structures, animals began to form tissue outgrowths (Fig. 2c). Intact (homeostatic) LPT(RNAi) animals also developed outgrowths, but at a lower frequency than regenerates (Fig. 2d).
Following individual knockdown of trr-1 and trr-2, milder differentiation defects were observed compared to LPT(RNAi), with no obvious outgrowths (Supplementary Figure 2b-d), confirming results from an earlier study 27 . However, trr-1/trr-2 double knockdown recapitulated the phenotype of LPT(RNAi), but with higher penetrance and increased severity (Supplementary Figure 2e, f). Functional redundancy between the two trr paralogs likely accounts for the reduced severity after individual knockdown. Double knockdown animals all developed outgrowths and started dying at day 5 post-amputation. Based on these observations, we decided to focus our attention on the LPT (RNAi) phenotype as regeneration defects and the formation of tissue outgrowths were temporally distinct and could be studied consecutively.
A more thorough study of the differentiation properties of LPT (RNAi) animals following amputation showed that the triclad gut structure failed to regenerate secondary and tertiary branches and to extend major anterior and posterior branches (Fig. 2e). Cephalic ganglia (CG) regenerated as smaller structures, the two CG lobes did not join in their anterior ends in LPT(RNAi) animals (Fig. 2f) and the optic chiasm and optic cups were mispatterned and markedly reduced (Fig. 2g, h). We found that 80% of LPT(RNAi) animals did not regenerate any new pharyngeal tissue (Fig. 2i). We interpreted these regenerative defects as being indicative of either a broad failure in stem cell maintenance and/ or differentiation.

MLL3/4 controls differentiation of the epidermis and neurons.
One of the structures most severely affected following loss of LPT function was the brain so we looked at the regeneration of different neuronal subtypes. LPT(RNAi) animals had reduced numbers of GABAergic (Fig. 3a), dopaminergic (Fig. 3b), acetylcholinergic (Fig. 3c) and serotonergic (Fig. 3d)    Epidermal tissue was also affected. Both early (Prog-1 +ve cells) and late (AGAT-1 +ve cells) epidermal progeny were significantly decreased, but not entirely absent, in LPT(RNAi) regenerating animals (Fig. 3e). No such defect was seen in individual trr-1 and trr-2 knockdown animals (Supplementary Figure 3e).
These effects along with defects in pharynx and gut tissues implicate broad NB differentiation defects in LPT(RNAi) animals.
LPT limits normal stem cell proliferation and tissue growth. Aside from impairment of regeneration following LPT(RNAi), the other major phenotype we observed were outgrowths of tissue that appeared at unpredictable positions in regenerating pieces.
Planarian regeneration is characterized by an early burst of increased NB proliferation, 6-12 h after wounding, and a second peak of proliferation, 48 h after amputation 38 . Following LPT (RNAi), we observed significant increases in proliferation at both of these peaks and at 8 days post-amputation, as proliferation failed to return to normal homeostatic levels (Fig. 4a). It was previously reported that Trr-1(RNAi) leads to decreases in  In 8 day-regenerating LPT(RNAi) worms the observed overproliferation is a result of localized clusters of mitotic cells rather than broad increase in proliferation across regenerating animals (Fig. 4b). This is different from previously reported planarian outgrowth phenotypes from our group and others, where hyperplastic stem cells are evenly distributed 39,40 . It seems likely that these mitotic clusters might eventually be responsible for the formation of outgrowths. When we looked at outgrowths, we found mitotic cells usually restricted to mesenchymal tissue, had penetrated into epidermal outgrowths in LPT(RNAi) animals ( Fig. 4c).
In order to understand if ectopically cycling NBs represented the breadth of known stem cell heterogeneity in planarians or only a subset of lineages, we performed FISH for markers of the sigma (collectively pluripotent NBs), zeta (NBs committed to the epidermal lineage) and gamma (NBs committed to the gut lineage) cell populations 41 . We found that all three NB populations are represented in the outgrowths of LPT(RNAi) animals ( Fig. 5a-c). Sigma, zeta, and gamma NBs are not significantly increased in pre-outgrowth LPT(RNAi) animals (Supplementary Figure 4), suggesting that the presence of these cells in outgrowths is not a secondary effect of increased cell number and passive spread of these cell populations, but rather local proliferation at outgrowth sites.
The epidermal progeny, marked by Prog-1 and AGAT-1, were concentrated in the outgrowths of LPT(RNAi) animals, while being in reduced numbers in non-outgrowth tissue (Supplementary Figure 5a). The observed disarray of Prog-1 + and AGAT-1 + cells in outgrowths could be the result of perturbed patterning and polarity of the epidermal layer in LPT(RNAi) animals (Supplementary Figure 5b), as epidermal cells appear to have lost polarity and to be no longer capable of forming a smooth epidermal layer. Furthermore, the average epidermal nuclear size is significantly increased in LPT(RNAi) animals compared to  40 . The epithelial layer in LPT(RNAi) animals also appears less well-defined than that in control animals, with a blurred distinction between epithelium and mesenchyme. Another feature of the LPT(RNAi) phenotype, encountered in a variety of malignancies 42 , are changes in nuclear shape (Supplementary Figure 5d). In summary, LPT controls NB proliferation and restricts stem cells to pre-defined tissue compartments as well as being responsible for the successful differentiation of several lineages. Taken together, our data demonstrate that disturbance of the function of planarian LPT leads to development of both differentiation and proliferation defects (Fig. 6), allowing us to conclude that the function of LPT/trr/Mll3/4 proteins as an epigenetic tumor suppressor function is conserved over a large evolutionary distance.
Transcriptional changes driving proliferation in stem cells. A key insight missing from the literature for Mll3 and Mll4 mutations, is the downstream targets that are mis-regulated in disease states, for example in hematopoietic stem cells that cause leukemia 26 . Given the conserved tumor suppressor function in planarians, we decided to focus on early regeneration when LPT (RNAi) animals that do not yet exhibit any outgrowth phenotype, providing the possibility to describe early regulatory changes that are potentially causal of out growths, rather than consequential. We performed RNA-seq on X1 (G2/M) fluorescence activated cell sorted (FACS) NBs from LPT(RNAi) and GFP(RNAi) planarians at 3 days of regeneration. Our analysis revealed that 540 transcripts are down-regulated (fold change≤−1.5, p < 0.05) and 542 -up-regulated (fold change≥1.5, p < 0.05) in X1 stem cells from LPT(RNAi) animals when compared to controls (Supplementary Data 1).
A recent meta-analysis of all available S. mediterranea RNAseq data allowed classification of all expressed loci in the planarian genome by their relative expression in FACS sorted cell populations representing stem cells, stem cell progeny and differentiated cells 8 . Superimposing the differentially expressed genes following LPT(RNAi) onto a gene expression spectrum reflecting FACS compartments, shows that LPT(RNAi) has a broad effect on gene expression in X1 cells (Fig. 7a), affecting genes normally expressed in different planarian cell types (Fig. 7b).
Analysis of Gene Ontology (GO) terms revealed a clear enrichment for cell cycle and cell division-associated terms in the list of up-regulated genes (Fig. 7c), in agreement with the observed hyper-proliferation in LPT(RNAi) phenotype. The list of down-regulated genes is also enriched for cell cycle-related terms, as well as cell differentiation and metabolism-related processes (Fig. 7c). These findings suggest a broad link between gene expression changes caused by LPT loss of function and NB over-proliferation.
Promoter H3K4 methylation and transcription are correlated. Previous studies tie MLL3/4/LPT-Trr function directly to monoand tri-methylation of H3K4 22-25 and indirectly to trimethylation of H3K27, because the H3K27me3 demethylase UTX is present in the same protein complex 43 . We set out to understand potential epigenetic causes of the transcriptional changes following LPT(RNAi) in planarians. To this end, we performed Chromatin Immunoprecipitation sequencing (ChIPseq) on isolated planarian G2/M stem cells and used Drosophila S2 cells as a spike-in control to normalize for any technical differences between samples 44 . The profile of H3K4me3, H3K4me1 and H3K27me3 at the transcriptional start sites (TSSs) of genes in control X1 cells was in agreement with their conserved roles in transcriptional control 8,25 (Fig. 8a, Supplementary Figure 6), suggesting out methodology was robust. LPT(RNAi) led to only subtle changes in the overall level of H3K4me3 and H3K4me1 at TSSs throughout the genome. A decrease in H3K4me1 and H3K4me3 was apparent proximal to the TSS in genes where expression is normally enriched in differentiated cells (Fig. 8a). Concomitant with this, we also observed an increase in H3K4me1 signal downstream of the predicted TSS for genes in enriched in stem cells (Fig. 8a). For the H3K27me3 mark, no consistent pattern was observed as a result of LPT(RNAi) in any group of genes subdivided by expression profiles (Fig. 8a). We next looked specifically at the promoter histone methylation status of those genes whose transcript levels were affected by  For genes with enriched expression in NBs, we observed a significant inverse correlation between expression following LPT(RNAi) and the amount of TSS-proximal H3K4me1 (Fig. 8b). This indicates that LPT(RNAi) leads to a reduction of this repressive mark at some loci and an upregulation of the cognate transcript expression in stem cells, consistent with the role of H3K4me1 as a repressive mark. For mis-regulated genes not normally enriched in X1 NBs, we observed instead a positive correlation between changes in transcriptional expression (upregulation) and changes in H3K4me3 levels at the TSS and gene bodies (Fig. 8b).
Overall, our data suggest that reductions in H3K4me1 following LPT(RNAi) cause up-regulation of some of the stem cell genes implicated by RNA-seq data from LPT(RNAi) animals.
Our data are consistent with MLL3/4's known role in H3K4 methylation and identify gene expression profiles following LPT (RNAi) that are broadly correlated with the amount of H3K4me1 and H3K4me3 at gene promoters.
Mis-regulation of oncogenes and tumor suppressors. After observing the global changes in expression and histone modification patterns following LPT(RNAi), we wanted to investigate individually mis-regulated genes that could be major contributors to the differentiation and outgrowth phenotypes (Figs. 9, 10). Within our list of up-and down-regulated genes, we saw misregulation of tumor suppressors, oncogenes and developmental genes (Supplementary Data 1). Detailed inspection of changes in promoter patterns of H3K4 methylation revealed example loci   where changes in methylation status were both consistent and inconsistent with changes in transcript levels (Figs. 9a, 10a, Supplementary Figure 7a, Supplementary Figure 9a). For example, we find that the up-regulated expression of the planarian orthologs of the transcription factors Elf5 and pituitary homeobox (pitx) are associated with increased levels of TSS-proximal H3K4me3 signal following LPT(RNAi) (Fig. 9a). Furthermore, up-regulation of some X1-enriched genes, such as pim-2-like, is associated with a decrease in H3K4me1 signal on the TSS, consistent with alleviated repression (Fig. 9a). On the other hand, some transcriptional changes following LPT(RNAi), such as the down-regulation of Ras-responsive element-binding protein 1 (RREBP1), are not correlated with the expected alterations in histone modification patterns on promoters (Fig. 10a). Such examples could potentially represent secondary (not related to histone modifications) or enhancer-dependent changes in the LPT(RNAi) phenotype. In the absence of similar RNA-seq/ChIPseq data in mammals, our data provide an important insight beyond the deep evolutionary conservation of MLL3/4 function. While mis-regulation of well-known oncogenes and tumor suppressors, like p53 (Fig. 10a), would be expected to broadly correlate with mis-regulation of Mll3/4 in cancer gene expression datasets, shared correlative expression changes for some selected genes not previously associated with Mll3/4 loss of function provide some independent evidence of a conserved regulatory program. In the Pomeroy Brain Oncomine dataset 45 , Mll3 and p53 expression levels were both significantly down-regulated (Supplementary Figure 10), supporting the functional link between the two genes established by previous work 24 and emerging in our study. We identified more LPT(RNAi) misregulated genes, which have some known association with growth and/or cancer, but have not been previously implicated in Mll3/4 loss of function phenotypes. An investigation of the Brune 46 and Compagno 47 lymphoma Oncomine datasets demonstrates that these genes are mis-regulated in different cancers in a similar manner to the mis-regulation we observe in planarians stem cells as a consequence of a decreased LPT expression (Supplementary Figure 10).

RNAi
One gene of interest in this data set was the planarian pitx gene ortholog. In planarians, pitx is expressed in the serotonergic neuronal precursor cells 48,49 and is required for their differentiation. Thus, pitx is not directly implicated in planarian stem cell proliferation, but rather in differentiation. Nonetheless, pitx's upregulation was of great interest to us since in human medulloblastomas down-regulation of Mll3 and over-expression of pitx2 are co-occurrences (Pomeroy Brain Oncomine dataset 45 ((www.oncomine.org)) (Fig. 9b). To investigate the cellular basis for pitx overexpression, we performed FISH for this gene in LPT (RNAi) animals. We observed an accumulation of pitx-positive cells in LPT(RNAi) regenerates (Fig. 9c). Given that production of terminally differentiated serotonergic neurons is decreased (Fig. 3d), the increase of pitx-positive cells following LPT(RNAi) marks the accumulation of serotonergic neuronal precursors that fail to differentiate. We conclude that planarian LPT normally regulates pitx-mediated differentiation of serotonergic neurons and that regulation of pitx is a possible example of a conserved feature of MLL3/4 function that is mis-regulated in some cancer types 45 .
Double RNAi validates overexpressed target genes. In the planarian model, up-regulated genes in our data set provided the opportunity to identify genes whose overexpression contributes to the LPT/Mll3/4 loss of function phenotype using double RNAi experiments. In addition to the transcription factor E74-like factor 5 (Elf5), we observed up-regulation of planarian orthologs of other developmental or cancer associated genes. Among them were orthologs of the serine/threonine kinase oncogene pim-2 (two genes called Smed-pim-2 and Smed-pim-2-like), a paralog of the epigenetic regulator Suppressor of zeste (Su(z)12), the transcription factors ETS4 and FoxA1 and an ULK2-like serine/ threonine protein kinase (Fig. 9a, Supplementary Figure 7a). Our ChIP-seq data suggested pim-2, ETS4 and ULK2-like are either indirectly regulated or regulated by LPT at enhancers, while pim-2-like, FoxA1, Su(z)12 paralog and Elf5 appear to be direct targets of LPT as their increased expression was associated with either reduced H3K4me1 or increased H3K4me3 signal at promoters (Fig. 9a, Supplementary Figure 7a).
We decided to focus on these genes because they all have a known role in a wide range of cancers. Overexpression of the cell fate decision determinant Elf5 is a known driving force behind breast cancer progression and metastasis 50 . The PIM family of proteins is involved in the integration of growth and survival signals 51 and their overexpression has been associated with hematological malignancies and solid tumors 52 . Inhibition of PIM2 is a promising avenue for the treatment of multiple myeloma 51 . Su(z)12 is part of the ubiquitous polycomb repressive complex 2 (PRC2) and is overexpressed in numerous cancers, including gastric cancer where Su(z)12 levels were associated with increased metastasis and unfavorable survival prognosis 53 . The ETS family of transcription factors has demonstrated significant involvement in all stages of tumorigenesis 54 , while FoxA1 is known as a "pioneer transcription factor in organogenesis and cancer progression" 55 . Finally, ULK2 is an autophagy regulator overexpressed in prostate cancer cells 56 . This selected panel of genes represents some of the best candidates for major effects amongst those genes with significant up-regulation in expression following LPT(RNAi).
In order to test whether the up-regulated expression of any of these genes is a potentially significant contributor to the LPT (RNAi) outgrowth phenotype, we performed LPT(RNAi) rescue experiments in the form of double RNAi knockdowns (Fig. 9d-f,  Supplementary Figure 7). At 48 h post-amputation, LPT(RNAi) regenerates have a significantly increased NB proliferation (Fig. 4a, b) and so do GFP/LPT(RNAi) double knockdown animals (Fig. 9d, e). Whereas pim-2/LPT(RNAi), ETS4/LPT (RNAi), FoxA1/LPT(RNAi), ULK2-like/LPT(RNAi) and Su(z)12 paralogue/LPT(RNAi) regenerates still have elevated NB proliferation, both pim-2-like/LPT(RNAi) and Elf5/LPT(RNAi) regenerates have a significantly decreased NB proliferation compared to GFP/LPT(RNAi) (Fig. 9d, e and Supplementary Figure 7b). Furthermore, not only did pim-2-like/LPT(RNAi) and Elf5/LPT (RNAi) regenerating animals show improved blastema formation (Supplementary Figure 7c), but also less than half as many animals in these two conditions went on to form outgrowths compared to GFP/LPT(RNAi) (Fig. 9f), demonstrating a rescue of the outgrowth phenotype. Importantly, individual knockdown of Elf5 and pim-2-like did not lead to regenerative, proliferation or outgrowth-related defects (Supplementary Figure 8). These findings suggest that the up-regulation of both pim-2-like and Elf5 is specifically involved in driving the LPT(RNAi) outgrowth phenotype and demonstrate the utility of our data for validating the role of MLL3/4 targets.
Defects caused by transcription factor down-regulation. We chose a selection of genes down regulated by LPT(RNAi) for further investigation to see if they might contribute to the observed LPT(RNAi) phenotype ( Supplementary Figure 9a-d,  Fig. 10a-d). LPT(RNAi) leads to the down-regulation of the tumor suppressor p53, PRDM1-1 and cut-like1 in X1 stem cells (Fig. 10a). These genes' down-regulation is associated with a decrease in H3K4me3 levels on their promoters. On the other hand, RREBP1's down-regulation, does not correlate with the amount of H3K4me3 on its promoter. While the function of p53 in planarians has already been described 57 , the role of PRDM1-1, cut-like1 and RREBP1 remain unexplored in planarian biology. Knockdown of each of these three genes resulted in impaired regeneration, characterized mainly by the inability to form eyes (Fig. 10b). RREBP1(RNAi) resulted in significantly increased proliferation compared to control, while cut-like1(RNAi) and PRDM1-1(RNAi) animals did not show a significant change in mitotic cell numbers (Fig. 10c). All three knockdown conditions showed a decreased number of prog-1-positive epidermal progenitors, but intact smedwi-1-positive cell numbers (Fig. 10d). These findings suggest that the observed differentiation phenotypes are not associated with an inability to maintain the stem cell pool following knockdown. Instead, stem cells seem to be restricted in their ability to differentiate correctly.
Our data suggest that decreased expression of PRDM1-1, cut-like1 and RREBP1 in LPT(RNAi) animals could be contributing to the differentiation defects seen after perturbation of LPT/Trr/ MLL3/4 function, and that downregulation RREBP1 may be contributing to the early over proliferation of stem cells observed in LPT(RNAi).

Discussion
In mammals, Mll3 and Mll4 have been implicated in different malignancy landscapes 19 , with clear evidence for tumor suppressor roles in mammalian systems 20,24 . However, relatively little is known about how these effects are mediated. Our study demonstrates that loss of function of the planarian LPT (an Mll3/ 4 ortholog) also results in the emergence of an outgrowth phenotype characterized by differentiation and proliferation defects. Our work also shows that LPT, TRR-1, and TRR-2 control differentiation to form the gut, eyes, brain, and pharyngeal cellular lineages, Future work in planarians combining ChIP-seq and RNA-seq will allow closer investigation of these and other epigenetic effects on stem cell differentiation.
We found that clusters of mitotic cells preceded the appearance of outgrowths in LPT(RNAi) regenerating animals, possibly preempting where the outgrowths would subsequently form. The observation of clusters of cells and the formation of outgrowths in some, but not all, RNAi animals is evidence of heterogeneity in stem cell responses to LPT(RNAi). This may reflect the stochastic nature of the broad genome-wide epigenetic changes mediated by MLL3/4 proteins that will lead to variability between cells after knockdown, such that only some NBs cycle out of control and cause outgrowths after the initial proliferative peaks associated with regeneration. A contributory cause to outgrowth formation in addition to proliferation could be failure of NBs to differentiate appropriately and instead continue to cycle at inappropriate positions. We also observed that outgrowth tissue contained different classes of stem cells. Among these stem cells, the presence of sigma NBs, thought to include truly pluripotent stem cells 41 , is of particular significance. When mis-regulated, these cells could share fundamental similarities with cancer stem cells (CSCs), thought to be founder cells in human malignancies 58 . CSCs have been described as one of the main factors in cancer aggressiveness and resistance to treatment 59 . Studying such cells in a simple in vivo stem cell model, provided by the planarian system, should bring further insight into important control mechanisms that are mis-regulated in different cancers. Our work here provides a useful example of this approach.
Our data suggest that LPT regulates expression of genes across cell types, including some genes with enriched expression in stem cells. Genes with significant expression differences following LPT (RNAi) were mostly associated with cell proliferation, differentiation, and metabolic processes. A subset of mis-regulated genes where RNA-seq and ChIP-seq data correlate is likely a direct consequence of LPT(RNAi) affecting promoter histone methylation status. Genes with altered expression where there is no such correlation, may represent indirect (secondary) changes or, alternatively, may have enhancers that have altered histone modifications as a result of LPT(RNAi). Future work will develop the use of planarians as a model of epigenetic gene regulation and it should also be possible to study enhancer function and evolution.
Among mis-regulated genes, we saw many tumor suppressors with reduced expression and oncogenes with increased expression. We also found a number of genes, including the transcription factor pitx, which were similarly mis-regulated in LPT (RNAi) planarians and human cancers with reduced Mll3 expression. Together, these data suggest that, as well as physiological function in controlling stem cell proliferation, there may be deep regulatory conservation of MLL3/4 function in animal stem cells. One advantage of our approach is that we were able to sample expression and histone states in NBs at an early time point before tumors formed. This could be an advantage for the identifying targets that act early to drive hyperplasia, rather than later secondary regulatory changes.
As proof of principle that genes mis-regulated by LPT(RNAi) directly contributed to the phenotype, we performed double RNAi experiments. Planarian homologs of the oncogene pim-2, called Smed-pim-2 and Smed-pim-2-like, that were overexpressed in stem cells following LPT(RNAi), were chosen as likely candidates, based on previous data on the roles of these genes from mammals 60 . We found that double RNAi with pim-2-like, was able to ameliorate LPT loss of function over-proliferation and outgrowth phenotypes induced by LPT(RNAi). In addition, double knockdown with the breast cancer oncogene Elf5 50 resulted in an even more dramatic rescue of the LPT loss of function phenotype. This provides strong support for the hypothesis that the over-expression of these two genes was significant in driving stem cell hyperplasia. Future work can now study how these genes function in stem cells and why overexpression leads to over proliferation.
We also identified downstream candidates that could be contributing to the lack of differentiation phenotype following LPT (RNAi). Knockdown of PRDM1-1, cut-like and RREBP-1 (downregulated in LPT(RNAi) animals) indicated their mis-regulation might contribute to the decreased epidermal differentiation observed following LPT knockdown.
Together these experiments, demonstrate the value of our approach for identifying potential downstream targets and implicate regulatory interactions driving the Mll3/4 loss of function phenotype. These targets can now be tested for conservation in mammalian experimental systems.
Overall, our work shows how perturbation of a conserved physiological role of LPT leads to mis-regulation of genes wellknown to control cell proliferation, causing hyperplasia and tumors in planarians. We find other genes that are misregulated in planarians that are also similarly mis-regulated in cancer expression studies that have reduced Mll3 expression. Some of these, like pitx, may represent deeply conserved regulatory interactions. In the absence of similar RNA-seq/ChIPseq data in mammals, our data provide an important insight into Mll3/4 loss of function, as well as revealing a deep evolutionary conservation in animal stem cells. These findings demonstrate the strength of the planarian system for understanding fundamental animal stem cell biology highly relevant to cancer and the potential for investigation of epigenetic mechanisms in stem cells.

Methods
Animal husbandry. Asexual freshwater planarians of the species S. mediterranea were used. The culture was maintained in 1x Montjuic salts water 61 . Planarians were fed organic calf liver once a week. After every feeding, the water was changed. Planarians were starved for 7 days prior to each experiment. They were also starved throughout the duration of each experiment.
RNAi. Double-stranded RNA (dsRNA) was synthesized from DNA fragments cloned in pCRII (Invitrogen) or pGEM-T Easy (Promega) vectors. T7 (Roche) and SP6 (NEB) RNA polymerases were used for transcription of each strand. The two transcription reactions were combined upon ethanol precipitation. RNA was denatured at 68°C and re-annealed at 37°C. Quantification was performed on a 1% agarose gel and Nanodrop spectrophotometer.
For single RNAi experiments a working concentration of 2 μg/μl was used. For double RNAi, each gene's RNA was at a concentration 4 μg/μl, resulting in solution concentration of 2 μg/μl.
DsRNA was delivered via microinjection using Nanoject II apparatus (Drummond Scientific) with 3.5″ Drummond Scientific (Harvard Apparatus) glass capillaries pulled into fine needles on a Flaming/Brown Micropipette Puller (Patterson Scientific). Each animal received around 100 nl dsRNA each day. H2B (RNAi) was performed for three consecutive days, as per Solana et al. 7 protocol. For single and double LPT, trr-1 and trr-2 knockdown, a course of 7 days of microinjections was performed (3 consecutive days + 2 days rest + 4 consecutive days). Set1(RNAi) and utx(RNAi) were performed for 4 consecutive days. For all other single and double knockdowns, a course of 10 days of microinjections was performed (3 consecutive days + 4 days rest + 3 consecutive days).
Primers used for amplification of DNA for dsRNA synthesis can be found in Supplementary Table 1.
In situ hybridization. RNA probes labeled with digoxigenin and fluorescein were generated via anti-sense transcription of DNA cloned in PCRII (Invitrogen) or PGemTEasy (Promega) vector. In situ hybridization was performed after fixation in 4% paraformaldehyde 62 for fluorescent experiments. For LPT, trr-1, trr-2, sigma, zeta, and gamma fluorescent in situ procedures, a pooled probes method was used using PCR products from across transcriptional units as templates for RNA probes 41 . Colorimetric in situ hybridization procedures were performed using alkaline phosphatase based detection of hybridized RNA probes 63 . Primers used for amplification of DNA for RNA probe synthesis can be found in (Supplementary  Table 1).
Imaging and image analysis. Colorimetric images were taken on Zeiss Discovery V8 (Carl Zeiss) microscope with a Canon EOS 600D or Canon EOS 1200D camera. Fluorescent images were taken on either Inverted Olympus FV1000 or FV1200 Confocal microscope. Cells were counted via Adobe Photoshop CS6 or FIJI software and the count was normalized to image area in mm 2 .
Flow cytometry. A simple FACS protocol 8,65 was used on disassociated planarians cells using hoechst, calcein and propidium iodide to sort living cells in G2/M of the cell cycle. A FACS Aria III machine equipped with a violet laser was used for the sort. BD FACSDiva and FlowJo software was used for analysis and gate-setting.
ChIP-seq. In total 600,000-700,000 planarian G2/M cells were FACS-sorted (using 3-day knockdown regenerates) in PBS and pelleted at 4°C. During the pelleting, S2 cells were added (corresponding to roughly 15% of the number of planarian ×1 cells) for the purpose of downstream data normalization 44 . Samples were then used to prepare Illumina sequencing libraries 9 . The process is summarized in Supplementary Figure 12. The libraries were sequenced on an Illumina NextSeq machine. Three biological replicates were prepared. The raw reads are available in the NCBI Short Read Archive (PRJNA338116).
RNA-seq. In total 300,000 G2/M cells were FACS-sorted in RNALater (Ambion) from knockdown and control RNAi animals at 3 days of regeneration. Cells were pelleted at 4°C and Trizol-based total RNA extraction was performed. The amount of total RNA used for each library preparation was 0.8-1 μg. Illumina TruSeq Stranded mRNA LT kit was used for library preparation according to the manufacturers instructions. Libraries were quantified with Qubit, Agilent Bioanalyzer and KAPA Library Quantification qPCR kit. Samples were sequenced on an Illumina NextSeq machine. Two biological replicates were prepared. The raw reads are available in the Short Read Archive (PRJNA338115).
ChIP-seq data analysis. ChIP-seq reads were trimmed with Trimmomatic 0.32 66 and aligned to the S. mediterranea SmedGD asexual genome 1.1 60 and D. melanogaster genome r6.10 67 with BWA mem 0.7.12. Picard tools 1.115 was used to remove read duplicates after mapping. Python scripts were used to filter and separate out read pairs belonging to either genome. ChIP-seq coverage tracks were then generated and normalized according to Orlando et al. in order to account for any technical variation between samples 44 . For more in-depth methods, including code, refer to Supplementary Software.
RNA-seq data analysis. Raw reads were trimmed with Trimmomatic 0.32 66 and pseudo-aligned to a set of asexual genome annotations 8 with Kallisto 0.42 68 . Differential expression was subsequently performed with Sleuth 0.28.1 69 . For more indepth methods, including code, refer to Supplementary Software.
Statistical methods. Wherever cell number was compared between experimental condition and control, a two-tailed t-test assuming unequal variance was used. Each legend states the number of specimens per condition, where relevant. Bar graphs show the mean average and the error bars are always Standard Error of the Mean.
For analysis of RNA-seq data, Wald's test (as part of the Sleuth 69 software) was used for assessing differential expression. Spearman's rank correlation was used for assessing the correlation between RNA-seq and ChIP-seq data. Hypergeometric tests were used for assessing gene enrichment in the RNA-seq data.
Code Availability. Code used in the analysis of data are presented in Supplementary Software.