Spatiotemporal patterning of EpCAM is important for murine embryonic endo- and mesodermal differentiation

Epithelial cell adhesion molecule EpCAM is expressed in pluripotent embryonic stem cells (ESC) in vitro, but is repressed in differentiated cells, except epithelia and carcinomas. Molecular functions of EpCAM, possibly imposing such repression, were primarily studied in malignant cells and might not apply to non-pathologic differentiation. Here, we comprehensively describe timing and rationale for EpCAM regulation in early murine gastrulation and ESC differentiation using single cell RNA-sequencing datasets, in vivo and in vitro models including CRISPR-Cas9-engineered ESC-mutants. We demonstrate expression of EpCAM in inner cell mass, epiblast, primitive/visceral endoderm, and strict repression in the most primitive, nascent Flk1+ mesoderm progenitors at E7.0. Selective expression of EpCAM was confirmed at mid-gestation and perinatal stages. The rationale for strict patterning was studied in ESC differentiation. Gain/loss-of-function demonstrated supportive functions of EpCAM in achieving full pluripotency and guided endodermal differentiation, but repressive functions in mesodermal differentiation as exemplified with cardiomyocyte formation. We further identified embryonic Ras (ERas) as novel EpCAM interactor of EpCAM and an EpCAM/ERas/AKT axis that is instrumental in differentiation regulation. Hence, spatiotemporal patterning of EpCAM at the onset of gastrulation, resulting in early segregation of interdependent EpCAM+ endodermal and EpCAM−/vimentin+ mesodermal clusters represents a novel regulatory feature during ESC differentiation.


Results
EpCAM expression in early murine gastrulation. In order to gain insight in the regulation pattern of EpCAM during early embryonic development, we re-analyzed two previously published single cell RNAsequencing datasets of mouse embryos 32,33 , spanning a developmental time going from blastocyst (E3.5) to head fold (E7.75) and including several embryonic and extra-embryonic lineages (see Fig. 1).
EpCAM mRNA expression was highest in cells of the inner cell mass (ICM) that comprises ESC, was slightly reduced but remaining high in the epiblast at E4.5, E5.5 and E6.5, and was sustained at high levels throughout early endodermal differentiation, including primitive endoderm (E4.5), and various stages of visceral endoderm (E6.5-6.75), and in the primitive streak (Fig. 1a).
EpCAM was co-expressed with the endodermal marker Foxa2 in cells of the primitive (E4.5) and visceral (E5.5-6.5) endoderm, and in cells of the forming primitive streak (E6.5) ( Fig. 1a and Supplementary Figure 1). Foxa2 mRNA expression was absent in the inner cell mass and early epiblast cells, and was increased in single cells of E6.5 epiblast. Expression of the mesodermal and ectodermal markers vimentin and nestin, respectively, was detectable but low in all stages of early embryonic development. Co-expression of EpCAM, vimentin, and nestin was mostly detected in cells of the forming primitive streak, where vimentin and nestin were expressed at comparably low levels ( Fig. 1a and Supplementary Figure 1).
Next, expression of EpCAM mRNA was analyzed in a second dataset including single cells isolated from the epiblast (E6.5), and from Flk1 + (E7.0), Flk1 + /CD41 + and Flk1 − /CD41 + progeny (E7.5-7.75) 33 . As described in the original publication, ten clusters could be identified in this dataset, which comprise several mesodermal lineages and embryonic blood cells (Fig. 1b). EpCAM mRNA expression was high in cells of the extraembryonic ectoderm, the visceral endoderm and the epiblast at E6.5 (Fig. 1b), thus confirming the aforementioned expression pattern. Nascent mesodermal progenitors and posterior mesoderm cells were characterized by a substantial ~100-fold and ~500-fold down-regulation of EpCAM expression, respectively (Fig. 1b). Further differentiated mesodermal cells including endothelium, pharyngeal mesoderm, blood progenitors and blood cells, were largely devoid of EpCAM mRNA expression, except cells of the allantois, which expressed residual levels comparable to nascent mesodermal progenitors (Fig. 1b). EpCAM was co-expressed with Foxa2 in cells of the visceral endoderm ( Fig. 1b and Supplementary Figure 1), while the expression of mesodermal marker vimentin displayed a complementary pattern of expression with respect to EpCAM (Fig. 1b). Indeed, vimentin expression was expectedly induced in nascent mesodermal progenitors and was sustained in their progeny. Ectodermal marker nestin was expressed at low levels in all differentiation stages analyzed and, thus, showed little co-expression with EpCAM ( Fig. 1b and Supplementary Figure 1).
Overall, EpCAM expression was associated with pluripotent and endodermal cells throughout early embryonic development, and was negatively correlated with mesodermal marker vimentin owing to an early repression in nascent mesoderm.
EpCAM patterning from mid-gestation to perinatal embryonic development. Based on RNA-sequencing data demonstrating an early patterning of EpCAM expression in endo-versus mesodermal cells, we analyzed the expression of EpCAM protein in C57BL-6 mouse embryos from mid-gestation (E9. 5 Figure 2). Similar expression of EpCAM in endodermal structures (e.g. lungs, colon epithelia), but a lack in mesodermal structures (e.g. heart) was observed at later embryonic E12.5 and E18.5 stages (Fig. 2a,b and Supplementary Figure 2). Selective expression of EpCAM was associated with the endodermal marker Foxa2 in the pharyngeal area, and primitive gut, while both were lacking in vimentin + heart primordium. EpCAM, but not Foxa2, was expressed in the embryonic periderm, which derives from surface ectoderm (itself a derivative of the EpCAM + epiblast) (Fig. 2c). At perinatal stage E18.5, EpCAM was present in epidermis, hair follicles, epithelial lining and root of tongue, larynx, pharynx, auditory channel, salivary glands, liver, lung, epithelial cells of kidney tubules, and colon (Supplementary Figure 2c).
Hence, EpCAM protein expression from mid-gestation to perinatal embryos confirmed RNA-sequencing expression data of early embryonic developmental stages. In particular, EpCAM displayed partially overlapping expression pattern with Foxa2 in endodermal tissues, but was mutually exclusive with vimentin in mesoderm.
EpCAM patterning in early 3D-differentiation of ESC. We made use of a hanging-drop 3D-differentiation model to generate embryoid bodies (EB) from E14TG2α ESC (Fig. 3a,b), which closely mimics embryogenesis in vitro and allows genetic manipulations 34 .
Down-regulation of cell surface expression of EpCAM and pluripotency marker SSEA-1 by more than 90% was observed in differentiated EB (day 21) compared to pluripotent ESC (Fig. 3c,d). Loss of EpCAM mRNA by 90% (Fig. 3e) was progressive and slightly delayed compared to core reprogramming factor Oct3/4 (Fig. 3f). Transcript levels of EpCAM, Foxa2, vimentin and nestin were assessed from RNA-sequencing datasets generated from single cells at day E3.5-6.5 32 (a) and E6.5-7.75 33 (b) of murine embryos. We show a visualization of the datasets with t-distributed stochastic neighbor embedding 67 (upper left panels), by highlighting the clusters defined in the original publications 32,33 . The approximate localization of cells included in each dataset is depicted in schemes for each time point (A: anterior; P: posterior; Pr: proximal; D: distal). Transcript levels are depicted as box-whisker plots with log 10 normalized counts (adding a pseudocount of 1), and colored according to cell type (lower panels). See Methods for additional details. . C colon; E eye; ENT ear nose and throat area; FB fore-brain; H heart; HB hind-brain; HP heart primordium; K kidney; L lung; LB limb bud; LP liver primordium; MB mid-brain; OP otic pit; P periderm; PE pharyngeal epithelium; PG primitive gut; So somite.  Chromatin immunoprecipitation (ChIP) experiments displayed enrichment of polymerase 2 (Pol II) and activating trimethylation of histone 3 at lysine 4 (H3K4) at two sites within the promoter of the murine EPCAM gene in pluripotent ESC (Fig. 3g,i). Control amplifications at the CenpI locus did not show any enrichment for Pol II, H3K4, or H3K27 and input controls revealed comparable levels (Fig. 3h,i). Upon differentiation, Pol II binding and H3K4 trimethylation progressively decreased, while inhibitory trimethylation of histone 3 at lysine 27 (H3K27) increased (Fig. 3i). Data mining of ChIP results confirmed Pol II and methylation pattern at the EpCAM promoter in undifferentiated Bruce4 ESC. In heart cells of eight weeks old mice, the EPCAM promoter lacked Pol II, with weak H3K4 and strong H3K27 trimethylation. Comparable chromatin silencing was observed in liver, brain, and spleen, while kidney and thymus displayed presence of all three marks, reflecting EpCAM heterogeneity in these organs (Supplementary 4). Thus, EB differentiation mimics expression dynamics of EpCAM observed in early embryos, with a strong reduction of protein expression and of transcription due to epigenetic control.
Early segregation of EpCAM + and EpCAM − cell clusters of differentiating ESC. Next, we analyzed the expression pattern of EpCAM during 3D-differentiation in EB at the cellular level. In differentiating EB of E14TG2α ESC, loss of EpCAM expression and segregation of EpCAM + and EpCAM − clusters from day 3.5 onwards resulted in spatiotemporal patterning of EpCAM (Fig. 4a). Typically, a layered margin of flattened visceral endoderm cells expressed EpCAM from day 4.0 onwards, while progressive loss of EpCAM was observed in the remaining EB, resulting in a majority of cells entirely devoid of EpCAM after day 6.0 of differentiation (Fig. 4a). Over time, EpCAM + cells were further restricted and differentiated to prismatic epithelium characterized by a basolateral expression pattern of EpCAM (Fig. 4b). Similar spatiotemporal patterning was observed in EB generated from Bruce4 ESC, although with a slightly delayed time course, which reflected differences in proliferation rates of both cell lines (Supplementary Figure 3e).
Early in ESC differentiation (EB d4.5), EpCAM partially overlapped with Foxa2 (prominently in cells of the visceral endoderm) and was mutually exclusive to vimentin (Fig. 4c,d). Neither strict negative nor positive correlation was observed between EpCAM and nestin (Fig. 4c). Mutually exclusive expression of EpCAM and vimentin was further confirmed with immunofluorescence double-staining of EB at day 4 and 5 ( Fig. 4e). EpCAM regulation in pluripotency and endodermal differentiation. In the following, we addressed the rationale for the observed association of EpCAM with early endoderm and repression in mesoderm. Endodermal differentiation of ESC was induced upon treatment with basic fibroblast growth factor (bFGF) and retinoic acid (RA) as described 35 . Differentiation was confirmed by 80% reduction of Oct3/4 and induction of Foxa2, Gata4, Eomes, and Afp (Fig. 5a). In contrast to a >90% reduction during spontaneous differentiation, EpCAM mRNA levels were enhanced by 2.6-fold compared to pluripotent ESC upon endodermal differentiation (Fig. 5a). Hence, EpCAM expression in spontaneous differentiation and fostered endodermal differentiation vary by a factor of >25-fold and was tightly co-expressed with Foxa2 + /Gata4 + in endodermal clusters (Fig. 5b).
In order to assess its contribution to pluripotency and differentiation, EpCAM was ectopically expressed from the CMV early promoter in E14TG2α ESC. The CMV promoter was chosen based on its capacity to drive expression of genes in ESC with a retained ability to be (down)-regulated during differentiation 36 . Over-expression of EpCAM at the surface of pluripotent ESC by a factor of 20-fold was feasible ( Fig. 5c and Supplementary Figure 6a). However, similar to endogenous EpCAM protein, ectopic expression was reduced by 90% during spontaneous 3D-differentiation compared to the pluripotent state of over-expression, suggesting a necessity for EpCAM reduction during differentiation. It must be noted that EpCAM transfectants remained with EpCAM levels superior than wild-type ESC after differentiation, owing to the initial over-expression (Supplementary Figure 6b). Unlike endogenous EpCAM mRNA levels, which were consistently reduced by 90% upon spontaneous differentiation, EpCAM mRNA levels in stable transfectants expressing exogenous EpCAM from the CMV promoter were reduced by 43% (Supplementary 6c), suggesting a combination of transcriptional and post-translational regulation of EpCAM expression. Ectopic EpCAM over-expression had no measurable impact, neither on expression of Oct3/4, Sox2 and Nanog in pluripotent and differentiated E14TG2α cells (Fig. 5e), nor on morphology and generation rate of EBs (Supplementary Figure 6d,e). Assessment of selected ecto-, meso-and endodermal markers under pluripotency and at intermediate (d8) and late time points (d21) of spontaneous differentiation disclosed significant induction of hepatocytic markers alpha-fetoprotein (Afp) and fibronectin 1 (Fn1) in EpCAM over-expressing ESC (Fig. 5f).
Next, we addressed the influence of a loss-of-function of EpCAM on pluripotency. CRISPR-Cas9-mediated knockouts of EPCAM were generated as E14TG2α single cell clones and were confirmed through genomic DNA sequencing, protein and cell surface expression. Premature stop codons resulted in theoretical proteins with predicted compositions of 28 to 144 N-terminal amino acids (Supplementary Figure 7a), which led to a complete loss of EpCAM at the cell surface and in lysates (Fig. 5g). EpCAM knockout (n = 6 independent clones) resulted in reduced expression of pluripotency genes Oct3/4, Sox2 and Nanog, ranging from 20%-48%, 30%-63%, and 57%-75% reduction, respectively (Fig. 5h). Effects of EpCAM knockout on the capacity of ESC to differentiate into endodermal tissue was analyzed following bFGF and RA treatment through the assessment of endodermal markers Foxa2, Afp and Eomes. All knockout clones displayed significant >50% reductions of endodermal markers, except for clone #58, which expressed Afp similarly and Foxa2 to enhanced levels compared to wild-type E14TG2α ESC (Fig. 5i). Hence, EpCAM expression is maintained during endodermal differentiation, while EpCAM knockout reduces pluripotency and endodermal differentiation capacity.

EpCAM regulation is mandatory for mesodermal differentiation of ESC to cardiomyocytes.
In vivo, EpCAM was strictly lacking in nascent mesodermal progenitors (see Fig. 1) including the cardiac mesoderm (see Fig. 2). Analysis of perinatal E18.5 cardiomyocytes substantiated a lack of EpCAM, Foxa2 and nestin expression, but expression of vimentin (Fig. 6a). Frequent spontaneous 3D-differentiation of ESC to contractile cardiomyocytes 37 was confirmed within E14TG2α EB (Video 1).
EpCAM expression in EBs closely mimicked the strict regulation observed in vivo, with a pattern mutually exclusive to vimentin and a lack of EpCAM in cardiomyocytes. Therefore, we sought to analyze the effects of exogeneous over-expression of EpCAM on mesodermal differentiation to cardiomyocytes. Ectopic over-expression of EpCAM resulted in reduction of the frequency of contracting EB, as a functional surrogate of cardiomyocyte differentiation comparable to foetal cardiomyocyte development 38 , from an average 86.5% in wild-type and 79% in vector controls to 17.6% in EpCAM over-expressing ESC (Fig. 6b). Immunohistochemical staining of contracting EB with antibodies specific for EpCAM, epithelial marker CK8/18 and cardiomyocyte marker alpha-cardiac actin (α-CAA) corroborated the close proximity but segregation of EpCAM + /CK8/18 + epithelial cells and EpCAM − /α-CAA + cardiomyocytes (Fig. 6c). In non-contracting EB from EpCAM transfectants, EpCAM was expressed more evenly and overlapped with CK8/18 in marginal cells, whereas α-CAA was lacking (Fig. 6c).   Figure 8). Furthermore, cardiomyocyte inhibition required full-length, uncleaved EpCAM, since only EpCAM-YFP but none of the RIP products of EpCAM cleavage, i.e. EpICD-YFP and EpCAM-CTF-YFP fusion proteins, inhibited the formation of contractile EB (Fig. 6d).
Thus, EpCAM over-expression inhibits cardiomyocyte formation, but RIP of EpCAM is not the molecular basis for the inhibitory effect.
EpCAM knockout impacts on cardiomyocyte differentiation. Complete loss of EpCAM in mesodermal progenitors is required for cardiomyocyte development, but EpCAM simultaneously impacts on pluripotency and endodermal differentiation. Owing to the interdependency of cells during differentiation, we analyzed possible effects of EpCAM knockout on EB contraction. Three control single cell clones that have undergone CRISPR-Cas 9 transfection and selection procedures, but which displayed wild-type (n = 2) or only minor decrease in levels of EpCAM (n = 1) expressed levels of pluripotency genes Oct3/4 and Nanog comparable to wild-type cells, thus suggesting full pluripotency (Supplementary 6b-e). None of the control clones were impaired in cardiomyocyte formation as measured through the generation of contracting EBs (Supplementary Figure 6f). In contrast, four out of six E14TG2α EpCAM knockout clones were severely impaired in the formation of contracting EB, with contraction rates dropping to 0.1-12.5% (Fig. 6e). Guided mesodermal differentiation of these four knockout clones after treatment of cells with 30 µM CHIR 99021 and 5 µM cyclopamine for 5 days was associated with reduced levels of brachyury, demonstrating diminished capacity to form mesodermal structures (Fig. 6f).
Physical contact with endodermal cells is reportedly decisive during the generation of cardiomyocytes. Initially, mesodermal progenitors require a Mesp1/Wnt5a-dependent activation of cardiovascular development, which is followed by mandatory reduction of Wnt5a and Mesp1 expression, and subsequent induction of Wnt11 for the completion of cardiomyocyte maturation through the instruction by Sox17 + /EpCAM + endodermal cells [40][41][42] . Spontaneous differentiation of wild-type E14TG2α cells was conducted in a time kinetic over ten days and mRNA expression of EpCAM, Wnt5a, Mesp1 (both early regulators), Gata4, Nkx2.5 (both intermediate regulators), Wnt11 (late regulator) and α-CAA (cardiomyocyte marker) was assessed. Time-dependent expression of these genes confirmed the abovementioned sequence of expression, with a peak of Wnt5a, Mesp1 and Gata4 at day 5 of differentiation, followed by a strong or complete loss of expression of Wnt5a and Mesp-1 at day 7, respectively. Expression of Gata4 was decrease to approx. 50% at day 10. Starting with day 5, Wnt11, Nkx2.5 and α-CAA expression was sustainably increased and peaked at day 10 ( Fig. 6g). After spontaneous differentiation at day 10, non-contracting knockout clones were characterized by marginally elevated levels of Wnt5a, strongly up-regulated Mesp1 expression and significantly reduced levels of Wnt11, Gata4, Nkx2.5 and α-CAA (Fig. 6h).
Wnt5a and Mesp1 regulation was further substantiated by their exclusive expression in cells of the primitive streak during early gastrulation at day E6.5 (Supplementary Figure 9). Both regulators of initial cardiomyocyte development were further expressed in nascent mesoderm progenitors at day E7.0, but especially Mesp1 was strongly down-regulated in all Flk1 + mesodermal progeny along differentiation (Supplementary Figure 9). A role for EpCAM + /Gata4 + endodermal cells in cardiomyocyte development was further suggested by their co-expression in primitive and visceral endoderm at early stages of gastrulation (Supplementary Figure 9).
Thus, impaired regulation of EpCAM through genetic knockout impacts on endo-and mesodermal differentiation, which are both required for the formation of contracting cardiomyocytes. Eventually, EpCAM-deficient ESC only partially progressed through mesodermal differentiation, and appeared blocked at a Mesp1 high stage.
EpCAM regulates ESC differentiation via ERas/AKT. EpCAM cleavage products CTF and EpICD did not limit cardiomyocyte formation, suggesting a role for full-length EpCAM in inhibition. Interacting partners of full-length EpCAM were assessed using a combination of stable isotope labeling with amino acids in cell culture (SILAC), immunoprecipitation of YFP-and EpCAM-YFP in murine F9 teratocarcinoma cells, and identification of co-precipitated proteins by LC-MS/MS. ESC-expressed Ras (ERas), a hyperactive version of the small GTPase Ras 43 , was reproducibly identified through immunoprecipitation with EpCAM and subsequent mass spectrometry analyses as interaction partner in three independent experiments. EpCAM and ERas interaction was subsequently validated in independent co-immunoprecipitations of lysates from YFP-and EpCAM-YFP-expressing F9 teratoma cells and E14TG2α ESC (Fig. 7a). Similarly to EpCAM, single cell RNA-sequencing analysis revealed that ERas expression was lost in nascent mesodermal progenitors and later stages of mesodermal differentiation (Fig. 7b). Down-regulation of ERas was confirmed during 3D-differentiation of E14TG2α ESC (Fig. 7c) and expression was predominant in marginal cells of EB, whereas cells of inner areas were deprived of ERas at later differentiation stages (Fig. 7d). In vivo, EpCAM and ERas displayed a high degree of co-regulation that resulted in indicated time points (n = 3 independent experiments). (g) Cell surface EpCAM expression was measured by FACS and in cell lysates of wild-type and EPCAM knockout clones under pluripotency conditions (n = 3 independent experiments). (h) Oct3/4, Sox2 and Nanog mRNA expression was measured by quantitative PCR in wild-type and EpCAM knockout clones under pluripotency conditions (n = 3 independent experiments). (i) Foxa2, Afp and Eomes mRNA expression was measured by quantitative PCR in wild-type and EpCAM knockout clones after endodermal differentiation upon treatment with RA and bFGF at day 5 as described in Methods section (n = 3 independent experiments). Mean ± SEM; Student's T-test (n = 2 groups) or One-Way ANOVA (n ≥ 3 groups); p < 0.05, **p < 0.01, ***p < 0.001. the simultaneous expression in kidney, hair follicles and lung, but complete loss in brain, heart and bones at day E18.5 of embryonic development (Supplementary Figure 10).
ERas signaling primarily induces the PI3-kinase/AKT branch 43 . Accordingly, overexpression of EpCAM in E14TG2α ESC induced an average 2.2-fold increase in AKT phosphorylation at serine 473 and hyper-activated AKT after insulin-like growth factor treatment (Fig. 7e,f). Oppositely, knockout of EpCAM in ESC resulted in reduced phosphorylation of AKT by 72.5% in average (Fig. 7g). Next, FLAG-tagged ERas and a constitutively active, myristoylated variant of AKT (myrAKT) were expressed in ESC (Supplementary Figure 7g,h). Both, Flag-ERas and myrAKT significantly reduced percentages of contraction by 30% compared to wild-type and vector controls (Fig. 7h).
In order to assess whether hyperactive ERas could complement for the differentiation defects observed upon loss-of function of EpCAM, EpCAM knockout clones #58 and #118, which displayed a retained contraction capacity, were subjected to CRISPR-Cas9-mediated knockout of ERas. Two EpCAM − /ERas − knockout clones of clones #58 and #118 were further analyzed (n = 4). Genomic DNA analysis proved genetic deletions in the ERAS locus, and double-knockout clones lacked ERas protein (Supplementary Figure 7i,j). Double knockout of EpCAM and ERas resulted in complete or substantial impairment to form contracting EB (0-54%) as compared with single-knockout clones #58 and #118clones (94% and 93%) (Fig. 7i). Impaired cardiomyocyte formation in double-knockout clones was further accompanied by reduced Gata4 expression (data not shown). Hence, EpCAM/ERas/AKT compose a regulatory signaling axis in ESC and ERas can partially complement for EpCAM knockout during differentiation.

Discussion
In the present study, we have comprehensively addressed timing and rationale for the strict differential regulation of EpCAM throughout development. RNA-seq. datasets 32,33 , which we have re-analyzed, firstly provided a high resolution of the precise timing of EpCAM regulation from early-to mid-gastrulation embryos at the single cell level. Throughout blastocyst and initiating gastrulation stages (E3.5-6.5), EpCAM was retained at high levels in cells of the inner cell mass, primitive and visceral endoderm, epiblast, and primitive streak. Co-expression of EpCAM with endodermal transcription factor Foxa2 was observed in primitive and visceral endoderm, as well as in the primitive streak, although to lower degree. This co-expression was further substantiated in visceral endoderm and endodermal clusters in later stage embryos and in ESC-derived EB at early time points of 3D-differentiation, when formation of visceral endoderm was reported 44 . High expression of EpCAM was retained in epiblast, visceral endoderm and extraembryonic ectoderm at E6.5. At this time point, co-expression of EpCAM with mesodermal marker vimentin was mostly restricted to the primitive streak, where significant but low levels of vimentin expression emerged within the cells that will ultimately undergo gastrulation and generate all three germ layers. Genetic knockout of murine EPCAM in ESC performed in the present study demonstrated that expression in ESC is required for full pluripotency to form cells of all germ layers, which is in accordance with its reported role in human and murine ESC and porcine iPS [3][4][5]21 . Knockout clones further displayed reduced capacity for guided endodermal and mesodermal differentiation, suggesting a requirement for EpCAM function(s) for the generation of primitive and visceral endoderm, and support of mesodermal lineages.
Noteworthy, extensive loss of EpCAM expression was observed in nascent mesodermal progenitors at E7.0. Later stages of mesodermal differentiation were all characterized by complete loss of EpCAM and progressive increase of vimentin. Complete loss of EpCAM was observed in posterior mesoderm cells, demonstrating a very strict and timely down-regulation in earliest Flk1 + mesodermal lineages as well as later, blood progenitors and embryonic blood cells. Continuation of a strict regulation of EpCAM was verified in mid-gestation and perinatal embryos. Retention in endodermal tissue, but complete loss of EpCAM in meso-and ectodermal tissues was consistent, resulting in segregation of EpCAM + and EpCAM − cell clusters at E9.5, E12.5 and E18.5. Thus, repression of EpCAM in mesodermal progenitors and retention in endodermal progenitors represents an early event during gastrulation of mouse embryos.
The rationale for this repression of EpCAM in differentiation was thereafter analyzed in a 3D-differentiation model of murine ESC, which mimics aspects of early embryogenesis and allows for genetic manipulations 45 . ESC-derived EB represent a valuable approximation of an embryo-like architecture composed of an external primitive/visceral endoderm and internal meso-and ectoderm lineages 34,46 , which includes the formation of an anteroposterior axis and a primitive streak based on Wnt signaling 47,48 . Spatiotemporal patterning, with segregation but close proximity of EpCAM + and EpCAM − clusters, was reproduced in EB. Segregation of EpCAM + and EpCAM − clusters might result from a direct effect of EpCAM on cell adhesion 49 or via its negative impact on E-cadherin-mediated cell-cell contacts 50 , which is a central cell adhesion molecule during asymmetric segregation of murine epithelial cells 51 and during gastrulation of zebrafish through a Wnt11-dependent regulation of cohesion of early meso-endoderm 52 .
Gain-and loss-of-function manipulations in the present study disclosed that selective expression of EpCAM supports an interdependent differentiation along EpCAM + endodermal and EpCAM − mesodermal lineages. In and cyclopamine at day 5 (n = 3 independent experiments). (g) Wnt5a, Mesp1, Wnt11, Gata4, Nkx2.5 and α-CAA mRNA expression was measured by quantitative PCR in wild-type E14TG2α ESC after spontaneous differentiation in a kinetic at the indicated time points (n = 3 independent experiments). (h) Wnt5a, Mesp1, Wnt11, Gata4, Nkx2.5 and α-CAA mRNA expression measured by quantitative PCR in wild-type and EpCAM knockout E14TG2α ESC clones after spontaneous differentiation (D10) (n = 3 independent experiments). Markers in (g-h) are all color-coded. Mean ± SEM; Student's T-test (n = 2 groups) or One-Way ANOVA (n ≥ 3 groups); p < 0.05, **p < 0.01, ***p < 0.001.  . EpCAM/ERas/AKT axis limits cardiomyocyte formation. (a) EpCAM-YFP and endogenous ERas interact in F9 teratoma cells and E14TG2α ESC transfectants expressing EpCAM-YFP or YFP. EpCAM-YFP and YFP were immunoprecipitated from stable F9 and E14TG2α ESC transfectants. Co-precipitated endogeneous ERas was detected with in immunoblots with specific antibodies. Immunoprecipitation (IP); immunoblot (IB) (n = 3 independent experiments). (b) Transcript levels of ERas were assessed from RNA-sequencing datasets generated from single cells at day E3.5-6.5 32 (left) and E6.5-7.75 33 (right). Transcript levels are depicted as box-whisker plots with log 10 normalised counts (adding a pseudocount of 1). See Fig. 1 and Methods for the approximate localization of cell types and for additional details on the datasets. (c) EpCAM and ERas mRNA expression was measured by quantitative PCR in pluripotent E14TG2 ESCs (d0) and EB (d2-14) (n = 3 independent experiments). (d) EpCAM and ERas protein expression was assessed by immunohistochemistry in E14TG2α ESC embryoid bodies (EB d2, d14) the frame of a guided endodermal differentiation protocol of ESC and during spontaneous formation of endodermal clusters, EpCAM was maintained, whereas complete repression of EpCAM was necessary in the mesodermal lineage in order to differentiate to cardiomyocytes (Fig. 8). EpCAM expression was supportive of the complete expression of Gata4, Foxa2 and Afp in ESC-derived endodermal cells. This finding is congruent with a central role of Gata4 in hepatocyte induction 53 , and a necessary physical contact of developing cardiomyocytes with endodermal cells 54 , more precisely with Gata4-producing Sox17 + -EpCAM + visceral endoderm or hepatocyte-like oval cells, to instruct cardiomyocyte differentiation 40,55 . Sox17 + -EpCAM + visceral endodermal cells themselves progress to hepatocyte progenitors 40 , further substantiating an interdependence of EpCAM + and EpCAM − cells in differentiation. Accordingly, high expression of EpCAM is a feature of human hepatocytic stem cells 56,57 and EpCAM is a de-repressor of the Wnt signaling cascade that is required to license hepatic development in zebrafish 58 . Knockout of EpCAM in zebrafish induces defective liver development 58 , whereas forced expression of EpCAM in ESC fostered transcription of hepatocytic markers Afp and Fn1, as shown here.

SCIENTIFIC
Transcription factor Mesp1 is central to human cardiovascular development 42 and is required for cardiomyocyte development through the regulation of cardiac mesoderm at E6.5, leading to the formation of the first heart field and, subsequently, the heart tube 42,59,60 . In the process of migration to form the heart crescent, cardiomyocyte progenitors will down-regulate Mesp-1 to further promote the generation of mature cardiomyocytes 61 . EpCAM knockout clones displayed strongly enhanced levels of Mesp1 and significantly reduced levels of Wnt11 at day 10, a time point when cardiomyocyte differentiation is completed in EB and Mesp1 down-regulation is accomplished. From these knockout studies, we conclude that EpCAM-deficient clones are blocked at a Mesp1 + stage along the differentiation into cardiomyocytes. The resulting reduction of Wnt11 levels will potentially impact on meso-endoderm cohesion 52 , and could ultimately impede on mesodermal differentiation.
Promoting and restricting effects of EpCAM on differentiation were connected to its interaction with hyperactive Ras GTPase ERas, which is central to oncogenic and proliferative features of ESC 43,62 . In line with the ability of ERas to induce activation of AKT, expression of EpCAM fostered the activating phosphorylation of AKT at serine 473 . Over-expression of ERas or an activated version of AKT mimicked limiting effects of EpCAM on cardiomyocyte formation, although less potently than EpCAM. This is congruent with a reported loss of ERas in E7.5 embryos to facilitate primitive streak and mesoderm generation, but retention in endoderm of the same gestation stage 63 . Regulation of ERas was even stricter than that of EpCAM, with a total loss already at the stage of nascent mesodermal progenitors. EpCAM knockout clones with retained capability of cardiomyogenesis were severely impaired after additional ERas knockout. This suggests that ERas can complement for the loss of EpCAM to support the formation of Gata4 + endodermal cells, which are required for cell non-autonomous inductive signals to cardiomyocyte progenitors (Fig. 8). EpCAM is together with galectin-1 the second membrane receptor described to interact with ERas 62 . Binding of ERas to EpCAM might facilitate initiation of signaling and recruitment of downstream molecules to ERas and, thereby, orchestrate signalling to AKT and further downstream targets. Similar roles of EpCAM and ERas are further substantiated through comparable effects during somatic reprogramming of fibroblasts to induced pluripotent stem cells 20,22,64 . Thus, EpCAM/ERas/AKT represents a novel signaling axis in ESC that participates in the regulation of endodermal cells and limits mesodermal differentiation to cardiomyocytes.
In summary, EpCAM expression is tightly regulated at earliest time points of gastrulation in order to achieve a mandatory spatiotemporal cellular heterogeneity of EpCAM in endo-and mesodermal lineages. Cell-autonomous roles of EpCAM emerge as a licensing factor for endoderm and as a limiting factor of cardiomyocyte development. Simultaneously, cell non-autonomous functions of EpCAM are likely in effect, such that EpCAM − mesodermal cells depend on EpCAM + Gata4-producing cells during the regulatory interplay of cardiomyocyte development (Fig. 8). Disturbance of this tight control of EpCAM results in perturbation of spontaneous, endo-and mesodermal differentiation.

Methods
Biological and technical replicates. Throughout the manuscript, biological replicate is referred to as a fully independent experiment performed with newly generated material (e.g. cell lysates, mRNA, etc.), whereas a technical replicate (e.g. mentioned as "duplicates") is a repeated measurement with identical material. Technical repeats address assay accuracy and reproducibility (assay noise), while biological repeats demonstrate reproducibility of assay outcome/results 65 .
Knockout of EpCAM and ERas was conducted using the CRISPR-Cas9-based system (Sigma Aldrich, Munich, Germany). Two guide RNAs (gRNAs) located in exons 2 and 4 of the EPCAM gene and two gRNAs located in exon 1 of ERAS were used in the all-in-one Cas9 and guide RNA expression plasmids. After transfection, plasmid-positive E14TG2α cells were sorted according to GFP expression and deposited as single cells into 96-well plates. Single-cell clones were analyzed through flow cytometry and immunobloting. Genetic knockout was confirmed upon sequencing of the according genomic DNA flanking and encompassing gRNA sequences.

Embryoid body formation and contraction.
To generate EB, 500 cells in 20 µl of Stempan Gmem medium (PAN-Biotech, Aidenbach, Germany) lacking ESGRO ® LIF were plated on the lid of a cell culture plate according to 46 . After three days, EB were manually transferred to ultra-low attachment plates (Nunc, Wiesbaden, Germany) for four days before transfer in standard 96 well plates for further differentiation up to 21 days. For immunohistochemistry, EB were embedded in tissue-tek (Sakura Finetek, Germany), snap-frozen in liquid nitrogen and processed to 4 µm sections. EB contraction was analyzed after 10 days via counting under a microscope in 96-well plates. At this time point, EB had an average size of 466 ± 24 µm (mean ± SEM; of ≥3 independent experiments). Contraction is given as percentage of contracting EBs standardized for number of EBs formed.
Embryo isolation. Embryos were isolated from the uterus of C57BL/6 wild-type mice at days post coitum E9.5, E12.5, and E18.5. All methods involving animals were carried out in accordance with relevant guidelines and regulations of the animal care licensing committee of the Helmholtz Centre Munich. All experimental protocols were approved by the licensing committee of the Helmholtz Centre Munich and the state administration "Regierung von Oberbayern".
Flow cytometry, immunohistochemistry and immunoblot staining. Cell surface expression of EpCAM and SSEA1 was measured as described 5 . Briefly, cells were stained with the EpCAM-specific antibody (CD326; BD Biosciences; 1:50 dilution in PBS-3% FCS) or the stage-specific mouse embryonic antigen (SSEA)-1-specific antibody (mouse polyclonal antibody MC480; Abcam) for 15 minutes on ice, washed three times in PBS-3% FCS, and stained with fluorescein isothiocyanate-conjugated rabbit anti-mouse secondary antibody (Vector Laboratories; FI-4001). Measurement of cell surface expression of EpCAM was performed in a FACSCalibur device (BD Pharmingen, Heidelberg, Germany).
Samples from EpCAM-YFP and YFP immunoprecipitates were pooled in three independent experiments and proteins recovered upon heating at 95 °C, 5 min in Laemmli buffer. Immunoprecipitated proteins were separated on SDS-PAGE, trypsinized by in-gel digestion, and analyzed via LC-MS/MS on a LTQ Orbitrap XL coupled to an Ultimate 3000 nano-HPLC. SILAC data analysis was performed using the Max Quant software as described previously 71 . Potential interaction partners were defined as proteins enriched by ≥3-fold, p-values ≤0.05, and ≥2 unique peptides in all three independent experiments. Statistics performed was a classical two-sided unpaired t-test on the individual protein intensities per label per sample, with the intensities for replicate #3 divided by 2.33 in order to normalize for differences in protein input in the IP.
Quantitative RT-PCR. Total mRNA was prepared using RNeasy Mini Kit (Qiagen, Hilden, Germany) and reverse transcribed with the QuantiTect Reverse Transcription-Kit (Qiagen, Hilden, Germany). cDNA was amplified using SYBR-Green PCR mastermix (Qiagen, Hilden, Germany) and gene specific primers. Normalizations across samples were performed using average of constitutive gene expression of glucuronidase beta (gusb). Gene expression levels were calculated according the equation 2-ΔΔCT, where ΔCT was defined as CT gene of interest -CT endogenous control. Levels of mRNA transcripts were assessed upon real time quantitative PCR with a LightCycler 480 device and LightCycler 480 SYBR Green II Master mix (Roche, Mannheim, Germany).