A defined glycosylation regulatory network modulates total glycome dynamics during pluripotency state transition

Embryonic stem cells (ESCs) and epiblast-like cells (EpiLCs) recapitulate in vitro the epiblast first cell lineage decision, allowing characterization of the molecular mechanisms underlying pluripotent state transition. Here, we performed a comprehensive and comparative analysis of total glycomes of mouse ESCs and EpiLCs, revealing that overall glycosylation undergoes dramatic changes from early stages of development. Remarkably, we showed for the first time the presence of a developmentally regulated network orchestrating glycosylation changes and identified polycomb repressive complex 2 (PRC2) as a key component involved in this process. Collectively, our findings provide novel insights into the naïve-to-primed pluripotent state transition and advance the understanding of glycosylation complex regulation during early mouse embryonic development.

www.nature.com/scientificreports/ mutation of C1GalT1, a key enzyme of the mucin-type O-glycosylation pathway, causes embryonic lethality in mice and alters the localization of neuromuscular junctions and establishment of muscle cell architecture in Drosophila 12,13 . Heparan sulfate (HS) reduction leads to differentiation of mouse ESCs and Drosophila germline stem cells 14,15 . Moreover, O-linked N-acetylglucosamine transferase (Ogt) has been shown to be essential for embryogenesis and early development in several animal models 16,17 , and for maintenance of the naïve state in mouse ESCs 18,19 . The overall glycomic profile has been characterized in mouse ESCs, conventional human ESCs (hESCs) and human induced pluripotent stem cells (hiPSCs), which are in a primed state, tumors and late differentiating cells [20][21][22] . However, a comparative analysis of glycomes during mammalian early embryonic development is currently missing.
Here, we performed a comprehensive and comparative analysis of the glycome of mouse ESCs and EpiLCs. Our findings show that all glycosylation classes undergo dramatic changes, both at the transcriptional and the structural level, from early stages of development. Furthermore, we identified polycomb repressive complex 2 (PRC2), a chromatin-remodeling complex which deposits three methyl groups at histone H3 lysine 27 (H3K27me3) to promote gene repression 23,24 , as one key component involved in the network orchestrating these glycosylation changes. Our results provide direct insights into glycosylation dynamics and contribute to the understanding of glycosylation regulation during mouse early embryonic development.

Induction of EpiLCs from ESCs.
To investigate the glycosylation dynamics occurring at the implantation stage, we induced differentiation of EpiLCs from ESCs. ESCs and EpiLCs are dependent on leukemia inhibitory factor (LIF) and fibroblast growth factor (FGF) signaling, respectively 25,26 . Here, we used a previously established protocol 3 with slight modifications (Fig. 1a). EpiLCs exhibit a typical flattened morphology after 72 h of induction (Fig. 1b). Transcriptional analysis of EpiLCs by Real-time PCR showed a negligible change in the pluripotency marker Oct3/4, whereas genes associated with the naïve state, such as Nanog, Esrrb, Klf2, Klf4, Rex1 and Tbx3 were strongly downregulated, together with a striking increase in the primed markers Fgf5, and Otx2 (Fig. 1c), consistently with previous results 3,27 . Accordingly, Oct3/4 protein level was retained in EpiLC at levels similar to those in ESCs, whereas the naïve marker Nanog decreased and the primed marker Otx2 increased. Furthermore, the phosphorylation level of ERK1/2, a downstream kinase involved in FGF signaling 26 , was substantially higher in EpiLCs (Fig. 1d). The expression of these markers was further assessed by immunostaining: Oct3/4 staining slightly decreased during the transition from ESCs to EpiLCs, whereas Nanog was not detected and Otx2 was highly expressed in EpiLCs, further confirming the successful differentiation of EpiLCs from ESCs (Fig. 1e).
Comprehensive and comparative glycome profiling of ESCs and EpiLCs N-linked glycosylation and free oligosaccharides. N-linked glycosylation involves the transfer of a tetradecasaccharide core unit from a lipid-linker donor to an asparagine of nascent proteins at the endoplasmic reticulum (ER). Upon transfer to the polypeptide chain, the glycoprotein undergoes trimming and a quality control process to ensure its correct folding. Fully folded glycoproteins are transported to the Golgi where they are subjected to further trimming and processing prior to transportation to the plasma membrane, whereas misfolded proteins are recycled, resulting in the release of free oligosaccharides (FOS) 28,29 (Fig. 2a). N-glycans and FOS composition were quantified by mass spectrometry (MS) analysis. The total amount of N-glycans was similar between ESCs and EpiLCs. Among the detected N-glycan subclasses, namely high mannose-type (HM), pauci-mannose (PM), and complex/hybrid (C/H), HM structures were the most abundant of the detected N-glycans in both ESCs and EpiLCs, confirming previous results obtained in conventional hESCs and hiPSCs 21 . Fucosylation and sialylation of N-glycans diverged dramatically between ESCs and EpiLCs. The total amount of fucosylated N-glycans was higher in EpiLCs compared to ESCs and was characterized by increased levels of both fucosylated PM and C/H structures. Total sialylated N-glycans, mainly present in an α2,6 configuration, were enhanced in EpiLCs ( Fig. 2b and Supplementary Fig. S1). FOS amount reportedly increases under conditions of ER stress 29 . A sharp reduction in the total amount and relative percentage of FOS was observed in EpiLCs, indicating a reduction of ER stress upon transition from ESC to EpiLC ( Fig. 2c and Supplementary Fig. S2). Interestingly, a recent report demonstrated that ER stress positively modulates interleukin-6 family expression (including LIF) in mouse astrocytes and macrophages 30 , suggesting a connection between FOS, ER stress, and LIF expression in ESCs. Transcriptional analysis of N-glycosylation pathway-specific genes well correlated with MS data. Indeed, Fut8, the sole enzyme responsible for catalyzing N-glycans core fucosylation 31 , showed an increased expression. Interestingly, a robust enhancement of Uggt2 was observed in EpiLCs, suggesting that a reduction in FOS-mediated ER-stress is partially mediated by increased expression of enzymes involved in N-glycans quality control and folding (Fig. 2d). To obtain more detailed insights into the N-glycome we performed a FACS profiling using a set of lectins that recognize N-glycan structures with different specificity. As a result, we observed a shift to shorter HM by galanthus nivalis agglutinin (GNA). Moreover, the increase in core fucosylated N-glycans observed at the structural and transcriptional level in EpiLCs was further confirmed by lens culinaris agglutinin (LCA) staining. In addition, phaseolus vulgaris erythroagglutinin (PHA-E4) and phaseolus vulgaris leucoagglutinin (PHA-L4) profiling indicated an enhancement in N-glycans carrying bisecting GlcNAc and 2,6-branching in EpiLCs, suggesting a relevant role for these structures during EpiLC differentiation ( Fig. 2e and Supplementary Fig. S3). Together, these data demonstrate that the N-glycome composition greatly diverges between ESCs and EpiLCs. The increase in short HM structures, shown by GNA staining (Fig. 2e), and the core fucosylated N-glycans carrying bisecting GlcNAc and 2,6-branching, indicated by the increase of fucosylated PM (Fig. 2b) and by LCA, PHA-E4, and PHA-L4 staining (Fig. 2e), are more characteristic of EpiLCs N-glycome (Fig. 2f).  www.nature.com/scientificreports/ up to 19 transferases that link N-acetylgalactosamine (GalNAc) on a serine/threonine residue to form Tn antigen. Tn antigen can be further elongated adopting one of 4 distinct core extensions: T antigen (Core 1), Core 2, Core 3 and Core 4 32 . Not mucin-type O-glycosylation takes place in the ER, except for the O-linked β-Nacetylglucosamine (O-GlcNAc) addition catalyzed by Ogt, the sole glycosylation that occurs in the cytoplasm and nucleus 33 (Fig. 3a). O-glycome profiling by MS revealed that HexNAc (Tn antigen or O-GlcNAc) was the most prominent O-glycan structure in both ESCs and EpiLCs. Moreover, T antigen and its modified structures were the only detected O-glycans among the mucin-type O-glycosylation core structures in both ESCs and EpiLCs ( Fig. 3b and Supplementary Fig. S4), indicating that the T antigen elongation pathway is the most abundant during early embryonic development, in accordance with our previous work 34 . From a transcriptional standpoint, a strong upregulation of both mucin-type O-glycan and not mucin-type O-glycan-specific genes was observed in EpiLCs, consistently with the overall enhancement of O-glycan content observed by MS. In particular, the expression of Galnt3, 7,13,14,18 and Galntl6, which are involved in the formation of Tn antigen, was dramatically increased (Fig. 3c), suggesting their major involvement in the formation of Tn antigen in EpiLCs. FACS analysis showed a shift from short mucin-type O-glycans, as indicated by Vicia villosa agglutinin-B4 (VVA-B4) and peanut agglutinin (PNA), which recognize Tn and T antigens, respectively, to elongated or Conversely, unlike all the other GAG classes that are synthesized in the Golgi on core proteins, hyaluronan (HA) polymerization occurs at the cell membrane and is not linked to any protein 35 (Fig. 4a). GAG quantification was performed by high-performance liquid chromatography (HPLC). As a result, we observed a marked increase in the total amount of GAG in EpiLCs, especially HS and CS/DS, which represent the vast majority of detected GAG, whereas HA levels did not differ between ESCs and EpiLCs; KS and 3-O-sulfation of HS could not be detected due to technical limitations. The absolute amount of HS sulfation was considerably higher in EpiLCs compared to ESCs, with a specific increase in mono-sulfated 6S and di-sulfated 2SNS, indicating that variations of the HS sulfation pattern occur from early developmental stages. An overall higher level of CS/DS structures was identified in EpiLCs, with predominancy of Unit A, followed by Unit C and unsulfated Unit O ( Fig. 4b and Supplementary Fig. S6), suggesting a role for the 4-O-sulfation pathway (Unit A, B and E) during early embryonic differentiation. Transcriptional analysis of GAG-specific transferases well correlated with HPLC data. Indeed, 2SNS and 6S sulfation enhancement in EpiLCs was followed by a substantial upregulation of Ndst2-4 and Hs2st1, and Hs6st2 expression, respectively (Fig. 4c). FACS profiling confirmed the results obtained by HPLC. In addition, increased staining of di-sulfated CS, high-sulfated KS and 3-O-sulfated HS were observed in EpiLCs, consistently with the striking enhancement of Ust and Chst15, Chst1, and Hs3st4-6 expression ( Fig. 4d and Supplementary Fig. S7). The HS sulfation pattern has been shown to be critical for growth factors binding and downstream signaling. Consistently with our observations in EpiLCs, N-sulfation and 3-O-sulfation were reported to be required for exit from the naïve pluripotent state via FGF and first apoptosis signal receptor (Fas), respectively 11 . In conclusion, EpiLC GAG profile is characterized by a dramatic increase in: (i) sulfation of HS, as shown by HPLC (Fig. 4b), and by JM403, Hepss-1, and mCochlin-Fc staining ( Glycosphingolipids. Glycosphingolipids (GSL) are characterized by the initial addition of glucose or galactose to a ceramide unit to produce glucosylceramide (GlcCer) or galactosylceramide (GalCer), respectively. GlcCer synthesis occurs in the Golgi, where it is further processed to lactosylceramide (LacCer), which is the branching point for the formation of the globo (Gb), ganglio (Gg) and neolacto/lacto ((n)Lc) series. Conversely, GalCer is produced in the ER and is the precursor of the gala-series 36 (Fig. 5a). GSL-glycan analysis was performed by glycoblotting combined with endoglycoceramidase (EGCase) I digestion. GSL composition assessed by MS showed a striking reduction of total GSL in EpiLCs. A shift from Gb and (n)Lc to Gg series was observed during ESC to EpiLC transition ( Fig. 5b and Supplementary Fig. S8). GlcCer and GalCer could not be detected due to the inherent enzymatic specificity of the EGCase used to release the glycan moieties (EGCase I). Analysis of the GSL-specific genes expression showed a marked enhancement of Ugt8a, which is involved in the formation of GalCer, whereas LacCer formation-related transferases were mildly increased or unaltered. The Gb to Gg series switch in EpiLCs observed at the structural level was consistent also at the transcriptional level. A reduction in the expression of the Gb series-specific enzyme B3galnt1 was followed by a robust upregulation of the Gg series-specific enzyme B4galnt1 (Fig. 5c). Moreover, FACS analysis using specific Abs further confirmed the shift from Gb to Gg observed by MS (Fig. 5d), demonstrating that the Gb to Gg series switch previously observed in neurons and embryoid bodies derived from mouse and human ESCs, respectively 37,38 , occurs specifically during the naïve to primed pluripotent state transition. The GSL profile dynamically changes during embryonic development; as a result, specific GSL structures, such as stage-specific embryonic antigen (SSEA)-3, SSEA-4 and Forssman antigen, are used as differentiation stage markers 39 . FACS analysis showed that SSEA-3,4 www.nature.com/scientificreports/  www.nature.com/scientificreports/ and Forssman antigen staining mildly increased in EpiLCs ( Fig. 5d and Supplementary Fig. S9). However, it is worth noting that SSEA-3,4 and Forssman antigen are structures of the Gb series (Fig. 5a), which we showed to be dramatically reduced and undergo a switch to Gg series upon ESC differentiation to EpiLCs, thus suggesting that Gg series structures might be more suitable differentiation markers. Together, these data show that the GSL composition shifts from Gb and (n)Lc to Gg series during ESC to EpiLC transition, and demonstrate the presence of SSEA-3,4 and Forsmann antigen structures in both ESCs and EpiLCs (Fig. 5e).

Elongation/branching and capping/terminal modifications.
Glycosylation is a stepwise process involving more than 200 distinct glycosyltransferases and related enzymes in mammals. These can be classified as pathway-specific and pathway-non-specific, which generally include enzymes involved in biosynthetic steps overlapping different glycosylation classes, such as elongation/branching and capping/terminal modifications (Fig. 6a). Transcriptional analysis by Real-time PCR showed that B3galt1,2,5 and B4galt2-4, involved in type I www.nature.com/scientificreports/ chain (Galβ1-3GlcNAc) and type II chain (Galβ1-4GlcNAc) structures formation, respectively, were enhanced, whereas LacdiNAc (GlcNAcβ1-4GalNAc)-specific enzymes B4galnt3,4, which we previously reported to positively regulate LIF signaling in ESCs 40 , were reduced in EpiLCs. Consistently with a higher level of KS (Fig. 4d), B4galt4, required for KS elongation 35 , was markedly increased. Moreover, upregulation of SSEA-3 at the structural level (Fig. 5d), correlated with a considerable higher expression of B3galt5, the enzyme involved in SSEA-3 synthesis on Gb 36 (Fig. 6b). An overall upregulation of sialyltransferases, including St8sia2,4, the genes involved in polysialic acid (PSA) formation 41 , was observed in EpiLCs. Among them, St6gal1,2, the only sialyltransferases involved in N-glycans sialylation 41 , dramatically increased in EpiLCs, demonstrating a correlation between the observed enhancement of α2,6 sialic acid structures on N-glycans by MS (Fig. 2b) and the expression of relative enzymes. Consistently with the GSL shift from Gb to Gg series observed at the structural level (Fig. 5b,d), the expression of sialyltransferases involved in Gg extension, namely St3gal2,5, St6galNAc3,5 and St8sia3, was substantially upregulated. Moreover, we detected a higher expression of St3gal2, the enzyme synthesizing SSEA-4 on Gb 36 , reflecting the increment observed by FACS (Fig. 5d). Fucosyltransferases expression strongly increased in EpiLCs, except for that of Fut9, the enzyme synthesizing SSEA-1 42 . Remarkably, Fut1,2, involved in the formation of SSEA-5, showed a striking upregulation. Furthermore, sulfotransferases mRNA levels were generally higher in EpiLCs. In particular, EpiLCs showed a considerable increase in Chst2,4 (Fig. 6c), the sole sulfotransferases that catalyze the formation of extended or branched capped mucin-type O-glycan structure 43 , confirming at the transcriptional level the data obtained by Meca-79 staining (Fig. 3d).
To obtain further insights into the major pathway-non-specific structures we used data obtained by MS and performed a FACS profiling using a combination of different lectins and Abs. Structures characterized by MS and FACS strongly correlated with the expression of the relative enzymes. We observed an increase in type II chain structures in EpiLCs, as indicated by erythrina corallodendron lectin (ECorL) staining, albeit Lewis x (Le x ) type II structure (SSEA-1) was unchanged. SSEA-1 was previously shown to be synthesized by Fut4 and, with higher activity, by Fut9 44 , particularly, within the mouse embryo context 42 . Thus, SSEA-1 unaltered staining can be explained by decreased Fut9 and a dramatic increase in Fut4 expression. Conversely, H1 antigen, a type I structure known as SSEA-5, was substantially enhanced in EpiLCs and only detected at a negligible level in ESCs. This is consistent with the robust expression upregulation of the type I chain transferases B3galt1,2,5 and the fucosyltransferases Fut1,2, demonstrating that type I structures increase during ESC to EpiLC transition. Overall fucosylation and sialylation levels were enhanced in EpiLCs, correlating with the observed increased expression of the fucose and sialic acid nucleotide transporters (Supplementary Fig. S10). Notably, sialic acid configuration shifted from an α2,6 to an α2,3 configuration in EpiLCs (Fig. 6d). Furthermore, PSA structure was specifically enhanced in EpiLCs, reflecting the upregulation of St8sia2,4 expression (Fig. 6e,f and Supplementary Fig. S11).

PRC2 contributes to glycosylation changes during ESC to EpiLC transition.
Since the glycosylation profile dynamically changes during ESC to EpiLC transition, we hypothesized the presence of a defined regulatory network. To identify putative candidates, we performed an in-depth analysis of previously published chromatin immunoprecipitation sequencing (ChIP-seq) datasets obtained in ESCs using the ChIP-Atlas comprehensive database 45 (https ://chip-atlas .org), searching for factors that are enriched at the glycosyltransferase promoter regions. As a result, we observed that PRC2 components occupy a great variety of glycosylationrelated genes across all glycosylation classes in ESCs ( Fig. 7a and Supplementary Fig. S12). PRC2, whose core components are Suz12, Eed, and either Ezh2 or Ezh1, is a chromatin-remodeling complex which catalyzes the H3K27me3 modification to promote gene repression. PRC2 can associate with other factors that regulate its chromatin recruitment, such as Mtf2 or Jarid2, to form two different subtypes of PRC2 named PRC2.1 and PRC2.2, respectively. In addition, PRC2 can synergically repress gene expression together with PRC1, which is composed by core components, such as Rnf2 23,24 . Further analysis of previously published ChIP-seq data indicated that PRC2.1, PRC2.2 and Rnf2 (PRC1) act cooperatively to regulate around 27% of the glycosylationrelated genes in ESCs (Fig. 7b). Moreover, a global alteration in the epigenetic state of glycosylation-related genes was observed in EpiLCs compared to ESCs with changes in PRC2-related histone modification H3K27me3 and its counterpart H3K27ac, which promote gene silencing and activation, respectively; promoter activation histone marker H3K4me3; and RNA polymerase II binding 23 (Supplementary Fig. S13 and Supplementary Table S1). Together, these data suggest that PRC2 is involved in glycosylation changes occurring during the transition from ESCs to EpiLCs.
To test this hypothesis, we treated ESCs with the PRC2 inhibitor EED226 50 for 48 h and examined the effects on the glycome. EED226 treatment resulted in a considerable reduction of the H3K27me3 modification (Fig. 7c,d). Glycomic profiling was performed by FACS analysis. As a result, we observed significant alterations in a wide range of structures, confirming that PRC2 is involved in glycosylation regulation in ESCs ( Supplementary  Fig. S14a). Next, we compared the glycosylation alterations observed by FACS during the ESC to EpiLC transition and in ESCs after EED226 treatment. Strikingly, a large number of structures were increased or decreased both in EpiLCs and EED226-treated ESCs, indicating a direct modulation by PRC2 (red in Fig. 7e). Conversely, other structures showed an opposite trend in EpiLCs and EED226-treated ESCs, suggesting the presence of other regulatory component(s) (blue in Fig. 7e). Finally, some structures were altered in EpiLCs but unchanged in EED226-treated ESCs, implying the presence of PRC2-independent pathway(s) (black in Fig. 7e). Transcriptional analysis of PRC2 core components by RNA-seq in ESCs and EpiLCs showed a decrease at the transcriptional level of the PRC2 core component Eed (Supplementary Fig. S14b). In addition, global H3K27me3 has been previously reported to be drastically reduced and redistributed upon ESC differentiation to EpiLCs 51  www.nature.com/scientificreports/ www.nature.com/scientificreports/ untreated vs EED226 treated samples by RNA-seq. Similar to the changes observed at the structural level, we observed three different patterns of expression: expression increased or decreased both in EpiLCs and EED226treated ESCs (43% of the glycosyltransferases and related genes); or following an opposite trends of expression (22% of the glycosyltransferases and related genes); or unaffected by PRC2 treatment and changed during the naïve-to-primed transition (29% of the glycosyltransferases and related genes) (Supplementary Fig. S14c and Supplementary Table S5). Collectively, these data led us to postulate the presence of at least three coordinated pathways that control glycosylation dynamics during EpiLC differentiation (Fig. 7f).
In conclusion, our findings demonstrate for the first time the presence of a developmentally regulated network orchestrating overall glycosylation changes and identified PRC2 as a key component involved in this process.

Discussion
The pivotal role of glycosylation during development and in determining stem cell identity across different species is becoming increasingly clear 11 . Previous reports characterized the glycomic profiles of mouse ESCs, conventional human ESCs (hESCs) and human induced pluripotent stem cells (hiPSCs), which are in a primed state, tumors and late differentiating cells [20][21][22] . In the present study, we performed a comprehensive and comparative analysis to investigate the glycosylation dynamics during the pluripotency state transition from ESCs to EpiLCs, which have been recently suggested to be in an intermediate developmental stage between the naïve state and the primed state, named formative state 52 . As a result, we demonstrated that glycosylation undergoes dramatic alterations from early stages of development, and we identified PRC2 as a key component of the network orchestrating these changes (Fig. 8).
Pluripotent stem cells (PSCs) in the primed state exhibit significant differences compared to naïve ESCs, such as a flat morphology, glycolytic metabolism, slow proliferation, and closer chromatin 53 . During the naïve-toprimed transition, RAS activation mediates the epithelial-to-mesenchymal transition (EMT), characterized by the switch from epithelial cadherin (Ecad) to neural cadherin (Ncad) 54 ; a process similar to cancer progression and tightly regulated by glycosylation 55,56 . Indeed, N-glycosylation function in the EMT during tumorigenesis has been widely reported: 1-6 branching structure on Ecad promotes its clearance from the cell surface and inhibits Ncad-mediated cell-cell adhesion. Furthermore, core fucosylation weakens Ecad cell-cell adhesion in lung cancer 55 . In addition, an increased level of sialylated glycans 56 and reduced GSL 57 were documented to correlate with EMT progression. Similarly, we detected enhanced 2,6 branching, core fucosylation (Fig. 2b,e), and total sialylated glycans (Fig. 6d), followed by a sharp reduction in GSL (Fig. 5b), suggesting a shared EMT regulation by glycosylation across different biological contexts. RAS activation was also reported to be linked to a metabolic shift from oxidative phosphorylation to glycolysis, and subsequent closer chromatin during the naïve-to-primed differentiation 54 . Importantly, RAS is downstream of FGF signaling, which requires N-sulfation of HS. Indeed, Ndst1/2 −/− ESCs are unable to exit from the naïve pluripotent state 11 . Moreover, Myc amount, which promotes the ESCs' proliferative program and thus the proliferation speed, is inversely correlated with FGF-ERK activation 58 , underlining the functional importance of FGF signaling regulation by HS. Accordingly, we observed a dramatic increase in Ndst2-4 expression and N-sulfated HS staining (Fig. 4c,d). In the light of previous reports, our data emphasize that glycosylation dynamic changes contribute and partially drive the major phenotypical alterations occurring during the naïve-to-primed transition, thus underlining the importance of mechanistically dissecting the role of glycosylation during developmental transitions in vitro and in vivo.
Comparative analysis of total glycomes allows the identification of pluripotency biomarker candidates 21 . Indeed, our data confirmed previously established pluripotency markers, such as SSEA-1,3,4 and Forsmann antigen, which are expressed at a similar levels in ESCs and EpiLCs. Moreover, we demonstrated that SSEA-5 is specifically expressed in EpiLCs, augmenting previous studies performed in conventional hESCs and hiPSCs 59 . In addition, we identified a wide range of novel structures across all glycosylation classes that are specifically enhanced in EpiLCs but not detected or detected at very low levels in ESCs providing additional markers to distinguish between the naïve and primed pluripotent states: bisecting GlcNAc and 2,6 branched tri-/tetraantennary complex N-glycans, extended or branched capped mucin-type O-glycan structure, N-unsubstituted GlcN, N-sulfated GlcN, and 3-O-sulfated HS, CS-A, C, D and E units, and PSA.
Expanded potential stem cells (EPSCs) can contribute to extraembryonic tissues in intraspecies chimeras and hence are totipotent stem cells 60 . Moreover, Dux overexpression converts ESCs into 2-cell-embryo-like (2C-like) cells 61 . The recent establishment of culture conditions to reprogram mouse ESCs into EPSCs or 2C-like cells allowed the in vitro investigation of the totipotent state. Despite the most suitable in vitro system to model the totipotent state is still under debate 62 , these models allowed the identification of some molecular features of totipotent cells [60][61][62] , providing an invaluable tool to characterize the totipotent state. Glycosyltransferases expression dramatically diverges between EPSCs/2C-like cells and ESCs in vitro 60,61 , and between the embryo cleavage stage and the ICM in vivo 61 . Thus, it will be of great interest to examine the glycosylation dynamics during the reprogramming process from ESCs to EPSCs and 2C-like cells to establish novel totipotency biomarkers and obtain mechanistic insights into the transition from the totipotent to the pluripotent state.
PRC2 regulates early embryonic specification processes by repressing crucial developmental genes 63 . Indeed, deficiency in PRC2 core components Eed, Suz12, or Ezh2 results in embryo lethality around E7.5-E8.5 due to gastrulation defects 24 . Moreover, PRC2 was previously reported to be essential to maintain the primed but not the naïve state of pluripotency both in mouse and human 64 . Here, we showed for the first time that PRC2 contributes to overall glycosylation alterations occurring during the ESC to EpiLC transition. Recently, ISY1 has been reported to modulate the biogenesis of a large subset of crucial miRNAs during the transition from ESCs to EpiLCs 27 . Moreover, PRC2 represses around 512 developmental regulators in ESCs 63 . Thus, the glycosylation changes we observed in EpiLCs are likely to be the result of the synergic action of PRC2 on glycosylation-related genes expression together with other component(s) involved both directly and indirectly in the glycosylation www.nature.com/scientificreports/ pathway and PRC2-independent pathway(s). In addition, PRC1 and PRC2 activity is directly modulated by O-GlcNAc 65 , suggesting a metabolically regulated network controlling the glycomic profile. PRDM14 is a critical pluripotency determinant conserved among mice and humans that suppresses developmental genes in ESCs by binding and recruiting PRC2 to the target genes 66,67 . Given this common pluripotency safeguard mechanism, it will be particularly interesting to investigate PRDM14 role in the PRC2-mediated glycosylation network, and whether the observed effects are translatable to human PSCs or are species-specific. Glycosylation is a developmentally coordinated post-translational modification 10,11 . Previous studies have identified glycosylation class-specific key regulators. For instance, hepatocyte nuclear factor 1α (HNF1α) was demonstrated to control N-glycan fucosylation of human plasma proteins 68 . More recently, autism susceptibility candidate 2 (Auts2) was shown to drive GSL metabolic switch during neural differentiation from mouse ESCs 37 . Nonetheless, regulation of overall glycosylation dynamics has remained unknown. Our study identified PRC2 as a key factor involved in the glycosylation changes occurring during naïve to primed transition. Our findings are the first demonstration that glycome complex alterations occurring during developmental transitions are orchestrated by a defined regulatory network. Consequently, it will be important to characterize the glycomic dynamics in a variety of developmental stages and cell types in order to identify transition-specific glycosylation regulatory components. Glycosylation is involved in a broad range of cellular processes 9 . Not surprisingly, aberrant forms of glycosylation are observed in all types of cancer 56 . Thus, we postulate the existence of glycosylation regulatory networks acting during tumorigenic processes, which identification will allow the development of novel therapeutic approaches. In conclusion, our findings provide a solid groundwork for further investigations in basic research and translational medicine.

Methods
Cell culture. R1 ESC line 69 was maintained on mouse embryonic fibroblasts (MEFs) that were prepared from embryos at embryonic day 14.5 and inactivated with 10 μg mitomycin C (Sigma-Aldrich). ESCs were maintained in ESC medium consisting of DMEM (Gibco) supplemented with 15% fetal bovine serum (FBS) (Nichirei Biosciences), 1% penicillin/streptomycin (Gibco), 0.1 mM 2-mercaptoethanol (Gibco), 0.1 mM nonessential amino acids (Gibco) and 1000 U/mL LIF (Chemicon International). All animal experiments were approved by the Animal Care and Use Committee for Soka University and were performed in accordance with relevant guidelines and regulations for animal care.

Construction of plasmid for mouse cochlin-Fc fusion protein and its expression in HEK293
cells. The cDNA encoding mouse cochlin (Genbank: NM_00729.5), which specifically recognizes 3-O-sulfation on HS (unpublished data), was amplified by PCR from mouse spleen cDNA, using the following primers: 5′-GTT CTC TGT GTT TGG GAA CAT-3′ and 5′-TCC TCA AGA GAG CAG CCT CC-3′. To express mouse cochlin fused to human IgG-Fc (mCochlin-Fc), mouse cochlin cDNA was amplified by PCR and inserted between the EcoRI and XhoI sites of a pCAGGS-Fc vector. HEK293 cells were transfected with the purified plasmid using www.nature.com/scientificreports/ Lipofectamine 2000 and cultured in D10 medium containing 2 μg/mL puromycin for a week. Cultured media was collected, and mCochlin-Fc was purified using a column of Protein A-Sepharose column (GE healthcare).

FACS analysis.
A single cell suspension for cell surface molecules staining was obtained using 0.02% EDTA/ PBS. Subsequently, 2-3 × 10 5 cells were collected and washed in FACS buffer (0.5% BSA (Iwai), 0.1% sodium azide (Sigma-Aldrich) in PBS). After washing, the cell suspension was incubated with Abs or lectins in FACS buffer. For internal molecules analysis, cells were harvested with 0.25% trypsin/EDTA (Thermo Fisher Scientific) and fixed/permeabilized for 30 min with 100% methanol (Wako) before staining. Samples were then analyzed using a BD FACSAria III Cell Sorter (Becton Dickinson). Cells were gated to exclude debris, dead cells (identified by propidium iodide staining; Sigma-Aldrich), and doublets. The primary and secondary Abs and lectins used are listed in Supplementary Table S3.

Sample preparation for glycome analysis of various glycoconjugates. Cultured ESCs and EpiLCs
(> 1 × 10 6 cells) were washed 5 times with PBS and collected using a scraper. Collected cells were resuspended in 100 mL of PBS and homogenized using an Ultrasonic Homogenizer (TAITEC, Saitama, Japan). Cell lysates were precipitated with EtOH and subsequently the proteinous pellet and supernatant fractions were separated by centrifugation 22,74,75 . The resulting pellet was dissolved in H 2 O and cellular protein concentration was measured using a BCA protein assay kit (Thermo Fisher Scientific). The pellet fractions corresponding to 25 μg, 50 μg, and 100 μg of proteins were used for N-glycans, O-glycans, and GAG analyses, respectively. The supernatant fraction corresponding to 100 μg of proteins was concentrated for GSL and FOS analyses. Glycomic analyses of N-glycans, GSL, and FOS were performed by glycoblotting methods combined with the SALSA procedure 76,77 . O-glycome analysis was performed by β-elimination in the presence of pyrazolone analogues (BEP) method, and GAG were measured by HPLC as previously described 21 . This methodology allows a comparative analysis of glycomes. The deduced composition and absolute amount of detected structures is listed in Supplementary  Table S4.

Data availability
The ChIP-seq datasets analyzed in this study are publicly available at NCBI Sequence Read Archive [Accession Numbers: SRX4488301, SRX4488308, SRX4488293, SRX4488300, SRX4488285, SRX4488292, SRX4488317 and SRX4488324 (Ref. 7 ); SRX1372665 (Ref. 46 ); SRX2528911 and SRX3738839 (Ref. 47 ); SRX2776968 (Ref. 48 ); SRX191072 (Ref. 49 )]. RNA-seq data generated for this study has been deposited in the GEO repository under accession number GSE161278. The data that support the findings of this study are available from the corresponding author upon reasonable request.