Functional dissection of PRC1 subunits RYBP and YAF2 during neural differentiation of embryonic stem cells

Polycomb repressive complex 1 (PRC1) comprises two different complexes: CBX-containing canonical PRC1 (cPRC1) and RYBP/YAF2-containing variant PRC1 (vPRC1). RYBP-vPRC1 or YAF2-vPRC1 catalyzes H2AK119ub through a positive-feedback model; however, whether RYBP and YAF2 have different regulatory functions is still unclear. Here, we show that the expression of RYBP and YAF2 decreases and increases, respectively, during neural differentiation of embryonic stem cells (ESCs). Rybp knockout impairs neural differentiation by activating Wnt signaling and derepressing nonneuroectoderm-associated genes. However, Yaf2 knockout promotes neural differentiation and leads to redistribution of RYBP binding, increases enrichment of RYBP and H2AK119ub on the RYBP-YAF2 cotargeted genes, and prevents ectopic derepression of nonneuroectoderm-associated genes in neural-differentiated cells. Taken together, this study reveals that RYBP and YAF2 function differentially in regulating mESC neural differentiation.


Statistics
For all statistical analyses, confirm that the following items are present in in the figure legend, table legend, main text, or or Methods section.

n/a Confirmed
The exact sample size (n) for each experimental group/condition, given as as a discrete number and unit of of measurement A statement on on whether measurements were taken from distinct samples or or whether the same sample was measured repeatedly The statistical test(s) used AND whether they are one-or or two-sided Only common tests should be described solely by name; describe more complex techniques in the Methods section.

A description of of all covariates tested
A description of of any assumptions or or corrections, such as as tests of of normality and adjustment for multiple comparisons A full description of of the statistical parameters including central tendency (e.g.means) or or other basic estimates (e.g.regression coefficient) AND variation (e.g. standard deviation) or or associated estimates of of uncertainty (e.g.confidence intervals) For null hypothesis testing, the test statistic (e.g.F, t, r) with confidence intervals, effect sizes, degrees of of freedom and P value noted Give P values as exact values whenever suitable.
For Bayesian analysis, information on on the choice of of priors and Markov chain Monte Carlo settings For hierarchical and complex designs, identification of of the appropriate level for tests and full reporting of of outcomes Estimates of of effect sizes (e.g.Cohen's d, Pearson's r), ), indicating how they were calculated Our web collection on statistics for biologists contains articles on many of the points above.For RNA-seq, the raw paired end reads were trimmed using Trim_Galore (v0.6.5), then mapped and quantified to to mouse mm10 genome with STAR-RSEM pipeline using RSEM (v1.2.22).Transcript-level counts were collapsed to to gene-level counts using tximport (v1.20.0).DESeq2 (v1.32.0) was used for finding differential genes.GO GO analysis was conducted with clusterProfiler (v4.0.0).Bowtie2 (v2.2.5) was used for alignment ChlP-seq data.Peak annotation was performed with ChIPseeker (v1.28.3).bamCoverage (v3.5.1) from deepTools was used to to generate the normalized bigwig files.Accessible regions were identified using MACS2 (v2.2.7.1) without control using the default parameters, and differential analysis was performed using DiffBind (v3.2.4).Adaptors and low-quality reads were trimmed with Trim_Galore (v0.6.5).For RYBP, YAF2, RING1B and H3K27ac ChIP-seq, the trimmed reads were aligned to to the mouse mm10 genome using bowtie2 (v2.2.5) with the parameter "--very-sensitive --end-to-end --no-unal".For H2AK119ub and H3K27me3 ChIP-seq, the trimmed reads were mapped to to mm10 (mouse) and dm6 (drosophila melanogaster) reference genomes using bowtie2 (v2.2.5) with the parameters "-very-sensitive --end-to-end --no-unal --no-mixed --no-discordant".For EZH2, JARID2 and MTF2 ChIP-seq, the trimmed reads were mapped to to mm10 (mouse) and hg38 (human) reference genomes using bowtie2 (v2.2.5) with the same parameters as as above.The mapped reads with a quality lower than 30 30 were filtered out.Duplicated reads were removed with sambamba (v0.6.7).The reads overlapping with mouse mm10 blacklist regions (http://mitra.stanford.edu/kundaje/akundaje/release/blacklists)were excluded.For RING1B ChIP-seq, uniquely mapped reads for each sample were subsampled to to 10 10 million, and replicates were subjected to to DiffBind (v3.2.4) for differential binding analysis.For calibrated ChIP-seq (cChIP-seq), the reads mapped to to mm10, dm6 or or hg38 genome were extracted and calculated for calibration as as previously described63.bamCoverage (v3.5.1) from deepTools was used to to generate the normalized bigwig files with the parameter "--normalizeUsing RPGC" for normal ChIP-seq and "--normalizeUsing RPGC --scaleFactor" for cChIP-seq.The reads mapped to to mm10 genome were used to to Data Policy information about availability of data All manuscripts must include a data availability statement.This statement should provide the following information, where applicable:

Software and code
-Accession codes, unique identifiers, or web links for publicly available datasets -A description of any restrictions on data availability -For clinical datasets or third party data, please ensure that the statement adheres to our policy

Human research participants
Policy information about studies involving human research participants and Sex and Gender in Research.

Recruitment
Ethics oversight Note that full information on the approval of the study protocol must also be provided in the manuscript.

Field-specific reporting
Please select the one below that is the best fit for your research.If you are not sure, read the appropriate sections before making your selection.

Life sciences Behavioural & social sciences Ecological, evolutionary & environmental sciences
For a reference copy of the document with all sections, see nature.com/documents/nr-reporting-summary-flat.pdf

Life sciences study design
All studies must disclose on these points even when the disclosure is negative.
Target genes of both RYBP and YAF2 were identified when they bound within 1 kb of the transcription start sites (TSSs).
The ATAC-seq, ChIP-seq and RNA-seq data reported in this paper have been deposited in the Genome Sequence Archive database in the National Genomics Data Center (GSA: CRA006987) [https://ngdc.cncb.ac.cn/gsa/browse/CRA006987]and in the Gene Expression Omnibus (GEO) database (GSE213416) [https:// www.ncbi.nlm.nih.gov/geo/query/acc.cgi?&acc=GSE213416].Source data are provided with this paper.Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.
Use the terms sex (biological attribute) and gender (shaped by social and cultural circumstances) carefully in order to avoid confusing both terms.Indicate if findings apply to only one sex or gender; describe whether sex and gender were considered in study design whether sex and/or gender was determined based on self-reporting or assigned and methods used.Provide in the source data disaggregated sex and gender data where this information has been collected, and consent has been obtained for sharing of individual-level data; provide overall numbers in this Reporting Summary.Please state if this information has not been collected.Report sex-and gender-based analyses where performed, justify reasons for lack of sex-and gender-based analysis.
Describe the covariate-relevant population characteristics of the human research participants (e.g.age, genotypic information, past and current diagnosis and treatment categories).If you filled out the behavioural & social sciences study design questions and have nothing to add here, write "See above."Describe how participants were recruited.Outline any potential self-selection bias or other biases that may be present and how these are likely to impact results.
Identify the organization(s) that approved the study protocol.
At least three independent replicates were used in each experiment, and the replicate number was presented in the figure legends.

None
The results were repeated at least three times as independent replicates.

Materials & experimental systems
Rabbit polyclonal anti-YAF2 (Our lab) for ChIP experiments was validated with Wild-type and Yaf2 knockout cell lines.The commercial antibodies were validated by following the manufacturer's methods.
46C mESCs were described in (Ying et al., 2003), RYBP-KO 46C mESCs were described in (Liu et al., 2019) and referred in the manuscript.The HEK293T cell lines was from ATCC and maintained in our lab.
Routine quality control was measured with microscopy morphology, and qRT-PCR was performed to test the expression of marker genes.
All cell lines were confirmed negative for Mycoplasma contamination.
No commonly misidentified line was used.
, ChIP-seq and ATAC-seq libraries were sequenced on on Illumina NovaSeq 6000.BD BD LSRFortessa was used for Flow cytometry.lmage J was used for quantification of of signals from Western blot.FlowJo (v10) was used to to analyze FACS data.Data was plotted using GraphPad Prism (v8.0.2).Adobe lllustrator 2020 was used to to prepare figures.