Role of UPF1-LIN28A interaction during early differentiation of pluripotent stem cells

UPF1 and LIN28A are RNA-binding proteins involved in post-transcriptional regulation and stem cell differentiation. Most studies on UPF1 and LIN28A have focused on the molecular mechanisms of differentiated cells and stem cell differentiation, respectively. We reveal that LIN28A directly interacts with UPF1 before UPF1-UPF2 complexing, thereby reducing UPF1 phosphorylation and inhibiting nonsense-mediated mRNA decay (NMD). We identify the interacting domains of UPF1 and LIN28A; moreover, we develop a peptide that impairs UPF1-LIN28A interaction and augments NMD efficiency. Transcriptome analysis of human pluripotent stem cells (hPSCs) confirms that the levels of NMD targets are significantly regulated by both UPF1 and LIN28A. Inhibiting the UPF1-LIN28A interaction using a CPP-conjugated peptide promotes spontaneous differentiation by repressing the pluripotency of hPSCs during proliferation. Furthermore, the UPF1-LIN28A interaction specifically regulates transcripts involved in ectodermal differentiation. Our study reveals that transcriptome regulation via the UPF1-LIN28A interaction in hPSCs determines cell fate.

For all statistical analyses, confirm that the following items are present in in the figure legend, table legend, main text, or or Methods section.

n/a Confirmed
The exact sample size (n) for each experimental group/condition, given as as a discrete number and unit of of measurement A statement on on whether measurements were taken from distinct samples or or whether the same sample was measured repeatedly The statistical test(s) used AND whether they are one-or or two-sided Only common tests should be described solely by name; describe more complex techniques in the Methods section.

A description of of all covariates tested
A description of of any assumptions or or corrections, such as as tests of of normality and adjustment for multiple comparisons A full description of of the statistical parameters including central tendency (e.g.means) or or other basic estimates (e.g.regression coefficient) AND variation (e.g. standard deviation) or or associated estimates of of uncertainty (e.g.confidence intervals) For null hypothesis testing, the test statistic (e.g.F, t, r) with confidence intervals, effect sizes, degrees of of freedom and P value noted Give P values as exact values whenever suitable.
For Bayesian analysis, information on on the choice of of priors and Markov chain Monte Carlo settings For hierarchical and complex designs, identification of of the appropriate level for tests and full reporting of of outcomes Estimates of of effect sizes (e.g.Cohen's d, Pearson's r), ), indicating how they were calculated Our web collection on statistics for biologists contains articles on many of the points above.

Software and code
Policy information about availability of of computer code Data collection

Data analysis
For manuscripts utilizing custom algorithms or or software that are central to to the research but not yet described in published literature, software must be be made available to to editors and reviewers.We We strongly encourage code deposition in in a community repository (e.g.GitHub).See the Nature Portfolio guidelines for submitting code & software for further information.For RNA sequencing data, we we pre-processed the raw reads from the sequencer to to remove low-quality reads and adapter sequences before analysis and aligned the processed reads to to Homo sapiens (GRCh37) using HISAT v2.1.051.HISAT utilises two indices for alignment (a (a global whole-genome index and tens of of thousands of of small local indexes).These two types of of indices are constructed using the same Burrows-Wheeler transform (BWT)/graph FM FM index (GFM) as as Bowtie2.Transcript assembly was processed using StringTie v1.3.4d.Based on on this result, the expression abundance of of transcripts and genes was calculated as as read count or or Fragments Per Kilobase of of exon per million fragments mapped (FPKM) value per sample.Differentially expressed genes (DEGs) were analysed by by the ratio of of FPKM or or using DESeq2 with read counts.Log2 fold-change value of of the genes was converted into cumulative frequency curve using the R function, 'ecdf' v4.0.5.Random genes were selected using the R function, 'Sample' v4.0.5.Gene functional classification and Gene ontology (GO) were performed using g:Profiler.Sankey diagram and dot plots for GO GO analysis were plotted by by https://www.bioinformatics.com.cn/srplot, an an online platform for data analysis and visualization.
For SPR data, primary data were collected using iMSPR measurement tool and data analysis was performed using TraceDrawer v1.9.The axis labels state the marker and fluorochrome used (e.g.CD4-FITC).
The axis scales are clearly visible.Include numbers along axes only for bottom left plot of group (a 'group' is an analysis of identical markers).
All plots are contour plots with outliers or pseudocolor plots.
A numerical value for number of cells or percentage (with statistics) is provided.
HeLa cells (KCLB, 10002), 293T cells (KCLB, 21573), PA-1 cells (ATCC, CRL-1572), CHA-hES15 (CVCL, 9741), H9 cells (WiCell) , and Pro2(iPSCs were kindly gifted by Dr. Kwang-Soo Kim of Harvard University) Cell line was purchased from Korean Cell Line Bank (KCLB) and ATCC.CHA-hES15 and Pro2 cells were gifted from CHA university and Harvard university, respectively, as described in the Methods section All the cell lines were routinely tested and confirmed that there is no contamination.
Name any commonly misidentified cell lines used in the study and provide a rationale for their use.
The positive cells were determined using flow cytometry (FACSCanto, BD Pharmingen) Data was analysed using the Flow Jo-v10 software program.
The proportion of living cells was always higher than 95%.
Cells were selected based on their forward scatter (FSC-A) and side scatter (SSC-A) properties.Subsequently, cell debris and

DBPR
was use for data collection.