Post-transcriptional remodelling is temporally deregulated during motor neurogenesis in human ALS models

Mutations causing amyotrophic lateral sclerosis (ALS) strongly implicate regulators of RNA-processing that are ubiquitously expressed throughout development. To understand the molecular impact of ALS-causing mutations on early neuronal development and disease, we performed transcriptomic analysis of differentiated human control and VCP-mutant induced pluripotent stem cells (iPSCs) during motor neurogenesis. We identify intron retention (IR) as the predominant splicing change affecting early stages of wild-type neural differentiation, targeting key genes involved in the splicing machinery. Importantly, IR occurs prematurely in VCP-mutant cultures compared with control counterparts; these events are also observed in independent RNAseq datasets from SOD1- and FUS-mutant motor neurons (MNs). Together with related effects on 3’UTR length variation, these findings implicate alternative RNA-processing in regulating distinct stages of lineage restriction from iPSCs to MNs, and reveal a temporal deregulation of such processing by ALS mutations. Thus, ALS-causing mutations perturb the same post-transcriptional mechanisms that underlie human motor neurogenesis. HIGHLIGHTS Intron retention is the main mode of alternative splicing in early differentiation. The ALS-causing VCP mutation leads to premature intron retention. Increased intron retention is seen with multiple ALS-causing mutations. Transcriptional programs are unperturbed despite post-transcriptional defects. eTOC BLURB Luisier et al. identify post-transcriptional changes underlying human motor neurogenesis: extensive variation in 3’ UTR length and intron retention (IR) are the early predominant modes of splicing. The VCP mutation causes IR to occur prematurely during motor neurogenesis and these events are validated in other ALS-causing mutations, SOD1 and FUS.


SUMMARY
Mutations causing amyotrophic lateral sclerosis (ALS) strongly implicate regulators of RNA-processing that are ubiquitously expressed throughout development. To understand the molecular impact of ALS-causing mutations on early neuronal development and disease, we performed transcriptomic analysis of differentiated human control and VCP-mutant induced pluripotent stem cells (iPSCs) during motor neurogenesis. We identify intron retention (IR) as the predominant splicing change affecting early stages of wild-type neural differentiation, targeting key genes involved in the splicing machinery. Importantly, IR occurs prematurely in VCP-mutant cultures compared with control counterparts; these events are also observed in independent RNAseq datasets from SOD1-and FUS-mutant motor neurons (MNs). Together with related effects on 3'UTR length variation, these findings implicate alternative RNAprocessing in regulating distinct stages of lineage restriction from iPSCs to MNs, and reveal a temporal deregulation of such processing by ALS mutations. Thus, ALScausing mutations perturb the same post-transcriptional mechanisms that underlie human motor neurogenesis.

HIGHLIGHTS
• Intron retention is the main mode of alternative splicing in early differentiation.
• The ALS-causing VCP mutation leads to premature intron retention.
• Increased intron retention is seen with multiple ALS-causing mutations.
• Transcriptional programs are unperturbed despite post-transcriptional defects.

eTOC BLURB
Luisier et al. identify post-transcriptional changes underlying human motor neurogenesis: extensive variation in 3' UTR length and intron retention (IR) are the early predominant modes of splicing. The VCP mutation causes IR to occur prematurely during motor neurogenesis and these events are validated in other ALScausing mutations, SOD1 and FUS.

INTRODUCTION
Amyotrophic lateral sclerosis (ALS) is a rapidly progressive and incurable condition, which leads to selective degeneration of motor neurons (MNs). ALS-causing mutations implicate crucial regulators of RNA-processing, which are normally expressed throughout development, in the underlying pathogenesis. This raises the hypothesis that post-transcriptional changes occurring at much earlier stages of life, including neurodevelopment, may play a pivotal role in the underlying molecular pathogenesis of ALS. Understanding the impact of ALS-causing mutations on the early stages of neurodevelopment and disease will allow us to elucidate the initiating molecular events. This may guide the development of new therapies targeting primary mechanisms before the disease progresses too far.
Both development and homeostasis of neurons fundamentally rely on the precise implementation of cell-type and stage-specific alternative RNA processing, including alternative splicing (AS) and alternative polyadenylation (APA) (Mansfield and Keene, 2011;Raj et al., 2014;Tian and Manley, 2017;Yap et al., 2016;Zhang et al., 2016). Several mechanisms of AS are established including exon-skipping, mutually exclusive exons and intron retention (IR). Alternative exon usage features particularly in neurons to regulate protein diversity and function (Nilsen and Graveley, 2010). Neural cell-types exhibit a higher proportion of retained introns compared with other tissues and there is an expanding body of evidence demonstrating a functional role for IR both in neuronal development and homeostasis (Braunschweig et al., 2014;Buckley et al., 2011;Mauger et al., 2016;Yap et al., 2012). Increased IR during neuronal differentiation has been shown to down-regulate the expression of transcripts that are unnecessary for mature neuronal physiology (Braunschweig et al., 2014). A recent study showed evidence for post-transcriptional processing of intron-retaining transcripts in response to neuronal activity (Mauger et al., 2016). APA is an alternative mode of RNAprocessing that generates distinct 3' termini most frequently in the 3' untranslated region (3' UTR) of mRNA, thereby engendering isoforms of variable 3' UTR length. 3' UTRs serve as a key platform in the RNA regulatory network controlling mRNA translational efficiency, localisation and stability (Fabian et al., 2010;Gupta et al., 2014;Will et al., 2013). 3' UTRs harbour extensive tissue-specific length variability that significantly affect their function (Derti et al., 2012). Of all human cell-types, brain-specific isoforms have the longest 3' UTRs (Miura et al., 2013;Wang and Yi, 2014;Zhang et al., 2005). Moreover, mutations perturbing the mRNA secondary structure and sites of mRNA-miRNA interactions can lead to neurodegeneration (Ward and Kellis, 2012). In spite of these findings, the role of 3' UTRs regulation and IR in the context of MN development and disease has remained understudied compared to other forms of alternative splicing.
AS and APA are coordinated by the action of individual or combinations of trans-acting RNA binding proteins (RBPs) that bind to specific sites within (pre-) mRNA (Gabut et al., 2008;Licatalosi et al., 2008). RBPs mediate mRNA splicing, nuclear export and localization. Accumulating evidence implicates RBPs as key regulators of both neurodevelopment and specific forms of neurodegeneration.
Indeed, mutations in several RBPs including TDP-43, FUS and TAF15 have been causally linked to familial form of ALS leading to RNA-processing defects in mouse models (Kapeli et al., 2017;Qiu et al., 2014).
However, given that these RBPs are ubiquitously expressed during development, we hypothesized that aberrant splicing and APA play important roles in both development and disease of the nervous system. Against this background we sought to understand how different modes of AS and APA feature in human motor neurogenesis and to examine systematically the impact of ALS-causing mutations on post-transcriptional remodeling during lineage restriction. By combining directed differentiation of induced pluripotent stem cells (iPSCs) with time-resolved RNA sequencing, we first identified the sequential programme of post-transcriptional remodeling that underlies human motor neurogenesis. This allowed us to compare VCP-mutant (VCP mu ) to control cultures during MN differentiation and identify mutation-dependent splicing deregulation in a developmental stage-specific manner.
Our findings uncover early post-transcriptional perturbations during motor neurogenesis in ALS, which may induce a state of compensated dysfunction that later impacts motor neuronal integrity, contributing to disease onset and/or progression.

Intron retention is the predominant mode of splicing in early motor neurogenesis
To examine post-transcriptional changes during human motor neurogenesis, we performed high-throughput RNA-sequencing (RNA-seq) analyses of polyadenylated RNA isolated from induced pluripotent stem cells (iPSC; day 0), neural precursors (NPC; day 7), 'patterned' precursor motor neurons (ventral spinal cord; pMN; day 14), post-mitotic but electrophysiologically immature motor neurons (MN; day 21), and electrophysiologically mature MNs (mMNs; day 35) derived from two patients with the ALS-causing VCP gene mutation and two healthy controls ( Figure 1A; 31 samples from 5 time-points and 3 genotypes; 2 clones from 2 healthy controls and 3 clones from 2 ALS patients with VCP mutations: R155C and R191Q). Cellular samples from each stage of MN differentiation have been extensively characterised as previously reported (Hall et al., 2017). Using a set of 19 key gene markers of spinal MN maturation and embryonic development we further confirmed a prior finding that iPSC-derived mMNs resemble fetal rather than adult MNs (Ho et al., 2016) (Figures S1A and S1B). Unsupervised hierarchical clustering (Spearman rank correlation and complete clustering) of the 31 samples using 15,989 reliably expressed genes segregated samples based on their developmental stage within the MN lineage rather than by mutant or control genetic background ( Figure 1B).
We next examined the temporal dynamics of alternative splicing (AS) during MN differentiation. Using the RNA-seq pipeline VAST-TOOLS (Irimia et al., 2014) we identified 1,599 and 1,507 AS events over time in control and VCP mu samples respectively (Figures S1C and S1D). Consistent with previous studies, > 60% of AS events at later stages of MN terminal differentiation were alternative exon and intron removal events (Figures 1C and S1C,D) (Yap et al., 2012;Zhang et al., 2016). We found that control and VCP mu samples exhibit a similar progressive increase in cassette exon inclusion over time ( Figure 1D) in genes involved in cellular component organization and axonogenesis (Figure 1E).
In contrast, IR accounted for >65% of AS events at the earlier phase of lineage restriction from NPC to pMN ( Figure 1C). The expression of intron-retaining transcripts increased at the transition from NPC to pMN (i.e. neural patterning) in control samples ( Figure 1F). Over 60% of all IR events were common between VCP and control samples ( Figure S1E, upper left) -these largely involve genes related to RNA-processing and splicing ( Figure 1G). Intriguingly, VCP samples exhibited a striking increase in the expression of intron-retaining transcripts at the earlier developmental transition from iPSC to NPCs (i.e. neural induction; Figure 1F). We found that 53% of IR events started upon patterning in control samples; in contrast 72% of the IR events in VCP mu cultures started upon neural induction (Figure S1E, right). Importantly we found that 60% of the events that started upon neural induction in VCP mu samples matched with those starting later between induction and patterning in control samples. The remaining events were either unique to VCP mu cultures or occurred at a different time point. This result indicates that VCP samples exhibit similar IR activity as seen in control counterparts during differentiation, but at a premature stage.
We next conducted Integrative Genomics Viewer (IGV)-guided manual curation to remove low coverage and spurious IR (e.g. some events annotated as IR were found to be alternative 3'UTR after visual inspection). We identified 167 highly confident IR events and assigned a retention value to every intron relative to expression of flanking exons (see Methods). Examining the degree of IR for these selected 167 events across all experimental samples shows a systematically higher percentage of IR in VCP samples at NPC stage compared to control samples of any stage, and confirmed our aforementioned findings of VCP mutation-related premature IR ( Figure 1H). We next analyzed our high-confidence set of manually curated IR events in two independent transcriptomic data-sets from human embryonic stem cells (Wu et al., 2010;Yao et al., 2017) confirmed that accumulation of IR during early neurogenesis is a generalizable phenomenon observed across diverse experimental models (Figures 1I and S1F).
VCP plays a key role in the cell cycle and an increased percentage of IR may result from an established link between splicing efficiency and cell cycle (Heyn et al., 2015). We therefore examined the possibility that the apparent acceleration in AS programs observed in VCP mu samples might be explained by a mutation-dependent increase in cell-cycle activity. In order to address this possibility, we employed flow cytometry on iPSCs, NPCs and pMNs. These experiments effectively excluded differences in cell cycle between VCP mu and control samples (Figures S1G-I).
Collectively, these findings demonstrate that IR is the predominant splicing change that affects early stages of neural lineage restriction. Importantly, IR occurs prematurely in VCP mu cultures, which cannot be explained by differences in cellcycle activity.

ALS-causing mutations
We next examined the impact of other ALS-causing mutations by analyzing independent transcriptomic data-sets for familial forms of ALS either caused by mutant SOD1 (n=5; 2 patient-derived SOD1A4V and 3 isogenic control MN samples where the mutation has been corrected) (Kiskinis et al., 2014a) or FUS (n=6; 3 patient-derived FUS R521G and 3 unaffected controls MNs) (Kapeli et al., 2016). We confirmed that IR is the predominant mode of splicing in MNs derived from both mutants ( Figure S2A). Importantly, our high-confidence set of 167 IR events affected by premature IR during VCP mu MN differentiation are also affected in FUS and SOD1 mutant MNs (Figure 2A). 74 and 59 IR events exhibit a significant increase in SOD1 and FUS mutant MNs respectively compared to controls (P-value < 0.01; Fisher count test; Figure S2B). These together form a significantly connected network of proteins enriched in RNA-splicing and translation (Figures 2B and 2C). The extent of splicing for those introns significantly retained in both SOD1 mu and FUS mu is depicted in a heatmap in Figure 2D. This demonstrates that the premature IR events identified in our ALS model also feature in MNs carrying other ALS-causing mutations.
Cumulatively, these findings confirm the generalizability of aberrant IR in different genetic forms of ALS.
SFPQ represents the most significant premature IR event identified during motor neurogenesis in our VCP mu cultures and was also found to be highly statistically significant in SOD1 and FUS MNs. The SFPQ intron 8/9, ~9 kb in length, was highly retained during neural patterning in control samples, prematurely (during neural induction) in VCP samples, and in MNs harbouring mutations in FUS and SOD1 ( Figures 3A and 3B). We validated SFPQ intron retention and its temporal deregulation by the VCP-mutation by qPCR analysis of RNA isolated from multiple independent iPSC lines (3 clones from 3 healthy controls and 4 clones from 2 patients with VCP mutations) ( Figure 3C and S3A). The key splicing regulators FUS and DDX39 are similarly strongly affected during early lineage restriction by VCP mutation, and by SOD1 and FUS mutations (Figures 3D-I). Changes in IR event were not accompanied by any significant difference in gene expression levels between VCP mu cultures and controls ( Figure S3B). From these results we conclude that distinct ALS-causing gene mutations lead to similar IR events in genes known to have pivotal roles in motor neurogenesis and ALS (Thomas-Jinu et al., 2017;Vance et al., 2009).

Peaks of transient 3' UTR remodelling and IR occur sequentially in early neurogenesis
Splicing and polyadenylation regulation are often interconnected. To examine whether 3' UTR length varies throughout the distinct stages of MN differentiation and how the VCP mutation affect this process, we first extended the current catalogue of Ensembl 3' UTR annotation using our RNA-seq data (as detailed in the methods).
We then analysed their maximum expressed length over the course of differentiation.
As expected, the 12,364 genes reliably expressed across all time-points exhibit longer 3' UTR in mMNs compared to iPSC for both control and VCP mu samples.
Notably, however, these exhibit shorter 3' UTR in NPC compared to iPSC in control samples only (P-value < 0.01; Wilcoxon test; Figure 4A).
To investigate further the dynamics of 3' UTR processing, we analysed tandem poly(A) sites that were located in the same terminal exon. We used the number of reads mapped to the terminal 100 nt segment of each 3' UTR as a proxy for the expression level of a 3' UTR isoform to analyse the relative use of alternative poly(A) sites in different conditions. Overall we observed that 822 genes undergo significant 3' UTR changes across the entire time-course of differentiation in either control and/or VCP mu cultures ( Figure 4B). In 171 of these genes, we observed increased found that genes with statistically significant 3' UTR lengthening in MNs were enriched in GO terms representing mRNA splicing ( Figure 4C). About 50% of the genes exhibiting 3' UTR lengthening from iPSC to mMNs were shared between control and VCP mu samples ( Figure 4B) with <10 genes displaying significant proximal or distal 3' UTR shifts when directly comparing control to VCP cultures at the at MN stage ( Figure S4A). Although the ALS-causing VCP-mutation overall leads to similar 3' UTR lengthening upon terminal differentiation as control samples, it is noteworthy that in several cases, VCP mu samples exhibited a significant This is further confirmed when directly comparing VCP to control samples at the NPC stage where >200 promoter-distal shifts were detected (Figures S4A and S4B). It is interesting to note that the few 3' UTR-shortening events at the VCP mu NPC stage include the gene HTT, which is required for neuronal development (Kerschbamer and Biagioli, 2015) ( Figure S4C). These data demonstrate that extensive 3' UTR length variation peaks prior to IR and characterises the transition from iPSCs to NPCs. Importantly, VCP samples do not exhibit any apparent similar process, which may either be explained by its absence or premature occurrence, which therefore escapes detection in our experimental paradigm.

Transcriptional programs underlying human motor neurogenesis remain unperturbed by the VCP-mutation
Having identified major differences in post-transcriptional remodelling in ALS samples compared to control, we next sought to understand their functional consequences. Given the finding that VCP mutation deregulates the temporal dynamics of RNA processing during MN differentiation, it is surprising that ALS genetic background does not drive measurable transcriptional differences during motor neurogenesis ( Figure 1B). Noting that hierarchical clustering is often dominated by a single transcriptional program, we applied singular value decomposition to the expression matrix (15,989 genes across 31 samples) to identify major orthogonal transcriptional programs and their associated biological processes underlying motor neurogenesis (Luisier et al., 2014). 69% of the variance in gene expression was captured by the first three components ( Figure  These results highlight the orthogonal cellular components that operate in parallel: progressive neurogenesis (component 1), transient neural induction and patterning (component 2) and terminal differentiation with some contribution to induction (component 3). Importantly, control and VCP-mutant samples behave similarly in these first three components. A multivariate linear analysis confirmed that time rather than VCP mutation is the most explanatory variable in components 1, 2 and 3 ( Figure S5B). In summary, we find that transcriptional programs reflect the developmental trajectory of MNs as specific pathways are activated in a developmental stage-dependent manner. However these developmental programs are not disrupted by the VCP mutation despite post-transcriptional defects, suggesting an underlying robustness of these transcriptional programs and/or compensatory mechanisms.

DISCUSSION
The objectives of this research study were twofold: to resolve the alternative RNA processing events underlying distinct stages of MN lineage restriction and systematically examine the influence of ALS-causing VCP mutation on this process.
In order to achieve this, we integrated the directed differentiation of patient-specific iPSCs into spinal MNs with RNA sequencing and comprehensive bioinformatic examination. We show that a robust splicing program underlies MN development as summarised in Figure 6 (Figure 6). We show that a peak of 3' UTR remodelling, including 3' UTR shortening, precedes IR. Specifically we show that significant 3' UTR length variation affects transcript structure as cells exit pluripotency during neural induction. This precedes the peak of intron-retaining transcript accumulation that occurs upon neural patterning to the ventral spinal cord. Following this, progressive 3' UTR lengthening, cassette exon inclusion and intron splicing then characterise the remaining phases of terminal differentiation to MNs as previously shown (Yap et al., 2012;Zhang et al., 2016).

Transient 3' UTR remodelling characterises neural induction
3' UTRs play key roles in post-transcriptional gene expression regulation and mRNAs expressed in brain tissues generally have longer 3' UTRs compared to other tissues (Miura et al., 2013;Wang and Yi, 2014;Zhang et al., 2005). Indeed it is also established that progressive lengthening of mRNA 3' UTRs is observed during mouse embryonic development (Ji et al., 2009). Here we show extensive 3' UTR length variation during neural induction from human iPSCs. Unexpectedly, we found

Overall relevance of aberrant IR to ALS
We show that the timing of the post-transcriptional program is perturbed in samples with ALS-causing VCP mutation, with a premature IR and to some extent 3' UTR length variation. Interestingly IR was the most prominent change in transcript structure at early stages of lineage restriction and also exhibited striking premature initiation in VCP mutants. Additionally key RNA splicing regulators that were targeted by the accelerated splicing program in VCP mu samples also exhibited widespread IR in MNs harbouring mutations in other ALS-causing genes, including SOD1 and FUS.
While the accelerated splicing program in VCP-mutant samples might be linked to neuronal degeneration, the lack of transcriptional changes indicates that the dysfunctions in IR and 3' end processing do not affect neuronal development in a major way. Thus, the post-transcriptional changes appear to be compensated during the process of neuronal differentiation. This raises the important possibility that early mutation-dependent changes in post-transcriptional remodelling may induce a state of compensated dysfunction, which could culminate in a disease phenotype later once the compensatory mechanisms are diminished. We have previously shown that the phenotype of VCP mutation can be detected only once MNs have begun extending axons (Hall et al., 2017), indicating that the process of differentiation may sensitise the MNs to the pre-existing post-transcriptional defects. Therefore it is reasonable to think that accelerated IR represents subtle manifestations of a dysregulated pathway of the developing neurons, which eventually contributes to the increased vulnerability of the differentiated MNs.
Notably the most significant intron-retaining transcript in ALS samples was SFPQ, which encodes a protein that plays key roles in salient ALS-related pathways including RNA transcription, splicing, 3' end processing and axon viability (Arnold et al., 2013;Clark et al., 2016;Cosker et al., 2016;Danckwardt et al., 2007;Dong et al., 2007;Hall-Pogar et al., 2007;Hirose et al., 2014;Masuda et al., 2016;Patton et al., 1993). SFPQ IR at early stages of MN development is required to ensure the integrity of downstream splicing events, as suggested by IR in control motor neurogenesis. SFPQ IR may however be detrimental at the terminally differentiated MN stage where it plays key role in regulating mRNA axonal transport and axon viability (Cosker et al., 2016;Thomas-Jinu et al., 2017). Additionally SFPQ together with other proteins is required for nuclear paraspeckle integrity. It is therefore 14 reasonable to speculate that SFPQ IR in MN might contribute to paraspeckle formation that has been previously observed at an early stage of the ALS pathogenesis (Lourenco et al., 2015;Nishimoto et al., 2013;Sasaki et al., 2009). SFPQ plays diverse roles at different stages of MN differentiation and therefore IR may have context-dependent functional consequences.
Collectively, these findings are consistent with a model whereby ALS-causing mutations lead to aberrant IR and 3' end processing during motor neurogenesis, which may be a primary molecular event that ultimately contributes to the disruption of motor neuron integrity. Our findings also indicate that these early perturbations in post-transcriptional processing are well compensated, but this state of compensated dysfunction increases the vulnerability of differentiated MNs towards initiating disease onset and progression.  with confirmed VCP mutations (R155C and R191Q; total = 3 different hiPSC lines) and one clone from each of two healthy controls (total = 2 different hiPSC lines). an NPC stage (4-10 weeks) that produces only neurons upon further differentiation;

Contact for reagent and resource sharing
Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Rickie Patani (rickie.patani@ucl.ac.uk).

Ethics Statement
Informed consent was obtained from all patients and healthy controls in this study.
Experimental protocols were all carried out according to approved regulations and guidelines by UCLH's National Hospital for Neurology and Neurosurgery and UCL's Institute of Neurology joint research ethics committee (09/0272).
Details of the lines used in this study are provided in Table S1. Two of the control lines used (control 2 and control 3) are commercially available and were purchased from Coriell (cat. number ND41866*C) and ThermoFisher Scientific (cat. number A18945) respectively.

Cell Culture
Induced PSCs were maintained on Geltrex (Life Technologies) with Essential 8 Medium media (Life Technologies), and passaged using EDTA (Life Technologies, 0.5mM). All cell cultures were maintained at 37C and 5% carbon dioxide.

Gene quantification and unsupervised characterisation of the data-set
The absolute quantification of the genes was performed using HTSeq count ( we plotted the expression on the vertical axis as a function of the time corresponding to each sample on the horizontal axis and coloring all samples corresponding to healthy controls gray, and those corresponding to VCP-mutants in magenta. Next we identified genes whose expression profiles correlated (Pearson correlation between individual gene expression profile and right singular vectors) and contributed (projection of each individual gene expression profile onto right singular vectors) most strongly (either positively or negatively) with the expression profile of the singular vectors. In order to identify representative genes for each singular vector, genes were ranked according to both projection and correlation scores. The highest (most positive scores in both projection and correlation) and lowest (most negative scores in both correlation and projection) motifs were selected for each singular vector using K-mean clustering for downstream Gene Ontology enrichment analysis.

Alternative 3' UTR identification from RNA-seq
Nucleotide-level stranded coverage was obtained for each of the 31 samples using genomecov from the BEDTools suite (Quinlan, 2014). Next continuously transcribed regions were identified using a sliding window across the genome requiring a minimum coverage of 7 reads in more than 80 positions per window of 100 bp; neighbouring regions separated by low-mappable regions were merged as previously described (Miura et al., 2013). Expressed fragments were then associated with overlapping 3' UTR using the latest hg19 Ensembl versions v75 (Flicek et al., 2012).
Isolated expressed regions that did not overlap with any feature were further associated with the closest 3' UTR if (1) the closest annotated feature was nothing but a 3' UTR, (2) if the strand of the expressed region was in line with the strand of the closest 3' UTR, and (3) if the distance to the 3' UTR was less than 10'000 kb which is the range of intragenic distance (data not shown). The resulting extended 3' UTRs were subjected to extensive filtering to exclude potential intragenic transcription, overlapping transcripts, and retained introns as previously described (Miura et al., 2013). We finally intersected the longest 3' UTR segment annotated with RNA-seq data with a poly(A) site annotation built using reads from 3'-end sequencing libraries in human samples (Gruber et al., 2016) to obtain putative pA within longest 3' UTR segment of each transcript.

Identification of changes in the use of proximal and distal poly(A) sites
We then used the number of reads mapped to -300 nt terminal region of each 3' UTR isoforms as a proxy for the 3' UTR isoform expression level. The density of mapped reads in -300 nt terminal region of 3' UTR isoform is bimodal, with a low-density peak probably corresponding to background transcription i.e. 3' UTR isoforms of low abundance or 3' UTR isoforms to which reads were spuriously mapped, and a highdensity peak corresponding to expressed 3' UTR isoforms. In order to identify reliably expressed 3' UTR isoforms in the study, a two-component Gaussian mixture was fitted to the data using the R package mclust(Fraley and Raftery); an isoform was called reliably expressed if in both replicates had less than 1% chance of belonging to the background category.
In order to identify transcripts that show a marked change in the pA site usage between conditions, we scored the differences in proximal-to-distal poly(A) site usage using the following two scores: The statistical significance of the changes in proximal-to-distal poly(A) site ratio between two conditions was assessed by Fisher's exact count test using summed-up raw read counts of promoter-proximal versus promoter-distal 3' UTR isoforms originating either conditions. We adjusted the P-Value controlling for False Discovery Rate (FDR) of 0.01. We restricted our analysis on Ensembl transcripts containing at least two 3' UTRs generated by tandem polyadenylation expressed in conditions of interest. Proximal shifts were then selected when " ≤ −1, ' ≤ −15% and FDR<0.01; distal shift were selected when selected when " ≥ 1, ' ≥ 15% and FDR<0.01.

Splicing analysis
The identification of all classes of alternative splicing (AS) events in motor neuron differentiation was performed with the RNA-seq pipeline vast-tools (Irimia et al., conditions, we required a minimum average ΔPSI (between the paired replicates) of at least 10% and that the transcript targeted by the splicing event in question to be reliably expressed in all samples from the conditions compared i.e enough read coverage in all samples of interest. Intron retention (IR) focussed analysis has next been performed on 167 IR events for which a percentage of IR has been calculated as the fraction of intron mapping reads to the average number of reads mapping to the adjacent 5' and 3' exons normalised to the length of the respective intron and exons. A Fisher count test P-value has been obtained when testing for differential IR between conditions.

Gene ontology enrichment analysis
GO enrichment analysis was performed using classic Fisher test with topGO Bioconductor package (Alexa and Rahnenfuhrer, 2016). Only GO terms containing at least 10 annotated genes were considered. A p-value of 0.05 was used as the level of significance. On the figures, top significant GO terms were manually selected by removing redundant GO terms and terms which contain fewer than 5 significant genes.

Network analysis
Experimental protein-protein interaction information has been retrieved from the STRING data-base (Szklarczyk et al., 2011) andvisualised in Cytoscape (Shannon et al., 2003).

Reverse transcription, qPCR and intron retention validation
Reverse transcription was performed using the Revert Aid First Strand cDNA Synthesis Kit (ThermoFisher Scientific) using 1μg of total RNA and random hexamers. qPCR was performed using the Power SYBR Green Master Mix (ThermoFisher Scientific) and the Agilent Mx3000P QPCR System. Primers used are listed in Table S3. For each intron retention validation to screen three primers pairs were used: (1) primer pair F1 R1 (intron spanning, across exon-exon junction) were used to to analyse gene expression levels; (2) primer pair F2 R2 (one primer on an exon flanking the intron to be analysed, the other on the intron) was used to assess the levels of intron retention; (3) primer pair F3 R2 (both primers on the exons flanking the intron of interest, if possible designed across the exon-exon junction) was used to measure levels of the spliced transcript. RT-minus samples were used as negative controls. Levels of intron retention (primer pair F2R2) were normalised over the expression level of each individual gene (primer pair F1R1). Gene expression levels were measured using the ddCt method using three housekeeping genes (GAPDH, POLR2B and UBE2D3).

Cell Cycle analysis
Cell cycle analysis was performed by flow cytometry according to standard protocols.
Briefly, cells were dissociated using either Accutase or Trypsin (Life Technologies), washed and fixed in suspension using 70% cold Ethanol. Cells were stained using propidium iodide (PI, 50μg/ml, Sigma Aldrich) in the presence of RNase A (10μg/ml, Sigma Aldrich). Cells were analysed using a BD FACS Calibur (BD Bioscience).
Doublets were excluded from analysis and 10,000 events were collected in the single cell gate per sample. Cell cycle data was analysed using the Multicycle module of FCS Express 6 (De Novo Software).