Aberrant interaction of FUS with the U1 snRNA provides a molecular mechanism of FUS induced amyotrophic lateral sclerosis

Mutations in the RNA-binding protein Fused in Sarcoma (FUS) cause early-onset amyotrophic lateral sclerosis (ALS). However, a detailed understanding of central RNA targets of FUS and their implications for disease remain elusive. Here, we use a unique blend of crosslinking and immunoprecipitation (CLIP) and NMR spectroscopy to identify and characterise physiological and pathological RNA targets of FUS. We find that U1 snRNA is the primary RNA target of FUS via its interaction with stem-loop 3 and provide atomic details of this RNA-mediated mode of interaction with the U1 snRNP. Furthermore, we show that ALS-associated FUS aberrantly contacts U1 snRNA at the Sm site with its zinc finger and traps snRNP biogenesis intermediates in human and murine motor neurons. Altogether, we present molecular insights into a FUS toxic gain-of-function involving direct and aberrant RNA-binding and strengthen the link between two motor neuron diseases, ALS and spinal muscular atrophy (SMA).

R NA processing is an essential part of gene expression, and disturbed RNA metabolism has been linked to several neurodegenerative diseases 1 . Among these diseases is amyotrophic lateral sclerosis (ALS), a relentless adult-onset disease characterised by loss of motor neurons in the motor cortex and spinal cord, leading to muscle weakness and eventually paralysis and death 2 . Mutations in the FUS gene are at the origin of FUS-linked ALS, and some are associated with a particularly aggressive disease phenotype with juvenile onset 3,4 . Insoluble FUS inclusions in neurons and glial cells represent the pathological hallmark of this type of disease. FUS pathology is also observed in frontotemporal dementia (FTD), where FUS co-aggregates with the other members of the FET-family, EWS and TAF15, as well as with their nuclear import receptor β2/transportin-1 (TNPO1) 5,6 . In this case, aggregation occurs in the absence of FUS gene mutations and may be caused by aberrant post-translational modifications 7 .
The FUS protein comprises two functional modules. A conserved N-terminal region of low complexity (LC) consisting of QGSY-rich and G-rich domains drives liquid-liquid phase separation (LLPS) and mediates protein-protein interactions 8 . In contrast, the C-terminal module is responsible for nucleic acid binding and contains two globular RNA-binding domains (RBD), an RNA-recognition motif (RRM) and a zinc finger (ZnF), each embedded in RGG-rich sequences 9 . The last 29 amino acids constitute a PY-type nuclear localisation signal (NLS) 10 . As a ubiquitously expressed, predominantly nuclear protein, FUS may regulate gene expression at different levels. Considerable evidence links FUS to the splicing machinery, especially to the U1 snRNP and U11/U12 di-snRNP 11,12 . However, few aspects of these interactions are known. Consistent with a role as splicing factor, loss of FUS induces widespread splicing alterations, affecting both U2-type and U12-type introns 12,13 . While we previously reported that the LC domain of FUS is sufficient to recruit the U12-type spliceosome, how FUS regulates U2-type splicing remains elusive.
ALS-associated mutations typically disrupt the nuclear localisation signal of FUS, leading to cytoplasmic mislocalisation and eventually the formation of aggregates 10 . Recent mouse models strongly suggest that FUS causes motor neuron degeneration through a cytoplasmic toxic gain-of-function, although a reduction of nuclear FUS could contribute to the disease [14][15][16] . RNA binding is required for full FUS toxicity in various ALS models 17,18 . Hence, a detailed characterisation of both cytoplasmic and nuclear RNA interaction networks of FUS is not only key to better understanding its physiological function but could also provide valuable insights into the molecular mechanisms underlying neurodegeneration in ALS.
In this study, we perform a combination of CLIP experiments to identify physiological as well as pathological RNA targets of FUS. We identify the spliceosomal U1 snRNA as the major FUS target in the nucleus and provide the solution structure for this RNA-mediated mode of U1 snRNP recognition: The RRM of FUS contacts stem-loop 3 (SL3) of the U1 snRNA, which protrudes from the globular core of the particle. Furthermore, we show that in ALS models, FUS aberrantly interacts with the U1 snRNA in the cytoplasm, leading to impaired snRNP biogenesis. These findings provide insights into the mechanism of FUS-dependent splicing regulation and suggest that impaired snRNP biogenesis molecularly links the motor neuron diseases ALS and SMA.

Results
An RBD-centric FUS CLIP approach. We performed CLIP-Seq 19 with three different FUS constructs to comprehensively identify direct RNA targets of FUS on a transcriptome-wide scale (Fig. 1a). Besides the full-length protein, we employed a FUS construct comprising only its RNA-binding module (amino acids 242-526) to study the importance of the LC region (aa 1-241), which enables FUS to form complexes with other hnRNPs ( Fig. 1b and Supplementary Fig. 1a, b). Such cofactors can modulate the binding landscape of an RNA-binding protein in vivo 20 . Finally, we aimed at identifying RNA targets that could be implicated in neurodegeneration by performing CLIP with cytoplasmic ΔLC-FUS harbouring the ALS-associated P525L mutation combined with a heterologous nuclear export signal (NES). All FUS constructs were Twin-Strep tagged and stably introduced in SH-SY5Y neuroblastoma cells by lentiviral transduction. This approach allowed us to purify FUS-RNA complexes to near homogeneity, which reduces the risk of false positives introduced by contaminating RNA-binding proteins 21 . The transgenes were expressed close to physiological levels ( Fig. 1c) and in the background of a FUS knockout (KO) to prevent competition for binding sites with endogenous FUS. Immunofluorescence confirmed the expected nuclear and cytoplasmic localisation, respectively (Fig. 1d-f and Supplementary  Fig. 1c). The CLIP experiments were performed in triplicate each and a no-cross-link control was included to monitor the specificity of the signal. As expected, the autoradiographs revealed strictly cross-link-dependent protein-RNA complexes migrating slightly slower than the free FUS constructs. RNA was isolated from the regions indicated by the red dashed lines and converted into cDNA libraries for high-throughput sequencing. To normalise the CLIP data to the input, we profiled the transcriptomes of all CLIP cell lines by RNA sequencing after depletion of ribosomal RNA.
CLIP-Seq reveals the U1 snRNA as a major FUS target. In agreement with previous CLIP studies, we found that the binding signatures of full-length FUS were evenly distributed along the entire length of transcripts with around 70% of the reads mapping to introns ( Fig. 2a and Supplementary Fig. 2a), reflecting binding of FUS to pre-mRNAs 13,22 . In contrast, the percentage of intronic reads was reduced to~40% in the ΔLC-FUS CLIP, suggesting that the LC domain promotes co-transcriptional binding of FUS to introns. This agrees with our recent finding that liquid-liquid phase separation by FUS is required for its association with chromatin 23 . In particular, the first intron of the hnRNPA2B1 pre-mRNA displays binding that is heavily dependent on the LC domain of FUS (Fig. 2b). Nevertheless, loss of this unstructured domain does not alter the widespread nature of FUS binding to exonic regions as shown for hnRNPA2B1 pre-mRNA or in the long non-coding RNA MALAT1, indicated by the similar CLIP peak distributions (Fig. 2b).
To define FUS-binding sites at single-nucleotide resolution, we exploited a characteristic feature of CLIP-Seq data: peptide remnants that remain cross-linked to the RNA cause the reverse transcriptase to introduce deletions in the library preparation step, thereby allowing for the identification of cross-link sites at single-nucleotide resolution 24 . Indeed, deletions represent the most common type of mutation in our data ( Supplementary  Fig. 2b). Hence, we looked for genomic positions where deletions are supported by at least three independent reads and overall mutation frequency was below 0.5 to exclude false positives due to single-nucleotide polymorphisms (see scheme in Fig. 2c). After clustering of proximal deletions (within a range of ten nucleotides) and only considering deletions identified in all biological replicates, we defined a set of 14,327 highly reproducible FUS-binding sites. These binding sites were preferentially located in single-stranded regions flanked by sequences with increased folding propensity ( Fig. 2d), consistent with the recently described specificity of the FUS RRM for stem-loop structures 9 . To normalise the CLIP data to input RNA-Seq data, we defined 200-nucleotide windows around the cross-link sites and calculated the ratio of CLIP-Seq reads to RNA-Seq reads (Fig. 2c). This analysis yielded significantly overlapping sets of highly enriched transcripts for the three biological replicates ( Supplementary  Fig. 2c). Again, consistent with the FUS-binding specificity, we found that spliceosomal snRNAs, which are rich in stem-loops, were significantly more enriched relative to messenger and long non-coding RNAs (Fig. 2e). Among the snRNAs, the U1 snRNA was the top target and represented by far the most enriched transcript in the whole dataset (Fig. 2f), confirming previous results that linked FUS to the U1 snRNP 11,25,26 and suggesting that the contact occurs via the snRNA. Using RNA-RNP immunoprecipitation (RIP), we found that FUS, but not an RNA-binding deficient FUS mutant, selectively interacts with the U1 snRNA, confirming that FUS exclusively contacts the RNA component of the U1 snRNP (Fig. 2g). In contrast, the interaction with the U11/U12 di-snRNP is RNA-independent, as the U11 snRNA co-purified with both constructs. This is in agreement with our previous study showing that the LC domain of FUS is sufficient to promote U12-type intron splicing and suggests that FUS may have evolved distinct mechanisms to regulate the splicing of U2-type and U12-type introns 12 . We then mapped the precise FUS-binding site on U1 snRNA by computing the number of cross-link-induced deletions for every nucleotide position along the U1 snRNA primary sequence (Fig. 2h) Fig. 1 An RBD-centric FUS CLIP approach. a Schematic representation of the Twin-Strep-tagged FUS constructs used for CLIP. b In addition to binding as a free molecule, FUS may be recruited to RNA via interactions with other RBPs. This may confer foreign specificity to FUS and promote binding to low-affinity binding sites. c Western blot showing expression of FUS CLIP constructs in comparison to endogenous FUS in the parental SH-SY5Y cell line, detected using anti-FUS antibodies. Tub1A2 served as a loading control. Relative FUS expression levels were determined by densitometry. n = 3. d-f Immunofluorescence confirmed the expected nuclear or cytoplasmic localisation of FUS CLIP constructs, visualised using anti-FUS antibodies. Nuclei were counterstained with DAPI. Scale bar = 10 µm, n = 1. Protein-RNA complexes were purified using Strep-Tactin beads, separated by SDS-PAGE and their purity was assessed by Coomassie staining. Autoradiographs of three biological replicates reveal characteristic shifts above the free protein. RNA was isolated from the regions of the membranes indicated by red dashed lines. NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-020-20191-3 ARTICLE NATURE COMMUNICATIONS | (2020) 11:6341 | https://doi.org/10.1038/s41467-020-20191-3 | www.nature.com/naturecommunications this structural element represents the major contact point between FUS and U1 snRNP. This interaction site is not only accessible in the previous crystal structure of the U1 snRNP 27 but also in the recent cryo-EM structure of the human spliceosomal pre-B complex 28 . Such a mode of spliceosome recognition was never described and prompted us to further explore the molecular details of this interaction.
To further characterise the interaction between FUS and U1 snRNP, we truncated the protein to the region only containing the FUS RRM and RGG2 (aa 260-390, hereafter referred to as FUS RRM). FUS RRM binds to SL3 with a moderate binding affinity of 8 µM ( Supplementary Fig. 3c). Upon the formation of the protein-RNA complex in the NMR tube, we observed chemical shift changes mainly occurring at the β-sheet surface of the RRM, at the α1-β2 loop and at the C-terminal extension containing RGG2 ( Supplementary Fig. 3d, e). The same experiment was performed using a 15 N-13 C uniformly labelled sample of SL3 to monitor the changes on the RNA resonances. The NMR signal (H8/H6 and H1′) corresponding to the 3′-part of the loop (A104-U107) were the most affected ( Supplementary  Fig. 4a-c).
We next solved the solution structure of FUS RRM bound to SL3 using 3569 and 344 intramolecular nuclear Overhauser effect (NOE)-derived distance constraints for the protein and the RNA, respectively, as well as 94 intermolecular NOEs ( Supplementary  Fig. 4d). The resulting ensemble of 18 NMR structures overlays with a backbone root-mean-square deviation of 0.79 Å ( Fig. 3d and Table 1). Three bases are recognised by the RRM β-sheet surface: (i) U105 forms direct hydrogen bonds with the main chain of T326; (ii) G106 stacks against the side chain of F288 and (iii) U107 establishes direct hydrogen bonds with the main and side chain of N284 ( Fig. 3e-g). The analysis of the structure revealed that the RRM recognises the YNY motif located at the 3′part of the SL3, similarly to our recent structure of FUS RRM bound to a pre-mRNA stem loop 9 . The structure of the protein-RNA complex also shows that the RNA loop is sandwiched by two other protein elements that contact the RNA. On one side, the long α1-β2 loop inserts into the loopadjacent major groove and provides additional contacts with the RNA phosphate backbone. On the other side, the beginning of the C-terminal extension folds into a small α-helix bringing the downstream RGG repeats to contact the adjacent minor groove (Fig. 3h). Consequently, the interaction between RGG2 and the minor groove may direct the position of the second RNA-binding domain. Our solution structure supports that FUS RRM is the main binding site for the U1 snRNP, while the zinc-finger domain may interact with other RNA molecules.

FUS contacts additional snRNAs during spliceosome assembly.
Besides the U1 snRNA, our FUS CLIP data also revealed strong enrichments for other snRNAs, which prompted us to further explore these interactions. Using cross-link analysis, we observed specific binding signatures in the 3′-stem loop (3′-SL) of U4 snRNA, a bulged loop (IL-1) in U5 snRNA and a linear stretch upstream of the GACAGA box in U6 snRNA (Supplementary Fig. 5a, b). Intriguingly, these cross-link sites are solvent exposed and clustered in the spliceosomal pre-B and B complexes, indicating that the interactions occur in the context of the spliceosome as opposed to individual snRNPs ( Supplementary  Fig. 5c, d). To address if these interactions are also mediated via the RRM domain, we incubated our 15 N-labelled FUS-RRM construct with snRNA fragments encompassing U4 3′-ISL (nucleotides 93-109), U5 IL-1 (nucleotides 4-18 followed by a UUCG tetraloop and nucleotides 59-77) and U6 5′-UAUA-CUAA-3′ (nucleotides 21-28). Indeed, we found that all three RNAs induced chemical shift perturbations in the β-sheet surface and α1-β2 loop of the RRM as well as in RGG2, consistent with a direct interaction in vitro ( Supplementary Fig. 5e). Overall, our results suggest that FUS employs its RRM to sequentially interact with multiple snRNAs as it escorts the spliceosome through the assembly phase of the splicing cycle.
The FUS-U1 snRNA interaction is altered in the cytoplasm. Consistent with cytoplasmic localisation of the ΔLC-FUS-P525L construct, we did not observe significant binding to intronic regions and lessened the enrichment of the predominantly nuclear snRNAs ( Supplementary Fig. 6a, b). However, the U1 snRNA was still preferentially targeted among snRNAs ( Supplementary Fig. 6c) and displayed an altered binding footprint: while one cross-link cluster confirmed the binding site of the RRM at SL3, we observed an additional cytoplasm-specific peak overlapping a GGU motif adjacent to the Sm site (Fig. 4a), with GGU being the preferred sequence bound by the zinc finger of FUS 9 . Hence, we decided to further study this interaction in vitro using our recombinant FUS-RBD construct and a U1 snRNA fragment encompassing stem-loops 3 and 4 with the intervening Sm site (SL34). Using NMR spectroscopy, we found that addition of SL34 to uniformly 15 N-labelled FUS-RBD induced chemical shift perturbations of NMR resonances of the RRM as well as the zinc finger, indicating that both RNA-binding domains are involved in the interaction with the naked U1 SL34 RNA ( Fig. 4b and Supplementary Fig. 6d). We then used isothermal titration calorimetry (Fig. 4c) and electrophoretic mobility shift assay ( Supplementary Fig. 6e) to determine a binding affinity in the low nanomolar range (K d~7 0 nM). Notably, this interaction is sensitive to mutation of the GGU adjacent to the Sm site, confirming the cytoplasm-specific binding site identified by CLIP (Fig. 4c). Hence, we wondered if such a bipartite mode of RNA recognition would be compatible with the assembly of the heptameric Sm ring, which is an essential step during snRNP biogenesis that occurs in the cytoplasm 30 . To assess this, we performed core snRNP assembly assays in a cellfree environment by mixing SL34 with recombinant Sm proteins. Incubation of the RNA with either FUS or Sm proteins alone led to the formation of two distinct complexes that could be separated by analytical size-exclusion chromatography (Fig. 4d). Intriguingly, titration of FUS to the Sm proteins impaired the formation of core snRNPs in a dose-dependent manner, confirming that the interaction between FUS and the U1 snRNA is incompatible with spontaneous core snRNP assembly in vitro. At equimolar amounts, FUS effectively outcompeted the Sm proteins to associate with the Sm site ( Supplementary Fig. 6f). In mature U1 snRNPs, the Sm ring is stabilised by the N-terminal domain of U1-70K, which wraps around this core domain and could prevent nuclear FUS from displacing Sm proteins in the context of pre-mRNA splicing 27 . Consistent with this hypothesis, the presence of U1-70K (aa 1-216) reduces the capacity of FUS to impair core snRNP assembly by~75% ( Supplementary Fig. 6g).
FUS and cellular stress impair snRNP biogenesis in ALS models. To further explore the effects of FUS on snRNP biogenesis in a physiological cellular model, we then used CRISPR/ Cas9 to target the endogenous FUS locus and generate isogenic hiPSC lines harbouring the ALS-associated P525L mutation as well as complete knockout of the FUS gene using the CRISPR- Trap approach 31 (Fig. 5a). Successful editing was verified by DNA sequencing, and the expression of pluripotency markers was confirmed by immunostaining ( Supplementary Fig. 7a, b). To examine the cell-type predominantly affected in disease, we then differentiated our hiPSCs to motor neurons (MNs), employing a previously described protocol 32 ( Supplementary Fig. 7c). In absence of stress, a small subset of homozygous MNs formed FUS condensates, and this behaviour was not observed for WT FUSexpressing MNs (Fig. 5b). However, most motor neurons did not display FUS inclusions, possibly due to their developmentally immature state 33 . We therefore induced cytoplasmic FUS condensation with sodium arsenite (SA) treatment for 1 h. The resulting FUS condensates stained positively for U1 snRNA (Fig. 5c) as well as the snRNP-specific import factor Snurportin-1 (Fig. 5b). The specificity of the FISH probe was verified by northern blotting using radiolabelled antisense probes (Supplementary Fig. 7d). This finding corroborates our recent analysis of the RNA content of purified FUS-containing droplets showing the presence of snRNA 23 . Nevertheless, we noted that SA-treatment also induced condensation of snRNP intermediates in the absence of FUS, suggesting that stress critically contributes to snRNP biogenesis defects in cellular FUS-linked ALS models. Indeed, Snurportin-1 condensates containing several snRNAs also formed in sodium arsenite treated hiPSCs and co-stained with the stress granule marker T-cell intracellular antigen-1 receptor (TIAR) (Supplementary Fig. 8a-e).
To circumvent the pitfalls of artificially added stressors and to study snRNP biogenesis in naturally aged tissue, we turned to an animal model and performed RNA fluorescence in situ hybridisation (FISH) in spinal cord sections of 18-month old 'FUSDelta14' mice. These mice harbour an ALS-associated splice site mutation that deletes the NLS and display a progressive loss of motor neurons in adult mice in the absence of FUS aggregation 16 . Strikingly, the FISH signal for U1 snRNA is clearly increased in the cytoplasm of spinal motor neurons of heterozygous 'FUSDelta14' mice compared to their wild-type littermates (Fig. 5d). In addition, we noted that U1 snRNA penetrates the nucleolus in 'FUSDelta14' mice (Fig. 5e)  that has been described for the Cajal body marker Coilin and Sm proteins under conditions of impaired snRNP biogenesis 34,35 . Together, these findings show that ALS-associated FUS traps a pool of U1 snRNA in the cytoplasm and disturbs biogenesis of U1 snRNP in vivo.

Discussion
Functional implication of the physiological interaction between FUS and U1 snRNP. The spliceosome is a highly dynamic molecular machinery that assembles on each intron to catalyse its removal from nascent transcripts 36 . In the early stages of this process, the U1 snRNP binds to and defines the future 5′splice site. Our CLIP data revealed that U1 snRNP is the preferred FUS partner in physiological conditions and the interaction occurs between FUS RRM and U1 snRNA stem-loop 3, which could have promoted the conservation of this unstructured loop. In contrast, stem-loop 4 of the U1 snRNA, which is also free from U1-specific proteins, harbours a UUCG tetraloop known to adopt a structured conformation incompatible with FUS binding 37 and previously shown to be involved in the general mechanism of splicing through interaction with the U2 snRNP component SF3A1 38 or in alternative splicing via its interaction with the splicing factor polypyrimidine tract-binding protein 1 39 . Indeed, both stem-loops could act as hubs for splicing factors involved in the general mechanism of splicing or in its regulation. The solution structure of the FUS RRM bound to stem-loop 3 reveals that the RRM recognises the YNY motif in the 3′-part of the loop, while two positively charged lysines (K315/K316) in the α1-β2 loop contact the phosphate backbone in the major groove of the stem. These lysine residues have recently shown to be acetylated, which impairs the ability of FUS to bind RNA 40 . However, the structure further shows how the C-terminal RGG2 repeats wrap around the loop-adjacent minor groove to probably orient the other RNA-binding domain. Since we previously described how FUS achieves high-affinity RNA binding by combining both its RNA-binding domains, we propose that FUS could form a bridge between U1 snRNP and other RNA molecules, such as the pre-mRNA or other spliceosomal RNAs (Fig. 6a). In agreement, the recent cryo-EM structure of the human pre-B spliceosomal complex revealed proximity between the stem-loop 3 and the exon of the pre-mRNA 28 and our structural simulations suggest that the RRM and the zinc finger of FUS could recognise RNA elements separated by up to 80 Å. Such intermolecular bipartite interactions could explain how FUS positions the U1 snRNP on pre-mRNA to modulate 5′-splice site selection and repress premature polyadenylation 41,42 . In a wider context, our work challenges the idea that splicing factors employ their RNA-binding domains to contact exclusively pre-mRNAs. Given that CLIP studies of numerous splicing factors focused on protein-coding RNAs, re-evaluation of published datasets with respect to snRNA binding could provide novel insights into the mechanisms of splicing regulation.
Impaired snRNP assembly links FUS-ALS to SMA. Disrupted snRNP biogenesis has been linked to motor neuron degeneration in spinal muscular atrophy (SMA), a childhood neuromuscular disorder caused by insufficient levels of survival motor neuron (SMN) protein, whose best-characterised role is to chaperone the assembly of small nuclear ribonucleoproteins (snRNPs) 30 . In particular, strong evidence comes from the findings that only SMN constructs that retain snRNP assembly activity are able to rescue SMA animal models and that injection of purified snRNPs rescues motor axon defects in a zebrafish model [43][44][45] . Impaired snRNP assembly induces downstream defects in RNA processing, which is thought to promote motor neuron death, though the molecular targets are still unclear. In addition, mutations in the snRNA genes RNU4ATAC and RNU12 as well as TOE1, which encodes an exonuclease involved in snRNP biogenesis, cause distinct syndromes that share a strong neurological component [46][47][48] . Thus, the central nervous system appears to be particularly vulnerable to alterations in snRNP homeostasis. How exactly altered snRNP levels or profiles lead to disease remains unknown. Several studies have explored the link between ALS and SMA. Genetic evidence suggests that abnormal SMN1/2 gene copy numbers modulate the risk and severity of ALS humans whereas SMN overexpression delays motor neuron loss in SOD1(G93A) ALS mice [49][50][51] . Furthermore, ALS-associated FUS interacts with the SMN protein and sequesters it into cytoplasmic condensates while evoking a loss of SMN-containing nuclear structures called Gemini of Coiled Bodies (GEMs), a hallmark of SMA 15,25,52,53 . An additional link is provided by the transcriptional activator complex ASC-1, whose interaction with the RNA polymerase II machinery is disturbed by SMA-causing mutations in one of its components or ALS-causing mutations in FUS 54 .
In this study, we used CLIP to identify an aberrant intramolecular bipartite interaction between ALS-associated FUS and the U1 snRNA, where the zinc-finger domain contacts a GGU motif in the Sm site and interferes with cytoplasmic U1 snRNP assembly in vitro and in a mouse model of FUS-linked ALS (Fig. 6b). Besides this toxic interaction, we propose that a second mechanism exacerbates the U1 snRNP biogenesis defect in a stress-dependent manner. In hiPSC-derived motor neurons, oxidative stress induced the condensation of core snRNPs into stress granules which co-localised with ALS-associated, but not wild-type FUS. Therefore, ALS-associated FUS could irreversibly sequester snRNP assembly intermediates by promoting the maturation of stress granules into pathological FUS aggregates (Fig. 6b). This is supported by evidence suggesting that stress granules may seed aggregate formation in ALS 10 . In addition to directly contacting cytoplasmic snRNAs, there is also evidence that ALS-associated FUS aberrantly interacts with the assembly factor SMN and sequesters it into cytoplasmic condensates 25,52,53 . Thus, FUS could disturb snRNP biogenesis via different mechanisms.
Altogether, our results are consistent with the adult onset of the disease and provide a mechanistic explanation for the previously reported snRNP biogenesis defects in FUS-linked ALS 25,55-57 . The resulting alterations in snRNP homeostasis could explain how in mice ALS-associated FUS leads to mis-splicing of pre-mRNAs that are not regulated by nuclear FUS, but instead are sensitive to levels of the core splicing machinery, such as the SmB/B' pre-mRNA 15,58,59 . Subsequent work will be required to assess the impact of FUS on steady-state snRNP levels in diseaserelevant cell types. Taking into consideration that disease onset is delayed by at least a decade in ALS compared to SMA, mild changes could already significantly contribute to the disease mechanism. Notably, dysregulated snRNP homeostasis has also been linked to TDP-43 and C9ORF72-linked ALS as well as the sporadic form of the disease [60][61][62] . Although the downstream effects and their impact on the disease remain to be investigated, it is important to note that not only pre-mRNA splicing but also polyadenylation is sensitive to reduced U1 snRNP levels 63 . In summary, our findings provide the molecular details of an RNAbased toxic gain-of-function of FUS in the cytoplasm causing a molecular defect that strengthens the link between FUS-linked ALS and SMA, with both motor neuron disorders displaying cytoplasmic snRNP biogenesis defects.

Methods
NMR spectroscopy. All NMR spectroscopy measurements were performed using Bruker AVIII 600 MHz, AVIII 700 MHz and Avance 900 MHz spectrometers equipped with cryoprobes. The data were processed with Topspin 3.1 (Bruker) and analysed with CARA 64 . All the NMR experiments were performed in the NMR buffer that contains 10 mM sodium phosphate buffer pH 6.8, 50 mM NaCl, 2 mM DTT at 303 K (for FUS RRM-SL3) or 313 K (for FUS RDB-U1 snRNP/SL3/SL34).

NMR titrations.
To monitor the interaction between FUS and U1 snRNP, a 20 μM solution of 2 H, 15 N, 13 C ILV FUS-RDB was titrated with U1 snRNP at 313 K. At each titration step, a 2D 1 H-13 C HMQC spectrum as well as a 2D 1 H-15 N TROSY spectrum were recorded. A similar procedure was followed when the 2 H, 15 N, 13 C ILV FUS-RDB were titrated with U1 SL3. To monitor the interaction between FUS-RDB with U1 SL34, a 100 μM solution of 15 N-labelled FUS-RDB was titrated with U1 SL34 at 313 K. At each titration step, a 2D 1 H-15 N TROSY spectrum was recorded. The NMR titration of FUS RRM with U1 SL3 was performed by adding unlabelled aliquots of U1 SL3 into a 100 μM solution of 15 N-labelled FUS RRM. At each titration step, a 2D 1 H-15 N TROSY spectrum was recorded. The NMR titration of U1 SL3 with FUS RRM was performed by adding unlabelled aliquots of FUS RRM into a 100 μM solution of 13 C-labelled U1 SL3. The formation of the complex was monitored by recording a couple of 2D 1 H-13 C HSQC spectra (centred on aliphatic or aromatic regions of RNA) after each titration step. Structure calculations. The resonance assignment of the bound protein was used as input for automatic peak picking and NOESY assignment using ATNOS-CANDID 65 . Resulting peak lists were checked and supplemented manually. RNA and intermolecular NOESY peaks were manually assigned and calibrated. Protein peaks were then re-assigned with the NOEASSIGN module of CYANA3.96 66 and manually checked. Structure calculations were performed using a list of unambiguous intramolecular NOE-derived distances for the protein and the RNA, unambiguous intramolecular NOE-derived distances and ambiguous restraints for the C-terminal RGG tail using CYANA. Due to the strong overlap of arginine and glycine side-chain resonances of the C-terminal RGG tail, intermolecular NOEs were treated using ambiguous restraints. Successive calculations allowed us to progressively remove the most violated ambiguous restraints before cartesian refinement. In addition, protein hydrogen bonds in secondary structures as well as dihedral angles restraints of the protein backbone were derived from the analysis of the backbone chemical shifts using TALOS + were also included. Finally, RNA base pairing and loose sugar pucker restraints were applied to constraint the double-stranded part of the RNA. Final calculations were performed using CYANA and out of 500 structures generated, the 50 structures with the lowest target function were further refined in cartesian space with the SANDER module of AMBER14 67 using ff14SB force field. Lowest energy models were then selected.
Analytic size-exclusion experiments. Analytic size-exclusion chromatography experiments were performed using Superdex 200 10/300 GL in 10 mM sodium phosphate buffer pH 6.8, 100 mM NaCl and 2 mM DTT. For the formation of the Sm core, each Sm protein was added in a test tube at a final concentration of 20 μM together with 5 μl of RNAsin (Invitrogen) and incubated at 37°C. After 5 min at 37°C , the RNA was added, incubated for another 5 min at 37°C, the sample volume was then adjusted to 250 μl and directly load on the size-exclusion column (S200 increase, GE Healthcare). A similar procedure was applied to prepare the FUS-RDB-SL34 complex. For the competition between FUS and Sm protein for SL34 binding, constant amounts of Sm proteins were incubated with various amount of FUS-RBD before the addition of the RNA. To test the effect of U1-70K on the competition between FUS and Sm protein for SL34 binding, we incubated together the equimolar amount of Sm proteins, FUS RBD and U1-70K (1-216) before the addition of the RNA in the solution. Data were directly integrated using Unicorn (GE Healthcare) and analysed using GraphPad.
Isothermal titration calorimetry. ITC experiments were performed on a VP-ITC microcalorimeter (Microcal). All proteins and RNA were extensively dialysed in 10 mM sodium phosphate buffer pH 6.8, 50 mM NaCl and 1 mM β-mercaptoethanol. For the titration between FUS RRM and SL3, the protein was concentrated to 600 μM and the RNA to 50 μM. For the titration between FUS-RBD and SL34 (or SL34mut), the protein was concentrated to 60 μM and the RNA to 6 μM. The titrations were performed at 25°C using a single injection of 2 μL followed by 6 μL injection every 300 s with a stirring rate of 307 rpm. Raw data were integrated and analysed using Origin 7. CLIP-Seq. SH-SY5Y cell lines were grown to 80% confluency in two 15-cm dishes per biological replicate. The cells were washed once with ice-cold PBS, covered with 10 ml of PBS and cross-linked at 254 nm and 150 N/cm 2 using a Bio-Link ® crosslinker (Vilber Lourmat, BLX-E). Subsequently, the cells were scraped off the plates and spun down at 500 × g and 4°C for 5 min. After removal of the supernatant, the pellets were shock-frozen in liquid nitrogen and stored at −80°C until use. Cells were lysed in 2 ml RIPA buffer (Thermo Scientific) supplemented with 2× HALT protease inhibitor [Pierce] and 0.5 U/μl RNase inhibitor (Lucigen) and incubated on ice for 15 min. To mask-free biotin and biotinylated proteins, 1 U avidin (Novex) was added to the lysate. Then, cellular debris was removed by centrifugation at 15,000 × g and 4°C for 15 min. The cleared lysates were incubated with 15 μL RNaseI (Thermo Scientific) dilution (1:250 in RIPA buffer) and 7.5 μL Turbo DNase (Ambion) at 37°C for 7.5 min and then cooled on ice for 3 min to partially digest cross-linked RNAs. Per IP, 100 μL of MagStrep Type 3 XT beads (IBA lifesciences) were washed twice with IP wash buffer (50 mM HEPES-NaOH pH 7.3, 300 mM KCl, 0.05 % NP-40). Subsequently, the protein-RNA complexes were bound to the beads head over tail for 1.5 h at 4°C. After four washes with IP wash buffer, the beads were resuspended in 150 μL dephosphorylation mix (8 U antarctic phosphatase enzyme [NEB] in 1× reaction buffer and 0.5 U/μL NxGen RNase inhibitor) and incubated at 37°C and 900 rpm for 20 min. After that, the beads were washed twice with IP wash buffer and twice with PNK buffer (50 mM Tris-HCl pH 7.6, 150 mM NaCl, 1% NP-40, 1% sodium deoxycholate, 0.1% SDS). Then, 150 μL 5'-phosphorylation mix (100 U T4 PNK (Thermo Scientific), 200 μCi γ-32 P-ATP (3000 Ci/mmol) (Hartmann Analytics), 3 mM ATP in 1× reaction buffer A and 0.5 U/μL NxGen RNase inhibitor) were added, and the samples were incubated at 37°C and 900 rpm for 45 min. Finally, the beads were washed four times with PNK buffer. Protein-RNA complexes were eluted by heating in 45 μL 2× LDS sample buffer (supplemented with 5 mM biotin) at 70°C for 10 min, separated on a NuPage 4-12% Bis-Tris gradient gel in MOPS buffer and transferred to a nitrocellulose membrane (Life Technologies). Paper arrows were dipped in the last PNK wash to mark 40 kDa and 80 kDa bands on the dried membrane, which was subsequently wrapped in cling foil and exposed to a phosphorimager screen for 38 h. The membrane sections containing the desired protein-RNA complexes were excised using a clean blade, transferred to Eppendorf tubes and grinded using a pipette tip. Next, the membrane fragments were incubated with 400 μL proteinase K mix (20% proteinase K (Invitrogen) diluted in water) at 37°C for 30 min. Following the addition of an additional 200 μL proteinase K mix, the reactions were incubated again for 30 min. RNA was isolated from the supernatant by addition of 1 volume acidic phenol:chloroform:isoamylalcohol (25:24:1) and centrifugation at 13,000 × g for 10 min. The aqueous phases (350 μL) were transferred to fresh tubes and precipitated upon addition of 35 μL volume 3 M NaOAc pH 4.6, 2 μl glycoblue (Ambion) and 1 mL absolute EtOH at −80°C for 30 min. Subsequently, the precipitated RNAs were spun down at 16,000 × g and 4°C for 30 min, washed with 75% EtOH and resuspended in 20 μL DEPC water. Library preparation of the CLIP samples was performed at Fasteris SA. Following adapter ligation using the TruSeq small RNA sample preparation kit (Illumina) and 25 cycles of PCR amplification, the libraries were sequenced on the HiSeq 2500 platform (Illumina) using 1 × 125 bp single-end (ΔLC-FUS) and 2 × 125 pairedend (FUS and ΔLC-FUS P525L ) cycles. For input RNA sequencing, 2 μg of the total RNA from the cell lines were ribo-depleted using RiboCop (Lexogen) according to the manufacturer's instructions. Library preparation was performed using the TruSeq stranded mRNA library preparation kit (Illumina). Samples were sequenced on a HiSeq 2500 platform using 150 cycles in paired-end (ΔLC-FUS) and single-end mode (FUS and ΔLC-FUS P525L ). CASAVA (v1.8.2) (Illumina) was used to convert Bcl files to FASTQ format.

Fluorescence in situ hybridisation (FISH)/immunofluorescence in cell lines.
Immunofluorescence was performed before FISH. Cells were fixed with 4% PFA for 15 min and then washed twice with PBS and twice with 70°C ethanol before permeabilization in 70% ethanol for 48 h at 4°C. After washing three times for 5 min at RT with PBS, the slides were blocked three times for 10 min at RT with blocking buffer (1% BSA in PBS supplemented with 2 mM ribonucleoside vanadyl complex (RVC) (Sigma)). Primary antibodies were diluted in blocking buffer and incubated for 1 h at 37°C and 1 h at room temperature. After three 5-min washes with blocking buffer, the secondary antibodies were added for 45 min at room temperature. The cells were then washed three times with PBS and post-fixed with 4% PFA for 5 min at room temperature to cross-link antibodies to their targets. Then, the slides were washed twice with 2× SSC (300 mM NaCl, 30 mM sodium citrate pH 7.0) and incubated with pre-hybridisation buffer (15% formamide, 10 mM sodium phosphate, 2 mM RVC in 2x SSC, pH 7.0) for 10 min at room temperature. Antisense probes were diluted to 0.5 ng/μl in hybridisation buffer (15% formamide, 10 mM sodium phosphate, 10% dextran sulfate, 0.2% BSA, 0.5 μg/μl Escherichia coli tRNA, 0.5 μg/μl salmon sperm DNA, 2 mM RVC in 2× SSC, pH 7.0) and hybridised to the cells overnight at 42°C. The next day, the cells were subsequently washed (all wash steps at 42°C) two times for 30 min with prehybridisation buffer and three times for 10 min in high-stringency wash solution (20% formamide, 2 mM RVC in 0.05× SSC, pH 7.0). After three washes in 2× SSC, the slides were mounted with aqueous vectashield mounting medium containing DAPI (Vectorlabs). Antibodies are listed in Supplementary Table 2.
Fluorescence in situ hybridisation (FISH) in tissue. Mouse spinal cord tissue was harvested as previously reported 16 according to applicable international, national, and institutional guidelines, including ARRIVE guidelines, for the care and use of animals and according to UK home office regulations. FISH was performed before Immunofluorescence. To dewax sections, the slides were incubated in xylene three times for 5 min. Then the sections were hydrated in a stepwise manner by incubations in 100% EtOH (2 × 2 min), 90% EtOH (1 × 2 min), 80% EtOH (1 × 2 min) and 70% EtOH (1 × 2 min) and finally distilled water (1 × 5 min). To retrieve antigens, the slides were boiled in 1 L citrate buffer (100 mM citrate pH 6.0) for 20 min in a microwave on high power. The slides were cooled by exchanging the buffer with slowly running cold tap water. Then they were transferred in distilled water (3 × 5 min) and hydrophobic barriers were created using a barrier pen. After one wash in 2 × SSC (1 × 5 min) the slide was incubated with prehybridisation buffer (15% formamide, 10 mM sodium phosphate, 2 mM RVC in 2× SSC, pH 7.0) at 42°C for 30 min. The labelled antisense U1 probe was diluted to 500 pg/μL in hybridisation buffer (15% formamide, 10 mM sodium phosphate, 10% dextran sulfate, 0.2% BSA, 0.5 μg/μl Escherichia coli tRNA, 0.5 μg/μl salmon sperm DNA, 2 mM RVC in 2× SSC, pH 7.0) and incubated with the sections overnight at 42°C. The next day, the slides were first washed at 42°C 6 × 15min with highstringency wash buffer (10-50% formamide, 0.05x SSC, 2 mM RVC) followed by washes in 0.05× SSC (3 × 10 min, 42°C) and PBS (1 × 5 min, RT). Subsequently, the slides were incubated in Sudan Black for 5 min to reduce tissue autofluorescence. Following short incubation in PBS to remove the bulk of residual Sudan Black and four additional 5-min washes in PBS, the slides were mounted with aqueous Vectashield mounting medium containing DAPI (Vectorlabs).
Image acquisition. Images of SH-SY5Y cells were obtained with a non-confocal fluorescence microscope (Leica DMI6000 B) using a 60×/1.4 NA oil immersion lens and the LAS X software platform (Leica). Images of hiPSCs and hiPSC-derived motor neurons were taken with a non-confocal Eclipse Ti-2 epifluorescence microscope (Nikon) using the NIS-Elements AR software (Ver 5.01) and either a 20×/dry or 60×/1.4 NA oil immersion lens. Confocal images of hiPSC-derived motor neurons and mouse spinal cord were obtained with a super-resolution VT-iSIM microscope (Nikon) using a 100×/1.49 NA oil immersion lens. Deconvolution was performed with the NIS-Elements AR software (Ver 5.01) using the Richardson/Lucy algorithm with 20 iterations. For printing, brightness and contrast of individual channels were linearly enhanced using the Fiji software 69 .
Electrophoretic mobility shift assay (EMSA). To refold the RNA, SL34 RNA was first diluted to 250 pM in 1× binding buffer (10 mM HEPES pH 7.3, 100 mM KCl, 5 mM MgCl 2 , 10 μg/ml yeast tRNA, 10 μg/ml salmon sperm DNA), incubated at 95°C for 1 min and then at 65°C for 1 min before allowing to cool down slowly to room temperature. For the binding reactions, 2 fmol RNA (100 pM concentration) were mixed with increasing concentrations of the FUS-RBD constructs (up to 2 μM) in 1× binding buffer for 1 h at room temperature. Subsequently, RNA gel loading buffer (5% glycerol, traces of bromophenol blue and xylene cyanol) was added and the protein-RNA complexes were separated on a non-denaturing 0.5× TBE 6% polyacrylamide gel in 0.5× TBE buffer under constant cooling. The gel was then fixed with EMSA fixing solution (5% glycerol, 12% methanol, 10% acetic acid), vacuum dried and exposed to a phosphorimager screen overnight.
Genome editing. Exon 15 of the FUS gene was targeted to introduce the P525L mutation using the pCRISPR-EF1a-SpCas9-P525 plasmid coding for the sgRNA targeting the sequence 5′-GGAGCCAGGCTAATTAATACGG-3′ using the strategy described in 70 . One day before transfection, 10 µM rock inhibitor Y-27632 (Stemcell Technologies) and 2 µM pyrintegrin (Stem cell technologies) were added to the stem cells. On the day of transfection, six wells of a six-well plate containing each 90% confluent stem cells in mTeSR1 containing rock inhibitor and pyrintegrin were transfected using TransIT ® -LT1 Transfection Reagent (Mirus) according to the manufacturer's instructions. Here, each six-well was transfected with a total amount of 5 µg of DNA, transfecting 200 ng of pRR-Puro-P525 and 4.8 µg of a mix of pCRISPR-EF1a-SpCas9-P525 and the P525L donor plasmid for HDR. For each six-well, a different molar ratio of pCRISPR-EF1a-SpCas9-P525 and donor plasmid was used (1:1, 1:3, 1:6, 4:1, 3:1, 2:1). The day after transfection, the medium was changed to mTeSR1 containing 10 µM rock inhibitor and 2 µM pyrintegrin supplemented with the 5 µl of the HDR-enhancer L755507 (Sigma). Two days after transfection, cells were detached using Accutase (Thermo Fisher) and pooled on one 15-cm plate in mTeSR1 containing 10 µM rock inhibitor and 2 µM pyrintegrin supplemented with 0.5 µg/ml puromycin. The selection was maintained for one more day and rock inhibitor and pyrintegrin were maintained for 4 more days. Thereafter, colonies growing from single cells were picked, and gDNA was isolated for clone screening using TRIzol according to the manufacturer's instructions. The P525L genomic locus was amplified from the genomic DNA using the KAPA Taq ReadyMix PCR Kit according to the manufacturer's instructions. PCR products were purified over a preparative agarose gel using the Wizard SV Gel and PCR Clean-Up System (Promega). Purified PCR products were sequenced at Microsynth AG.
RNA-seq data analysis. Mapping of raw reads obtained from RNA-seq experiments was accomplished using STAR version 2.5.2a 73 with the parameters of the RNA-seq pipeline for long RNAs provided by the ENCODE Data Coordinating Center and the full ENSEMBL gene annotation version 90 of genome assembly GRCh38.
CLIP-Seq data analysis. Preprocessing: single-and paired-end samples were subjected to 3′ adapter trimming using cutadapt version 1.14 74 . Mapping and additional processing steps: the trimmed reads were mapped using STAR version 2.5.3a 73 with the same parameters as for RNA-seq samples. Putative PCR duplicates were filtered from both single-and paired-end libraries by applying samtools (version 1.8.3) utilities fixmate and markdup 75 . The intersection of alignments and gene annotation: To infer the location of the aligned reads with respect to specific gene annotation features (exons, introns, etc.), a filtered gene annotation was used. The filtered set only contained entries of genes annotated with support level 1 (all splice junctions supported by at least one trusted mRNA sequence) plus the following classes of non-coding RNAs: snRNA, snoRNA, scaRNA, scRNA, miRNA and lincRNA. If an alignment intersected with multiple annotated features, each feature was counted partially, with a weight proportional to the width of the intersection. The intersections of features with multi-mapping reads were further weighted with 1/# mappings of the read. Highly reproducible binding sites: For each sample, a set of highly reproducible FUS-RNA interaction sites was inferred by exploiting the deletions introduced by the reverse transcriptase during cDNA library preparation. First, only deletions not already annotated as SNPs in the ENSEMBL vcf file of gene annotation GRCh38 version 90 were considered. In regions of the alignment where the forward and reverse read overlapped, deletions were required to be identified in the alignment of both reads. Finally, only those deletion sites where the mutation frequency among all alignments overlapping this site was below 50% were retained. Clustering of deletion sites: individual deletion sites were combined to deletion regions if they were less than 11 nucleotides apart. Enrichment analysis: for the inferred deletion regions (see above), an enrichment ratio of alignments from the CLIP experiment and from a matching RNA-seq experiment was calculated as follows. Each deletion region was extended up-and downstream by 100 nucleotides. For CLIP and RNA-Seq samples, the raw numbers of single-end or read-pair alignments with at least one matching nucleotide in the defined region were identified and a pseudo-count of one was added for both values. Multi-mapping reads were counted towards each of the matching loci as 1/ number of matching loci.
The enrichment score was calculated as the ratio of library size normalised CLIP reads and library sized normalised RNA-seq reads that mapped to the region surrounding the site. The enrichment score was subsequently used as a metric to sort deletion sites, e.g., to compute the overlap of replicates with respect to transcripts with the highest enrichment scores. Secondary structure analysis: utilising the intersection of inferred binding sites from all three replicates of the full-length construct, secondary structure prediction was done with the RNAfold program of the ViennaRNA Package version 2.4.8 76 . For each condition, a foreground set of binding sites was selected as the intersection of highly reproducible sites from all replicates. A background set was obtained by shuffling the foreground sequences while preserving the dinucleotide frequencies with a tool from the meme suite version 4.12.0 77 . After obtaining secondary structure predictions for each sequence through RNAfold, a position-wise mean base-pairing score is inferred for both sets independently assuming the following fold propensities: A: 0.365, C: 0.516, G: 0.663, U: 0.494. The final fold propensity score for each position was calculated as the ratio of the pairing score from the foreground and the pairing score from the background for this position. Nucleotide composition analysis: The same sequences as for the secondary structure analysis was used to calculate the nucleotide frequencies at given positions and plot them as nucleotide composition. The entire data analysis process was executed as a workflow which was created with snakemake version 4.3.0 78 Statistics and reproducibility. To satisfy the requirements for standard statistical tests and to ensure the robustness of our results, all experiments yielding quantitative data were performed in triplicate, which is the standard for molecular biology experiments. Because of the non-linear transformation of Ct values in relative quantifications using the 2 ΔΔCT method, we performed statistical analyses using log-transformed values and employed two-sided Welch's t test considering the unequal variances observed between conditions.
Reporting summary. Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability
The accession number for the FUS-RRM:SL3 structure reported in this paper is PDB: 6SNJ. The accession number for FUS-RRM:SL3 chemical shifts reported in this paper is BMRB:34427. Input total RNA-Seq data and high-confidence FUS-binding sites inferred from CLIP were uploaded to GEO with the accession number GSE139263. All data supporting the findings of this study are available from the corresponding author upon reasonable request.