Disruption of pathways regulated by Integrator complex in Galloway–Mowat syndrome due to WDR73 mutations

Several studies have reported WDR73 mutations to be causative of Galloway–Mowat syndrome, a rare disorder characterised by the association of neurological defects and renal-glomerular disease. In this study, we demonstrate interaction of WDR73 with the INTS9 and INTS11 components of Integrator, a large multiprotein complex with various roles in RNA metabolism and transcriptional control. We implicate WDR73 in two Integrator-regulated cellular pathways; namely, the processing of uridylate-rich small nuclear RNAs (UsnRNA), and mediating the transcriptional response to epidermal growth factor stimulation. We also show that WDR73 suppression leads to altered expression of genes encoding cell cycle regulatory proteins. Altogether, our results suggest that a range of cellular pathways are perturbed by WDR73 loss-of-function, and support the consensus that proper regulation of UsnRNA maturation, transcription initiation and cell cycle control are all critical in maintaining the health of post-mitotic cells such as glomerular podocytes and neurons, and preventing degenerative disease.

S1: Proteins identified by mass spectrometry of GFPimmunoprecipitates isolated from a human podocyte cell line stably expressing GFP-WDR73. Table S2: Genes identified as being differentially expressed to a significantly different extent by RNA-Seq in WDR73 KD cells + EGF compared to Ctrl siRNA-treated cells + EGF (fold change 1.2). Table S3: Full list of genes differentially expressed in WDR73 KD cells + EGF compared to Ctrl siRNA-treated cells + EGF. Table S4: Genes identified as being differentially expressed to a significantly different extent in Ctrl siRNA-treated cells + EGF compared to Ctrl siRNAtreated cells -EGF (fold change 2). Table S5: Full list of genes differentially expressed in Ctrl siRNAtreated cells + EGF compared to Ctrl siRNA-treated cells -EGF. Table S6: Genes identified as being differentially expressed to a significantly different extent in WDR73 KD cells -EGF compared to Ctrl siRNA-treated cells -EGF (fold change 1.2).
To obtain neural progenitor cell lines (NPCs), iPSCs were differentiated using a dual SMAD inhibition strategy (Feyeux et al., 2012). To ensure the correct identity of the NPCs used in this study, we performed a flow-cytometry based quality control check. Cells were first isolated using trypsin, filtered on a 40µm strainer and counted to have an equal number of cells for each cell line tested and all controls. Trypsin was then inactivated by adding an equal volume of Opti-MEM (GIBCO) 10% FCS and then centrifuged for 5min at 1,200rpm at 4°C before resuspension in Fc blocking solution [Fc blocking reagent diluted 1/5 in FACS buffer (PBS, 2.5% FCS, 2.5mM EDTA)]. Cells were incubated for 15 min on ice before washing in FACS buffer and a second centrifugation at 1,200rpm for 5 min at 4°C. The pellets were resuspended in fluorescently-conjugated primary antibodies CD57-FITC (HNK1) and CD271-PE (P75) diluted 1/10 in FACS buffer and incubated on ice for 30min. The cells were then again washed in FACS buffer, centrifuged a final time at 1,200rpm for 5min at 4°C before resuspension in FACS buffer. Cells were then acquired on LSR Fortessa flow cytometer (BD). Cells passed the quality control check if the population of HNK1 + /P75cells was greater than 80%.
The NPC controls were derived from the iPSC lines, IMAGINi005 (clone #04) and IMAGINi009 (clone #09), and are referred to as Ctrl 05-04 and Ctrl 09-09 in this study.

Nucleo-cytoplasmic fractionation
To perform a nucleo-cytoplasmic fractionation, cells were first lysed in a low salt buffer (10mM Tris-HCl pH7.8, 10mM KCl, 1.5mM MgCl2, 0.5mM DTT plus complete protease inhibitor). Following lysis, cells were incubated on ice for 10min, followed by centrifugation at 13,500rpm for 10min at 4°C. The supernatant, containing the cytoplasmic extract, was removed at this point for subsequent analysis. The pellet, containing the nuclear proteins, was washed once in low salt buffer and again centrifuged for 10min at 4°C. The pellet was then resuspended in 100µl of high salt buffer (20mM Tris-HCl pH7.9, 420mM KCl, 1.5mM MgCl2, 10% glycerol, 0.5mM DTT plus complete protease inhibitor) and allowed to rotate overnight at 4°C. The following day, the lysate was again centrifuged at 13,000rpm for 10min, and the supernatant, consisting of the nuclear extract, removed for analysis.

Proteomic analysis
Proteomic analysis was carried out by the Proteomic Platform 3P5-Necker (Necker Hospital, Paris, France).
 NanoLC-MS/MS protein identification and quantification S-Trap micro spin column (PROTIFI, Hutington, USA) digestion was performed on IP eluates according to manufacturer's protocol. Briefly, samples were digested with 3µg of trypsin (PROMEGA) at 37°C overnight. After elution, peptides were vacuum dried.
Samples were resuspended in 35 µL of 10% acetonitrile, 0.1% trifluoroacetic acid in high performance liquid chromatography (HPLC)-grade water. For each run, 5 µL was injected in a nanoRSLC-Q Exactive PLUS (RSLC Ultimate 3000) (THERMO SCIENTIFIC, Waltham MA, USA). Peptides were loaded onto a µ-precolumn (Acclaim PepMap 100 C18, cartridge, 300 µm i.d.×5 mm, 5 µm) (THERMO SCIENTIFIC), and were separated on a 50 cm reversed-phase liquid chromatographic column (0.075 mm ID, Acclaim PepMap 100, C18, 2 µm) (THERMO SCIENTIFIC). Chromatography solvents were (A) 0.1% formic acid in water, and (B) 80% acetonitrile, 0.08% formic acid. Peptides were eluted from the column with the following gradient 5% to 40% B (38 min), 40% to 80% (1 minute). At 39 min, the gradient stayed at 80% for 4 min and, at 44 min, it returned to 5% to re-equilibrate the column for 16 min before the next injection. One blank was run between each series to prevent sample crosscontamination. Peptides eluted from the column were analyzed by data dependent MS/MS, using top-10 acquisition method. Peptides were fragmented using higherenergy collisional dissociation (HCD). Briefly, the instrument settings were as follows: resolution was set to 70,000 for MS scans and 17,500 for the data dependent MS/MS scans in order to increase speed. The MS automatic gain control (AGC) target was set to 3.10 6 counts with maximum injection time set to 200 ms, while MS/MS AGC target was set to 1.10 5 with maximum injection time set to 120 ms. The MS scan range was from 400 to 2000 m/z. Dynamic exclusion was set to 30 seconds duration.

 Data Processing Following LC-MS/MS acquisition
The MS files were processed with the MaxQuant software version 1.5.8.3 and searched with Andromeda search engine against the database of Homo Sapiens from swissprot 07/2017. To search parent mass and fragment ions, we set an initial mass deviation of 4.5 ppm and 20 ppm respectively. The minimum peptide length was set to 7 amino acids and strict specificity for trypsin cleavage was required, allowing up to two missed cleavage sites. Carbamidomethylation (Cys) was set as fixed modification, whereas oxidation (Met) and N-term acetylation were set as variable modifications. Match between runs was not allowed. Label-free quantification (LFQ) minimum ratio count was set to 1. The false discovery rates (FDRs) at the protein and peptide level were set to 1%. Scores were calculated in MaxQuant as described previously (Cox et al., 2008). The reverse and common contaminants hits were removed from MaxQuant output. Proteins were quantified according to the MaxQuant label-free algorithm using LFQ intensities (Cox et al., 2014;Luber et al., 2010) Dara were analyzed with Perseus software (version 1.6.0.7) freely available at www.perseus-framework.org (Tyanova et al., 2016). The LFQ data were transformed in log2. All the proteins identified in at least 5 of the 6 biological replicates per group were submitted to statistical test (volcano plot, FDR=0.05 and S0=2) after imputation of the missing value by a Gaussian distribution of random numbers with a standard deviation of 30% relative to the standard deviation of the measured values and 3 standard deviation downshift of the mean to simulate the distribution of low signal values. Protein annotations (GO, Keywords) were retrieved directly using via Perseus.

Flow cytometry
For flow cytometry analysis, an equal number of cells per condition and all controls were fixed in 2% PFA for 30 min on ice. Cells were washed twice in cold PBS, then resuspended in 3ml ice-cold 70% ethanol and kept for at least 24hrs overnight before further processing. The following day, cells were centrifuged at 1,500rpm for 5min at 4°C, washed twice in PBS before resuspension in a solution containing primary antibody diluted to the required concentration in 100µl PBS 0.5% Triton X-100, 1% BSA. Cells were stained for 30min on ice, then washed and resuspended in an Alexa-488 conjugated secondary antibody (LIFE TECHNOLOGIES) diluted 1/400 in PBS 0.5% Triton X-100, 1% BSA. After a 30minute incubation, cells were again washed in PBS, centrifuged for 5min at 1,500rpm and then resuspended in propidium iodide (PI) staining solution (PBS 0,5% TX-100, 2mg RNAse A and 250mg PI). Cells were incubated at room temperature for 30min with minimal exposure to light before acquisition on the flow cytometer (Kaluza for Gallios, BECKMAN COULTER). Analysis was performed using the Kaluza analysis software (BECKMAN COULTER).

Immunofluorescence
For immunofluorescence experiments, NPCs were seeded onto glass-coverslips precoated with Poly-L-Ornithine and laminin at a minimum density of 1x10 6 cells/well of a 6-well dish. 24-48hrs hours after seeding, cells were fixed in 4% paraformaldehyde (PFA) diluted in PBS. After 20min, the PFA was removed and cells washed once in PBS before a 5minute incubation in 30mM glycine (SIGMA) to quench the PFA autofluorescence. Cells were then washed a further 2x in PBS and then permeablised by incubation in 0.5% Triton X-100 diluted in PBS for 5min. Cells were washed 3x in PBS and then incubated in PBS 1% BSA overnight at 4°C. The following day, coverslips were incubated in primary antibody diluted to the appropriate concentration in PBS 1% BSA, and then washed 3x before incubation in an Alexa-488 conjugated secondary antibody (LIFE TECHNOLOGIES) diluted 1/400, also in PBS 1% BSA. The coverslips were then washed a final 2x in PBS, once in water, and finally mounted onto glass slides using Mowiol mounting solution. Images were captured using a LEICA TCS SP8 SMD (Single Molecule Detection) confocal microscope.

Design of qPCR primers for U12 and SNORD3A
In order to design primers that amplified the long, unprocessed U12 transcript, we first retrieved the mature U12 RNA sequence, plus 200 nucleotides downstream of the 3' end, from BioMart. We cross-checked this sequence with that available on NCBI gene. As the sequence of the 3' box in U12 has been previously described (Tarn et al., 1995), we placed the reverse primer downstream of this region. In regards to SNORD3A, our goal was to design primers that amplified a long and in theory, unprocessed, transcript. To do this, we used BioMart to retrieve the sequence encoding SNORD3A plus 200 nucleotides downstream of the 3' end. We placed the forward primer in the sequence of the mature SNORD3A transcript, and the reverse primer in the downstream sequence. Using the U1 3'box consensus sequence and the motif matching FIMO software, we were unable to locate a putative 3'box in this region, but as reports state that 3' cleavage occurs between 9-19 nucleotides downstream of the final nucleotide in the mature sequence (Hernandez, 1985), we ensured the reverse primer was beyond this region.

RNA-Sequencing and subsequent analysis
RNA-sequencing was performed on RNA extracted from human immortalised podocytes in which WDR73 was depleted using a commercially available siRNA pool, with podocytes treated with a non-targeting oligonucleotide used as a control. 48hrs following siRNA transfection, cells were subjected to 24hrs starvation in complete media supplemented with 0.1% FCS before stimulation with 100ng/ml EGF for 30min. The experiment was performed in quadruplicate, with the three samples in which depletion of WDR73, as determined by RT-qPCR, was most comparable being those submitted to the genomic platform of Imagine Institute for subsequent analysis.
RNA-Seq libraries were prepared using 1µg of total RNA using the Universal Plus mRNA-Seq kit (NUGEN) as recommended by the manufacturer. Briefly, mRNA was captured using polyA+ magnetic beads. The mRNA were then fragmented chemically. Single strand and second strand cDNA were produced and then ligated to Illumina compatible adapters with Unique Dual Index. Following an initial test to evaluate the suitable number of PCR cycles to apply to each sample, the cDNA produced were amplified by PCR. To produce oriented or 'stranded' RNA-Seq libraries, a final strand selection was performed. An equimolar pool of the final indexed RNA-Seq libraries was prepared (the NuQuant system from NUGEN was used to facilitate the RNAseq libraries quantification and normalization) and sequenced on a NovaSeq6000 from Illumina (Paired-End reads 100 bases + 100 bases). A total of ~50 millions of passing filter paired-end reads was produced per library.
FASTQ files were then mapped to the ENSEMBL Human (GRCh38/hg38) reference using Hisat2 and counted by featureCounts from the Subread R package (R version 3.5.2; 2018-12-20). Read count normalisations and groups comparisons were performed by three independent and complementary statistical methods: Deseq2 (version 1.20), edgeR (version 3.22.5), LimmaVoom (version 3.36.5). Flags were computed from counts normalized to the mean coverage. All normalized counts <20 were considered as background (flag 0) and >=20 as signal (flag=1). P50 lists used for the statistical analysis regroup the genes showing flag=1 for at least half of the compared samples. The results of the three methods were filtered at pvalue<=0.05 and folds 1.2/1.5/2 compared and grouped by Venn diagram. Functional analyses were carried out using Ingenuity Pathway Analysis (IPA, QIAGEN).