DDX41 coordinates RNA splicing and transcriptional elongation to prevent DNA replication stress in hematopoietic cells

Myeloid malignancies with DDX41 mutations are often associated with bone marrow failure and cytopenia before overt disease manifestation. However, the mechanisms underlying these specific conditions remain elusive. Here, we demonstrate that loss of DDX41 function impairs efficient RNA splicing, resulting in DNA replication stress with excess R-loop formation. Mechanistically, DDX41 binds to the 5′ splice site (5′SS) of coding RNA and coordinates RNA splicing and transcriptional elongation; loss of DDX41 prevents splicing-coupled transient pausing of RNA polymerase II at 5ʹSS, causing aberrant R-loop formation and transcription-replication collisions. Although the degree of DNA replication stress acquired in S phase is small, cells undergo mitosis with under-replicated DNA being remained, resulting in micronuclei formation and significant DNA damage, thus leading to impaired cell proliferation and genomic instability. These processes may be responsible for disease phenotypes associated with DDX41 mutations.

INTRODUCTION DDX41 mutation occurs in various hematopoietic malignancies, most frequently in acute myeloid leukemia (AML) and myelodysplastic syndromes (MDS) [1][2][3][4]. DDX41 encodes a DEAD-box-type RNA helicase that mainly localizes in the nucleus. Proposed biological functions of nuclear DDX41 include R-loop resolution [5,6], small nucleolar RNA processing [7] and ribosomal RNA (rRNA) processing [8]. DDX41 was also found in the spliceosome [9][10][11][12]. Notably, it has been shown that some individuals with a germline DDX41 variant in one allele later develop hematopoietic malignancies by acquiring a somatic mutation in the other allele. Most germline DDX41 variants are frameshift, nonsense, or missense mutations that occur in the entire coding region without any hotspots, which suggests that these germline variants lose expression or function. On the other hand, somatic mutations are highly concentrated in the R525H mutation located within the helicase domain where DDX41 interacts with ATP. Indeed, our previous study revealed reduced ATPase activity for the helicase domain with the R525H mutation [8]. In addition, compound heterozygous mutations combining a germline variant and the somatic R525H mutation are observed in human AML/MDS, whereas homozygous Ddx41 knockout mice are lethal [13]. Collectively, somatic mutants are considered to be functionally hypomorphic, and the acquisition of somatic mutations to cells with germline variants would be expected to further reduce the activity of DDX41 to the extent that it is not completely lost. The average age of disease onset for patients with a germline DDX41 variant is the 60 s, which is comparable to that of patients without the variant [2-4, 14, 15]. However, individuals with a heterozygous germline DDX41 variant present unexplained cytopenia of one or more hematopoietic lineages before the development of hematopoietic malignancies at the rate of 40-66% [1,16].
Patients with DDX41 mutations often exhibit hypoplastic bone marrow, which is relatively characteristic of MDS/AML with this mutation. In addition, DDX41 mutations in MDS/AML are not necessarily mutually exclusive with those in genes encoding typical MDS-related RNA splicing factors [17][18][19], suggesting unique pathological implications of DDX41 mutations somewhat different from those of other MDS-related RNA splicing factors.
Here, we demonstrate that DDX41 mainly binds to 5ʹ splice sites (SS) of coding RNA and is involved in the formation of activated spliceosomes. DDX41 was responsible for interaction between the spliceosome and transcriptional elongation complex, thereby coordinating the two distinct machineries. Suppression of DDX41 function thus caused altered transcriptional elongation and R-loop accumulation, which led to mild DNA replication stress in S phase. Consequently, cells undergo mitosis without resolving this replication stress, leading to impaired cell proliferation and genomic instability.

RESULTS
DDX41 is an RNA splicing factor that binds mainly to the 5ʹSS To investigate the role of DDX41 in RNA biogenesis, we analyzed RNAs with which DDX41 may interact, by performing ultraviolet (UV) crosslinking, immunoprecipitation (IP), and sequencing (CLIPseq) [20,21]. We crosslinked K562 cells expressing Myc-tagged wild-type (WT) or R525H mutant DDX41, most frequent somatic mutant found in myeloid malignancies, via UV light, and sequenced RNAs co-precipitated with Myc-DDX41 ( Supplementary  Fig. S1A).
First, we mapped the sequenced reads to ribosomal DNA (rDNA) and found that 60.0-65.1% CLIP-seq reads were uniquely mapped to rDNA, especially to the 18S and 28S rRNA regions, after removing duplicate reads ( Supplementary Fig. S1B, C). Therefore, DDX41 likely interacts primarily with mature rRNA and may play a role in ribosome biogenesis, as we suggested previously [8]. We then mapped the remaining sequence reads to human genome and found that 24.7-28.7% CLIP-seq reads per sample were uniquely mapped to coding genes ( Supplementary Fig. S1B), indicating that DDX41 also binds to coding RNA. DDX41 preferentially bound to 5ʹSS, and somewhat less to 3ʹSS, on coding RNAs (Fig. 1A). Mutant R525H DDX41 also bound to rRNA and coding RNA in the same manner as did WT DDX41 ( Fig. 1A and Supplementary Fig. S1C), which suggests that R525H mutant retains comparable RNA-binding activity to WT DDX41, although the mutant has reduced ATPase activity [8].
Because DDX41 bound to or near SS on coding RNAs, we analyzed RNA splicing changes using rMATS [22] by comparing RNA sequencing (RNA-seq) data for K562 cells with suppressed DDX41 expression (DDX41-knockdown cells) by DDX41-specific short hairpin RNAs (shRNA) (shDDX41#1 or shDDX41#2) with data for control cells expressing scrambled shRNA (shScramble). As a reference, we included RNA-seq data for cells expressing SRSF2 P95R, one of the RNA splicing-related mutants found in MDS, in the analysis. Similar trends of RNA splicing changes were observed in both DDX41knockdown cells and SRSF2 P95R-expressing cells (Fig. 1B, C). However, RNA splicing changes in DDX41-knockdown cells were less frequent than those of SRSF2 P95R-expressing cells (Fig. 1B) and were either increased or decreased for each event type, with no particular trend toward either direction ( Fig. 1D and Supplementary  Fig. S1D), which differed from those of SRSF2 P95R-expressing cells in which skipped exon (SE) was relatively suppressed [23]. Figure 1E shows that only 202 of 703 and 593 differentially spliced events in shDDX41#1-and shDDX41#2-expressing cells, respectively, overlapped. There was also little overlap when aggregated across all shared events that could affect the same exons ( Supplementary Fig. S1E). No relevant sequence features existed in differentially spliced exons and 5'SS and 3'SS ( Supplementary Fig. S1F, G), although we found that the DDX41 deficiency induced splicing changes in genes partially similar to those seen in myeloid malignancies with mutations in RNA splicing factors accompanied by impaired mRNA production ( Supplementary Fig. S1H, I, J, K).
These results suggest that DDX41 may be involved in RNA splicing and mRNA synthesis by interacting with 5'SS, but play smaller roles in determining RNA splicing position and inclusion/ exclusion of exons. Nevertheless, RNA fluorescence in situ hybridization with oligo(dT) probes demonstrated speckle-like signals in DDX41-knockdown cell nuclei (Fig. 1F), suggesting that the loss of DDX41 caused impaired maturation or export of mRNA due to a defect in mRNA synthesis.

DDX41 interacts with activated spliceosomes
To further investigate the roles of DDX41 in RNA splicing, we analyzed proteins with which DDX41 may interact. FLAG-tagged DDX41 was immunoprecipitated with its interacting proteins from the nuclear fraction of FLAG-DDX41-expressing K562 cells, and we analyzed them by mass spectrometry. Gene ontology (GO) analysis of proteins specifically interacted with FLAG-DDX41 revealed that DDX41 had a high likelihood of interacting with proteins related to RNA splicing ( Fig. 2A and Supplementary Fig.  S2), including catalytic steps 1 and 2 spliceosomes. Because these spliceosomes include several components of the NineTeen complex (NTC), we investigated interactions of DDX41 with the NTC. In RNA splicing, the NTC with PRP19 as a core component joins the spliceosome together with U4/U6.U5 tri-snRNP and plays essential roles in spliceosome activation and reorganization by exchanging factors in the complex (Fig. 2B) [24]. NTC also promotes transcriptional elongation by interacting with an RNA polymerase II (Pol II)-containing complex [25,26]. We found that DDX41 interacted with PRP19 and CDC5L, which are core components of NTC, and Yju2 (CCDC94), CWC22, and CWC25, which are NTC-related factors, but DDX41 interacted very weakly with CWC27 (Fig. 2C). Interaction between DDX41 and PRP19 was likely not mediated by RNA, because RNase A treatment did not impair the interaction (Fig. 2D). Because CWC27 dissociate from spliceosome during conversion from B* to C complex [27], and CWC25 and Yju2 from C to C* complex [24] (Fig. 2B), the spliceosome with which DDX41 mainly interacted appeared to be C complex that has completed SS recognition. This result agrees with papers reporting that DDX41 occurred in the C complex containing activated spliceosome [9,10,24,28]. is involved in RNA splicing by binding to 5'SS but does not play a major role in SS recognition. A Relative CLIP-seq signals at 5ʹSS and 3ʹSS on coding RNA. Vertical axis: ratio of CLIP sample signal divided by that of input RNA from same cells. Blue and red lines in top panels indicate relative signal enrichment of CLIP reads from cells expressing Myc-tagged WT DDX41; green and orange lines in bottom panels indicate reads from cells expressing Myc-tagged R525H mutant DDX41. B Quantification of RNA splicing changes in K562 cells expressing shDDX41#1, shDDX41#2, or SRSF2 P95R. We placed splicing events into groups according to rMATS: (1) skipped exon (SE), (2) alternative 5ʹSS (A5SS), (3) alternative 3ʹSS (A3SS), (4) mutually exclusive exons (MXE), and (5) retained intron (RI). Cumulative number of events in each cell group with an inclusion level difference (ILD) > 0.1 or <−0.1 and a false discovery rate (FDR) <0.05 are shown. C Distribution of RNA splicing events in K562 cells expressing shDDX41#1, shDDX41#2, or SRSF2 P95R compared with control K562 cells. Splicing events were categorized as in B. D Changes in RNA splicing events for SE in DDX41-knockdown cells and SRSF2 P95R-expressing cells. We included splicing events with 10% minimum change of absolute percent spliced-in index (PSI, which indicates rate of incorporation of specific exon into transcript of a gene) (delta PSI ≥ 0.1) and average reads ≥5; those with FDR < 0.05 with ILD < 0.1 or >0.1 in each group were considered significant and plotted with red or blue dots, respectively. Gray dots are not significant. E Overlap of RNA splicing events among DDX41knockdown cells and SRSF2 P95R-expressing cells. All significant RNA splicing events (SE, MXE, RI, A5SS, and A3SS) in each cell type were summed, and event overlap among DDX41-knockdown cells and SRSF2 P95R-expressing cells is shown. F Subcellular distribution of poly(A)tailed RNA in DDX41-knockdown cells. Scale bars: 20 μm. The factors within NTC are incorporated into or excluded from the complex depending on the splicing steps, in which core NTC components (PRP19 and CDC5L) occur throughout NTC after incorporation of the complex into the spliceosome; CWC25 and Yju2 are incorporated before the B * complex and excluded before the C * complex, and CWC27 is excluded before the C complex. C Interaction of DDX41 with RNA splicing process-specific components in the NTC. Myc-tagged NTC components (Yju2, CWC22, CWC25, CWC27, PRP19, and CDC5L) were expressed with FLAG-tagged DDX41 in HEK293FT cells, and DDX41interacting proteins were immunoprecipitated with an anti-FLAG antibody. Precipitated proteins were probed with anti-FLAG, anti-Myc, or anti-β-Actin antibody. Left and right panels indicate input and immunoprecipitated samples, respectively. D Non-RNA-mediated interaction of DDX41 with PRP19. FLAG-tagged DDX41 and Myc-tagged PRP19 were expressed in HEK293FT cells, and FLAG-DDX41 was immunoprecipitated with anti-FLAG antibody. Precipitated samples were then treated with 20 μg/ml RNase A for 30 min at 37°C before being probed with anti-FLAG or anti-Myc antibody.
Next, we examined cell cycle progression using U2OS cells stably expressing GFP-tagged histone H2B. DDX41-knockdown cells had a longer mitotic period, with abnormal mitosis and postmitotic nuclear morphologies (Supplementary Fig. S3A and Supplementary Movies 1 to 3). The cells frequently manifested multipolar centrosomes and scattered chromosomes in metaphase and lagging chromosome and DNA bridges in anaphase (Fig. 3D), which were also the case in HeLa cells and even in noncancerous ARPE-19 epithelial cells ( Supplementary Fig. S3B). γ-H2AX-positive micronuclei, a hallmark of genomic instability, were frequently found in DDX41-knockdown K562 cells ( Fig. 3E and Supplementary Fig. S3C).
We observed G2/M arrest and impaired DNA replication in DDX41-knockdown cells, as indicated by the increase in cells with double amount of DNA (4N) and reduced bromodeoxyuridine (BrdU) incorporation (Fig. 3F). Slowed replication fork progression was also observed in a DNA fiber assay (Fig. 3G), accompanied by an enhanced γ-H2AX signal (Fig. 3H), which is consistent with a previous report describing the involvement of RNA processing factors in DNA damage response [32]. Impaired DNA replication here was not severe enough to induce S-phase arrest (Fig. 3F). However, we noted that the G2 phase population substantially contributed to G2 and M phase accumulation on day 3 of siDDX41 transfection in additional analyses with anti-phosphorylated histone H3 (pHH3) antibody for mitotic cells (Fig. 3I). Thus, problems with DNA replication during S phase, even though they were unnoticeable, may trigger significant cell cycle changes and abnormal cellular morphology.
In accord with the knockdown experiments, DDX41inh-2 inhibited BrdU incorporation (Fig. 4B). However, the reduction in BrdU incorporation was modest at 48 h after addition of the inhibitor, when cell growth was markedly inhibited, and as long as 72 h was needed before a marked decrease in BrdU incorporation. This finding supports our hypothesis that problems resulting from DNA replication defects, rather than impaired replication itself, were responsible for the marked cell cycle changes.
These cellular changes associated with DDX41 inhibition were similar to mild replication stress observed with low concentrations of aphidicolin (APH), a DNA polymerase α inhibitor (Supplementary Fig. S4C, D) [35,36]; APH treatment at 10 nM induced a slight delay in S-phase progression, as did the DDX41 inhibitor ( Fig. 4E and Supplementary Fig. S4E).
We found that DDX41 inhibition for 6 h during S phase increased single-stranded DNA to the same extent as did 10 nM APH (Fig. 4G). Enhanced and prolonged Chk1 activation in S phase was also seen in DDX41inh-2-treated cells (Fig. 4H). These data indicated that mild DNA replication stress was induced by DDX41 inhibition. Nevertheless, an apparent increase in γ-H2AX signal was not observed until cells completed mitosis ( Fig. 4H and Supplementary Fig. S4F), suggesting the requirement for mitosis in DDX41-inhibited cells to manifest marked DNA damage.
We also tested how DNA replication stress by DDX41 inhibition affected mitosis by arresting synchronized cells at G2 phase in the presence of DDX41inh-2, followed by its wash-off ( Fig. 4I and Supplementary Fig. S4G). Cell cycle analysis showed that cells pretreated with DDX41inh-2 in S/G2 had a delayed return from 4N to 2N (Fig. 4J) and delayed G2-M transition (Fig. 4K), whereas DDX41 inhibition initiated from the end of G2 did not cause these delays ( Supplementary Fig. S4H, I). These data indicated that mild DNA replication stress induced by DDX41 inhibition delayed G2-M transition, consistent with our findings showing G2 phase accumulation for asynchronized DDX41-knockdown cells (Fig. 3I).
DNA replication stress by DDX41 inhibition induces mitotic abnormalities and leaves DNA damage in post-mitotic cells We then analyzed how DNA replication stress by DDX41 inhibition affected mitosis. We analyzed mitosis after 8 h of exposure to DDX41inh-2 during S phase without CDK1 inhibition by RO-3306 ( Supplementary Fig. S5A). Although the extent of lagging chromosome and multipolar mitosis was comparable for DDX41inh-2treated cells and DMSO-treated cells, DDX41inh-2 treatment increased mitotic DNA bridges and ultrafine bridges positive for Blooms syndrome protein (BLM), which strongly suggested an increase in under-replicated DNA in S phase (Fig. 5A) [37]. DDX41 inhibition during S phase led to reduced IdU incorporation ( Fig. 5B and Supplementary Fig. S5A) and subsequent G2 arrest in daughter cells (Fig. 5C, D), similar to the results in asynchronized DDX41-knockdown cells (Fig. 3F). We also found that DDX41 inhibition in S phase increased the number of nuclear γ-H2AX foci in G1 daughter cells (Fig. 5E and Supplementary Fig.  S5B). In DDX41-knockdown cells, modest γ-H2AX accumulation first occurred in G1 (Fig. 5F, Day 3), before accumulation expanded to all cell cycle phases (Fig. 5F, Day 4). These observations suggest that modest DNA replication stress acquired during S phase by DDX41 inhibition persists and that cells enter mitosis in an underreplicated state; interphase checkpoint mechanisms would overlook replication stress caused by DDX41 inhibition, which would let cells enter mitosis without sufficient repair of DNA lesions.

DDX41 protects DNA from R-loop formation
We postulated that impaired RNA splicing caused by DDX41 deficiency induced formation of an R-loop, a genomic structure that consists of a DNA:RNA hybrid and displaced single-stranded DNA, which in turn causes DNA replication stress [38].
Immunofluorescence staining of cells with S9.6 antibody, which is used to detect R-loops [39], showed increased nuclear S9.6 signals in DDX41-knockdown cells (Fig. 6A). Treatment with actinomycin D (ActD) inhibited this S9.6 signal induction (Fig. 6A), which confirmed that accumulation of R-loops by DDX41 knockdown depended on transcription. Increased nuclear S9.6 signals by DDX41inh-2 were also noted in the first S phase after G1 arrest (Fig. 6B). In support of our hypothesis, overexpression of RNase H1 modified to preferentially localize to the nucleus [40] attenuated γ-H2AX signal induction in DDX41-knockdown cells (Fig. 6C). These data indicated that R-loops aberrantly formed in a transcription-dependent manner were responsible for the DNA damage caused by loss of DDX41 function.
Given this, we immunoprecipitated R-loops from HEK293FT cells expressing Myc-tagged RNase H1 mutant (D210N) that lacks enzymatic activity but retains R-loop-binding capacity (Supplementary Fig. S6A, B). Identified 431 R-loop-interacting proteins, of which 26-53% overlapped with those identified in other studies [5,41,42], included certain RNA processing factors and Pol II, confirming efficient R-loop recovery ( Supplementary Fig. S6B and Supplementary Table S1). However, DDX41 was not identified in the co-precipitates in our assay ( Supplementary Fig. S6B and Supplementary Table S1), suggesting that the primary cause of R-loop accumulation by loss of DDX41 function is due to the formation process but not resolution of the R-loop itself. DDX41 promotes transcriptional elongation by facilitating cooperation between the spliceosome and transcriptional elongation complex To investigate the mechanism by which DDX41 suppression caused DNA replication stress with R-loop accumulation, we analyzed the Cancer Dependency Map (DepMap) dataset [43]. In support of our observations (Fig. 3B, C), the dataset showed that knockdown and knockout of DDX41 both broadly affected cell viability (gene effect scores for CRISPR and RNAi were −1.07 ± 0.18 and −0.84 ± 0.21, respectively) (Fig. 7A), which strongly suggests that DDX41 is essential for cell survival. GO analysis of 100 and 47 genes that co-depended with DDX41 in knockdown and knockout screens, respectively, showed that GO terms related to RNA Pol IImediated transcription and RNA splicing, including catalytic step 2 spliceosome and splicing C complex, were significantly enriched (Fig. 7B, Supplementary Fig. S7A and Supplementary Table S2). These results suggested that DDX41 maintains cell survival through regulation of RNA splicing and Pol II-mediated transcription. Importantly, RNA splicing is tightly coupled with transcriptional elongation [44][45][46]. Indeed, we found an interaction of DDX41 with NTC components that were also involved in transcriptional elongation (Fig. 2C) [26].
This DepMap data may reflect our data obtained from transcriptome analyses of DDX41-knockdown cells (Fig. 7C). Gene set enrichment analysis (GSEA) [47] showed that gene sets related to transcriptional elongation, RNA helicase activity, and mRNA splicing were enriched among the genes down-regulated in DDX41-knockdown cells ( Fig. 7D and Supplementary Fig. S7B). Comparable results were obtained via analyses of published datasets of 451 AML cases (Fig. 7E) [48]. In addition, with the same clinical dataset, we confirmed that many genes related to catalytic step 2 spliceosome and Pol II-mediated transcriptional elongation were interdependent with DDX41 expression (Supplementary Fig.  S7C). Thus, reduced expression of the genes related to these processes in cells with lower DDX41 expression was likely due to negative feedback regulation.
To investigate the relationship between DDX41-mediated RNA splicing and transcriptional elongation, we performed IP-Western blotting. We found that DDX41 interacted with Pol II phosphorylated at S2 and S5 (pS2 and pS5) in the C-terminal domain (CTD) and that RNA did not mediate the interaction (Fig. 7F), raising a possibility that DDX41 coordinates RNA splicing and transcriptional elongation.
We then performed fractionation-assisted native chromatin IP and sequencing analysis (ChIP-seq) [49] with an antibody against CTD of Pol II (with only shDDX41#1-expressing K562 cells because few shDDX41#2-expressing cells were available). ChIP-seq data showed that the average distribution of Pol II on the gene body was reduced in DDX41-knockdown cells (Fig. 7G), which may be partially due to reduced Pol II expression in DDX41-knockdown cells (Fig. 7H). We analyzed Pol II occupancy around 5ʹSS and 3ʹSS; interestingly, Pol II had a peak at 5ʹSS in the control cells, which disappeared in DDX41-knockdown cells (Fig. 7I). For 3ʹSS, Pol II signals decreased after exon-intron junctions, but the control and DDX41-knockdown cells showed a comparable shift (Fig. 7J), which confirmed the significance of changes in Pol II occupancy at 5ʹSS. Loss of the 5ʹSS peak in DDX41-knockdown cells did not depend on the difference in gene expression levels between DDX41-knockdown and the control cells, because a similar pattern at 5ʹSS was observed even when we performed the same analysis for the exons of genes expressing above the median in DDX41knockdown cells (Supplementary Fig. S7D). These observations, along with our CLIP-seq data (Fig. 1A) demonstrating selective binding of DDX41 to 5ʹSS, support a unique role for DDX41 around 5ʹSS. We next studied whether observed signal accumulation of Pol II at 5ʹSS was related to RNA splicing. We selected constitutively skipped exons in adequately transcribed genes in control cells for analysis (Supplementary Fig. S7E). We found no clear peaks at 5ʹSS and 3ʹSS on such skipped exons (Fig. 7K), although a signal fluctuation occurred because of the small number of exons included in analysis. These observations indicated that Pol II globally paused at 5ʹSS in association with RNA splicing similar to previous report [50], but this phenomenon no longer took  Supplementary Fig. S5A for schematic. Bars indicate means; error bars, SD of triplicate samples; two-tailed Welch's t test. C, D Cell cycle arrest at G2 phase in HeLa cells that had been treated with DDX41inh-2 in S phase and had undergone mitosis. Cells were treated as in Fig. 4I. Cell cycle status 18 h after removal of DDX41inh-2 and RO-3306 was identified by PI staining (C). Cells were double-stained with PI and anti-pHH3 antibody to distinguish mitotic cells from cells at G2 (D). E Increase in γ-H2AX foci in G1 HeLa cells that had been treated with DDX41inh-2 in S phase and had undergone mitosis. Cells were stained with anti-γ-H2AX antibody and DAPI. See Supplementary Fig. S5B for schematic. Bars represent means ± SD; n = 151 and 195 for DMSO and DDX41inh-2, respectively; two-tailed Welch's t test. Scale bars: 10 μm. F Increase in γ-H2AX signals primarily occurred at G1 in HeLa cells after DDX41 knockdown. Cells were stained with anti-γ-H2AX and anti-pHH3 antibodies and PI. MFI of γ-H2AX in each cell cycle phase was analyzed with flow cytometry. Bars indicate means; error bars, SD of triplicate samples; two-tailed Student's t test.
place when DDX41 expression was decreased, even though RNA splicing changes occurred only in a subset of exons (Fig. 1). This result may be attributed to the reduced interaction of PRP19 with pS2 and pS5 Pol II (Fig. 7L), especially with pS2 Pol II, which was supported by findings from other groups that RNA splicing depends on the interaction of PRP19 to Pol II CTD [51][52][53][54]. All our data suggest that DDX41 coordinates RNA splicing and transcriptional elongation at 5ʹSS, thereby inhibiting aberrant R-loop accumulation and subsequent DNA replication stress.
immature bone marrow cells isolated from heterozygous (Ddx41 R525H/WT ) and WT (Ddx41 WT/WT ) mice were cultured in the presence of cytokines to support stem/progenitor cell growth, followed by induction of R525H mutation in Ddx41 R525H/WT cells by adding (Z)-4-hydroxytamoxifen (4-OHT). Induction of the mutant was confirmed by direct sequencing (Supplementary Fig. S8C). As in our previous study [8], induction of R525H mutation inhibited cell proliferation (Fig. 8A). Moreover, these cells showed increased R-loop formation, phosphorylation of replication protein A 32 (RPA32) at serine 4 and 8 residues-a marker of fork collapse [55] -and γ-H2AX signals (Fig. 8B, C), which confirmed that increased R-loop and DNA damage from DDX41 inhibition observed in cancer cell lines also occurred in primary immature hematopoietic cells. Ddx41 R525H/WT cells further showed increased micronucleus formation and abnormal nuclear morphology (Fig. 8D, E), which suggested involvement of loss of Ddx41 function in genomic instability. Finally, we observed R-loops in cultured hematopoietic cells from Ddx41 heterozygous (Ddx41 WT/KO ) mice. Increased R-loop formation was observed in Ddx41 WT/KO cells, which was further enhanced by the induction of R525H mutation (Ddx41 R525H/KO ) ( Supplementary Fig. S8D). These observations suggest that a reduction of DDX41 expression level causes certain dose-dependent disturbance and that R525H mutation is functionally hypomorphic.
In conclusion, our study identified a process by which loss of DDX41 expression or its helicase activity caused spliceosomal dysfunction and impaired transcriptional elongation, thus leading to DNA replication stress that resulted in genomic instability. Our data suggest a model in which DDX41 regulates Pol II pausing at the 5ʹSS with the NTC, and Pol II waits there for RNA splicing to complete, but if DDX41 is deficient, transcriptional elongation machinery may proceed without slowing down at 5ʹSS (Fig. 8F) or may terminate elongation and dissociate from chromatin. Although DNA replication defects in the absence of DDX41 are not necessarily severe, persistence and gradual accumulation of minor replication stress beyond mitosis would cause hematopoiesis failure that is often observed in bone marrow of patients with DDX41 mutation.

DISCUSSION
Mutation or aberrant expression of genes encoding an RNA helicase occurs in various malignancies [56][57][58], which suggests that abnormal RNA recombination is closely associated with tumorigenesis. In hematopoietic malignancies, thus far, only DDX41, encoding a DEAD-box-type RNA helicase, and DHX15 and DHX34, encoding DEAH-box-type helicases, were reproducibly mutated in AML/MDS [31,59,60]. Characterization of these gene products would provide an excellent model for clarifying biological roles of RNA helicases and its involvement in leukemogenesis.
Here, we discovered that inhibition of DDX41 led to DNA replication stress with increased R-loop formation. Although the degree of the stress was modest, cells underwent mitosis with under-replicated DNA, resulting in genomic instability and growth inhibition. Hematopoietic stem cells (HSCs) have relatively longer cell cycles and divide at a low rate, once every 40 weeks [61]. Therefore, most HSCs remain quiescent in a steady state and have fewer opportunities to enter S phase. However, HSCs become more likely to enter the cell cycle when they are exposed to inflammatory environments or when they are aged. Importantly, cycling aged HSCs in mice have higher levels of replication stress associated with cell cycle defects; such replication stress persists even after the cells re-establish quiescence [62]. Therefore, in the presence of DDX41 mutations, even weak replication stress can accumulate in aged HSCs. In addition, considering the sequential acquisition of DDX41 mutations observed in myeloid malignancies, hematopoietic cells with a heterozygous germline DDX41 variant would have to wait until they acquire somatic mutations in other alleles to develop a myeloid malignancy, because DNA replication stress acquired in cells with a heterozygous variant may be limited even in aged HSCs. This was also suggested from the phenotypes observed in Ddx41 genetically modified mice [7]. These ideas may explain why carriers of germline DDX41 variants generally develop myeloid malignancy at older ages.
We also showed that DNA damage caused by loss of DDX41 function was dependent on transcription-related R-loop accumulation. Although impaired RNA splicing can lead to R-loop formation [63], its process remains poorly understood. Accumulating evidences implicate the interdependence of RNA splicing and transcriptional elongation [64]. Specifically, a recent study demonstrated that the upstream 5'SS remained associated to the transcription machinery during intron synthesis [50], consistent with our observation. Furthermore, Pol II paused at 5ʹSS resumed elongation after completion of RNA splicing [65]. Interestingly, our ChIP-seq analysis with antibody against Pol II showed that Pol II likely ignores pausing at 5ʹSS when not enough DDX41 is available. Taken together, we suggest that Pol II continues aberrant elongation without waiting for RNA splicing to finish, or alternatively, terminates elongation at 5ʹSS when RNA splicing is delayed by DDX41 deficiency. This model might be linked to a new perspective that the main obstacle to replication fork progression is the elongating Pol II engaged in R-loop [66].
DDX41 depletion induced mild splicing changes without any specific pattern; one possible explanation for this is that DDX41 regulates efficient splicing in the late step rather than decision of SS. However, further sequencing analysis of clinical specimens with DDX41 mutation is clearly needed for its confirmation, given that INTS3 intron retention observed in DDX41-knockdown cells was reported to be highly exclusive to SRSF2 mutation [67]. Since only a subset of R-loops can be hotspots for DNA damage [68], Fig. 7 Changes in gene expression and distribution of Pol II by DDX41 knockdown. A Dependence of cell lines on DDX41. We used DepMap portal (https://depmap.org/portal/). For density distributions for CRISPR and RNAi data, smaller scores indicate that DDX41 is essential for cell line survival; −1 was comparable to the median of all pan-essential genes. B Genes co-dependent with DDX41 include those related to RNA splicing and Pol II-mediated transcription. Top co-dependent genes with DDX41 (with q values <0.05) identified in CRIPR screening were subjected to GO analysis; results were visualized via g:Profiler (upper). Representative GO terms related to RNA splicing (red) and Pol II-mediated transcription (blue) were numbered (lower). Table S1 gives the complete list. C Gene expression changes by DDX41 knockdown. A hierarchical clustering of 1341 genes that showed expression changes with p < 0.05 in common in shDDX41#1-and shDDX41#2-expressing DDX41-knockdown cells compared with shScramble-expressing control cells was visualized. D Representative gene sets associated with RNA splicing and transcriptional elongation negatively enriched in shDDX41#1-and shDDX41#2-expressing cells. ES, enrichment score; NES, nominal enrichment score. E Representative gene sets associated with transcriptional elongation and RNA processing negatively enriched in DDX41 low-expressing AML cases. The 451 AML cases presented in the article by Tyner et al. [48] were divided into three groups according to the expression level of DDX41, and the transcriptome differences between groups with DDX41 expression levels below mean −SD (DDX41 low) and above mean +SD (DDX41 high) were examined. F Direct interaction of DDX41 with Pol II in HEK293FT cells. Protein extracts from cells expressing FLAG-DDX41 were immunoprecipitated with anti-FLAG antibody or control IgG and then probed with anti-FLAG and anti-Pol II (pS2 and pS5) antibodies. regions and types of splicing defect may determine the impact on genomic stability. We conclude that the vulnerability in DNA replication that partially remains in daughter cells would be essential to explain the unique phenotype of DDX41-related myeloid malignancies.

Cell cycle synchronization
HeLa cells were synchronized to the late G2 phase or G1/S boundary by double thymidine block with or without CDK1 inhibitor, respectively (see Supplementary Methods).

Flow cytometry
Cell cycle distribution, incorporation of BrdU or IdU, apoptosis and γ-H2AX expression were analyzed via flow cytometry (see Supplementary Methods).

CLIP-seq and ChIP-seq analyses
CLIP-seq was performed according to a previously reported method ( Supplementary Fig. S1A) [20,21]. ChIP-seq was performed by applying the fractionation-assisted native ChIP method (49). Detailed procedures are described in Supplementary Methods.

Mice
Ddx41 R525H cKI mice were crossed with C57BL/6-Gt(ROSA)26Sor tm1(cre/Esr1)Arte (ERT2Cre) mice (Model6466, Taconic Biosciences) to allow tamoxifen-inducible excision of floxed regions. Ddx41 heterozygous knockout mice (C57BL/6N Ddx41 tm1a ) were purchased from the UC Davis KOMP Repository (MMRRC Stock #047340-UCD). All mice were kept according to guidelines of the Institute of Laboratory Animal Science, Hiroshima University. The Animal Care Committee at the Japanese Foundation for Cancer Research approved all murine studies. Detailed procedures for generating Ddx41 R525H cKI mice and experimental procedures with the mice are described in Supplementary Methods.

Statistical analysis
Statistical analysis was performed with R software (version 4.1.2). All statistical details of experiments are given in Supplementary Methods and corresponding figure legends, with p values ≤0.01 being considered statistically significant unless otherwise indicated. Error bars in figures indicate standard deviation (SD).
Additional methods are available in the Supplementary Information.

DATA AVAILABILITY
RNA-seq, ChIP-seq, and CLIP-seq data are available in the sequence read archive (SRA) database (https://ddbj.nig.ac.jp/search) (accession ID for CLIP-seq: DRA011992, for RNA-seq: DRA014267, and for ChIP-seq: DRA014255). The datasets generated during the study are available from the corresponding author upon reasonable request.