A spliceosomal twin intron (stwintron) participates in both exon skipping and evolutionary exon loss

Spliceosomal twin introns (stwintrons) are introns where any of the three consensus sequences involved in splicing is interrupted by another intron (internal intron). In Aspergillus nidulans, a donor-disrupted stwintron (intron-1) is extant in the transcript encoding a reticulon-like protein. The orthologous transcript of Aspergillus niger can be alternatively spliced; the exon downstream the stwintron could be skipped by excising a sequence that comprises this stwintron, the neighbouring intron-2, and the exon bounded by these. This process involves the use of alternative 3′ splice sites for the internal intron, the resulting alternative intervening sequence being a longer 3′-extended stwintron. In 29 species of Onygenales, a multi-step splicing process occurs in the orthologous transcript, in which a complex intervening sequence including the stwintron and neigbouring intron-2, generates by three splicing reactions a “second order intron” which must then be excised with a fourth splicing event. The gene model in two species can be envisaged as one canonical intron (intron-1) evolved from this complex intervening sequence of nested canonical introns found elsewhere in Onygenales. Postulated splicing intermediates were experimentally verified in one or more species. This work illustrates a role of stwintrons in both alternative splicing and the evolution of intron structure.


spliceosomal twin introns (stwintrons) are introns where any of the three consensus sequences involved in splicing is interrupted by another intron (internal intron). In Aspergillus nidulans, a donor-disrupted stwintron (intron-1) is extant in the transcript encoding a reticulon-like protein. The orthologous
transcript of Aspergillus niger can be alternatively spliced; the exon downstream the stwintron could be skipped by excising a sequence that comprises this stwintron, the neighbouring intron-2, and the exon bounded by these. This process involves the use of alternative 3′ splice sites for the internal intron, the resulting alternative intervening sequence being a longer 3′-extended stwintron. In 29 species of Onygenales, a multi-step splicing process occurs in the orthologous transcript, in which a complex intervening sequence including the stwintron and neigbouring intron-2, generates by three splicing reactions a "second order intron" which must then be excised with a fourth splicing event. the gene model in two species can be envisaged as one canonical intron (intron-1) evolved from this complex intervening sequence of nested canonical introns found elsewhere in Onygenales. postulated splicing intermediates were experimentally verified in one or more species. This work illustrates a role of stwintrons in both alternative splicing and the evolution of intron structure.
In Eukaryotes, the primary transcripts of nuclear genes are frequently interrupted by non-coding sequences (introns) which must be excised to generate a translatable mRNA. A dedicated organelle, the spliceosome, is responsible for this process 1 . The spliceosome is a nuclear ribonucleoprotein complex which includes small nuclear RNAs indispensable for its function 2 . These snRNAs interact with three canonical intronic sequence elements, the "donor" at the 5′ splice site, the "acceptor" at the 3′ splice site and the internal sequence element defining the lariat branchpoint adenosine. Two subsequent trans-esterification reactions involving these three sequence elements are necessary for intron excision, which results in the precise fusion of the two bordering exon sequences and the release of the integral intron sequence in the form of a lariat. The major (U2) and the minor (U12) spliceosomes employ functionally analogous but structurally distinct snRNAs to facilitate recognition of different splice sites but the principles of the two-step splicing reaction are essentially the same for both.
Alternative splicing of spliceosomal introns has been described as a means to generate protein diversity 3 . In human, it is estimated that the transcripts of ~ 95% of all multi-exon genes can be alternatively spliced 4 . Different mechanisms -exon skipping, intron retention, the use of alternative 5′ or 3′ splice sites, and mutually exclusive excision of overlapping introns -can lead to different ORFs, derived from the same primary transcript. Moreover, alternative splicing may result in an un-translatable mRNA or lead to a premature termination codon, and as such could constitute a mechanism of regulation of gene expression at the post-transcriptional level, coupling alternative splicing with nonsense-mediated mRNA decay 5 . The question arises to what extent alternative splicing would occur as a common, physiological means of regulation of eukaryotic gene expression or whether the formation of most "nonfunctional" RNA species is merely an inevitable consequence of the inaccuracy inherent to intron excision by the spliceosome 6 .
The intron-exon structure of transcripts of nuclear genes is phylogenetically dynamic as extant introns can be lost and new introns can arise throughout evolution 7 . Complex intervening sequences have been described which consist of more than one U2 intron 8 . One type of complex intervening sequence is made up of abutting U2 introns which are excised by recursive splicing 9,10 . The second type consists of intronic sequences in which (one) U2 intron(s) is/are nested within (an)other U2 intron(s) which can be, but not necessarily are, removed sequentially [11][12][13] . We have described spliceosomal twin introns ("stwintrons") in fungi [14][15][16][17] . These are complex intervening sequences where the "external" intron can only be removed after the excision of the "internal" intron nested within. Stwintrons are thus formally the spliceosomal analogues of the original group II/III twin introns in the Euglena gracilis plastid DNA 18 . Nevertheless, stwintrons are basically one particular type of complex intervening sequence consistent of nested U2 introns and there are no reasons to suspect that they would not occur in other eukaryotic kingdoms.
The internal U2 intron of a stwintron can be located within any of the three canonical intronic sequences essential for splicing of the (disrupted) external intron. We have characterised two of the three possible classes of stwintrons ( Fig. 1), those where the internal intron interrupts the donor (named [D] stwintrons) or the acceptor ([A] stwintrons) of the external intron [14][15][16][17] . We have not published evidence for the existence of the third class, in which the internal intron disrupts the sequence element around the lariat branchpoint adenosine of the external intron ([L] stwintrons).
In higher metazoa (vertebrata in particular) 5′-and 3′ splice sites in the primary transcript are initially paired across each exon ("exon definition"), enabling the existence of large introns alternating with much smaller exons 19,20 . Nevertheless, short introns exist in, e.g., Drosophila melanogaster 21 and zebrafish 22 and their proper excision is preceded by interactions across the intron ("intron definition"). Moreover, the canonical U2 units of complex intervening sequences in higher metazoa (abutted U2 introns; nested U2 introns) 9,10,12,13 are subject to intron defining interactions. In ascomycete fungi 23 , almost all introns are small -often <100 nt -and this implies an intron definition mechanism of splicing 24 . In fission yeast (Schizosaccharomyces pombe), it was handsomely demonstrated that splice site pairing is essentially confined to intron definition 25,26 . The need for consecutive U2 splicing reactions to excise a stwintron, necessarily "inside out", demonstrates that splice sites pair via intron definition in filamentous Ascomycota (Pezizomycotina), too. High-throughput transcriptome analysis has nevertheless shown that alternative splicing occurs frequently in fungi ( [27][28][29] , amongst others). The availability of complete genome sequences of more than a thousand species of Ascomycota provides an opportunity to investigate functional aspects of intron gain or -loss across a whole phylum. Here, we describe a new [D5,6] stwintron in the model organism Aspergillus nidulans, where the internal intron is situated within the donor element of the external intron, between the fifth and the sixth nucleotide (nt). This stwintron is phylogenetically older than those published previously: it is present across the nine classes of the Pezizomycotina subphylum of which member species have been genome-sequenced. We demonstrate that this stwintron can be involved in skipping the downstream exon from the transcript through alternative use of 3′ splice sites for its internal U2 intron. In specific taxa, the same stwintron is embedded in an even more complex intervening sequence that requires four splicing reactions to be excised.

Results and Discussion
A novel [D5,6] stwintron is extant in A. nidulans. We identified a potential [D5,6] stwintron at locus AN5404 of the A. nidulans genome (http://fungidb.org/fungidb/app/record/gene/AN5404) 30 in which the internal intron splits the external intron between the fifth and the sixth nucleotide (nt) of the donor sequence (5′-GUAAG|U). The stwintron is the most 5′ intervening sequence in a gene comprising additional five canonical U2 introns (Fig. 2a). We have cloned and sequenced cDNAs of the fully spliced mature mRNA of the gene at locus AN5404 (GenBank MK410458). We detected the postulated splicing intermediate, where the internal intron (75 nt) was excised and where the external intron (76 nt) is still present (GenBank MK410459). The typical two-step excision of this [D5,6] stwintron is shown in Fig. 2b. Figure 3 shows the structure of the AN5404 [D5,6] stwintron in the primary transcript, together with the accession numbers of the sequences of the mature mRNA and of the splicing intermediate. www.nature.com/scientificreports www.nature.com/scientificreports/ Locus AN5404 is predicted to encode a reticulon-like protein of 326 amino acids (UniProt C8VGL9; GenBank CBF81965). It is assigned by automated annotation to protein family PF02453 which comprises "proteins of unknown function which associate with the endoplasmic reticulum". Reticulon-like proteins (Rtn) occur widespread across Eukarya 31 . The A. nidulans RtnA protein is predicted to comprise four transmembrane domains and one coiled-coil domain. The coiled domain (residues 250-300) is clearly detected in the alignment of the 779 orthologues used to infer the protein phylogeny (see below). The termini of the human RTN4 variants were shown to be intrinsically disordered 32 . Orthologues of the reticulon-like protein in the seven species of filamentous fungi experimentally assessed (A. nidulans and six others, see below) are likewise predicted to feature intrinsically disordered regions at both the N-and C-termini.
Occurrence of the [D5,6] stwintron in AN5404 orthologues in the Pezizomycotina subphylum. We have searched the databases for orthologue genes of the A. nidulans reticulon-like gene (rtnA, locus AN5404) in Ascomycota. The [D5,6] stwintron is present in almost all genome-sequenced species of the Pezizomycotina subphylum, including in the early divergent classes of the Pezizomycetes and the Orbiliomycetes. We mined 754 of orthologue genes (and the stwintrons within) from species of Pezizomycotina (see Supplementary Table S1). A number of taxa are predicted to have a second [D5,6] stwintron at the position of the second intervening sequence in A. nidulans. This is the case for many species within the classes of the Dothideomycetes, Xylonomycetes and Lecanoromycetes.
We have experimentally confirmed the [D5,6] stwintron in Aspergillus niger (Eurotiomycetes class), Botrytis cinerea (Leotiomycetes), Trichoderma reesei and Neurospora crassa (the latter two, Sordariomycetes) (Fig. 3). In Helminthosporium solani (Dothideomycetes), we have shown the existence of the two [D5,6] stwintrons predicted in the rtnA transcript. In addition, the National Center of Biotechnology Information (NCBI) EST database provided conclusive evidence for the [D5,6] stwintron in Tuber melanosporum (Pezizomycetes) as we found an expressed sequence tag (EST) covering the transcript of the reticulon-like gene (Accession FP417649) from which the predicted internal intron was absent, while the external intron was still present.
In Supplementary Figure S1, a maximum likelihood phylogeny illustrates the evolutionary relations between the orthologous RtnA proteins, broadly in accordance with taxonomy at the level of classes and orders. Some species (<3%) lack the internal intron of the stwintron; a standard intron is extant at the corresponding position. The patchy distribution of internal intron absence indicates multiple, independent (stw)intron loss events. The [D5,6] stwintron and the loss of its 3′ neighbouring exon. The phase-1 intervening sequences at A. nidulans positions 1 (i.e., the stwintron) and -2 occur in the DNA coding for the N-terminal intrinsically disordered region of the RtnA protein, which is poorly conserved in amino acid sequence and length across the Pezizomycotina subphylum. This apparently relaxed context allows natural variations of the local intron-exon structure of the gene. In multiple lineages in the classes of the Eurotiomycetes, Sordariomycetes, Orbiliomycetes and Pezizomycetes, the intron at A. nidulans position 2 has disappeared together with the exon upstream of it. Among the experimentally investigated fungi (Fig. 3) this situation occurs in T. reesei. The exon downstream the stwintron is 15 nt in the black Aspergilli and 24 nt long in all other Aspergilli and Penicillia (see Supplementary  Fig. S2). However, in other genera of the Aspergillaceae family, such as Monascus and Xeromyces, the second exon and the second intron are absent from the rtnA gene while one 3′-extended [D5,6] stwintron appears to replace them. In all genome-sequenced species of the Trichocomaceae family, the second intron and exon are also absent. In Supplementary Fig. S2, the various lengths of the exon between the stwintron and the intron in position 2 (NB. Here and below we refer to introns and exons in different species by their position in A. nidulans, cf. Fig. 2a) are given for all analysed species of the Eurotiales and Onygenales sister orders. In all species of Eurotiales, the size of the exon upstream of the position-conserved intron corresponding to intron 3 in A. nidulans, is (also) conserved. This suggests that the loss of exonic sequences is due to intronisation 33 of the exon (II) originally between the stwintron and the downstream standard intron corresponding to A. nidulans intron positions 1 and 2 in an  (Fig. 4). The situation could be rationalised by postulating two alternative [D5,6] stwintrons, one as described above (the primary stwintron; splicing reactions strictly observing intron definition) and the second comprising intervening sequence 1 (i.e., the primary stwintron), exon II and intron 2, to be called the secondary stwintron. The excision of this secondary stwintron would result in the fusion of exons I and III after two splicing reactions, with the consequent loss of exon II.
We inspected the relevant stwintron sequences in species of Aspergillaceae to select an amenable species in which the internal intron of the [D5,6] stwintron could theoretically be excised using an alternative 3′-acceptor downstream the one used to excise the internal intron of the (primary) stwintron. This situation occurs in A. niger, where a 5′-CAG is located 26 nt downstream of the CAG acceptor of the internal intron of the primary stwintron (Fig. 5). The alternative 3′ splice site could be associated with a distinctive branchpoint element of its own. The secondary internal intron would thus be 29 nt longer at its 3′, and its removal would result in a canonical donor element (5′-GUAAG|U) that cannot effectually pair with the acceptor of the external intron of the primary stwintron, its putative branchpoint element being closer to this alternative donor than to its associated acceptor. The donor of the external intron could however pair with the next available acceptor and associated branchpoint element downstream -those of the standard intron 2 -and the external intron of the secondary [D5,6] stwintron thus would include the sequences of exon II and intron 2. We expect the canonical donor of the secondary external intron (5′-GUAAG|U) to compete with the donor of the standard intron 2, the imperfect (5′-GUAAAU), for the same 3′ splice sites (Fig. 5). The ability to compete with the downstream 5′-donor of intron 2 may be influenced by the association of a protein complex -for instance, the Exon Junction Complex 34,35 -with the donor of the external intron of the [D5,6] at the exon junction of the internal intron of the stwintron. Employing the same oligonucleotide primers used for the detection of the splicing intermediate of the primary [D5,6] stwintron (cf. Fig. 3), we found a minority of A. niger cDNA clones (<5%) that lacked the 105-nt long alternative internal intron The A. nidulans transcript cannot be alternatively spliced due to the absence of alternative 3′ splice sites for the internal intron of its stwintron and always retains exon II (top scheme). The A. niger transcript, however, can be alternative spliced to include or not, exon II in the mature mRNA (two middle schemes). Finally, in the Monascus ruber transcript, loss of bona fide internal splice sites has led to intronisation of this "skippable" exon (bottom scheme). In the three species, the length of exon upstream the intron at A. nidulans position 3 is conserved (i.e., Exon III in the two Aspergilli, Exon II in M. ruber). The internal U2 intron(s) of the stwintron(s) are colour coded in blue letters on a light grey background, the external U2 intron(s) and the standard intron at A. nidulans position 2 in red letters on a dark grey background. Only the sequences of the canonical conserved splice motifs -5′-donor [5′-GURWGH], motif around the branchpoint adenosine [5′-DYURAY], 3′-acceptor [5′-HAG]) -are given. Noncoding sequences are in lower case letters. The sequences corresponding to the alternative 3′ acceptors and associated branchpoint elements for the internal intron of the stwintron in A. niger are annotated as "L,1" and "A,1" for the primary stwintron, and "L,2" and "A,2" for the 3′ extended secondary stwintron. (2019) 9:9940 | https://doi.org/10.1038/s41598-019-46435-x www.nature.com/scientificreports www.nature.com/scientificreports/ we predicted (GenBank MK410463). Using a couple of exonic PCR primers, we found cDNA from fully spliced mRNA from which the exon II sequences (15 nt) were indeed absent (GenBank MK410462). The alternative splicing of the A. niger rtnA transcript was collaborated by extant RNA reads from the NCBI's Sequence Read Archive (SRA) 36 (not shown). exon skipping in the rtnA transcript of Neurospora crassa. The NCBI databases contain ten ESTs for N. crassa rtnA where exon II (33 nt) is retained but also three ESTs where it has been skipped (Accession numbers of the latter: GH144932, GH135593 & GH146396). The existence of these ESTs strongly suggest that the rtnA transcript is subject to alternative splicing. The likely intermediate of the secondary stwintron excision could be detected amongst the N. crassa RNA reads in the Sequence Read Archive at NCBI 36 . The 5′ of this alternative internal intron sequence is the donor of the internal intron of the primary [D5,6] stwintron (see Fig. 3) and the 3′ is the acceptor of the external intron of the primary [D5,6] stwintron (Fig. 6), a variation of the situation revealed in A. niger but with the same end result, skipping of exon II. Supplementary Table S2 provides a list of relevant sequence reads. Its excision reconstitutes a new donor (5′-GUAAG|A) that may pair with the next available acceptor and associated branchpoint element downstream, those of intron 2 in the rtnA transcript, to excise the 3′ extended secondary external intron and consequently, would result in exon skipping.
Our results show genuine alternative splicing of the rtnA primary transcript in A. niger and N. crassa, and strongly suggest a key role of the [D5,6] stwintron in this process.

Intron 2
Exon III Exon II Exon I [D5,6] stwintron www.nature.com/scientificreports www.nature.com/scientificreports/ stepwise formation of a functional intron by multiple splicing events in Onygenales. The reticulon-like gene in all but two (see below) genome-sequenced species of Onygenales (sister order to the Eurotiales, the latter includes the Aspergilli) harbours a complex intervening sequence at the first intron position (NB. Numbering as in A. nidulans). In the primary transcript, the [D5,6] stwintron is present and a canonical U2 intron is seen at the second position. The basic gene model for rtnA in Onygenales is highly reminiscent of that of all the Aspergilli (cf. Fig. 2a), with the last four introns at strictly conserved positions, although the stwintron and the second intron are not necessarily phase-1. For almost half of the species, the length of the small exon II is not a multiple of three (see Supplementary Fig. S2). Moreover, 12 of the species where the length of exon II is a multiple of three, the exon contains an in-frame stop codon. In many Onygenales species, there is no in-frame start codon upstream of the reticulon domain after excision of the [D6.5] stwintron and the standard intron 2.

Exon I Exon II Exon III
Two species, Byssoonygena ceratinophila and Ophidiomyces ophiodiicola, however, have five rather than six introns, all of them canonical. The first intron in these two species is a phase-1 U2 intron which could have resulted from a fusion of the intervening sequences at the first and second position (see below), while the four introns 3′ to it are in the same positions as in the Aspergilli. We postulate that in all other Onygenales species, a complex intervening sequence can be envisaged encompassing a fully functional [D5,6] stwintron (position 1) as well as the canonical U2 intron downstream (position 2) of it, plus the sequences separating these as well as sequences abutting them at 5′ and 3′, respectively (Fig. 7). The three latter "intronised" sequences would thus arise from ancestral coding sequences. We shall call this extraordinary intervening sequence, a "dual intervening sequence" as it would require multiple, consecutive U2 splicing reactions to be properly excised. In B. ceratinophila and O. ophiodiicola, the "dual intervening sequence" may have morphed into one canonical (U2) intron probably after losing its internal "intron-exon" structure, yet retaining its phase. This hypothesis is supported by publicly available expression data in various species of Onygenales where the mature mRNA is not interrupted by the "dual intervening sequence" predicted above. These EST and TSA data from NCBI's Expressed Sequence Tags-and Transcriptome Shotgun Assembly databases, are listed in Supplementary Table S3 and include a TSA for B. ceratinophila rtnA mRNA that confirms the newly formed standard intron at the first intron position in that species. www.nature.com/scientificreports www.nature.com/scientificreports/ We demonstrated experimentally the existence of the new "dual complex intervening sequence" in the reticulon-like gene in Malbranchea cinnamomea 37 (GenBank MK421638). Figure 7 shows schematically that in 29 species of Onygenales (including M. cinnamomea), the excision of the [D5,6] stwintron and the canonical U2 intron at A. nidulans position 2 necessarily occurs before the formation of a new functional intron by sequential U2 splicing reactions. We have called this new U2 intron, a "second order" intron. In the primary transcript, the "second order" intron would be discontinuous and split in three pieces: its 5′-donor located upstream of the [D5,6] stwintron, its canonical sequence element around the branchpoint adenosine between the [D5,6] stwintron and intron 2, and its 3′-acceptor behind intron 2. Once enabled, the "second order" intron would need to be excised by a fourth (standard) splicing reaction to yield the mature mRNA. In this multi-step splicing sequence, all four consecutive standard splicing reactions are preceded by the definition of a small canonical U2 intron, the default mechanism of splice site selection in Ascomycota.
We have experimentally confirmed the existence of all three proposed splicing intermediates and the mRNA end product in M. cinnamomea (see Supplementary Fig. S3; GenBank MK421639; MK421640; MK421641; MK410473). Variations in the order of U2 splicing are possible, for instance, the standard intron at position 2 (the third U2 intron excised as depicted in Supplementary Fig. S3) could be excised before the removal of the external intron of the [D5,6] stwintron to its 5′ (the second U2 intron excised as depicted in Supplementary Fig. S3) or even before the removal of the internal intron of the stwintron (the first splicing reaction in Supplementary  Fig. S3). We cannot exclude that the order of splicing is (occassionally) different as we have not experimentally assessed the alternative patterns, but the last excision must always be that of the "second order" U2 intron. To the best of our knowledge, exon definition 19,20 has never been described in ascomycete fungi. In the case of the Onygenales rtnA transcript, exon defining splice site selection cannot provide an alternative route to excise the entire "dual complex intervening sequence" in one splicing reaction since the three intronic splicing elements of the "second order" U2 intron (5′-donor, element including the lariat branchpoint adenosine, 3′-acceptor) are separated from one another by the canonical U2 introns nested within.
Moreover, we found the proposed "second order" intron as one continuous sequence in three ESTs from Coccidioides immitis and C. posadasii (GenBank GH365468, GH423005; GH438822). In the first two ESTs, all other four canonical introns at the downstream positions (i.e., introns 3-6) are absent, suggesting that the "second order" intron is the last to be removed despite it being (part of) the most 5′ intervening sequence. The third EST (GH438822) still harbours introns 4 and 5, in addition to the "second order" intron. Our results demonstrate the existence of a complex intervening sequence, the "dual intervening sequence", that can only be properly removed by four consecutive U2 splicing reactions each strictly complying with intron defining splice site selection. U2 introns are not necessarily excised co-transcriptionally, because the "second order" intron in Onygenales rtnA does not exist as a functional intron in the primary transcript.  Figure 7. Excision of the first intervening sequence in the primary transcript of the Onygenales rtnA genes requires four consecutive U2 splicing reactions. The "dual intervening sequence" includes the [D5,6] stwintron and intron 2 nested within a discontinuous fourth U2 intron -the "second order" intron. The order of U2 removal is depicted schematically from top (primary transcript) to bottom (mature mRNA). The internal U2 intron of the [D5,6] is highlighted in blue and would be the first to be excised to result in splicing intermediate 1. Subsequently, the external U2 intron, highlighted in red, would be removed to yield splicing intermediate 2. After removal of the complete stwintron, the standard intron downstream in the primary transcript (intron 2; also highlighted in red) would be excised and splicing intermediate 3 would be formed. At this stage, the "second order" intron is continuous and enabled, and its excision finally would yield the mature mRNA. The sequences that constitute the second order U2 intron, discontinuous in the primary transcript and the first two splicing intermediates, are highlighted in yellow. Note that the order of the first three standard U2 splicing reactions may deviate from the depicted path.

Conclusions
Analysis of rtnA [D5,6] stwintrons in 788 species of fungi revealed novel functional aspects of spliceosomal twin introns. In the order of the Onygenales, a new intron has evolved by intronisations around and in between the pre-extant stwintron and the downstream intron-2 in their reticulon-like gene. This new intervening sequence is complex, as the stwintron and the second intron within remain functional and have to be excised, before the remaining intronic sequences could be excised as one standard U2 intron. Our results confirm that U2 introns are not necessarily excised co-transcriptionally.
In various other Pezizomycotina taxa, the exon downstream of the [D5,6] stwintron is absent from the rtnA gene, with the stwintron seemingly extended 3′ to include the sequences of that exon and of the canonical U2 intron behind it. The apparent intronisation of these exonic sequences can be rationalised as a phylogenetic analogue of the physiological phenomenon of exon skipping in an ancestor species where one of the alternative splice options is lost. The stwintron is implicated in a mechanism of this mode of alternative splicing by its ability to use an alternative acceptor for its internal U2 intron, as we show is the case in A. niger and N. crassa, extending the internal intron such that the 5′ donor of the external intron can no longer effectually pair with its conventionally used acceptor (and associated lariat branch point adenosine). Instead, this distal 5′-donor competes for the 3′ splice site of the downstream standard intron.

Methods
Fungal strains, cultivation and nucleic acid isolation. The fungi employed in this study and the respective growth media used for biomass formation and nucleic acid extraction are listed online in Supplementary  Table S4. All strains were grown in 500-mL Erlenmeyer flasks with 100 mL of growth medium seeded with vegetative spore inocula, in a rotary shaker (Infors HT Multitron) at 200 rotations per min for 24 h. Mycelia were harvested by filtration over sterile Miracloth (Calbiochem). The biomass was washed with distilled water, frozen and ground to powder under liquid nitrogen. For the extraction of genomic DNA and total RNA from deep frozen mycelial powder, Macherey-Nagel NucleoSpin kits (NucleoSpin Plant II and NucleoSpin RNA Plant, respectively) were used.

Reverse transcription pCR (Rt-pCR) and sequencing. Reverse transcription was performed with
Oligo(dT) as a primer and 1 μg of total RNA as the template using the First Strand cDNA Synthesis Kit (Thermo Scientific). PCR reactions were done with 4 μL of single strand cDNA template and gene-specific oligonucleotide primers (see Supplementary Table S5) using DreamTaq DNA Polymerase (Thermo Scientific). After initial denaturation at 95 °C for 2 min, 40 amplification cycles of 95 °C for 30 s, 56 °C for 1 min, and 72 °C for 1 min were executed, followed by one post-cyclic elongation at 72 °C for 5 min. Amplified DNA fragments were separated in native agarose gels.
To confirm the existence of the predicted stwintron splicing intermediates, we used primer pairs that do not amplify DNA off mature mRNA template. One of the primers eclipses an intron-exon junction of the putative external intron and contains few nt at its 3′ end that do not basepair when the external intron is absent. This protocol usually yields two PCR fragments of defined sizes of which the smaller one corresponds to the splicing intermediate. Both fragments were processed and sequenced. Thereto cDNA was purified with NucleoSpin Gel & PCR Clean-up (Macherey-Nagel) and subsequently cloned using pGEM-T Easy Vector System I (Promega). Three independent plasmid clones were sequenced over both strands using universal primers (Eurofins Genomics, Ebersberg, Germany). All RT-PCR experiments were done in duplicate, starting with biomass from two independent liquid cultures. Sequences were deposited at GenBank under accession numbers GenBank MK410458-MK410473 and MK421638-MK421641. The same material is also available as Supplementary Sequence Data.
Identification of putative A. nidulans stwintrons of the [D5,6] type. The principles of the stwintron sequence motif search in (fungal) whole genome sequence datasets were detailed previously 16 . Recapitulating, we defined five degenerated sequence motifs for the donor-, acceptor-and lariat branch point sequence elements within a stwintron, including the two hybrid motifs characteristic for the stwintron type (i.e., consistent of nucleotides (nt) of the external-as well as of the internal intron). These motifs are based on a statistical consensus for the three conserved intron elements in A. nidulans 23 (i.e., donor, 5′-GTRWGY; branch point motif, 5′-RYTRAY; and acceptor, 5′-YAG), although one can use a more relaxed consensus, for instance, at the first position of the 6-nt element around the lariat branch point adenosine (D instead of R), or the first position of the 3-nt acceptor (H instead of Y). Furthermore, we defined distance ranges separating these five motifs conforming four principles rooted in our experience in manually calling intron-exon structure in filamentous fungi: [1], The minimum length of an A. nidulans intron is 42 nt; [2], The minimum distance between the lariat branch point element and the acceptor element is 4 nt; [3], the distance between the donor element at 5′ and the lariat branch point element is always bigger (and in the majority of cases, considerably bigger) than the distance between the latter and the acceptor at 3′; [4] bona fide 5′-and 3′-splice sites are paired across the intron (intron definition) -basically, the paired splice sites are those nearest to one another to facilitate excision of the smallest possible intron 25 (albeit always subject to principles [1], [2] and [3]). The mean intron length is reported to be 73 nt in A. nidulans while canonical introns longer than 160 nt are rare 23 . In accordance, we set the distance range between the donor-and lariat branch point elements from 25 to 120 nt. The first stwintrons we reported on were encountered serendipitously 14,15 . However, these simple, degenerated sequence motifs we designed allowed us to actively search for candidate stwintrons in whole genome sequences: a screen for [D1,2] stwintrons led to the identification of two bona fide stwintrons 16,17 .
At the onset of the current work, we searched for stwintrons of the [D5,6] type, i.e., where the internal intron is nested within the canonical donor element of the external intron between the latter's fifth and sixth