Genome-wide Identification and characterization of circular RNAs in the rice blast fungus Magnaporthe oryzae

Numerous circRNAs have been identified in different organisms, but little attention has been addressed on fungal circRNAs. Here, we identified a total of 8,848 circRNAs from the model plant pathogenic fungus M. oryzae. 5,840 circRNAs were identified from mycelium, 2,721 circRNAs from conidium, while only 287 circRNAs from both tissues. This indicated that most of the M. oryzae circRNAs were specifically expressed in mycelium or in conidium. Parental genes of circRNAs in mycelium were enriched in basic metabolisms required for normal growth, while in conidium, they were enriched in biogenesis of storages potentially used for infection. M. oryzae circRNAs could also bind to miRNAs, suggesting they may also function as sponges in fungi. This study suggested M. oryzae circRNAs could play important roles in regulation of growth and development.

Validation of M. oryzae circRNAs. We then experimentally validated the predictions of circRNA in M. oryzae. We randomly selected ten circRNAs from the highly expressed ones for validation. These selected circR-NAs were validated by reverse transcription (RT)-PCR method. We designed divergent and convergent primers  (Table S3). Theoretically, divergent primers could amplify the fragment from circular form in RNA samples, but convergent primers couldn't amplify any sequence. The divergent primers also couldn't amplified target fragments in genomic DNA. The PCR products were subsequently analyzed by agarose gel electrophoresis and verified by sequencing (Figs 1d,e and S2). Eight circRNAs (80%) were successfully validated, indicated the identified circRNAs in this study were reliable.
Expression patterns of circRNAs in M. oryzae. In order to reveal expression patterns in different developmental stages, the expression levels of all circRNAs were normalized by using a TPM (transcripts per million) analysis 18 . In total, approximately 95.3% (5840) of mycelium circRNAs and 90.5% (2,721) of conidium circR-NAs were tissue specific, but only 2.7% (287) of the total circRNAs were commonly expressed in both tissues (Fig. 1a). Hierarchical cluster analysis also revealed circRNAs exhibited specific expression patterns in mycelium and conidium of M. oryzae (Fig. 2).
Functional classification of circRNAs. To explore putative functions of circRNAs in different developmental stages of M. oryzae, the parental genes of circRNAs expressed in mycelium or conidium were conducted by Gene Ontology (GO) classification and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis, respectively.
In mycelium, among the biological process classification, the enriched GO terms included single-organism cellular process (GO:0044699), cellular process (GO:0009987), and metabolism process (GO:0008152) such as cellular metabolism process (GO:0044237), nitrogen compound process (GO:0006807) and basic metabolism  (Figs 3a and S3). The KEGG pathway analysis demonstrated the circRNA-host genes in mycelium were significantly enriched in amino acid metabolism (such as tryptophan metabolism, alanine, aspartate and glutamate metabolism, etc.), sugar metabolism, carbon metabolism, and methane metabolism. On the other hand, some biosynthesis pathways, such as biosynthesis of amino acids and secondary metabolisms, as well as the ribosome structure formation were also enriched ( Fig. 4a; Table S4).
In conidium, for the biological process, the enriched GO terms included single-organism process (GO:0044699) and single-organism cellular process (GO:0044763). For the molecular function, the circRNA-host genes were mainly related to catalytic activity (GO:0003824) and binding (GO:0005488), such as heterocyclic compound binding (GO:1901363), organic cyclic compound binding (GO:0097159), ion binding (GO:0043167), and small molecule binding (GO:0036094). Interestingly, many nucleoside, nucleotide, ribonucleoside and ribonucleotide binding activities were found, indicated circRNA could play key roles in regulation of nuclear events in conidium (Figs 3b and S4). The KEGG pathway analysis demonstrated the circRNA-host genes in mycelium were significantly enriched in amino acid metabolism (such as tryptophan metabolism, alanine, aspartate and glutamate metabolism, etc.), sugar metabolism, lipid metabolism and pentose phosphate pathway. For biosynthesis, fatty acid biosynthesis, biosynthesis of amino acids, biosynthesis of secondary metabolites were also enriched. For structure formation, phagosome and protein processing in endoplasmic reticulum were enriched ( Fig. 4b; Table S5).
The GO category and KEGG pathway analyses showed that, in mycelium, host genes of circRNAs were mainly involved in basic metabolisms, which were required for normal growth. While in conidium, circRNAs host genes  circRNA-miRNA interaction analysis. It has been widely reported that circRNAs can bind miRNAs to prevent them from targeting mRNAs and then activate gene expression 8 . To determine whether M. oryzae cir-cRNAs could affect post-transcriptional regulation of function genes by binding to miRNAs, all the circRNAs sequences were used to analyze potential miRNAs binding sites. Because the M. oryzae miRNA has not been reported yet, we used the published miRNAs identified in a filamentous fungus Trichoderma reesei, as potential circRNAs targets. In total, we identified 2,760 (31.19%) circRNAs contained predicted binding sites for 13 miRNAs, and 534 of them contain two to eight miRNA binding sites (Dataset S3). Through Cytoscape software, we construct an entire circRNA-miRNA interaction network based on the interaction predicted by conserved seed-matching sequence between circRNAs and miRNAs (Fig. 5A). We found that Tre-milR-13, Tre-milR-6, and Tre-milR-4 could be bound by 552, 467, and 388 circRNAs, respectively (Fig. 5B). On the other hand, for example, miRNAs Tre-milR-6, Tre-milR-8 and Tre-milR-12 could all be bound by the circRNA mor_circ_01113 (Dataset S3). These interactions indicated that not only one miRNA could be bound by different circRNAs, but also one circRNA could bind different miRNAs. These data suggested the circRNAs could also function as miR-NAs sponges to regulate gene expression in M. oryzae.

Discussion
In the past several years, thousands of circRNAs were identified in humans, animals, plants (including Arabidopsis, rice, and soybean), yeasts (S. cerevisiae and Schizosaccharomyces pombe) 5,10,14,15,19 , and the numbers are growing quickly. Because the circRNAs are largely distributed in different eukaryotic organisms, and their features are quite unique, they could function as important post-transcriptional regulators, which need more extensive research. However, except for yeasts, genome-wide and systematic identification of fungal circRNAs has not been reported so far and their features are largely unknown. In this study we conducted a genome-wide identification of circRNAs in the model plant pathogenic fungus M. oryzae, and analyzed the features of circRNAs in this fungus. Our results demonstrated a large number of circRNAs exist in the fungus. Our data also provide resources for further functional characterization of M. oryzae circRNAs.
In the identified M. oryzae circRNAs, 5840 (66.00%) were specifically derived from mycelium, and 2721 (30.75%) were specifically derived from conidium (Fig. 1A). It seems like that, the numbers of circRNAs in mycelium may be larger than that in conidium, suggesting more unique circRNAs could be required for regulation in mycelium formation. Since only the mycelium and conidium tissues were used to identify circRNAs, the actual number of circRNAs in M. oryzae should be definitely underestimated.
Similar to humans, animals, and plants [3][4][5] , majority of the expression profiles of M. oryzae circRNAs were specifically expressed in different tissues (Fig. 2). In 8,848 identified circRNAs, only 287 (3.24%) were identified in both of the mycelium and conidium. This phenomenon suggested that specific developmental stage could be regulated by corresponding circRNAs. More interestingly, the GO category and KEGG pathway analyses showed that, in mycelium, parental genes of circRNAs were mainly involved in basic metabolisms related to normal growth. While in conidium, parental genes of circRNAs were basically involved in biogenesis of different storages (especially glycogen, lipid, and kinds of amino acids), which could be utilized for the appressorium-mediated fungal infection process. These data suggested the circRNAs could play important roles in mycelium and conidium development in M. oryzae. However, it is more interesting to evaluate functions of circRNAs during appressorium formation and invasive growth in the host cells.
We also identified 2,760 (31.2%) circRNAs contained potential binding sites for 13 miRNAs target mimics in fungus 20 . These circRNAs-miRNA interactions could function as potential post-transcriptional regulators like miRNA sponges. However, the binding events and biological functions of circRNAs-miRNA interactions need experimental validation. On the other hand, no miRNA binding sites can be predicted in other 68.8% circRNAs, what regulatory mechanisms existed in these circRNAs should be addresses in the future.
In this study, by using high-throughput RNA sequencing technology and bioinformatics tools, we were able to identify and characterize the M. oryzae circRNAs from mycelium and conidium. We analyzed their features, expression patterns and potential functions. In the future, it is required to explore and compare circRNAs from fungal appressorium formation stage and infection hyphae formation stage. Expression patterns of circRNAs under distinct stresses also need research. Importantly, determine roles and regulatory mechanisms of key cir-cRNAs, especially the pathogenicity-related circRNAs, are required. For example, whether M. oryzae circRNAs can bind miRNAs from itself or host is an interesting area. Our study not only reveals the features of circRNAs in fungi, but also provided resource of circRNAs for further study, such as the regulatory mechanisms and evolution of circRNAs in different organisms.

Methods
Materials preparation. The M. oryzae strain P131 was used for test 21 . Mycelium were collected from twoday-old culture in liquid CM at 180 rpm (28 °C). Conidia were washed down from colony grown on seven-dayold oatmeal agar plates and were harvested by filtering through five layers of lens wiping paper.
Libraries construction for circRNA sequencing. The total RNAs were isolated from mycelium and conidium tissues by using TRIzol reagent (Invitrogen, CA, USA) according to the manufacturer's procedure. Concentration and purity of the total RNA were detected with a NanoDrop spectrophotometer ND-1000 (NanoDrop Technologies, Wilmington, DE, USA). Around 10 µg total RNA was used for removing of rRNA by the Epicentre Ribo-Zero Gold Kit (Illumina, San Diego, USA). Next, the RNAs were treated with 10 U/μg RNase R (Epicentre, Madison, WI) to remove linear RNA at 37 °C for one hour. By using an mRNA-Seq sample preparation kit (Illumina, San Diego, USA), the remaining RNAs were used to construct cDNA libraries, which were subsequently used to perform circRNA sequencing by an Illumina Hiseq2500 platform.
Identification of circRNAs in M.oryzae. M. oryzae genome sequence and gene annotations of P131 strain were downloaded from NCBI Genome Database (www.ncbi.nlm.nih.gov/genome) under the accession number AHZT00000000 22 . Adapter sequences and low quality reads were removed from the raw reads. Then the high-quality clean reads were used for circRNAs identification according to pipline described in Fig. S1. In brief, index of the reference genome was built using Bowtie (v2.0.6) 23 , and paired-end clean reads were multiply mapped with the M.oryzae reference genome through Tophat2 (v2.1.0) algorithm 24 . The unmapped reads were kept, and 20-mers sequences from 5′ and 3′ end of these reads were used to align reference genome again, by using a software Bowtie v2.0.6 22 . The anchored sequences were subsequently analyzed by the find_circ software 2 , then the complete reads can be aligned with breakpoints flanked by GU/AG splice sites. At last, back-spliced reads with more than two supporting reads were identified as circRNAs.
Validation of circRNAs in M. oryzae. Genomic DNA was extracted from mycelium and conidium tissues by the cetyl-trimethylammonium bromide (CTAB) method 25 . Total RNA was isolated from mycelium and conidium tissues using TRIzol reagent (Invitrogen, CA, USA) according to the manufacturer's method. Total RNA samples were treated with DNase I to remove DNA contamination (Takara, Dalian, China). The cDNA were obtained by reverse transcription using PrimeScript RT-PCR Kit (Takara, Dalian, China) with random primers. The divergent and convergent PCR primers were designed for circRNA validation. Divergent primers were designed on the flanking sequences of head-to-tail splicing sites of circRNAs to detect the circRNAs (Table S3). Convergent primers were used as positive controls to detect linear transcripts. Divergent PCR amplification were performed by 36 cycles, while convergent PCR amplification were performed by 26 cycles. The PCR products were dissected by gel electrophoresis and purified by the QIA quick Gel Extraction Kit (Qiagen, Beijing, China). Sanger sequencing was performed subsequently.

Expression level analysis.
To reveal expression patterns in distinct developmental stages, the expression levels of all circRNAs were normalized by using TPM (transcripts per million) algorithm 18 : Normalized expression = (mapped reads)/(total reads) * 1000000. Differential expression of two samples was performed by using DESeq. 2 26 . P-values was adjusted by Benjamini & Hochberg method 27 . By default, the threshold of corrected p-value for differential expression was set to 0.05.