Author Correction: The Phytophthora cactorum genome provides insights into the adaptation to host defense compounds and fungicides

An amendment to this paper has been published and can be accessed via a link at the top of the paper.

The combination of de novo prediction and homology-based comparisons resulted in the identification of 56.7 Mb repetitive elements in the P. cactorum genome (Table 1; Supplementary Table 2), accounting for about 46.7% of the assembled genomes. 45.3% of the repeats in the P. cactorum genome were transposable elements (TEs), of which 20.3% were long terminal repeats (LTR) (Supplementary Table 3). The P. cactorum draft genome has more repeat sequences than P. capsici (19%), P. sojae (39%), and P. ramorum (28%), but less than P. infestans (74%). We predicted 27,981 protein-coding genes in the assembled genome following a combination of homology and ab initio methods ( Table 1). The average coding length was 1692.53 bp, and the average exon per gene was 3.45. P. cactorum had a noticeable expansion of gene content compared to P. capsici (19,805), P. infestans (17,797), P. sojae (16,988), and P. ramorum (14,451). The gene density in P. cactorum (241/Mb) was less than P. capsici (268/Mb), but was higher than other Phytophthora species (74/Mb in P. infestans, 179/Mb in P. sojae, and 222/Mb in P. ramorum). Gene structure-based evaluation was performed to confirm the annotation of protein-coding genes (Supplementary Figs S5 and 6b,c). The analysis of local gene density in P. cactorum showed that most genes with intergenic regions were 400 bp to 15 kb long. The main distribution of flanking distances is wider in P. cactorum but not the other three sequenced genomes ( Supplementary Fig. 6a). In addition, 63% of the predicted genes (17,566) showed expression levels (FPKM > 0.05) with the alignment of ~2.2 Gb RNA-seq data 26 to the our genome (Supplementary Table 4). In total, 25,225, 11,533, 10,480 and 13,287 of the predicted genes were assigned with a functional annotation in the NR, Swiss-Prot, KEGG, and InterProScan databases, respectively (Supplementary Table 5).
An overview of annotated ncRNA is shown in Supplementary Table 6. A total of 6,731, 5,947, 143, and 218 tRNAs were identified in P. cactorum, P. infestans, P. sojae, and P. ramorum, respectively. The numbers of Leu-tRNA, Glu-tRNA, and Pro-tRNA in P. cactorum were most abundant. 376 rRNAs and 376 snRNAs were predicted in P. cactorum. Two mature miRNAs and four potential target genes of these miRNAs were identified (Table 1; Supplementary Table 7). These four target genes encoded a nuclear pore complex protein, a poly(A) polymerase, an acid/auxin permease, and an unknown protein, respectively.
Comparative genomics and evolution of Phytophthora species. Gene family clustering analysis of eight Phytophthora species identified 11,674 gene families with a total of 19,783 genes in P. cactorum (Table 1; Fig. 1a). The numbers of single-copy orthologs in eight Phytophthora species were comparable. P. cactorum had 8,198 unclustered genes and 893 unique gene families (2,310 unique paralogs) (Supplementary Table 8). Among the genes unique to P. cactorum, the majority were enriched in defense response, cell cycle, interaction between organisms, peptidyl-amino acid modification, regulation of cell cycle, and TOR signaling pathway in the biological process (Supplementary Table 9). The Venn diagram showed that the eight Phytophthora species shared a common core set of 3,205 gene families (Fig. 1b). The number of P. cactorum-specific gene families was 2,383 ( Fig. 1b).
To systematically study the evolutionary dynamics of Stramenopile species, species phylogeny was performed utilizing 5,103 single-copy orthologous genes among 16 species, which included red algae (Chondrus crispus) and green algae (Chlamydomonas reinhardtii) (Fig. 2). The phylogenetic analysis revealed that red algae and green algae were grouped into one branch. The pathogenic oomycetes (such as Phytophthora, Pythium, and Saprolegnia) were separated from the nonpathogenic stramenochromes (such as Thalassiosira, Aureococcus, and Nannochloropsis). P. cactorum was more closely related to wide host range species (P. parasitica, P. infestans, and P. capsici) than to other sequenced Phytophthora species. Furthermore, the phylogenetic analysis mostly resembles the known topology of the tree of Stramenopile 27,28 . The exact topology of the eight Phytophthora species based on genomic data is highly consistent with the phylogenetic relationships of Phytophthroa species studies by Blair et al. 29 and Runge et al. 30 using multi-locus analysis. With the sequencing of more oomycete species genomes, the true phylogeny between Pythiaceae and Peronosporaceae in Peronosporales will gradually become clear. The estimated divergence time between P. parasitica, P. infestans and P. cactorum was 221.4 (138.6-342.4) million years ago (MYA) (Fig. 2). This most comprehensive and robust study of Phytophthora relationships to date based on genomic data will provide a phylogenetic framework for interpreting the evolutionary events of the genus.
The expansion of gene families was frequently reported to directly or indirectly involve in pathogenicity in fungal pathogens 19,22,28,[31][32][33] . 3,493 and 4,150 gene families of P. cactorum had undergone expansion and contraction, respectively (Fig. 2). The number of expanded gene families in P. cactorum was the largest among Phytophthora species, whereas the number of contracted gene families was relatively small. The expanded genes in P. cactorum were enriched in membrane, cytoskeleton, transport, carbohydrate metabolism, nucleotide binding, transporter activity, transferase activity, and hydrolase activity (p < 0.01, Supplementary Table 10). The contracted genes in P. cactorum were enriched in cellular catabolism, DNA metabolism, chromosome, nucleic acid binding, and nucleotide binding (p < 0.01, Supplementary Table 11). The expansion of gene families is likely the result of the transposons 33 . The expansion of gene families in P. cactorum may be due to the large number of transposable elements in genome. However, more analysis should be performed to elucidate the relationship of transposons and gene family expansion.
Whole-genome duplication (WGD) followed by gene loss has been found in most eudicots and is regarded as the major evolutionary force that gives rise to gene neofunctionalization in both plants and animals 34 . It has been speculated that the increased genome size and gene content of Phytophthora species may be due to WGD or their divergent repertoires of transposable elements 19,35,36 . WGD analysis of three Phytophthora species indicated that both P. cactorum and P. infestans have experienced a recent WGD event (Fig. 1c). This WGD event helps clarify some of the evolutionary history of Phytophthora species. The time of divergence between P. cactorum and P. infestans was later than that between P. cactorum and P. sojae, which was consistent with the phylogenetic tree ( Fig. 2). And some evidence indicated the WGD in P. cactorum was followed by gene loss (Supplementary Fig. 7).
Positive selection was proposed to contribute to fitness. About 428 to 634 P. cactorum genes were determined as positive-selected genes comparing with other Phytophthora species (p < 0.05; Supplementary Table 12   GO enrichments analysis demonstrated that most of these genes in P. cactorum were involved in cellular macromolecule metabolism, hydrolase activity, and transferase activity (Supplementary Table 13).
P. cactorum, P. parasitica, and P. capsici have a wide spectrum of hosts, and P. sojae has a narrow spectrum of hosts. It has been reported that oomycete pathogens have a large and diverse repertoire of expanded gene families 19,22,28,37 . Most of the expanded gene families were reported to be directly or indirectly involved in pathogenicity, such as glycoside hydrolases 28,38 or secreted proteins characterized by the presence of either the RXLR or the LXLFLAK (Crinkler) motifs 19,[39][40][41] . Comparative genomic analysis demonstrated that the numbers of pathogenicity-associated genes, including proteinase inhibitor, protein toxins, secretomes, secondary metabolites biosynthesis, were similar to or smaller than the numbers reported for P. capsici, P. sojae, and P. parasitica. The numbers of genes involved in CAZymes and detoxification metabolism were increased in the genome of P. cactorum (Table 2). Thus, the divergence of wide host range species and narrow host range species may be partly associated with the expansion and extraction of genes involved in detoxification enzymes and CAZymes.
For successful infection, phytopathogenic microorganisms have the ability to adapt to the plant defense system through detoxification or direct utilization of plant defense compounds 42,43 . In this study, we found that P. cactorum made good use of ginsenosides as the sole carbon source to growth (Supplementary Fig. 1). Four tested Phytophthora species showed similar abilities to utilize glucose, xylan, pectin, cellulose, and gum guar. However, P. cactorum, P. capsici, and P. parasitica showed higher ability to utilize ginsenosides than P. sojae ( Supplementary  Fig. 2). Ginsenosides can be hydrolyzed by microbial CAZY enzymes 44 . Based on genomic analysis, we predicted 901 genes that putatively encode CAZY enzymes in P. cactorum. This number was larger than those of other three sequenced Phytophthora species (Table 2). Especially, the members of GHs, GT, and PL families were expanded in P. cactorum.
Microbes have evolved the ability to detoxify xenobiotics through enzymes and transporters [45][46][47] . We identified the ATP-binding cassette (ABC) transporter families and major facilitator superfamily (MFS), as well as the cytochrome P450 (CYPs), peroxidase (POD), glutathione S-transferases (GSTs), methyltransferase (MTR), and dehydrogenase in the P. cactorum genome ( Table 2). Our previous proteomic analysis also showed that P. cactorum employed detoxification enzymes to tolerate a series of fungicides 13 . These data imply that the expansion of detoxification metabolism genes enable P. cactorum with higher detoxification ability against host defense compounds or commercial fungicides. However, more genetic studies should be performed to elucidate the function and relationship of these key genes.
Transcriptome. Transcriptomes from four Phytophthora species (P. cactorum, P. capsici, P. parasitica, and P. sojae) exposed to ginsenosides for 24 hours contained a total of 267 (Supplementary Table 14 Fig. 8). A set of the up-regulated DEGs in P. cactorum and P. capsici were involved in the detoxification metabolism and glycosyl hydrolase, highlighting the potential roles of these genes in adaptation to ginsenosides. Furthermore, time-course transcriptome analyses respectively identified 179, 355, 45, 270, 341, and 127 DEGs in P. cactorum after exposure to ginsenosides for 1 h, 3 h, 6 h, 12 h, 24 h, and 48 h (Supplementary Fig. 9). The GO classifications showed that nine GO terms were unique for up-regulated genes, which included drug binding and transporter, transcription related factors, and enzyme activator GO terms ( Supplementary Fig. 10). Further analyses showed that a series of glycoside hydrolase and transferase genes as well as detoxification-related genes were induced after the exposure to ginsenosides (Fig. 3; Supplementary Tables 18 and 19). Previous study demonstrated that ginsenosides could be hydrolyzed by microbial glycosyl hydrolases to release glycosyl as nutrient for microbes 44 . Although the function of ginsenosides induced detoxification-related genes in P. cactorum should be proven by further genetic studies, these genes were frequently reported in chemoresistance [45][46][47] . Thus, P. cactorum could not only detoxify ginsenosides through detoxification-related enzymes but also utilize them as nutrient for growth through glycosyl hydrolases. It may be an important strategy for P. cactorum to infect P. notoginseng.
Transcription factors regulate gene expression and protein kinases regulate cellular activities by phosphorylating target proteins in response to internal or external signals. We identified a total of 566 transcription factors and 536 protein kinases in the P. cactorum. The numbers were smaller than those found in P. parasitica (689 transcription factors, 577 protein kinases), but larger than those found in P. capsici (458, 406), P. sojae (431, 423), P. infestans (381, 413), P. ramorum (367, 399), and P. kernoviae (252, 231) (Supplementary Tables 20 and 21). The C2H2, MYB-related and SET transcription factors were comparatively abundant in P. cactorum, as well as the group CAMK and TKL protein kinases. P. cactorum has an expanded RLK Pelle group proteins of 22 members. There were also a large number of unclassified kinases, suggesting novel functions performed by the P. cactorum. After exposure to ginsenosides, a set of transcription factors and protein kinases-related genes were significantly up-or down-regulated (Supplementary Tables 18 and 19), which may facilitate the adaptation of P. cactorum to defense compounds.

Conclusions
In summary, we sequenced the P. cactorum using the third-generation single-molecule real-time (SMRT) sequencing technology and revealed the relationship between P. cactorum and other sequenced Phytophthora species. Comparative genomics analyses identified the expansion of gene families associated with the detoxification and carbohydrate-active enzymes (CAZymes) against plant defense compounds. These genes may enable P. cactorum with a high ability to tolerate or utilize plant defense compounds and commercial fungicides. This may partly explain the pathogenicity of P. cactorum in a broad range of hosts. Together, our genomic analyses provide insights into the adaptive mechanisms of P. cactorum to plant defense compounds and fungicides, which will facilitate future studies on pathogenesis and disease management. Methods DNA isolation, sequencing and assembly. P. cactorum was isolated from an infected P. notoginseng in Yunnan. The mycelia were harvested after the strains grown in 150 mL of carrot liquid medium in a 500-mL shaker culture flask on a shaker for three days at room temperature, respectively. Then genomic DNA used for sequencing was extracted from mycelia using Omega Fungal DNA Kit according to the manufacturer's instructions. Briefly, fresh fungal tissue was disrupted and then lysed in a specially formulated buffer containing detergent. Contaminants were further removed after DNA precipitation using isopropanol. Binding conditions were then adjusted and the sample was applied to a spin-colum. Trace contaminants such as residual polysaccharides were removed with two rapid wash steps, and pure DNA was eluted using low ionic strength buffer. In total of 50 mg DNA were used to construct the PacBio sequencing libraries.
Then sequencing was performed to produce raw reads. Totally, 4.84 Gb filtered subreads were obtained for P. cactorum from raw data. Though the error rates of single-molecule reads are high, but we yielded a considerable amount of long reads (43×), which required for self-correction and for subsequent de novo assembly, to assemble the genome. Automatic assembly was performed using PBcR pipeline of Celera Assembler version 8.3rc1 18 . Syntenies was yielded through aligning the scaffolds of P. cactorum to that genome of P. infestans, P. sojae and P. capsici using NUCmer in MUMmer 3.23 48 , respectively.  Repeats annotation. First, we searched for tandem repeats across the genome using the program Tandem Repeat Finder (TRF) 49 . The transposable elements (TEs) in the genome were identified by a combination of homology-based and de novo approaches. For homolog based prediction, known repeats were identified using  Gene prediction and functional annotation. We used the MAKER2 53 pipeline to predict protein-coding genes in the genome. First, the genome was repeat masked using the result of repeat annotation. Then the masked genome was fed to the MAKER2 pipeline with the ab initio gene predictors being GeneMark-ES, FGENESH, Augustus and SNAP. And for the evidence-driven gene prediction, cDNA of P. infestans and proteins of six related species from Ensembl (release-28, P. infestans, P. kernoviae, P. lateralis, P. parasitica, P. ramorum, and P. sojae) were fed to the pipeline. Gene functions were assigned according to the best match alignment using BLASTp against NR, Swiss-Prot and KEGG databases. InterProScan functional analysis and Gene Ontology IDs were obtained using InterProScan 54 .
Non-coding gene annotation. Software tRNAscan-SE 55 is specified for Eukaryotic tRNA and was deployed for tRNA annotation. We used homologous method to identify rRNA. The rRNA sequence data downloaded from Rfam 56 database was used as a reference. INFERNAL 57 was used to identify snRNA.

Gene family cluster.
To identify different sets of gene clusters, protein-coding genes sequences of 16 species were downloaded from Esembl and JGI (http://genome.jgi.doe.gov) and used to locate gene clusters. After pairwise aligning was conducted using BLASTp with an E-value cutoff of 10 −5 , OrthoMCL package 58 was performed to identify the gene family clusters using the BLASTp output with default parameters, final paralogous and orthologous genes were defined using MCL software in OrthoMCL.
Phylogenetic tree construction. The single-copy orthologous genes defined by OrthoMCL 58 were formed, then multiple single-copy genes was aligned using MUSCLE 59 , and the aligned sequences were extracted to feed to MrBayes (http://mrbayes.sourceforge.net) to inferred the species phylogeny using a maximum likelihood (ML) approach. To estimate the divergence time of each species, the information about the already known divergence time data between these species from http://www.timetree.org/ were collected. The topology of the ML tree was fed to MCMCTREE in paml version 4.4 60 for constructing a divergence time tree and calculated the divergence time. Based on the calculated phylogeny and the divergence time, CAFÉ 61 (Computational Analysis of Gene Family Evolution, version 2.1), a tool based on the stochastic birth and death model for the statistical analysis of the evolution of gene family size, was applied to identify gene families that had undergone expansion and/ or contraction. The GO enrichment was done with Ontologizer 2.0 62 by using one-sided Fisher's exact test, the Parent-Child-Union method, with a p-value cut-off of 0.01. All genes with GO annotation were used as reference, and the genes undergone expansion or contraction was used as study set.
Detection of positively selected genes. To detect genes under positive selection, BLASTn was performed to align the coding sequence (CDS) libraries of P. infestans, P. lateralis, P. capsici, P. ramorum, P. kernoviae, P. parasitica and P. sojae against the P. cactorum CDS library, respectively, in order to find the gene pairs with the best alignments. The resulting orthologous gene pairs were aligned again using lastz with the default parameters as a preparation for KaKs_Calculator 1.2 63 , which finally yielded a dataset of each gene pair's Ka/Ks ratio, and the Ka/Ks ratio >1 was determined a positively selected gene (significance, p-value < 0.05). The GO enrichment was done with Ontologizer 2.0 62 by using one-sided Fisher's exact test, the Parent-Child-Union method, with a p-value cut-off of 0.05. All genes of P. cactorum were used as reference, and of all positively selected genes in P. cactorum were used as study set.  66 . Secondary metabolism genes were annotated based on their genomic context and domain content using an automatic web-based software SMURF (www.jcvi.org/smurf/) 67 . The gene families of potential infection-related genes were scanned using HMMER with HMMs against the Pfam families (E-value cutoff of 0.01, PF00067 for cytochrome P450, PF05630 for NPP1 family). The candidates of NPP1 family were further identified to confirm the existence of a signal peptide in N-terminal. Transporters were identified by scanning for the PFAM domains representing both two ABC transporters domains (PF00005 and PF00664) and assisted with manual inspection. Statistics of other proteins in Table 2 were based on the annotation of InterProScan database.
The families of CRN effectors in P. cactorum, P. sojae, P. capsici and P. parasitica dataset were initially predicted based on the BLASTp comparisons (E-value cut-off of 10 −5 ) against the collection of CRN effectors of Phytophthora species and NCBI databases, and confirmed the existence of a signal peptide in N-terminal. The sensitivity test of Phytophthora species to plant defense compounds and fungicides. The cultures of P. cactorum, P. capsici, P. parasitica, and P. sojae, growing on carrot agar medium (CA) plates were transferred onto new CA plates amended with crude ginsensides at concentrations of 0, 0.10, 0.50, 1.0, 5.0, and 10.0 g L −1 or fungicides at the following concentrations: fluopicolide, dimethomorph and flumorph, at 0, 0.1, 0.5, 1.0, and 2.0 mg L −1 ; pyraclostrobine and kresoxim-methyl, at 0, 0.1, 0.5, 1.0, 2.0, and 5.0 mg L −1 ; cymoxanil, at 0, 10, 20, 40, and 80 mg L −1 ; metalaxyl-M, at 0, 0.1, 1.0, 10.0, and 50.0 mg L −1 . Fungicide or ginsenosides was dissolved in methanol (OmniSolv, HPLC grade) to prepare stock solutions. To prepare agar plates supplemented with serial dilutions of fungicides or ginsenosides, the stock solutions were added into CA medium (200 g boiled carrot and 15 g agar in a total volume of 1 L of distilled water) when CA medium was cooled to 50 °C. The final concentration of methanol in any tested media was limited to 0.1% (vol/vol). The experiment was performed three times with four replicates and incubated for 4 days in the dark at 25 °C. The diameters of the colonies were measured perpendicularly. The ginsenosides was extracted from three-year-cultivated P. notoginseng roots with MeOH:H 2 O (80:20) and identified by HPLC-MS as described previously 70 .

RXLR effector prediction.
Growth profiles of Phytophthora species on single carbon source. Citrus pectin, glucose, xylan, cellulose, gum guar and crude ginsenosides (all from Sigma) were used as single carbon source separately in agar medium, to evaluate the growth of different Phytophthora species. Inocula of P. cactorum and other Phytophthora species (P. capsici, P. parasitica, and P. sojae) were placed on these media and incubated at 25 °C for four days. These tests were repeated three times, and the results were analysed to evaluate mycelium growth ability on different single carbon source.
RNA-seq. P. cactorum and other Phytophthora species (P. capsici, P. parasitica, and P. sojae) were grown from mycelial inocula at 27 °C (150 mL of carrot liquid medium in a 500-mL shaker culture flask) shaking at 115 rpm. 10 fresh plug (5 mm in diameter) was taken from the growing edge of a CA culture and transferred into 150 mL of medium in each shock culture flask, and the mixture was incubated in an orbital shaker (ZHWY-111B, Shanghai ZHICHENG Analytical Instruments Manufacturing Co., Ltd.) at 150 rpm at 25 °C. After 24 h pre-incubation, ginsenosides stock solution was added to the medium to a final concentration of 1.0 mg L −1 . For the control culture, only methanol was added. Mycelia were collected at 24 h after exposure to ginsenosides. For P. cactorum, mycelia were collected at 1, 3, 6, 12, 24, and 48 h after treatment with ginsenosides. And then, the mycelial mat was separated from the medium by filtration, quickly washed three times with deionized water, then immediately frozen in liquid nitrogen and lyophilized. Each treatment had two independent replicates.
3 μg of total RNA per sample was used as input material for the RNA sample preparation. Beads with oligo (dT) were used to isolate poly(A) mRNA from total RNA. RNA sequencing libraries were constructed from these mRNA using the TruSeq RNA Sample Preparation Kit (Illumina, San Diego, USA). Briefly, the Elution 2-Frag-Prime (94 °C for 8 minutes, 4 °C hold) was used to elute, fragment and prime the mRNA with Elute, Prime, Fragment Mix (Illumina). First strand cDNA synthesis was performed with First Strand Master Mix and SuperScript II mix (ratio: 1 µl SuperScript II/7 µl First Strand Master Mix) (Invitrogen). The second strand was synthesized with Second Strand Master Mix (Illumina) and Ampure XP beads (Illumina) were used to separate the double-stranded (ds) cDNA from the 2nd strand reaction mix. After end repair and the addition of a 3′-dA overhang, the cDNA was ligated to Illumina PE adapter oligo mix (Illumina), and size-selected for 400 ± 10% bp fragments by gel purification. After 15 cycles of PCR amplification, the paired-end libraries were sequenced using the paired-end sequencing module (150 bp at each end) of the Illumina HiSeq 4000 platform.
These raw reads were processed through Trimmomatic (Version 0.32) 71 to remove reads containing adapter, reads containing poly-N and low quality reads from the raw data and yielded clean data for downstream analyses. The corresponding trimmed clean reads were aligned to the related reference genome (P. parasitica and P. sojae were downloaded from Ensembl, and the P. capsici genome was downloaded from JGI database) employing TopHat2 72 software with default settings. Calculation of gene expression level and identification of differentially expressed genes (DEGs) between the time-course (for P. cactorum) or treatments (for P. parasitica, P. sojae and P. capsici) and control groups were conducted using Cufflinks v2.2.1 73 . Fragments per kilobase of exon per million fragments mapped (FPKM) were used to normalize RNA-seq fragment counts and estimate the relative abundance of each gene. The Cuffdiff package in Cufflinks was used to perform pairwise comparisons of the expressions of each gene between treatments and control in the four species and to report DEGs and transcripts. The DEGs were decided based on a p-value < 0.05 and at least a 2-fold change between the two FPKM values. Data availability. All raw genome sequence data have been deposited in the Short Read Archive (SRA) at NCBI under accession number SRR3386345 (PRJNA318145). Raw RNA-seq data have been deposited in the SRA under accession number SRP111895.