Genomic identification of WRKY transcription factors in carrot (Daucus carota) and analysis of evolution and homologous groups for plants

WRKY transcription factors belong to one of the largest transcription factor families. These factors possess functions in plant growth and development, signal transduction, and stress response. Here, we identified 95 DcWRKY genes in carrot based on the carrot genomic and transcriptomic data, and divided them into three groups. Phylogenetic analysis of WRKY proteins from carrot and Arabidopsis divided these proteins into seven subgroups. To elucidate the evolution and distribution of WRKY transcription factors in different species, we constructed a schematic of the phylogenetic tree and compared the WRKY family factors among 22 species, which including plants, slime mold and protozoan. An in-depth study was performed to clarify the homologous factor groups of nine divergent taxa in lower and higher plants. Based on the orthologous factors between carrot and Arabidopsis, 38 DcWRKY proteins were calculated to interact with other proteins in the carrot genome. Yeast two-hybrid assay showed that DcWRKY20 can interact with DcMAPK1 and DcMAPK4. The expression patterns of the selected DcWRKY genes based on transcriptome data and qRT-PCR suggested that those selected DcWRKY genes are involved in root development, biotic and abiotic stress response. This comprehensive analysis provides a basis for investigating the evolution and function of WRKY genes.

Transcriptional regulation is the most important link for regulating gene expression in plants. Transcription factors are involved in controlling many important biological processes in the gene transcription regulatory network 1,2 . In the plant genome, a large proportion of genes belong to transcription factors, and at least 58 transcription factor families have been determined 3 . In Arabidopsis and rice, approximately 8% and 4% of genes were identified as transcription factors, respectively 4,5 .
The WRKY family, whose name is derived from the highly conserved WRKY domain, is one of the largest transcription factor families 6 . The WRKY domain contains about 60 amino acids, comprising a highly conserved short peptide WRKYGQK and adjacent C 2 H 2 or C 2 HC zinc finger structure. The conserved amino acid WRKYGQK also consists of various forms, such as WRKYGKK, WRKYDQK, and WRKYDHK 7,8 . Based on the number of WRKY domains and the type of zinc finger, the WRKY family can be divided into three groups, namely, groups I, II, and III. Group I contains two WRKY domains and a C 2 H 2 zinc finger type (C-X 4-5 -C-X 22-23 -H-X 1 -H). Group II contains one WRKY domain and a C 2 H 2 type zinc finger; this group can be further divided into five subgroups (IIa, IIb, IIc, IId, and IIe). Group III also contains only one domain but has a C 2 HC zinc finger type (C-X 7 -C-X 23 -H-X 1 -C) 6 . Wu and his colleagues further analyzed the WRKY family in Arabidopsis and rice and revealed the distribution of group II into three subgroups (IIa+ b, IIc, and IId+ e) based on sequence similarities 9 .
The first reported WRKY gene, SPF1, was cloned from sweet potato in 1994 10 . Since then, several WRKY genes have been found in parsley 11 , Arabidopsis 12 , rice 13 , and even in green algae 14 . The completed genome sequencing of many plants has resulted in a more comprehensive identification of WRKY genes. The number of WRKY members in different species varies greatly. Thus far, 72 members have been found in Arabidopsis 15 , 100 in rice 9 , 41 in Physcomitrella patens 3,16 , and 19 in Selaginella moellendorffii 3 . Evolution analysis showed that group I was the oldest group, and groups II and III originated from group I 9 , 17 . In the later study, WRKY factors were identified in the nonplant species Dictyostelium discoideum and Giardia lamblia, suggesting that WRKY factors have a very ancient origin 14 .
A large number of WRKY factors were found, but the functions of only a small number of these factors have been analyzed further. These WRKY factors are involved in seed germination, plant growth and development, signal transduction, and metabolic regulation [18][19][20] . They also participate in biotic and abiotic stress responses, such as freezing, salinity, drought, and pathogen infection 21,22 . A WRKY transcription factor, ABO3, mediates plant responses to ABA and drought stress in Arabidopsis 23 . Overexpression of two wheat WRKY genes, namely, TaWRKY2 and TaWRKY19, confer tolerance to salt, drought, and freezing stresses in transgenic plants 24 . Similarly, Zhou et al. revealed that three GmWRKY genes from soybean play differential roles in abiotic stress tolerance in transgenic Arabidopsis plants 25 . Several WRKY genes are regulated by miRNAs 26,27 . In sunflower, miR396 regulates HaWRKY6 in response to high-temperature damage 27 .
Carrot is an important economic crop in the family Apiaceae and is rich in carotene and various nutrients. Studies on carrot transcription factors are rarely reported because of the lack of carrot genomic data 28,29 . The publication of the carrot genome draft database (CarrotDB) provides resources to make a bioinformatics identification and analysis on WRKY transcription factors. The genomic and transcriptomic database for carrot was built by our group (Lab of Apiaceae Plant Genetics and Germplasm Enhancement, http://apiaceae.njau.edu. cn/carrotdb/index.php) 30 . In this paper, we identified 95 DcWRKY genes in the genomic and transcriptomic carrot database and focused on the evolution and duplication of WRKY genes on different species. A total of 71 DcWRKY genes were detected to have expression on carrot root development based on transcriptome data. Moreover, the expression analysis of several DcWRKY genes under different stresses showed that DcWRKY genes participated in the abiotic stress response. Our results provide a basis for studying the evolution and function of WRKY transcription factors.

Identification of DcWRKY transcription factors in the D. carota genome.
To identify all the WRKY factors in the carrot genome, we employed the HMM profile of the WRKY domain (PF03106) as a query to search against the database using HMMER3.0 and BLAST. A total of 95 nonredundant genes were assigned as WRKY genes and termed DcWRKY1 to DcWRKY95 (Table 1 and S1). The lengths of the DcWRKY proteins ranged from 101 to 865 amino acids, with an average of 333 amino acids. The highly conserved domain WRKYGQK was present in 88 DcWRKY proteins, whereas the remaining seven proteins contained WRKYGKK, WRKYDHK, or WRKYDQK domain. Seventeen genes were identified to group I, which contained two WRKY domains and had a zinc finger motif of C 2 H 2 type (C-X 4 -C-X 22-23 -H-X 1 -H). Sixty-seven DcWRKY members contained a zinc finger motif of C 2 H 2 type (C-X 4-5 -C-X 23 -H-X 1 -H), which were classified as group II. Group III had 11 members, which contained a C 2 HC(C-X 7 -C-X 23 -H-X 1 -C) zinc finger.
Phylogenetic relationship and structure of the carrot DcWRKYs. To investigate the classification and phylogenetic relationship of the WRKY proteins in carrot and Arabidopsis, we used the domain region of the WRKY proteins from carrot and Arabidopsis to construct a phylogenetic tree. Based on the phylogenetic tree ( Fig. 1), all the DcWRKY factors could be divided into three groups. Group I is an independent branch, whereas groups II and III showed a relatively close relationship. Group II, in particular, could be further divided into three main groups with five subgroups. Subgroups IIa and IIb were separated from one clade, and IIe and IId clustered to a branch.
The exon-intron structure analysis was performed to gain more insight into the DcWRKY genes. High variation was observed in numbers of exons and introns among DcWRKY genes ( Fig. 2A,B). Forty-two DcWRKYs had two introns and accounted for the largest proportion, followed by eighteen DcWRKYs possessed only one intron and twelve DcWRKYs had three introns. DcWRKY genes belonging to the same group seemed to have similar exon-intron structures. For example, eight genes (DcWRKY59, 67,15,28,41,82,91,95) did not contain any intron, and four of them (DcWRKY59, 67, 15, and 28) were classified into group I, three of them (DcWRKY82, 91, and 95) were classified into group IIc. Group III contained eleven members, nine genes had two introns. For subgroups IIb, most genes contain four introns, while most IId and IIe genes had two introns. These results showed a strong correlation between exon-intron structure and phylogenetic relationship, which providing an additional foundation to support the classification.
The conserved motifs were predicted by MEME program to explore the diversity in each group. As illustrated in Fig. 2C,D, motifs 1, 3, and 5 contained a WRKYGQK sequence. The DcWRKY proteins that share a similar motif composition were clustered into the same group. For example, most members of groups IIa and IIb contained motifs 1, 2, 6, and 8, whereas group IId and IIe shared motifs 1 and 2. Motifs 1 and 2 were also present in group IIc. Motifs 2, 3, 4, 5, and 7 were present in group I, which contained two WRKY domains, whereas group III possessed motifs 3 and 4.
Physicochemical analysis of deduced DcWRKY proteins. To analyze the physical and chemical characterizations of the carrot DcWRKYs, we calculated all 95 DcWRKY proteins using the Protparam tool (Table S1). The values of theoretical pI ranged from 4.61 to 10.04, and the average for all proteins was 7.24. No distinct difference was found between the percentage of positive and negative amino acids in all groups except for group IId, in which the percentage of positive amino acids was two-fold higher compared with negative amino acids. The content of aliphatic amino acids in all the proteins was very high, which accounted for an average of 16%, whereas the percentage of aromatic amino acids was only 7%. The average value of the aliphatic index reached 59    Continued which suggested that the DcWRKY proteins contained rich aliphatic amino acids. Almost all DcWRKY proteins were calculated to be unstable proteins, only seven DcWRKY proteins were considered to be stable with aliphatic index values of less than 40. The grand average of hydropathicity of all DcWRKYs was less than zero, indicating that DcWRKY proteins were hydrophilic. Most of the physical/chemical properties of DcWRKY proteins were quite similar, but several differences were still observed, which may be due to the nonconserved regions in the protein sequences.
The cis-regulatory elements in all DcWRKY genes promoters were analyzed using the online software PlantCARE based the carrot genome data. A number of different kinds of cis elements were found, and the 10 most common elements were represented in Fig. 3. These elements included a fungal elicitor responsive element W-box, two light responsive elements (G-box and Sp1 elements), three hormone responsive elements (CGTCA-motif, ERF, and ABRE elements), a motif named Skn-1 associated with endosperm expression, a stress induction responding site TC-rich, a drought responsive element MBS, and a heat stress responsive site HSE. Most DcWRKY genes contained more than one cis element in their promoter regions. WRKY proteins usually functioned as transcriptional regulators by binding to W-box ( (C/T)TGAC(T/C)) to regulate defense-related genes 11 . We found that several DcWRKY genes also contained W-box element in their promoter regions. The  same findings were identified in Chinese cabbage and Arabidopsis 31,32 , suggesting that these DcWRKY genes may be regulated by other DcWRKY genes or self-regulated mechanisms. A lot of studies have also reported that WRKY factors were responsive to various stresses including drought, cold and salinity 14,33 , that may due to the upstream genes specificity bind the corresponding cis element to regulate the expression of WRKY genes.

Subcellular localization of DcWRKYs. Subcellular location analysis on the combination of WoLF PROST
and TargetP showed that most DcWRKY proteins were located in the nucleus (Table S1). To examine the subcellular localization of DcWRKY proteins, the coding sequences of three predicted nucleus-localized WRKYs were fused to the N-terminus of GFP and expression in tobacco cells via biolistic bombardment. The GFP fluorescence was observed only in the nucleus of transformant cells (Fig. 4), indicating that DcWRKY45, DcWRKY11 and DcWRKY80 were localized to the nucleus in vivo.
Evolution and distribution of WRKY family factors among different species. The WRKY family factors are commonly found in plants, but are also reported in two nonphotosynthetic organisms, namely, D. discoideum 14 and G. lamblia 34 . We constructed a schematic of the phylogenetic tree in eukaryote evolution. Based on the whole-genome level, the number of WRKYs in each species was counted (Fig. 5)  paralogous gene pairs were found in P. abies (237), followed by apple (102), P. patens (67), and carrot (56). Species with large genome seems to show correspondingly great paralogous genes, except for grape. Grape had only seven paralogous gene pairs. A total of 28/23 and 21/19 orthologous/coorthologous gene pairs were identified in carrotrice and carrot-Arabidopsis. Numerous orthologous/coorthologous gene pairs were also found between carrot and other dicotyledons. However, a small number of homologous genes were found between the gymnosperm P. abies and other plants, although the former has a relatively large WRKY family (72 WRKY transcription factors) and has the most number of paralogous gene pairs (237 pairs). These relationships indicate that the proteins that belonged to the same group may have the same functions in the regulation network. DcWRKY20 protein, which belonged to group IId, showed significant correlations with 17 proteins, indicating that DcWRKY20 probably plays an important role in biological regulation mechanisms by inducing other genes. Moreover, two DcMAPK proteins Dck19436 (DcMAPK1) and Dck06213 (DcMAPK4) respectively showed potential regulation relationships with nine DcWRKY proteins, including DcWRKY20.

Interaction network of
To assess whether there were interactions between DcWRKY20 and two DcMAPK proteins, the fusion plasmids in the vectors pGADT7 and pGBKT7 were co-transformed into the yeast strain AH109 and the interaction was quantified with X-α -Gal assays. Figure 7 showed that co-expression of the DcWRKY20 with DcMAPK1 or DcMAPK4 resulted in X-α -galactosidase expression activity. The positive control showed that pGBKT7-53 can interact with pGADT7-T, and the negative control did not display such expression. These results demonstrated that DcWRKY20 interacted directly with DcMAPK1 and DcMAPK4 in yeast.

Expression analysis of DcWRKY genes during root development in carrot.
The transcript abundances of DcWRKY genes during development in carrot were analyzed through the calculation of transcriptome data (SRR2177455). All DcWRKY genes were surveyed and 71 DcWRKY genes were found to have the expression. As showed in Fig. 8, the DcWRKY genes showed a broad expression during different developmental stages. Some genes, like DcWRKY6, 8,24,31,88, and 95, showed relatively high expression levels in all stages, while DcWRKY1, 23, 53, and 89 expressed very low. Throughout the Fig. 7, we found that the stage with 25 d had a large difference compared to other three stages, while a similar pattern was observed between 60 d and 90 d.
The WRKY genes in Arabidopsis, such as AtWRKY6 35 , AtWRKY44 36 , and AtWRKY75 37 , which have been confirmed to play important roles in plant development. According to our homologous analysis and transcriptome data, several DcWRKY genes were selected for qRT-PCR validation. As illustrated in Fig. 9, the expression patterns of those eleven DcWRKY genes had a greater difference at four development stages. Meanwhile, the qRT-PCR analysis and transcriptome data of most genes were consistent. The expression levels of four genes (DcWRKY2, 64, 69, and 88) increased over the process of carrot development. Among them, DcWRKY2 and DcWRKY64 showed significant expression changes at the initial stage and maintained higher expression levels at the later stages, while the expression of two others (DcWRKY69 and DcWRKY88) was highly variable. DcWRKY11 and DcWRKY95 exhibited a decrease pattern of expression at all four stages (Fig. 8), and qRT-PCR verified this result (Fig. 9). The expression levels of DcWRKY6 and DcWRKY8 at the initial stage were significantly decreased, while they showed a significant increase in the later stages.

Expression profiles of DcWRKY genes under biotic stresses in carrot.
In regards to DcWRKY genes response to biotic stresses, fourteen DcWRKY genes which their orthologous genes in Arabidopsis involved in biotic stress were selected to determine the expression profiles after treatment with whitefly and aphids infections, respectively. As shown in Fig. 10, those selected DcWRKY genes were sensitive to biotic stresses. Nine of the tested DcWRKY genes exhibited a high level of accumulation after subjected to aphid stress, especially DcWRKY5, followed by DcWRKY31, 8, 30 and 1. The expressions of DcWRKY8, 28, 30, 31 and 74 were induced after inoculation with whitefly. Among them, the effect whitefly infection on DcWRKY28 and DcWRKY30 response were much stronger than others. DcWRKY8, 30, and 31 were found to be both significantly upregulated in two biotic stresses, whereas weak expressions were observed for DcWRKY80 and DcWRKY90.  (Table 1), we selected 12 DcWRKY genes from each group to analyze their expression patterns under four abiotic stresses (cold, heat, salt, and drought) in carrot. Figure 11 shows that the members of group I, DcWRKY27 and DcWRKY30, had only one motif difference in structure but exhibited different responses to abiotic stresses.  In heat and cold treatments, DcWRKY27 was evidently upregulated and maintained a high expression level during 24 h, whereas DcWRKY30 was not sensitive to temperature treatment. Moreover, the expression level of DcWRKY27 also increased in salt and drought treatments; the expression level in salt treatment increased by almost 80 times. However, the trend in the expression of DcWRKY30 initially increased and reached a maximum after 2 h and then decreased in salt and drought conditions. Based on the further classification of group II (IIa+ IIb, IIc, and IId+ IIe), we found that the phenomenon of the same subgroup genes that showed a similar expression trend was common in group II, for example, DcWRKY45 (IIa)/DcWRKY1 (IIb), DcWRKY5 (IIc)/DcWRKY11 (IIc), and DcWRKY58 (IId)/DcWRKY88 (IId). The members of group II seemed to be more sensitive to drought and salt stresses. In particular, under salt stress, DcWRKY1 and DcWRKY18 increased by 14-and 12-fold, respectively.
DcWRKY8 and DcWRKY10, which both belonged to group III, had a motif difference. The expression levels of the DcWRKY8 and DcWRKY10 genes slightly changed under cold and heat conditions. Moreover, DcWRKY8 responded rapidly to salt stress response, but both genes had a similar change trend under drought treatment: an initial increase and peak at 2 h followed by a decrease.  Table 2. Numbers of paralogous, orthologous, and coorthologous gene pairs among nine plant species.
The numbers on the diagonal represent the paralogous genes of each species, the numbers located before "/" represent the orthologous genes, and the numbers located after "/" represent the coorthologous genes. Cre:  9 , but almost no reports on carrot, an Apiaceae plant, have been found. In this study, 95 genes were identified to encode WRKY transcription factors, and they were divided into three groups based on the similarity of structure and motif. Among all motifs, motifs 1, 3, and 5 contained a WRKYGQK sequence. Motifs 3 and 5 were present in group I members, while motif 1 was found in groups II and III. In addition, comparative structural analysis of DcWRKYs revealed that DcWRKYs in the same group shared similar exon-intron structures. The analysis on structures of DcWRKY genes might provide a way to find out which group of WRKY genes might be of a more ancient origin. A phylogenetic tree of WRKY transcription factors from carrot and the dicotyledonous model plant Arabidopsis was constructed. The result was consistent with domain and zinc finger type classifications of carrot WRKY transcription factors. Basing on the current genomic data, we built a model diagram for the origin and evolution of WRKY family transcription factors. Some species that can represent different branches on the plant's evolutionary tree were selected to analyze the WRKY family. As illustrated in Fig. 5, group I existed in all species, including the two nonplant species G. lamblia and D. discoideum. However, groups II and III only seemed to be specific for the green plant lineage and expand with the evolution of higher plants 17,40 . The results expound and support that all the WRKY groups may have evolved prior to the moss lineage. Group I may have originated before the origin of eukaryotes about 1.5 billion years ago 14 .
The events of gene duplication and loss are the driving forces during species evolution. With genome amplification, the number and density of WRKY factors are greatly increased. The algae C. reinhardtii and V. carteri both contain only two WRKY factors, whereas the moss P. patens and the fern S. moellendorffii have 41 and 19 WRKY members, respectively. Groups II and III appeared before the origin of the moss plant and have been duplicated many times with plant evolution, which accounted for a large proportion in higher plants. In this study, we selected several monocotyledonous and dicotyledonous plants to compare the duplication and evolution of WRKY factors. The duplication of WRKY factors in monocotyledonous and dicotyledonous plants was independent, and the duplication and diversification of WRKY factors occurred before the differentiation of monocots and eudicots. The number of group III members in monocotyledonous plants was evidently higher than that in dicotyledonous plants. This result indicates that group III seems to be more active in duplication and may have more function in monocots. Gene replication has been confirmed as one of the reasons that lead to new gene functions 41 . Therefore, we can infer that WRKY factors were expended during plant evolution. The unique gene duplication events in different species revealed the species specificity in the evolution of WRKY factors.
The homologous WRKY genes were compared to further analyze the genetic relationship. A total of 73 homologous gene groups belonging to nine distinctly divergent groups were identified based on the sequence similarity. The number of homologous genes showed a significant difference among different species, which may be influenced by the genetic distance between the analyzed species. Numerous homologous gene pairs were found between carrot and other dicotyledons because of their close relationship. However, few homologous gene pairs were identified between carrot and other monocotyledonous and lower plants. This study also shows that species with large genome indicates more paralogous genes. This phenomenon may be contributed by the plant's whole genome duplication (WGD). The WGD event is an important evolutionary feature of plant genome, which can explain the results of gene duplication or loss 42,43 . Previous studies showed that Arabidopsis has undergone three WGD, and rice also has undergone at least one WGD event 5,44 . Furthermore, the number of paralogous genes in grapevine is much less than that in other higher species, which could be due to grapevines' failure to undergo any WGD event 45 . Carrot may also have experienced one or more rounds of WGD events, which could have led to the DcWRKY factors being expended.
Previous studies founded that WRKY transcription factors have complex regulatory networks in response to biotic and abiotic stresses. Among the networks, the WRKY factors are controlled by different levels, including direct positive or negative control by WRKY factors, regulation via other transcription factors or proteins, and the small-RNA-WRKY interactome 21 . In this study, an interaction network between DcWRKYs and other carrot proteins was constructed. Thirty-eight DcWRKY proteins showed co-expression relationships with other proteins in carrot genome, indicating that these DcWRKYs interact with other proteins to modulate stress resistance. The DcWRKY20, was identified in a yeast-two-hybrid system that interacted directly with two mitogen-activated protein kinases DcMAPK1 and DcMAPK4. In Arabidopsis, MAPKs are important regulating factors and act by phosphorylating transcription factors, which subsequently activates transcription of other genes, including WRKY genes 46,47 . In Arabidopsis, WRKY33 was subject to post-translational modification by MAPK4 that was involved in pathogens and SA-mediated responses 48 . The phosphorylation of OsWRKY30 by MAPKs was crucial in order for OsWRKY30 to perform its biological function in rice 49 . Here we found that DcWRKY20 can interact with MAPKs, speculating that DcWRKY20 activity may be regulated by post-translational modifications with MAPKs. We also confirmed the W-box binding activity of several DcWRKY factors by yeast one-hybrid system. However, no obviously interactions of these DcWRKYs and the W-box element were observed in yeast cells (data not shown), suggesting that they may not have direct control relationship. Also the absence of interaction with W-box in yeast may be due to the absence of required co-factors, such as phosphorylation 11 , zinc-ions 50 .  Evidence has demonstrated that WRKY transcription factor are involved in plant growth and development. The WRKY transcription factor in rice, OsWRKY78, can regulates stem elongation and seed development 51 . Overexpression OsWRKY31 gene enhances disease resistance and affects root growth and auxin response in transgenic rice plants 52 . GhWRKY15, a member of the WRKY family identified from cotton, is involved in disease resistance and plant development 53 . Expression analyses in carrot root development helped to screen WRKY genes which may be involved in carrot root development. The DcWRKY3, DcWRKY8 DcWRKY24, DcWRKY64, and DcWRKY88, were abundantly expressed at all four stages of development in carrot root. Notably, the orthologous genes of DcWRKY3 and DcWRKY8 in Arabidopsis, AtWRKY6 (AT1G62300.1) and AtWRKY53 (AT4G23810.1), which have been confirmed to play an important role in leaf development 35,54 . This result suggested that DcWRKY3 and DcWRKY8 may have similar functions to their orthologous Arabidopsis genes during plant development.
Transcriptome analysis revealed that a large number of drought, cold, and high-salinity stress-responsive genes, including numerous WRKY genes, were identified in Arabidopsis 39 . The comparative analysis of DcWRKY genes with their homologous AtWRKY genes helped to predict the potential functions of DcWRKY proteins. Following homologous gene annotations in Arabidopsis, we deduced the functional roles of DcWRKYs. A qRT-PCR experiment showed that the expression profile of DcWRKY genes agreed well with the WRKY genes in Arabidopsis. The two Arabidopsis WRKY genes AtWRKY33 (AT2G38470.1) and AtWRKY40 (AT1G80840.1) were induced by biotic and abiotic stresses 55,56 . Their orthologous genes in carrot are DcWRKY30 and DcWRKY31, which were also found to be evidently upregulated under drought, salt and pathogenic stresses. As plants evolved from lower to higher, plants have established a series of mechanisms of plant growth and development, metabolic regulation, and stress response. The rapid expansion of the WRKY gene family may be a way to meet the requirements for these pathways. Most stress-resistance traits are often controlled by multiple genes 57,58 . The expression patterns of several DcWRKY genes significantly changed, indicating that these genes all appeared to be involved in biotic and abiotic stress response. We expect that future research will reveal the accurate regulation mechanisms of WRKY genes in signaling pathways and stress responses.

Materials and Methods
Identification of putative WRKY genes in carrot. The genome sequences of carrot (Daucus carota L. cv. Kuroda) were downloaded from the Carrot Genome Project web site (http://apiaceae.njau.edu.cn/carrthe otdb/index.php) 30 , and the transcriptome sequences were downloaded from NCBI SRA database 59 and Carrot Genome Project web site 30 . The hidden Markov model (HMM) of the WRKY domain (PF03106) was downloaded from the Sanger database (http://pfam.sanger.ac.uk/family/WRKY). All the putative carrot DcWRKY factors were obtained via screening carrot genome sequences and transcriptome sequences by HMMER 3.0 software (http://hmmer.janelia.org/) using default parameters. The sequences were then submitted to the NCBI database (http://ncbi.nlm.nih.gov/) to search for WRKY domain. All the nonredundant gene sequences encoding complete WRKY domains were considered as putative WRKY genes.
Sequence alignments and phylogenetic analysis. The database of the Arabidopsis WRKY family factors was downloaded from the plant transcription factor database (http://www.arabidopsis.org/). The WRKY family databases of other species were downloaded from the plant transcription factor database (http://planttfdb.cbi.pku.edu.cn/) 3 . All WRKY factors were classified into subgroups based on the sequence alignments of WRKY proteins in carrot and Arabidopsis using ClustalW with default parameters 60 . The domain region of the WRKY proteins from carrot and Arabidopsis were used to construct a phylogenetic tree with MEGA5.0 using the neighbor-joining method with 1000 bootstrap replicates 61 .

Subcellular localization.
To test the predicted nuclear localization patterns, the full length of three DcWRKY genes (DcWRKY11, DcWRKY45, DcWRKY80) without the stop codon were amplified with primers as shown in Table S2. The PCR products were inserted into a GFP-fusion expression vector PA7, and transferred to tobacco leaves to determine the subcellular localization. Empty vector 35S::GFP was used as control. Fluorescence images of GFP fusion proteins were observed using the LSM780 confocal microscopy imaging system (Zeiss, Germany).
Yeast two-hybrid assay. For the yeast two-hybrid assay, the DcWRKY20 cDNA fragment was cloned into the pGBKT7 vector, and the DcMAPK1 and DcMAPK4 cDNA fragments were cloned into the pGADT7 vector, respectively. Unique amplified primers for cloning these three genes were represented in Table S2. These vectors were co-transformed into the yeast AH109 strain, and the transformants were identified with X-α -gal according to a manufacture's protocol (Clontech, Takara, Dalian, China). Yeast co-transformation with pGBK7-53 and pGADT7-T was used as a positive interaction control, and co-transformation with pGBKT7-Lam and pGADT7-T was used as a negative control.
Transcript abundance analysis. Transcript abundance analysis in carrot root development was based on transcriptome data (SRR2177455). Transcript abundance analysis of DcWRKY genes were estimated by calculating read density as 'reads per kilobase of exon model per million mapped reads' (RPKM) 69 . The heatmap was performed by Multiexperiment Viewer software (http://www.tm4.org/) 70 . Plant materials, abiotic and biotic treatments, and qRT-PCR. Experimental samples were obtained from two-month-old 'Kurodagosun' carrot seedlings, which were grown in pots in a controlled-environment growth chamber. The temperature was set at 4 °C for the cold treatment and 38 °C for the heat treatment. Salt and drought treatments were carried out by irrigating with 200 mM NaCl and 20% PEG6000, respectively. Seedlings irrigated with sterile water were used as blank control. For plant biotic damage, whitefly was mechanically inoculated on the carrot leaves by rubbing the leaf with the virus, and adult aphids settled on carrot plants for the aphid tests. After 7 days, infective-stage leaves were collected and used for bioassays. All samples were immediately frozen in liquid nitrogen and stored at − 70 °C.
Total RNAs were extracted from the samples using the total RNA kit (Tiangen, Beijing, China) and then reverse transcribed into cDNAs using the PrimeScript RT reagent kit (TaKaRa, Dalian, China). Three independent PCR reactions were carried out for each gene using the MyiQ Single-Color RT-PCR detection system (Bio-Rad, Hercules, CA, USA). The PCR amplification profile was as follows: 95 °C for 30 s and 40 cycles of 94 °C for 5 s and 60 °C for 30 s, at last a melting curve (65-95 °C, at increments of 0.5 °C) was generated to check the amplification specificity. The relative gene expression was calculated with the 2 −ΔΔCT method 71 . All primers used are shown in Table S2. The DcTUB gene was used as an internal control to normalize the expression of DcWRKY genes.