Genome-wide identification of AP2/ERF transcription factor-encoding genes in California poppy (Eschscholzia californica) and their expression profiles in response to methyl jasmonate

With respect to the biosynthesis of plant alkaloids, that of benzylisoquinoline alkaloids (BIAs) has been the most investigated at the molecular level. Previous investigations have shown that the biosynthesis of BIAs is comprehensively regulated by WRKY and bHLH transcription factors, while promoter analyses of biosynthesis enzyme-encoding genes have also implicated the involvement of members of the APETALA2/ethylene responsive factor (AP2/ERF) superfamily. To investigate the physiological roles of AP2/ERF transcription factors in BIA biosynthesis, 134 AP2/ERF genes were annotated using the draft genome sequence data of Eschscholzia californica (California poppy) together with transcriptomic data. Phylogenetic analysis revealed that these genes could be classified into 20 AP2, 5 RAV, 47 DREB, 60 ERF and 2 Soloist family members. Gene structure, conserved motif and orthologous analyses were also carried out. Gene expression profiling via RNA sequencing in response to methyl jasmonate (MeJA) indicated that approximately 20 EcAP2/ERF genes, including 10 group IX genes, were upregulated by MeJA, with an increase in the expression of the transcription factor-encoding gene EcbHLH1 and the biosynthesis enzyme-encoding genes Ec6OMT and EcCYP719A5. Further quantitative RT-PCR confirmed the MeJA responsiveness of the EcAP2/ERF genes, i.e., the increased expression of 9 group IX, 2 group X and 2 group III ERF subfamily genes. Transactivation activity of group IX EcAP2/ERFs was also confirmed by a luciferase reporter assay in conjunction with the promoters of the Ec6OMT and EcCYP719A5 genes. The physiological roles of AP2/ERF genes in BIA biosynthesis and their evolution in the regulation of alkaloid biosynthesis are discussed.

www.nature.com/scientificreports/ Multiple sequence alignment, phylogenetic analysis, and classification of EcAP2/ERF proteins. To classify the 134 EcAP2/ERF proteins, we performed a multiple sequence alignment of the AP2/ERF domain of the 134 EcAP2/ERF proteins and of each group of A. thaliana AP2/ERF proteins and then constructed a phylogenetic tree using the neighbour-joining (NJ) method ( Fig. 1, Supplementary Fig. S1). According to the phylogenetic tree and homology search, 5 proteins containing one AP2/ERF domain were closely related to AP2 family proteins. Thus, 20 proteins were ultimately assigned to the AP2 family. Five B3 domain-containing proteins were classified in the RAV family; however, 47 proteins were identified in the DREB subfamily, and 60 proteins belonged to the ERF subfamily. EcAP2/ERF5 and EcAP2/ERF127 showed homology with AT4G13040, which is involved in salicylic acid (SA)-mediated plant defence 24 ; therefore, they were classified as members of the Soloist family. The DREB and ERF subfamilies were further divided into eleven groups, groups I-X and VI-L 16 . The numbers of proteins in these groups are listed in Table 1. No protein belonging to the Xb-L group was found in E. californica. EcAP2/ERF61, which is located near group VIII, was assigned to a single group based on the phylogenetic tree, which is shown in Fig. 2. Compared with the number of AP2/ERF subfamilies in A. Gene structure and motif composition of the EcAP2/ERF family. To compare the genomic DNA sequences of EcAP2/ERF genes, we determined their intron and exon structures (Fig. 2). Among the 134 genes, 37 had introns, 19 AP2 family genes (except for EcAP2/ERF115) had at least two exons, the Soloist genes had 4-5 introns, and 16 genes in DREB and ERF subfamilies had 1-5 introns. Most of the ERF subfamily genes (85.0%) were, however, intronless. Particularly, all genes in the group III, VI, VI-L and IX subfamilies had no introns, whereas all group VII genes had more than one intron. This exon-intron structural pattern is highly consistent with that of other plant species, such as castor bean, Brachypodium distachyon and moso bamboo [25][26][27] .
To investigate the potential motifs of AP2/ERF proteins in each family, we analysed their amino acid sequences using online MEME analysis (Figs. 3,4,5). Motifs related to the AP2/ERF domain (AR1, AR2, AR3, AR4 or AR7) were found in all AP2 and RAV family proteins (Fig. 3). AR6 and AR8, which are components of the B3 domain, were found only in RAV family proteins. Several AP2 family proteins had additional motifs, AR9, AR10, AR11 and AR12. As shown in Fig. 4, all DREB subfamily (groups I-IV) proteins had D1, D2 and D3 motifs, which are related to the AP2/ERF domain. All group III proteins and two group II proteins (EcAP2/ERF103 and EcAP2/ ERF132) had motif D4 located in the C-terminal region of the AP2/ERF domain. Some group III proteins contained D5, D6, D7, D8 and D9 motifs in the N-terminal region or C-terminal region, whereas motif D11 was found in group I, II and IV proteins. Furthermore, motif D10 was located in the N-terminal region of group IV proteins, and motif D12 was found in a portion of group II and III proteins.
Similar to the DREB subfamily proteins, all the ERF subfamily proteins (V-X, VI-like) had E1 and E2 motifs, which correspond to the AP2/ERF domain (Fig. 5). Motifs E4 and E12 were found only in group VI proteins, while motifs E7, E8 and E10 were found only in group VII proteins. Motifs E6 and E9 were found only in group IX proteins. Moreover, motif E5 was found in a portion of group V, VIII and VI-L proteins, and motif E11 was found in a portion of group VI and VI-L proteins (Fig. 5). These motifs might be important for the function of each group of proteins.

Orthologous relationships of EcAP2/ERFs between E. californica and A. thaliana.
To investigate the orthologous relationship of EcAP2/ERFs, we carried out a reciprocal BLAST search using predicted fulllength amino acid sequences of the California poppy and Arabidopsis AP2/ERF proteins as queries. A total of 49 EcAP2/ERFs reciprocally hit Arabidopsis proteins with ≥ 45% sequence similarity and ≤ 1e−20 e-value (Supplementary Table S2). Additionally, a phylogenetic tree using full-length EcAP2/ERF and AtAP2/ERF proteins was generated to validate the orthologous relationship ( Supplementary Fig. S2). As a result, 19 EcAP2/ERF proteins were overlapped with the reciprocal best hit proteins and 3 additional EcAP2/ERFs were also predicted as orthologous proteins. These orthologous AP2/ERF proteins between E. californica and A. thaliana might likely have a similar function. www.nature.com/scientificreports/ Transcriptome-based expression profiling of EcAP2/ERF genes in MeJA-treated E. californica seedlings. Alkaloid production has been well known to be induced by MeJA, a phytohormone related to the defence response 28,29 . In fact, many studies have revealed that the expression of genes involved in BIA biosynthe- www.nature.com/scientificreports/ sis, including within California poppy, is transiently induced in response to MeJA 8,22,23,30 . To explore the EcAP2/ ERF transcription factors involved in the regulation of BIA biosynthesis, the expression profiles of EcAP2/ERF, BIA transcription factor-encoding genes (EcbHLH1-1 and EcbHLH1-2) and biosynthesis enzyme-encoding genes (Ec6OMT and EcCYP719A5) in response to MeJA treatment were examined using RNA-seq data (Fig. 6). As shown in Fig. 6A, the expression of EcAP2/ERF72 in the AP2 family and EcAP2/ERF121 in the RAV family increased in response to MeJA; the change was greater than onefold after the data were log 2 transformed compared to the data of the mock control at 0 h. The expression of EcbHLH1-1, EcbHLH1-2, Ec6OMT and EcCYP719A5 showed a clear increase after treatment with MeJA, as previously reported 8,23 . Interestingly, the expression levels of many genes in the AP2 and RAV families were downregulated under MeJA conditions.
In the DREB subfamily, more than 20 genes were highly induced by MeJA; however, many of them were also induced under the mock conditions. The expression of EcAP2/ERF77, EcAP2/ERF79 and EcAP2/ERF92 in group III was clearly upregulated by MeJA; in particular, EcAP2/ERF92 showed the highest expression increase (fold-change > 7) after 12 h (Fig. 6B). EcAP2/ERF49 in group II and EcAP2/ERF130 in group IV also showed high expression increases (fold-change > 1). The expression levels of EcAP2/ERF50, EcAP2/ERF103 and EcAP2/ERF132 also increased in response to MeJA; however, their basal expression levels in California poppy seedlings were low. www.nature.com/scientificreports/ In the ERF subfamily, 9 genes in group IX were highly induced (fold-change > 1) during 0.5-6 h (Fig. 6C). In particular, EcAP2/ERF17 showed a more than approximately fivefold increase in expression after 6 h. On the other hand, the expression levels of EcAP2/ERF8 and EcAP2/ERF10 in group IX decrease in response to MeJA. The expression levels of EcAP2/ERF20 and EcAP2/ERF26 in group X and EcAP2/ERF43 in group VIII were also apparently induced by MeJA. However, the expression profiles of the EcAP2/ERF genes in groups V, VI, VII and VI-L were not largely altered in response to MeJA.  www.nature.com/scientificreports/ were also analysed, as shown in Fig. 6, and a clear response to MeJA was confirmed, as previously described. As shown in Fig. 7  Relative expression

Discussion
The AP2/ERF superfamily is one of the largest groups of plant-specific transcription factors involved in plant developmental processes as well as biotic and/or abiotic stress response and various plant hormone signalling 18 . Members of the AP2/ERF superfamily have been studied at genome-wide level in various plant species, such as Arabidopsis, soybean, rice, carrot, moso bamboo, Chinese jujube and grape 15,26,[31][32][33][34][35] . However, this study is the first report on the genome-wide identification of AP2/ERF transcription factor-encoding genes from E. californica, a BIA-producing plant. In this study, 134 AP2/ERF members were identified from the California poppy genome (  Fig. S2 and Table S2). These findings indicate that the numbers of AP2/ERF transcription factor-encoding genes vary among plant species, and their functions have were expanded throughout evolution. The genome of California poppy, a basal eudicot member of the Papaveraceae family, may indicate the evolutionary history of AP2/ERF family genes in land plants. As shown by the summary of the number and gene density (per megabase) of AP2/ ERF family genes from various plant species (Supplementary Table S3), the percentage of each family gene in E. californica is relatively close to that in other plant species. The gene density of E. californica, 0.267, is close to that of Medicago truncatula and is relatively low compared to that of other plant species. Since several EcAP2/ ERF genes containing incomplete AP2/ERF domain sequences were omitted in this study, however, the exact number of AP2/ERF genes in the California poppy genome may be higher. Gene structural analysis showed that 72.4% of EcAP2/ERF genes had no introns, while the AP2 subfamily genes exhibited complex structures; that is, nearly all AP2 family genes in California poppy harboured introns ranging from 1 to 9 (Fig. 2). This tendency is consistent with phenomena observed in other plant species [25][26][27] . The conserved exon-intron structure of AP2 family genes suggests that gene duplication might have occurred after the evolution of introns, and the small number of AP2 family genes might be due to the intron structure, which impairs the ability to duplicate. On the other hand, the lack of introns in many DREB and ERF subfamily genes might suggest the need for these genes for a quick response to environmental stresses during development.
Conserved domains and motifs of transcription factors are generally important for their transcriptional activity, protein-protein interactions and DNA-binding activity. Motif analysis revealed that each family and subgroup of proteins shared similar motifs, which might be related to their specific functions (Figs. 3, 4, 5). It is also possible to speculate about the function of EcAP2/ERF proteins compared with those of the motifs of Arabidopsis and rice AP2/ERF proteins 16 . Since motif D9 was found in only subgroup IIIc proteins, it might confer a specific function to those proteins. The D8 motif found in a subset of group III proteins contains an LWSY sequence at the C-terminus, which is also conserved in OsDREB1A, B and C and in AtDREB1A as an LWSY motif 16 . These proteins have been reported to be involved in high salt, drought and cold tolerance 39 . Further, a DSVWR sequence in motif D4, which is similar to the conserved DSAWR sequence found in Arabidopsis DREB proteins, is present in group II and III proteins; however, the function of this motif is still unknown. Motif E5 contains an FDLNxxP  www.nature.com/scientificreports/ sequence, which is well known as an EAR motif 40 . Since the EAR motif affords repression activity, EcAP2/ERF proteins that harbour E5 motifs might function as transcriptional repressors. Motifs E4, E11 and E12 were found mainly in group VI proteins, which suggests that these motifs might exhibit characteristic features of group VI proteins. Motif E8 is conserved in group VII proteins and contains a TPEISS sequence, which is considered a putative MAP kinase phosphorylation site 16 . Thus, the function of group VII proteins containing the E8 motif might be regulated by MAP kinase-related signalling. Motifs E6 and E9 are present in group IX proteins, which can be divided into three subgroups, IXa, IXb and IXc 16 . Motif E6 is present in subgroup IXb proteins, while motif E9 is present in subgroup IXa and IXc proteins. It is possible that these motifs might provide specific functions of each subgroup of proteins and might be related to the transactivation activity of group IX proteins, as shown in Fig. 8. Motifs AR1, AR2, AR3, AR4 and AR7 in AP2 and RAV family proteins are included in AP2/ERF domain sequences. The other motifs, such as AR5, AR9, AR10, AR11 and AR12, contain unique sequences that might confer specific functions to these proteins. Plant hormone signaling has an important influence on various plant processes including the biosynthesis of secondary metabolites 28,29 . Whereas alkaloid biosynthesis has been also reported to be affected by various plant hormones 2,41,42 , BIA biosynthesis is quite well-known to be controlled by JA-signalling involved in chemical defence against microorganisms and herbivores in the plants, including California poppy 8,22,23 . Potential AP2/ ERF transcription factors involved in the regulation of BIA biosynthesis in California poppy were characterized by analysis of their gene expression profiles via RNA sequencing and qRT-PCR based on MeJA-responsiveness. The results revealed that 14 EcAP2/ERF genes were upregulated in response to MeJA, though 2 genes showed no significant difference (Figs. 6 and 7). In particular, the expression of 9 of 13 group IX EcAP2/ERF genes clearly increased in response to MeJA within 4 h. Compared to the expression pattern of Ec6OMT and EcCYP719A5, the expression pattern of many group IX genes was upregulated rapidly more but then decreased. This quick MeJA response is relatively similar to the response of EcbHLH1-1 and EcbHLH1-2, both of which are potential master regulators in BIA biosynthesis.
Group IX AP2/ERF transcription factors have been reported to be involved in the regulation of various secondary metabolic pathways 20 . Octadecanoid derivative-responsive Catharanthus AP2-domain (ORCA) 2 and 3 were first isolated from Catharanthus roseus as MeJA-responsive group IX AP2/ERF transcriptional regulators of alkaloid biosynthesis 43,44 . In particular, ORCA3 is considered a master regulator of monoterpenoid indole alkaloid (MIA) biosynthesis 45 . In addition, the following proteins have been identified and characterized in the last decade: NtERF189, a master regulator of nicotine biosynthesis in Nicotiana tabacum 46 ; AaERF1, AaERF2 and AaORA, positive regulators of artemisinin biosynthesis in Artemisia annua 47,48 ; GAME9/JRE4, a master regulator of steroidal glycoalkaloid biosynthesis in tomato and potato 49,50 ; and OpERF2, a possible regulator of camptothecin biosynthesis in Ophiorrhiza pumila 51 . These group IX proteins showed high similarity. These findings strongly suggest that MeJA-responsive group IX EcAP2/ERF transcription factors are involved in the positive regulation of BIA biosynthesis. In fact, in the present study, a transient LUC reporter assay indicated that four group IX EcAP2/ERF proteins were clearly upregulated, reflecting transcriptional activity within the promoters of biosynthesis genes (Fig. 8).
Interestingly, tobacco group IX AP2/ERF genes, including NtERF189, are known to be clustered in the nicotine production regulatory locus called NIC2 46 . Furthermore, ORCA gene clusters were identified from the C. roseus genomic scaffold 52,53 . A recent study revealed that these AP2/ERF transcription factors encoded by the ORCA gene cluster mutually regulate the expression of clustered genes and MIA biosynthesis 54 . To examine whether EcAP2/ERF genes were also clustered in the California poppy genome, we searched for the location of each EcAP2/ERF gene in the genome scaffolds. The results revealed that the EcAP2/ERF2, EcAP2/ERF3 and EcAP2/ ERF4 genes, which showed high upregulation in response to MeJA, co-localized in the same genome scaffold (Supplementary Table S4). These EcAP2/ERF proteins showed the highest similarity to Arabidopsis AtERF5 in the orthologous analysis (Supplementary Table S2). Further, EcAP2/ERF2, EcAP2/ERF3 and EcAP2/ERF4 exhibited relatively high transactivation activity in the reporter assay. Our findings strongly suggest that these clustered EcAP2/ERF genes might be derived from common origin through gene duplication and these AP2/ ERF transcription factors play an important role in the regulation of BIA biosynthesis. Additional functional characterization of EcAP2/ERF transcription factors and elucidation of the transcriptional network involved in several transcription factors, including AP2/ERF proteins, are ongoing.

Materials and methods
Identification of AP2/ERF superfamily genes in the E. californica draft genome. Based on the annotated gene information available at the Eschscholzia Genome DataBase (https ://eschs cholz ia.kazus a.or.jp), 178 AP2/ERF genes were first isolated from the California poppy draft genome. After removing redundant and incomplete ORF sequences, the SMART database (https ://smart .embl-heide lberf .de/) was used to further eliminate sequences that did not contain a complete AP2/ERF domain. All the genes were also validated using the PhytoMetaSyn transcriptomic database (https ://bioin forma tics.tugra z.at/phyto metas yn/) 55  www.nature.com/scientificreports/ method, Jones-Thornton-Taylor (JTT) model, and 1000 bootstrap replications using MEGA 7.0 software (https ://www.megas oftwa re.net/) 56 .
Gene structure and conserved motif analysis. Gene structure was visualized using the online Gene Structure Display Server (GSDS) (https ://gsds.cbi.pku.edu.cn/) based on predicted full-length CDS regions with their corresponding genomic sequences. Conserved motifs of AP2 and RAV family members and DREB and ERF subfamily proteins were predicted using the MEME Suite web server version 5.1.0 (https ://meme-suite .org/) with the following parameters: maximum number of motifs of 12 and optimum motif width ≥ 6 and ≤ 50 57 . The topology of the phylogenetic trees was determined with the AP2/ERF domain using MEGA 7.0 software.

Orthologous analysis.
A reciprocal BLAST was carried out to find the orthologous relationship between California poppy and Arabidopsis AP2/ERF proteins. Best hits with more than 50% sequence coverage were picked up. Un-rooted phylogenetic tree using full-length EcAP2/ERF and AtAP2/ERF proteins was generated with the NJ method, p-distance model, and 1000 bootstrap replications using MEGA7.0.
RNA sequencing and expression profiling. California poppy seedlings were grown in Linsmaier-Skoog (LS) media (pH 5.7, 0.8% agar) at 23 °C under continuous light (50-100 µmol photons/m 2 /s) for 10 days, after which they were transferred to LS media supplemented with 100 µM MeJA or 0.1% dimethylsulfoxide (DMSO) as a control. Total RNA was extracted from 5 seedlings at 0, 0.5, 1, 3, 6 and 12 h after treatment using an RNeasy Plant Mini Kit (Qiagen, Hilden, Germany) and subsequently sequenced as 101 bp + 101 bp paired-ends on an Illumina HiSeq 2500 by Hokkaido System Science Co., Ltd. (Hokkaido, Japan). Read trimming using the cutadapt and Trimmomatic programs, read mapping using the TopHat/Bowtie program, and expression analysis via calculations of fragments per kilobase of exon model per million mapped fragments (FPKM) values using Cufflinks were also performed. Hierarchical clustering and construction of heat maps were performed using MeV software (https ://mev.tm4.org/).

Gene expression analysis by real-time PCR.
Eighteen seedlings of California poppy that were grown as described above were treated with 100 µM MeJA for 0, 1, 4 and 24 h. After dividing them into three sample groups (each with 6 seedlings), total RNA was extracted from each group using an RNeasy Plant Mini Kit. Single-stranded cDNA was synthesized from 1 µg of total RNA using ReverTra Ace qPCR RT Master Mix in conjunction with gDNA Remover (Toyobo, Osaka, Japan). Quantitative RT-PCR was performed with specific primer pairs (Supplementary Table S6) using Thunderbird SYBR qPCR Mix (Toyobo, Osaka, Japan) and a Light-Cycler 96 system (Roche, Basel, Switzerland). The PCR conditions were 95 °C for 1 min, followed by 45 cycles of 95 °C for 15 s, 60 °C for 20 s and 72 °C for 30 s. The data were calculated by the comparative 2 −ΔΔCt method, and the relative expression levels were standardized by the amplification of the β-actin gene as an internal control.
LUC reporter assay. The promoter sequences of Ec6OMT (507 bp) and EcCYP719A5 (621 bp) were isolated from California poppy genomic DNA and inserted into a pDR196-LUC vector, yielding a promoter::LUC reporter construct. The full-length cDNAs of group IX EcAP2/ERF genes were fused to the CaMV 35S promoter in the pBI221 vector, which subsequently served as an effector construct. A dual-LUC reporter assay was then performed using C. japonica protoplasts as previously described 58 .