Genome-wide characterization of bZIP gene family identifies potential members involved in flavonoids biosynthesis in Ginkgo biloba L.

Ginkgo biloba L. is an ancient relict plant with rich pharmacological activity and nutritional value, and its main physiologically active components are flavonoids and terpene lactones. The bZIP gene family is one of the largest gene families in plants and regulates many processes including pathogen defense, secondary metabolism, stress response, seed maturation, and flower development. In this study, genome-wide distribution of the bZIP transcription factors was screened from G. biloba database in silico analysis. A total of 40 bZIP genes were identified in G. biloba and were divided into 10 subclasses. GbbZIP members in the same group share a similar gene structure, number of introns and exons, and motif distribution. Analysis of tissue expression pattern based on transcriptome indicated that GbbZIP08 and GbbZIP15 were most highly expressed in mature leaf. And the expression level of GbbZIP13 was high in all eight tissues. Correlation analysis and phylogenetic tree analysis suggested that GbbZIP08 and GbbZIP15 might be involved in the flavonoid biosynthesis. The transcriptional levels of 20 GbbZIP genes after SA, MeJA, and low temperature treatment were analyzed by qRT-PCR. The expression level of GbbZIP08 was significantly upregulated under 4°C. Protein–protein interaction network analysis indicated that GbbZIP09 might participate in seed germination by interacting with GbbZIP32. Based on transcriptome and degradome data, we found that 32 out of 117 miRNAs were annotated to 17 miRNA families. The results of this study may provide a theoretical foundation for the functional validation of GbbZIP genes in the future.

www.nature.com/scientificreports/ terminus consists of a heptad repeat of leucine or other bulky hydrophobic amino acids 13,14 . To bind DNA, two subunits adhere via interaction between the hydrophobic sides of their helices, which creates a superimposing coiled-coil structure 15 . ACGT is the core sequence recognized by bZIP TF, and the corresponding cis-acting elements are A-box, G-box, and C-box 14,16 . Flavonoid biosynthesis involved in multiple enzymes in plants, and its accumulation is not determined by a single key enzyme. Besides, the process is also regulated by many transcription factors. A bZIP transcription factor was found to be involved in the metabolism of flavonoids. HY5 belongs to the H subclass of the bZIP gene family, which was proved to be involved in plant photomorphogenesis 17,18 . Subsequently, it was shown that HY5 was also involved in the anthocyanin metabolism process, which belongs to flavonoids 19 . Qiu et al. obtained SlHY5 frameshift mutated and found that the anthocyanin content in the mutant was lower than that in wildtype tomato 20 . HY5 can improve plant cold tolerance through direct regulation of CBF or indirect regulation of MYB15 21 .Anthocyanins accumulation could be inhibited by Gibberellins at low temperature in plants, while the expression level of the gibberellin signaling pathway gene GA2ox1 showed HY5-dependent 22 . In strawberry, HY5-bHLH9 heterodimer formed to regulate light-dependent coloration and anthocyanin biosynthesis, suggesting that HY5 can be involved in plant anthocyanin metabolism by acting together with other genes 23 . Nguyen et al. 24 found HY5 binds directly to the MYBD promoter in this process to promote anthocyanin accumulation. Zhang et al. 25 found that HY5 or HYH TFs promote anthocyanin accumulation by upregulating DFR, a key gene downstream of the anthocyanin biosynthesis pathway in A. thaliana. The bZIP TF was shown to affect the biosynthesis of anthocyanins in red raspberries 26 . Furthermore, the bZIP TF HY5 in regulates anthocyanin synthesis through the activation of AtMYB75 transcription in A. thaliana 27 . Droge et al. 28 isolated the G/HBF-1 (bZIP) TF from soybean and found that G/HBF-1 can bind to the H-box and G-box in the promoter region of the CHS gene to activate CHS gene expression and substantially increase phytoalexin level, thereby improving disease resistance in soybean. Akagi et al. 29 found that DkbZIP5 was involved in proanthocyanidin biosynthesis by binding to the ABRE element of the promoter region of DkMYB4 in persimmon callus. Fan et al. 15 identified a total of 135 RsbZIP genes from radish genome and found that RsbZIP011 and RsbZIP102 were potential participants in the radish anthocyanin synthesis pathway. Therefore, regulating the expression of TFs is another effective way to increase flavonoid content.
The whole genome sequence of G. biloba has been published in 2016. The bZIP gene family is important in plant development and stress response. However, the bZIP transcription factors have not been reported in G. biloba yet. In the present study, the whole genome sequence of G. biloba was used to screen for the GbbZIP gene family. We analyzed the physicochemical property, phylogenetic relationship, chromosome localization, gene structure, motifs, tissue expression pattern, protein-protein interaction network, miRNAs of GbbZIP gene family and their expressional patterns under SA, MeJA, and low temperature treatment. The result will provide a theoretical basis for subsequent studies on the mechanism of action and function of the bZIP gene family in G. biloba.

Results
Identification of bZIP genes in G. biloba. A total of 65 protein sequences were identified from the G. biloba genome using HMMER 3.0 software. 15 bZIP proteins were found to contain no bZIP domains or incomplete bZIP domains by further CDD and Pfam analysis, so they were subsequently removed. The remained 40 proteins containing bZIP domains were named GbbZIP01-GbbZIP40. The proteins encoded by the 40 GbbZIP genes ranged from 66 (GbbZIP34) to 681 (GbbZIP22) amino acids and from 7.90 kDa to 74.44 kDa in relative molecular weight. The isoelectric point was predicted to range from 5.16 to 9.75. As can be seen in Table 1, the GRAVY scores of 40 GbbZIP proteins were negative, indicating they are hydrophilic. 38 GbbZIPs are expressed in cell nucleus. GbbZIP12 and GbbZIP27 are located in the chloroplast and plastid, respectively. Multiple sequence alignment of the conserved domain sequences of 40 GbbZIP members showed that the basic region is composed of N-(X)7-R/K, and the zipper domain is composed of the heptapeptide repeat of leucine (L) or related hydrophobic amino acids (Fig. 1).
To investigate the phylogenetic relationship among the bZIP family members, a phylogenetic tree was gen- Gene structure and motif analysis. To gain insights into the structures of GbbZIP genes, their exon/ intron organization was investigated. As shown in Fig. 4b, the number of exons in GbbZIPs ranged from 1 to 12. The gene structure of the same subfamily was relatively conserved. For example, four members in group E contained the same number and length of exons. Out of the six members in group A, five are composed of a long exon, and three short exons. A higher number of exons were found in D and G subclasses, while fewer introns were found in F and S subclasses. The gene structure of subclass A was highly uniform, but the number of introns in subclass H varied considerably. www.nature.com/scientificreports/ The motifs of GbbZIPs were analyzed using the MEME website and the results were shown in Fig. 4c. A total of 20 motifs were found in GbbZIPs. Using the PFAM database, motif 1 and motif 9 (Table S2) were identified as the core conserved domains of GbbZIP proteins. Motif 13 and 4 were identified as DOG1 (Delay of Germination 1), which appears to be a highly specific controller seed dormancy 33 . Motif 12 and 18 were identified as MFMR (Multifunctional mosaic region), and other motifs contained no specific annotation information 34 . In addition to the core conserved domains, different conserved motifs were also present in different subfamilies. For example, Motif 3, 5, and 7 are only present in subfamily A; motif 4, 11, and 13 are distributed in subfamily D; motif 6, 12, 15, 17, and 18 exist in subfamily G; motifs 8 and motif 10 occur in subfamilies F and E, respectively. Motif 9 was detected only in subclass G and S, suggesting similar functions between the two subclasses that were in the same branch of the evolutionary tree.
Promoter analysis. The sequence 2000 bp upstream of each GbbZIP gene was analysed in PlantCARE website, and the results were visualized with TBtools v1.09854. The detailed cis-acting element information can be found in Table S3. As shown in Fig. 5, the most frequently occurring cis-acting elements contain light-, hormone-, abiotic stress-, and growth-and development-responsive element. G-box, ACE, and GT1 motifs are www.nature.com/scientificreports/ light-responsive elements, while ABRE, TGA, CGTCA-motif, and TCA-element are ABA-, auxin-, MeJA-, and SA-responsive elements, respectively. Gibberellin responsive element contains GARE-motif and P-box. Among them, the light-responsive element appeared most frequently (158 times) and was the most widely distributed in GbbZIP gene promoters. Nine GbbZIPs contained low-temperature-responsive elements. In addition, other elements include anoxic inducibility (ARE), circadian control, defense and stress (TC-rich), endosperm expression (GCN4-motif), and meristem expression (CAT-box) elements.

Analysis of the expression pattern and correlation.
The expression pattern of the GbbZIP gene family was analyzed based on FPKM from G. biloba transcription database. As shown in Fig. 6a, 36 GbbZIPs are expressed in eight tissues, except for four genes (GbbZIP09, GbbZIP24, GbbZIP32, and GbbZIP38) that are not expressed in mature leaf and root. GbbZIP13, GbbZIP18, and GbbZIP31 showed high expression characteristics in all eight tissues. GbbZIP38, GbbZIP24, GbbZIP26, and GbbZIP32 showed very low expression levels in all tissues. The expression profiles of GbbZIP11, GbbZIP12, GbbZIP16, and GbbZIP36 were significantly higher in microstrobilus than in other tissues. The expression levels of GbbZIP01, GbbZIP02, GbbZIP21, and GbbZIP25 were lower in mature leaf than in other tissues.
Based on correlation analysis between on the FPKM of GbbZIPs and the flavonoid content in the eight tissues of G. biloba, we obtained two GbbZIPs (GbbZIP08 and GbbZIP15) with correlation coefficients greater than 0.8 (Fig. 6b, Table S4, Table S6, Table S7). These two genes were assigned to subclass H in the evolutionary tree analysis.  Prediction and analysis of miRNA regulating GbbZIPs. In order to further understand the GbbZIP genes in G. biloba, we combined transcriptome and degradome sequencing results to screen for 117 miRNA members regulating 33 GbbZIPs. Based on the transcriptome annotation results, a total of 32 miRNAs were matched to 17 miRNA families (Table S5). Fig. 8 showed that 9 out of 40 GbbZIPs were each targeted by only one corresponding miRNA. GbbZIP36, GbbZIP31, GbbZIP11, and GbbZIP32 were targeted by 15, 11, 13, and 9 miR-NAs, respectively. Interestingly, some miRNAs targeted more than one GbbZIP gene, such as novel_miR_582 (GbbZIP18 and GbbZIP40), novel_miR_2803 (GbbZIP21 and GbbZIP28), novel_miR_2362 (GbbZIP36 and GbbZIP39), novel_miR_1493 (GbbZIP01 and GbbZIP27), and novel_miR_2752 (GbbZIP03 and GbbZIP10). GbbZIP08 and GbbZIP15 were targeted by novel_miR_850 and novel_miR_715, respectively.
Differential expression levels of GbbZIPs with hormone treatment. Based on the results of cisacting element analysis, genes containing a large number of cis-acting elements which are related to SA-, MeJA-, and low temperature-responses were selected for qRT-PCR analysis, and the results were visualized in the form of a heatmap. As shown in the Fig. 9c, most GbbZIPs responded negatively to SA, only the expression levels of GbbZIP08 and GbbZIP33 first increased and then decreased. GbbZIP25, GbbZIP16, GbbZIP11, GbbZIP11, GbbZIP26, GbbZIP10, and GbbZIP29 were downregulated within 24 h after MeJA treatment then increased at 48 h (Fig. 9a). The expression levels of GbbZIP23 and GbbZIP35 peaked at 6 h after MeJA treatment and  www.nature.com/scientificreports/ then started to decrease (Fig. 9a). With low-temperature treatment, the transcription levels of GbbZIP08 and GbbZIP14 were upregulated, while GbbZIP26 increased and then decreased, and reached a peak at 6 h (Fig. 9b).

Discussion
G. biloba is an ancient economic tree species that originated in China and it has a very strong ability to adapt to harsh environments. Therefore, the genetic information of G. biloba is of great research value in plant growth and development, pest and disease defense, abiotic stress, and tree evolution. To date, the bZIP gene family has been identified in A. thaliana (78 bZIPs) 30 , Populus (86 bZIPs) 37 , Raphanus sativus (135 bZIPs) 15 , Solanum tuberosum (80 bZIPs) 38 , Brassica napus (247 bZIPs) 38 , and Olea europaea (103 bZIPs) 39 . However, the bZIP transcription factors in G. biloba have not been reported yet. In this study, a total of 40 GbbZIP members were identified based on the genomic data of G. biloba. Compared to the species above, G. biloba contains the fewest bZIP members with 40 while B. napus contains the most with 247. We used phylogenetic analysis to classify bZIP genes into 10 groups-40 GbbZIP genes, along with 69 bZIP genes from A. thaliana, and 109 from M. domestica (Fig. 3). The GbbZIP members were not detected in subgroup J, B, and M, which implying that these members were derived gradually during the evolutionary process. In G. biloba, the largest number of GbbZIPs were found in subclasses S, A, and I subgroups. It is similar to A. thaliana 30 39 which contains the most bZIP members in S, A, I, and D subclasses. In the Fig. 3, GbbZIP03 was clustered closely with AtbZIP51 (VIP1), which is a putative host factor in Agrobacterium-mediated T-DNA transfer 41 . Furthermore, GbbZIPs in the same group share a similar gene structure, number of introns and exons, and motif distribution. HY5 (Elongated Hypocotyl 5) is a major regulator of photomorphogenesis. Studies in apple indicated that MdHY5 promoted anthocyanin accumulation by regulating the expression of MdMYB10 and downstream anthocyanin biosynthesis genes 42 . SlHY5 in Solanum lycopersicum could directly recognize and bind to the G-box and ACGT-containing element in the promoters of anthocyanin biosynthesis genes, such as CHS1, CHS2, and HDR 43 . It is apparent in the Fig. 6b, the positive correlation between the expression of GbbZIP15 and GbbZIP08 and the content of flavonoids was the highest. In addition, GbbZIP15 and GbbZIP08 was close to AtbZIP56 and MdbZIP67  Protein-protein interaction network for GbbZIPs was analyzed by the STRING website (http:// stringdb. org/) using the full-length protein sequence 16,36 . Empty nodes indicate proteins of unknown 3D structure.
Filled nodes indicate that some 3D structure is known or predicted. proteins were found to be the most similar to GbbZIP08 and GbbZIP15 proteins (Table S8, Table S9). AtbZIP56 (HY5) in subclass H has been reported to play an important role in anthocyanin biosynthesis 24 . MdWRKY11 transcription factor was reported to bind to the W-box in the MdbZIP67 promoter and thus participated in the anthocyanin biosynthesis in M. domestica 44 . The result implied that GbbZIP15 and GbbZIP08 might be involved in anthocyanin biosynthesis of G. biloba. SA is a crucial internal signaling molecule needed for the induction of plant defense responses upon attack of a variety of pathogens 45 . The expression of GbbZIP33 was increased by 20-fold 3 h after SA treatment. However, no significant changes were observed in the expression of GbbZIP33 after MeJA and low-temperature treatment. In Cassava, MebZIP3 was largely increased while MebZIP5 expression was decreased after SA treatment 46 . The trend of GbbZIP08 was relatively consistent in MeJA and SA treatment. Its expression level first increased, then decreased, and peaked 3 h after treatment. It showed a similar trend with CsBZIP40 in sweet orange. The expression level of CsBZIP40 increased significantly from 12 h and then decreased after 36 h After MeJA treatment 47 . With low-temperature treatment, GbbZIP08 showed an increasing trend from 0 h to 8 h, and the expression was significantly increased. Similarly, the result was also found in apple. MdbZIP67 (MdHY5) overexpression www.nature.com/scientificreports/ improved the cold resistance in apple and transgenic A. thaliana 48 . It indicated that GbbZIP08 might participate in the cold stress in G. biloba. The expression level of GbbZIP29 was upregulated under low-temperature stress while the low-temperature-responsive element (CCG AAA ) was found in GbbZIP29 In the promoter analysis.
In the protein-protein interaction analysis, it revealed that 33 out 40 GbbZIP members were on the proteinprotein interaction network (Fig. 7). BZIP53 (AT3G62420) and bZIP44 (AT1G75390) both involved in the positive regulation of seed germination through MAN7 gene activation 49 . Therefore, the interaction between BZIP53 (homology of GbbZIP09) and bZIP44 (homology of GbbZIP32) might participate in seed germination. GbbZIP08 and GbbZIP15 were speculated as homologous genes of HY5 (AtbZIP56). It has been reported that HY5 was involved in modulating seed germination and seedling development in response to abscisic acid and salinity 50 . HYH (HY5 homology) has a similar functions as HY5 in plant photomorphogenesis and anthocyanin accumulation 22,51 . It was speculated that GbbZIP34 (homology of HYH) was involved in flavonoid biosynthesis in G. biloba by interacting with GbbZIP08 and GbbZIP15 (homology of HY5).
The reported miRNAs associated with anthocyanin biosynthesis include miR858a, miR828, and miR156. In tomato, miR828 repressed anthocyanin expression by cleaving the structural gene SlMYB7-like 52 . 32 out of 117 miRNAs were annotated to 17 families (Table S5). In this study, GbbZIP08 and GbbZIP15 were targeted by novel_ miR_850 and novel_miR_715, respectively. The miRNAs (novel_miR_1493, novel_miR_30, novel_miR_3054, novel_miR_3292, novel_miR_443, novel_miR_462, and novel_miR_82) were annotated as miR1314 family and they all targeted GbbZIP37. MIR1314 appears gymnosperm-specific and is likely to exist as a luster 53 . GbbZIP06 was targeted by novel_miR_1493 and novel_miR_18 which were identified as miR397. Overexpression of native Musa-miR397 enhanced plant biomass without compromising abiotic stress tolerance in banana 54 . The miRNAs (novel_miR_2243, novel_miR_3072, novel_miR_2847, and novel_miR_775) were annotated as miR159 in this study. It was reported that miR159 inhibited the root growth by repressed the expression of MYB33, MYB65 and MYB101 in A. thaliana 55 .

Conclusions
In the current study, the genome-wide distribution of the GbbZIP gene family was identified in G. biloba. We conducted a preliminary analysis of the physicochemical properties, chromosome localization, evolutionary relationship, tissue-specific expression pattern, protein-protein interaction, and miRNAs. We verified the response of 20 GbbZIPs to MeJA, SA, and low temperature through qRT-PCR. Based on the correlation analysis and the phylogenetic analysis, GbbZIP08 and GbbZIP15 were inferred to be HY5 genes involved in anthocyanin biosynthesis in G. biloba. The results will be important for the further functional characterization of GbbZIP08 and GbbZIP15 in G. biloba.

Methods
Plant materials and treatments. Annual G. biloba seedlings were used in this study. The G. biloba seeds were derived from Ginkgo Resource Garden at the western campus of Yangtze University (E:112.15373°, N:30.357146°) and they were carried out in accordance with relevant institutional, national or international www.nature.com/scientificreports/ guidelines and regulation. Before sowing, G. biloba seeds were immersed in tap water for 3 days to absorb enough water. Then, they were taken out and covered with wet gauze to be promoted germination. When the radicle of the seeds grew to 1 cm, they were planted in pots (15 cm in height and 17 cm in diameter) and grown in a greenhouse (E:112.152283°, N:30.358268°). We watered the seedlings every 5 days. When the seedlings developed 5-6 leaves after 2 month, the leaves of seedlings were treated with 10 mM salicylic acid (SA), 1 mM methyl jasmonate (MeJA) and 4 °C, respectively 56,57 . Exogenous hormones were sprayed on the seedling leaves in this study. Leaves treated with water were used as the control. Three replicates were selected for each treatment, and each replicate contained 10 seedlings. After three different treatments, leaves of annual G. biloba seedlings in each treatment were collected at 0, 3, 6, 12, 24, and 48 h, separately 58 .
Identification of GbbZIP genes. The  Analysis of gene structure, conserved motif, and promoter. To further analyze the distribution of the GbbZIP genes in G. biloba, GbbZIP protein motifs were predicted on the website MEME (https:// memesuite. org/ meme/) 4 . To acquire the gene structure, the nucleus sequences of GbbZIP genes were submitted to GSDS website (http:// gsds. gao-lab. org/) 69 . The 2000 bp upstream sequences of the transcription start site in GbbZIPs were extracted from the G. biloba genome database, and their promoters were predicted on the Plant-CARE website (http:// bioin forma tics. psb. ugent. be/ webto ols/ plant care/ html/) 70 .

Expression pattern analysis based on transcriptome and correlation analysis.
To investigate the expression pattern of the GbbZIP gene family, we used FPKM of GbbZIP genes in eight tissues (root, stem, immature leaf, mature leaf, microstrobilus, ovulate strobilus, immature fruit, and mature fruit) based on the transcriptome data 71 . Subsequently, the correlation index between the FPKM of GbbZIP genes and the content of flavonoid in eight tissues was calculated (p < 0.05) by Pearson's correlation method using the OmicShare tools (http:// www. omics hare. com/ tools).
Protein-protein interaction network. The protein sequences of 40 GbbZIP genes were extracted and sent to the STRING website (http:// string-db. org/). The interaction network of GbbZIP proteins was predicted in the STRING website using a model plant A. thaliana 16,72 . The result was imported into Adobe Illustrator 2020 software for modification. miRNA target prediction of GbbZIP genes. Based on transcriptome and degradation data, miRNAs targeting the GbbZIP genes were screened from the annotation files 6,71 . The screening results were visualized using OmicShareTools (http:// www. omics hare. com/ tools), a free online platform.
qRT-PCR analysis. Total RNA of G. biloba was extracted using the MiniBEST Plant RNA Extraction Kit (TaKaRa, Dalian) in accordance with the manufacturer's instructions. First-strand cDNA was synthesized using HiScript II 1st Strand cDNA Synthesis Kit (Vazyme, Nanjing). qRT-PCR was performed according to the procedure of ChamQ Universal SYBR qPCR Master Mix (Vazyme, Nanjing). The Integrated DNA Technologies website (https:// sg. idtdna. com/ pages) was used to design the primers (Table S1). The relative expression level was normalized to the GbGAPDH (GenBank ID: L26924.1) gene and calculated using the 2 −∆∆Ct method 73 .
Statistical analysis. Statistical analyses were performed with IBM SPSS software (Version 22). The relative expression level was presented as the mean ± standard error (SE). Duncan's multiple range tests were conducted to examine signifcant diferences among means at p < 0.05.