Characteristics of banana B genome MADS-box family demonstrate their roles in fruit development, ripening, and stress

MADS-box genes are critical regulators of growth and development in flowering plants. Sequencing of the Musa balbisiana (B) genome has provided a platform for the systematic analysis of the MADS-box gene family in the important banana ancestor Musa balbisiana. Seventy-seven MADS-box genes, including 18 type I and 59 type II, were strictly identified from the banana (Pisang Klutuk Wulung, PKW, 2n = 2x = 22) B genome. These genes have been preferentially placed on the banana B genome. Evolutionary analysis suggested that M. balbisiana MCM1-AGAMOUS-DEFICIENS-SRF (MbMADS) might be organized into the MIKCc, MIKC*, Mα, Mβ, and Mγ groups according to the phylogeny. MIKCc was then further categorized into 10 subfamilies according to conserved motif and gene structure analyses. The well-defined MADS-box genes highlight gene birth and death in banana. MbMADSes originated from the same ancestor as MaMADSes. Transcriptome analysis in cultivated banana (ABB) revealed that MbMADSes were conserved and differentially expressed in several organs, in various fruit developing and ripening stages, and in stress treatments, indicating the participation of these genes in fruit development, ripening, and stress responses. Of note, SEP/AGL2 and AG, as well as other several type II MADS-box genes, including the STMADS11 and TM3/SOC1 subfamilies, indicated elevated expression throughout banana fruit development, ripening, and stress treatments, indicating their new parts in controlling fruit development and ripening. According to the co-expression network analysis, MbMADS75 interacted with bZIP and seven other transcription factors to perform its function. This systematic analysis reveals fruit development, ripening, and stress candidate MbMADSes genes for additional functional studies in plants, improving our understanding of the transcriptional regulation of MbMADSes genes and providing a base for genetic modification of MADS-mediated fruit development, ripening, and stress.

Though ananalysis of the MADS-box genes in the banana A genome is available 13 , a descriptive genomewide phylogenetic and functional characterization of MADS-box genes in the banana B genome is still missing. To advance our knowledge of the characteristics of MADS-box genes in the banana B genome and to further studies on this pivotal transcription factor family, we present an in-depth analysis of the number, corresponding relationship with MaMADSes, phylogeny, location, structure, and expression and co-expression network of MADS-box genes in the recently released B genome database 31 . We find the type I and SEP/AGL2, AGL17, DEF/GLO subfamilies to be more significantly contracted and the SQUA/AP1, TM3/SOC1, and STMADS11 subfamilies to be more expanded in the B genome than the A genome. Therefore, we hypothesize that the efficient utilization and extensive sub-and neofunctionalization in these subfamilies are responsible for the extensive distribution of banana.

Results and discussion
Seventy-seven MADS-box genes are preferentially placed on the banana B genome. To strictly identify banana MADS-box genes in the B genome, we searched the banana B genome database with MADS-box sequences from the banana A genome as queries using BLAST, Hidden Markov Model searches, Swiss-Prot, and Clusters of Orthologous Genes (COG) functional annotation to establish MbMADSes. After comprehensive consideration, we identified 77 putative MADS-box members from the banana B genome. Additionally, analysis of the conserved motifs verified that the identified MbMADSes possessed the conserved MADS domain, the primary attribute of the MADS-box family. Of these 77 predicted banana MADS-box proteins, there was variation of amino acid residues that spanned 64 (MbMADS58)-818 (MbMADS32), relative molecular masses that spanned 7.4 (MbMADS58)-87.6 (MbMADS32) kDa, and isoelectric points that spanned 5.1-11.5 (Supplementary Table S1). For characterization of the evolutionary relationships among MbMADSes from the banana A and B genome, a maximum likelihood (ML) evolutionary tree was made ( Fig. 1; Supplementary  Table S1). Using the genome database (http://banan a-genom e.cirad .fr/) (released in 2019), we found that the 77 MbMADSes were localized on 11 chromosomes. The maximum number included 10 genes (13.0%) localized on chromosome 5, followed by eight (10.4%) on chromosome 1, 8, 10, and 11, seven (9.1%) on chromosomes 2, 3, and 4, and five (6.5) on chromosome 6; only four MADS-box genes were localized on chromosome 7.
Due to gene duplication-transposition, the gene birth and death rate for type I MADS-box is higher than type II 32 . Most type I genes are functionally redundant or silent and only partly required for regulating coenocytic development, while their other functions remain elusive 33,34 . In comparison with MaMADSes, the number of type I MbMADSes was greatly decreased, with 13 fewer than MaMADSes 13 . This result suggests that type I MADS-box in the B genome shows a higher gene death rate and that the B genome banana efficiently uses type I to regulate female gametogenesis and seed development, which is consistent with the report that the banana B genome has greater gene family contraction and loss than the A genome 31 .
The number of type II MbMADSes was relatively stable, with only six less than MaMADSes. The subfamilies of SQUA/AP1, TM3/SOC1, and STMADS11 displayed slightly greater gene expansion, with two, two, and one more than in the A genome. Subfamilies SEP/AGL2, AGL17, DEF/GLO, GGM13, and MIKC* displayed slight gene contraction with three, three, three, one, and one less than the A genome. Nevertheless, AG, AGL12, and OsMADS32-like remained unchanged. These results indicate that type II MbMADSes were more conserved than type I, and mild sub-and neofunctionalization in these subfamilies may be linked to the complex morphology and environmental distribution of banana.
Being similar to MaMADSes, the MIKC c subfamilies, such as STK and FLC, are exclusive to the banana B genome (Fig. 1). FLC subfamily genes determine flowering time 35 . The cause for the death of the FLC subfamily might be consistent with the tropical character of banana, the flowering of which does not need low temperature stimulation, allowing the plant to flower randomly at any time. STK subfamily genes control ovule development 36 .
Scientific Reports | (2020) 10:20840 | https://doi.org/10.1038/s41598-020-77870-w www.nature.com/scientificreports/ The death of STK corroborates the greatly decreased number of developing ovules and the key evolutionary step of the long-term selection for seedless fruits in wild banana, which caused sterility and improved the palatability of wild seedy banana fruits 37 During banana development, 20-70% of ovules are lost. Even in cases where the ovules appear normal, approximately 50% of the normal ovules remain unfertilized despite a sufficient supply of pollen to the stigma 38 . The remaining developing ovules might be controlled by other subfamilies such as AG, SQUA/AP1, and TM3/SOC1. Together, this highlights that banana can realize its evolutionary advantages and fully utilize its MADS-box genes in flower and fruit development.
Conserved and variable structure exhibit adaptability. The structure of a MADS-box gene decides its function. Generally, MADS-box genes consist of four domains: MADS, I, K, and C. The I domain may be responsible for protein dimer formation. The K domain is responsible for protein dimerization, and the C terminal domain may be responsible for transcriptional activation and protein complex formation 39,40 .
To obtain the characteristics of the MbMADSes proteins, we used MEME software to identify 10 conserved motifs in total, and we used the InterPro database to annotate them (Fig. 2). Moreover, the exon-intron structure was also obtained by gene structure display server (GSDS) (http://gsds.cbi.pku.edu.cn/). All the MbMADSes proteins contain the conserved MADS domain (Motif 1). In terms of both the conserved domain and exon-intron analysis, type I MbMADSes were the most simple MADS-box proteins and contained two to five motifs. Fifteen out of 18 type I MbMADSes, except MbMADS 9, 52, and 73, contained a MADS domain (Motif 1) and a variable C terminal domain ( Motif 7,8,9 or 10). Consistent with the domain analysis, 14 out of 18 type I MbMADSes, except MbMADS 9, 18, 52, and 73, were intronless (Fig. 3). This simple structure might facilitate their role in evolution and the regulation of seed development 33,34 .
Compared with type I, the structure of type II MbMADSes was more complex, and most contained six to seven motifs. Ninety-three percent (55/59) of type II MbMADSes, except for MIKC*, contained Motif 3 with or   Fig. 4c), exist in banana and rice, suggesting that ancestral MADSes genes were present before banana and rice diverged (Fig. 4). Synteny was observed inthe A and B genome divergence ( Fig. 1; Supplementary Table S2). For example, MaMADS81 and 103 are sister to MbMADS20 and 69, respectively. A total of 73 pairwise genes were formed between the banana A and B genome in Fig. 1, among which 65 pairwise genes were formed by MbMADSes with  Table S2).
A one-to-one correspondence was generated, with a high frequency of 75% on the same chromosome and 25% on the different chromosomes. For example, MbMADS6 and MaMADS9 are sisters located on the same chromosome 1, while MbMADS2 and MaMADS25 are sisters located on chromosomes 1 and 3, respectively ( Fig. 5; Supplementary Table S2). This close evolutionary relationship suggests that A and B genome banana originated from the same ancestor and a number of syntenic events occurred in these lineages, leading to the syntenic divergence of banana A and B before the polyploidization of banana (Fig. 1). The one-to-one correspondence from the different chromosomes indicates that chromosomal cross over, exchange, recombination, as well as transposable elements and long terminal repeat retrotransposons might have occurred during divergence from the common ancestor 42,43 . This result corroborates the report that irregularities, including bridges, fragments, and lagging univalents, can be detected in a significant proportion of clones during microsporogenesis and the second meiotic division 3 . Seven pairwise genes were formed by MaMADSes themselves, and one pairwise gene was formed by MbMADS7 and MbMADS61. The close gene vicinity to each other suggests that subfamily expansion may have proceeded via tandem duplications (Fig. 1).

Conserved and differential expression profiles of MbMADSes genes in Banana (ABB) .
To evaluate the organ-specific expression characteristics of MADS genes in banana (ABB), the roots, leaves, flowers, and fruits were subjected to RNA-seq analysis. Of the 77 MbMADSes genes, 68 genes (except for MbMADS3, 8, 17, 31, 37 50, 58, 59, 69) were expressed in at least 1 examined organ ( Fig. 6; Supplementary  The finding that so many MbMADSes were highly expressed in the roots is in agreement with the review that MADS-box genes are fundamental in root development 44 . The highly elevated expression levels and gene numbers in the flowers and fruits imply that MbMADSes have more significant parts in the flowers and fruits than in other organs, which is in line with the early study that MADS-box transcription factors are pivotal in flower and fruit development 45,46 . The phenotypes of fruit development and ripening process were as shown in Fig. 7a. Along with fruit ripening, the ripening-related physiological parameters significantly changed. The ethylene production significantly increased and reached the highest of 21.5 ng. g −1 h −1 while the fruit pulp firmness greatly decreased and reached the lowest of 0 at 6 DPH (Fig. 7b,c). Moreover, the colors of a, b, and L gradually increased and reached the highest of 7.2, 39.8 and 68.5 at 6 DPH, respectively (Fig. 7d). These results were consistent with our recently report of Wang et al. (2019) 31 .
To evaluate the contribution of MbMADSes genes in fruit development and ripening, the expression of MbMADSes genes was evaluated in fruits sampled from 0, 20, and 80 days after flowering (DAF) and 0, 3, and 6 days postharvest (DPH) of the fruits (Fig. 7e; Supplementary Table S4). Among the 77 MbMADSes, 68 genes (except for MbMADS3, 8, 17, 31, 37 50, 58, 59, 69) were differentially expressed at various fruit development and ripening stages. Sixty-three (92.  , and 1174, respectively; this suggest that these genes might function prominently in developmental and ripening processes of banana fruit. These results were closely aligned with the report that the AG and SEP subfamilies are the key regulators of fruit development and ripening 21,22,47 . The finding that MbMADS36 was highly expressed in both flower and fruit development could be explained by the morphologically in distinguishable pistillate and staminate flowers that are biseriately arranged in a cluster 3 .

Expression profiles of MbMADSes genes under abiotic and biotic stresses in Banana (ABB)
. Banana is a valuable fruit of tropic and subtropic environments and can adapt to environmental stresses 48,49 .
The prolonged process of banana evolution represents the long history of plant domestication. Banana propagates vegetatively by divisions known as "pups" or "suckers. " As the scale of production increased, the sterile cultivars were grown in close proximity in large quantities, resulting in attack by pathogens because of a lack of genetic diversity 50 . Fusarium oxysporum f. sp. cubense tropical race 4 (Foc TR4) is believed to be a major and destructive disease of banana, ranking in the top six of significant global plant diseases 51 . Foc TR4 targets banana plant roots and colonizes the vascular system of the rhizome and pseudostem. Within 5-6 months of planting, distinctive internal and external wilting symptoms can typically be observed 52 . Thus, understanding the molecular mechanism of abiotic stress and Foc TR4 infection is a priority for the sustainable development of the banana industry.
These eight interacted transcription factors were validated by quantitative real-time (Fig. 9b). Moreover, MbMADS75 and bZIP (Mb_10_t14180.1) were selected to identify their interaction by Yeast Two-Hybrid assay www.nature.com/scientificreports/ (Fig. 9c). The reason for this selection is that bZIP transcription factors are crucially implicated in plant development and responses to numerous stresses 64,65 . The results of the qRT-PCR demonstrated that the eight interacting genes possessed the same expression pattern as MbMADS75, except that the WRKY was down-regulated by Foc TR4, which requires further investigation. The result of the Yeast Two-Hybrid assay demonstrated that MbMADS75 could interact with bZIP (Mb_10_t14180.1) to perform its function.
In conclusion, we identified a total of 77 MADS-box genes from the banana (Pisang Klutuk Wulung, PKW) B genome. We classified these as the MIKC c , MIKC*, Mα, Mβ, and Mγ groups according to the phylogeny, and also organized MIKC c into 10 subfamilies. The well-defined MbMADSes highlight gene birth and death in banana. MbMADSes originate from the same ancestor as MaMADSes. Major genes that demonstrated high expression in fruit development, ripening, and the stress treatments were part of the SEP/AGL2 and AG subfamilies. www.nature.com/scientificreports/ Several type I and other subfamilies, including the TM3/SOC1 and STMADS11 MbMADSes genes, demonstrated high expression in the process of banana fruit growth, ripening, and stresses, which suggests their novel parts in controlling fruit development, ripening, and stress responses. Interactive network analysis indicated that MbMADS75 interacted with bZIP and seven other transcription factors to perform its functions. These findings contribute greatly to our understanding of the contributions of MbMADSes in the regulation of banana fruit development, ripening, and environmental adaptation processes, and enable further breeding and genetic improvements in agriculture.  69 ; Following this, we carried out BLAST searches to establish the anticipated MbMADSes in the banana database, using all Arabidopsis and rice MADSes as queries. We ultimately assessed each of the candidate protein sequences using the CDD (http:// www.ncbi.nlm.nih.gov/cdd/) and PFAM (http://pfam.sange r.ac.uk/) databases. Then, we used multiple sequence alignments to verify the conserved domains of the predicted MbMADSes proteins. Further, we used Clustal X 2.0 to perform sequence alignments of the full-length MADSes proteins from banana, Arabidopsis, and rice. A maximum likelihood (ML) evolutionary tree with 1000 bootstrap replicates was produced in MEGA 7.0 software to assess the phylogenetic relationships 70 .

Methods
Protein characteristics and sequence analyses. The molecular weight and isoelectric points of the predicted MbMADSes proteins were predicted with the ExPASy proteomics server (http://expas y.org/). Using the MEME program (http://meme.nbcr.net/meme/cgi-bin/meme.cgi), we identified the conserved motifs in the full-length banana MADS proteins based on the parameters: maximum motif number of 10 and optimum motif width of between 6 and 50. We subsequently annotated all identified motifs using InterProScan (http://www.ebi. ac.uk/Tools /pfa/iprsc an/). We identified the gene structures of banana MbMADSes using the GSDS program. Quantitative real-time PCR analysis. The changes in the transcriptome of MbMADS75 and the other eight interacted genes were evaluated by qRT-PCR analysis on Stratagene Mx3000P Real-Time PCR system with SYBR Premix Ex Taq (TaKaRa, Japan). The PCR amplification conditions utilized for each of the reactions were as follows: 10 min at 95 °C, followed by 40 cycles of 10 s at 95 °C, 15 s at 50 °C and 30 s at 72 °C. Target gene relative expression levels were estimated using the 2-ΔΔ Ct method 74 . The reaction specificities for every primer pair were evaluated utilizing qRT-PCR melting curve analysis, agarose gel electrophoresis, and sequencing of the PCR products (Supplementary Table S7). MaRPS2 (HQ853246) and MaUBQ2 (HQ853254) constituted the internal controls that were used to normalize the target gene expression 75 . Each treatment sample had a corresponding regularly-watered control, with three independent biological replications for each sample. We sampled treatment and control plants at every time point for expression analysis, and contrasted the relative expression levels of genes in every treatment time point with those in every time point under normal conditions.

Regulatory network construction.
Based on the B genome database 32 and transcriptome analysis, we selected MbMADS75-which was especially expressed throughout the fruit developing and the ripening process-as the "from node" and the interactive proteins as "to node direction" to establish a gene regulatory network diagram using Cytoscape software (version 3.4.0).