Stable reference genes for RT-qPCR analysis of gene expression in the Musa acuminata-Pseudocercospora musae interaction

Leaf pathogens are limiting factors in banana (Musa spp.) production, with Pseudocercospora spp. responsible for the important Sigatoka disease complex. In order to investigate cellular processes and genes involved in host defence responses, quantitative real-time PCR (RT-qPCR) is an analytical technique for gene expression quantification. Reliable RT-qPCR data, however, requires that reference genes for normalization of mRNA levels in samples are validated under the conditions employed for expression analysis of target genes. We evaluated the stability of potential reference genes ACT1, α-TUB, UBQ1, UBQ2, GAPDH, EF1α, APT and RAN. Total RNA was extracted from leaf tissues of Musa acuminata genotypes Calcutta 4 (resistant) and Cavendish Grande Naine (susceptible), both subjected to P. musae infection. Expression stability was determined with NormFinder, BestKeeper, geNorm and RefFinder algorithms. UBQ2 and RAN were the most stable across all M. acuminata samples, whereas when considering inoculated and non-inoculated leaf samples, APT and UBQ2 were appropriate for normalization in Calcutta 4, with RAN and α-TUB most stable in Cavendish Grande Naine. This first study of reference genes for relative quantification of target gene expression in the M. acuminata-P. musae interaction will enable reliable analysis of gene expression in this pathosystem, benefiting elucidation of disease resistance mechanisms.

for RNA sequencing (RNAseq) and transcriptome analysis. To date, a number of large scale studies have been conducted in Musa to further understanding of defense responses and genes involved in response to different phytopathogens and pests, such as Fusarium oxysporum f. sp. cubense tropical race 4 [6][7][8] and the root-knot nematode Meloidogyne incognita 9 . In relation to foliar pathogens, to date, Illumina-based RNAseq transcriptome profiling of the host-pathogen interaction has been conducted with the Sigatoka leaf spot pathogen P. fijiensis 10 , with a characterization of unigenes expressed during interaction with P. musae determined through 454 sequencing of cDNA 11 .
In order to accurately quantify and validate gene expression derived from in silico NGS data, the reverse transcription quantitative real-time polymerase chain reaction (RT-qPCR) is today recognized as the benchmark approach for such work, given its' specificity, sensitivity and rapidity 12,13 . For this, however, RNA sample quality, cDNA preparation and dilution, and PCR sample preparation and reaction execution must be standardized. In accordance with the Minimum Information for Publication of Real-Time Quantitative PCR Experiments (MIQE) guidelines 14 , normalization of expression of each target gene to that of a reference gene is also required to correct data for variations in samples that occur during cDNA preparation, with reference genes also subject to the same effect of cDNA quality 15,16 . For such normalization, the employment of appropriate reference genes for each organism, tissue type and time point is therefore essential. The ideal characteristic of a reference gene is a stable expression for a tissue type, regardless of developmental stage or experimental treatment conditions 15,17 . Genes that are involved in basal cell activities in plants have been assumed to offer constitutive expression across tissues and independent of experimental conditions. Commonly employed genes have included actin and tubulin, given their role in the cell cytoskeleton, genes involved in protein synthesis (elongation factor) and protein degradation (ubiquitin), and genes involved in glucose metabolism (glyceraldeide-3-phosphate dehydrogenase). With such housekeeping genes, however, a stable transcript response is not necessarily guaranteed across different experimental conditions, as observed in numerous plant species [18][19][20] and can lead to inaccuracies in results, masking true interpretation of gene expression 16,21 .
In order to facilitate the identification of stable reference genes, a number of mathematical algorithms have been developed, including the programs geNorm, NormFinder and BestKeeper [22][23][24] . These have been widely employed in reference gene development across monocotyledonous plant species, including, for example, rice 25 , maize 26 , sugarcane 27 and Setaria 28 .
In the case of Musa, Chen and colleagues 29 conducted a study towards validation of reference genes for use in Cavendish bananas, with focus on fruit and peel at different developmental stages, after hormone treatment, and following fruit exposure to abiotic and biotic stress 29 . Genes were also validated using pooled plant tissues comprising root, leaf, flower, peel and pulp. Similarly, Podevin et al. 30 investigated the stability of reference genes for analysis of gene expression in leaf and meristem tissues across varieties grown in vitro, under greenhouse conditions and in leaf disc material 30 . Zhang et al. 31 also identified stable reference genes for root tissues in cultivars during interaction with Fusarium oxysporum f. sp. cubense race 1 and race 4 strains 31 . These data all highlight the importance of selection of multiple reference genes for normalization in accord with plant genotype, tissues, development stage and specific experimental conditions.
In this study, the stability of eight candidate reference genes for transcript normalization in whole plant leaf tissues was analyzed in two M. acuminata genotypes, contrasting in resistance to the foliar pathogen P. musae. The reliability of the most and least stable genes was determined through normalization of the relative expression of the target gene encoding RuBisCO activase (RCA). The selected reference genes will enable accurate gene expression analysis in this important pathosystem.

Results
Analysis of primer specificity and efficiency. A total of eight candidate genes were selected as potential reference genes in this study. Information on the candidate genes, in relation to accession number, primer sequences and amplification are summarized in Table 1. Specificity of each of the primers to a single gene locus was confirmed by dissociation curve analysis, with a single peak observed for each primer pair at the expected primer annealing temperature (Fig. 1), together with an absence of any signal in negative controls lacking template cDNA. PCR amplification efficiency values ranged from 90.55 to 98.21% across the candidate genes, as calculated by LinRegPCR software.
Analysis of expression levels of candidate reference genes. For each candidate reference gene, quantification cycle (Cq) values were presented for the expression data from each sample. Cq values ranged from 19.6 to 29.6, with broad differences observed between the candidate reference genes, highlighting the importance of statistical approaches for stability ranking in identification of accurate reference genes. The ACT1 gene (amplified with primer pair 2 [pp2]) displayed the lowest Cq value, indicating that it was more expressed than the other candidate genes. Conversely, the UBQ1 gene displayed the highest Cq values, indicating lowest gene expression. The Cq value ranges for the reference genes evaluated in the different Musa samples are displayed in Fig. 2, with a summary of all Cq values obtained for each treatment for the potential reference genes provided in Supplementary Table S1. expression stability. The performance of the candidate reference genes was assessed in the four biological samples, divided into three experimental sets; global pooled M. acuminata Calcutta 4 and Cavendish Grande Naine leaf material (from non-inoculated and inoculated plants); pooled M. acuminata Calcutta 4 leaf material (from non-inoculated and inoculated plants); and pooled M. acuminata Cavendish Grande Naine leaf material (from non-inoculated and inoculated plants). Stability in expression and ranking of the eight candidate reference genes, amplified with nine primer pairs, was determined using mathematical algorithms geNorm, NormFinder www.nature.com/scientificreports www.nature.com/scientificreports/ and BestKeeper. Re-ranking of genes was conducted using RefFinder. Summarized results from all analyses are presented in Table 2.
On the basis of geNorm analysis, many candidate reference genes showed stability values below the default limit of 0.5 when employed on data for different M. acuminata samples (Table 2, Fig. 3). UBQ2, RAN, UBQ1 and EF1α genes displayed stability values below 0.5 when employed on data for pooled M. acuminata Calcutta 4 and Cavendish Grande Naine leaf samples (from both non-inoculated and P. musae-inoculated, sample name 'Global'), indicating greatest stability. In the case of samples from M. acuminata Calcutta 4 leaf material, either from non-inoculated or P. musae-inoculated plants (sample 'Global C4'), all genes displayed stability values below the accepted threshold. APT and UBQ2 were highlighted as the most stably expressed genes, followed by UBQ1 and ACT1pp2. For samples from M. acuminata Cavendish Grande Naine leaf material (non-inoculated or P. musae-inoculated) (sample 'Global CAV'), all genes once again displayed M-values below the accepted threshold, with RAN and GAPDH genes most stable, followed by α-TUB and EF1α.
Calculation of pairwise variation was employed to determine the minimum number of reference genes required for stable normalization of gene expression data. According to Vandesompele and coworkers 22 , a conventional normalization of RT-qPCR based on a single reference gene can potentially lead to a more than 6-fold normalization error. Given this, geNorm automatically takes into account that at least two reference genes are employed for normalization. For all M. acuminata sample types (Global, pooled non-inoculated or P. musae-inoculated M. acuminata Calcutta 4 and Cavendish Grande Naine leaf samples; Global C4, pooled non-inoculated or P. musae-inoculated M. acuminata Calcutta 4 leaf samples; Global CAV, pooled non-inoculated or P. musae-inoculated M. acuminata Cavendish Grande Naine leaf samples; C4 mock, pooled non-inoculated M. acuminata Calcutta 4 leaf samples; C4 infected, P. musae-inoculated M. acuminata Calcutta 4 leaf samples; CAV mock, pooled non-inoculated M. acuminata Cavendish Grande Naine leaf samples; and CAV infected, P. musae-inoculated M. acuminata Cavendish Grande Naine leaf samples), the pairwise variation value of V2/3 was below the default cut-off value of 0.15, indicating that two reference genes are sufficient for accurate normalization of the gene expression data (Fig. 4). Values of V above the default limit, indicate that one or several extra reference genes would be required to improve RT-qPCR analysis 22 . Here, data revealed that the inclusion of further reference genes would not have a significant effect on accurate normalization of target gene expression.
NormFinder-based analysis provided gene stability ranking data with a certain overlap to that obtained with geNorm ( Table 2). When considering the combined dataset 'Global' , based on pooled M. acuminata Calcutta 4 and Cavendish Grande Naine leaf samples (non-inoculated or P. musae-inoculated), the four most stably expressed genes were in agreement, with UBQ1, UBQ2, EF1α and RAN genes ranked in order of greatest stability. In the case of data samples from M. acuminata Calcutta 4 leaf material, either from non-inoculated or P. musae-inoculated plants (sample 'Global C4'), by contrast, only the UBQ2 gene was highly ranked with both algorithms. In the case of analysis with the algorithm NormFinder, this gene ranked as the most stably expressed, followed by α-TUB, EF1α and RAN. When data was analyzed according to samples from M. acuminata Cavendish Grande Naine leaf material (non-inoculated or P. musae-inoculated) (sample 'Global CAV'), three of the four most highly ranked stable genes were again in agreement with geNorm data, with RAN ranked as most stable, followed by GAPDH, α-TUB and UBQ2.  www.nature.com/scientificreports www.nature.com/scientificreports/ ranked stable genes were in agreement, with the EF1α gene ranked most stable, followed by ACT1pp2, RAN and UBQ2. When considering the data samples 'Global C4' , three of the most highly ranked stable genes were in agreement with results obtained using NormFinder and two in agreement with geNorm-derived ranking. Here, the RAN gene ranked as most stable, followed by APT, UBQ2 and EF1α. In the case of data derived from the samples 'Global CAV' , two genes most highly ranked in terms of expression stability were in agreement with ranking data obtained using NormFinder and another two in agreement with ranking data obtained using geNorm. The four most highly ranked genes, in order of expression stability, were α-TUB, ACT1pp2, APT and GAPDH.  www.nature.com/scientificreports www.nature.com/scientificreports/ RefFinder was employed to enable a comprehensive re-ranking of expression stability in the candidate reference genes based on integration of the data from the previous three algorithms. When taking into account the complete 'Global' dataset, for both M. acuminata Calcutta 4 and M. acuminata Cavendish Grande Naine leaf material (non-inoculated and P. musae-inoculated), the four most highly ranked candidate reference genes, in order of stability, were: UBQ2, EF1α, RAN and GAPDH. In the case of the 'Global C4' dataset, the most stable candidate reference gene was APT, followed by UBQ2, RAN and EF1α. For the 'Global CAV' dataset, the most stable candidate reference gene was α-TUB, followed by RAN, UBQ2 and EF1α (Table 2).
A summary of all candidate reference gene rankings with the mathematical algorithms is provided in Supplementary Table S2. testing selected reference genes for normalization of M. acuminata genes expressed in leaves. For validation of reference gene ranking for leaf tissues, the relative expression of RuBisCO activase (RCA) was normalized against the reference genes. This catalytic chaperone is involved in modulating RuBisCO, a key enzyme in the plant photosynthetic pathway. Although three RCA genes were identified in the M. acuminata DH Pahang reference genome, only RCA3 could be amplified by conventional PCR in both M. acuminata Calcutta 4 and M. acuminata Cavendish Grande Naine. When considering the complete 'global' samples, the mathematical programs identified UBQ2 and RAN together as the most stable genes, with α-TUB and ACT1pp1 as the least stable reference genes. The former pair were evaluated for normalization of RT-qPCR expression profiles in M. acuminata Calcutta 4 and M. acuminata Cavendish Grande Naine leaf samples subjected to light and dark treatments. In both M. acuminata genotypes, RCA3 expression was down-regulated in samples from leaves covered with aluminium foil. Transcript levels of RCA3 in the three biological replicates showed that when employing the two most stable reference genes, variation between samples was reduced, whereas the same analysis using the two least stable genes introduced considerable variation in fold change in relative expression in samples across the biological replicates (Fig. 5).

Discussion
In this study, genes encoding proteins involved in basal cell activities in plants were evaluated as potential reference genes for normalization of gene expression data in M. acuminata leaf material through a multi-algorithm based analysis of expression stability.
The stability of reference genes in Musa has so far been evaluated across a limited number of specific experimental conditions. Chen and coworkers 29 tested candidate reference genes in Cavendish subgroup material, with stability determined across plant tissues, developmental stages, and following hormone treatment, abiotic stress and biotic stress after fruit inoculation with Colletotrichum musae 29 . Podevin and colleagues (2012) also evaluated expression stability of reference genes in genotypes subjected to different physiological growth conditions, osmotic stress, treatment with acetone and following biotic stress with the leaf pathogen Pseudocercospora fijiensis in cultivar M. acuminata Tuu Gia 30 . Zhang and colleagues 31 also evaluated the stability of reference genes in Cavendish Grande Naine root tissues infected with Fusarium oxysporum f. sp. cubense 31 . The present work is the first describing the stability of reference genes for leaf materials during the interaction with P. musae, within the context of accurate analysis of gene expression in genotypes contrasting in resistance to the pathogen.
Multi-algorithm analysis of candidate reference gene expression stability revealed differences in the top-ranked genes observed between the results generated with each algorithm, although there was greater consistency in identification, and therefore exclusion, of the least stable candidate genes. Differences in gene stability ranking between algorithms is common [45][46][47][48] , reflecting data analysis methods for each algorithm.
In our study, combinations of stable reference genes were identified for each condition employed. When considering all samples (Global set), analyses with all the mathematical algorithms consistently revealed UBQ2 and RAN ranked amongst the most stable genes, with α-TUB and ACT1pp1 ranked as the least stable. Previously, Chen and co-workers 29 also recommended UBQ2 and RAN as the most stable general reference genes for Cavendish bananas. Although UBQ genes have been widely employed in RT-qPCR normalization across different plant species 49 , including monocot species, variation in their stability is apparent according to physiological conditions employed. In Oryza sativa, for example, UBQ5 was among the most stable reference gene for plants exposed to hormone treatment, saline and drought stresses, whereas UBQ10 was among the least stable genes observed 50 . UBQ1 was also identified as a stable reference gene for rice plants subjected to temperature, saline and hormone treatment, fungal infection and defense signaling compounds 51,52 . In Saccharum sp., UBQ1 also presented stable expression in leaf tissues 27 . RAN is a ubiquitously expressed gene that encodes a small soluble GTP-binding protein which plays roles in vegetative growth and stress tolerance in Arabidopsis 53 . Such high ranking of RAN as a reference gene has also been described in Cucumis melo 54 and Corchorus capsularis 55 . According to Podevin and colleagues (2012), by contrast, ACT1, EF1α and β-TUB were identified as the most stable reference genes for leaf samples of a Cavendish genotype, with L2 and 25 S the least stable genes. In the same work ACT1 and β-TUB were identified as the most stable genes for leaf samples inoculated with P. fijiensis 30 . If we consider our sample groups, where leaf material was inoculated with P. musae, APT and GAPDH were always among the most stable genes for Calcutta 4 samples, while GAPDH and EF1α were the most stable genes for Cavendish Grande Naine. Such results highlight how expression stability of reference genes varies according to genotype, tissue or pathogen employed in biotic stress investigation.
Primer design must also be considered in order to avoid unspecific amplification and formation of secondary structures that may hamper the polymerase chain reaction. Since actin is commonly used as a reference gene for RT-qPCR, we designed two primer pairs for ACT1, each annealing at different positions within the target gene sequence, in order to test whether they would present different stability patterns. Both primer pairs resulted in similar size amplicons, presented similar Tms (°C), and showed similar amplification efficiencies (%) ( Table 1). Results revealed, however, that gene expression following amplification with primer pair ACT1pp2 was more stable than that observed using primer pair ACT1pp1. Such results highlight that in addition to gene selection as reference for RT-qPCR analysis, target gene region for amplification, primer annealing temperature, and amplicon size can all influence the efficiency of an experiment 56 . In our study, we restricted amplicon size difference to a maximum of 66 bp, with all selected genes transcribed by RNA polymerase II, the same enzyme involved in gene expression of most of the target genes studied by our group. Such standardization of annealing temperature, amplicon size and consideration of RNA polymerase enzyme for reference and target gene transcription is the first for Musa reference genes for RT-qPCR. www.nature.com/scientificreports www.nature.com/scientificreports/ RuBisCO is a key enzyme of the photosynthetic CO 2 assimilation and photorespiratory pathways 57 . This enzyme comprises a large subunit, encoded by a plastidial gene, and smaller subunits, encoded by nuclear genes 58,59 . Activity can be regulated by RuBisCO activase (RCA), a chloroplast protein encoded by the nuclear genome. This enzyme facilitates dissociation of the sugar phosphates from RuBisCO, decreasing CO 2 concentration for catalyzing the carbamylation of its catalytic lysine 60,61 . Previous studies have shown that RCA participates in the developmental stages of the leaves, increasing activity in mature tissues 62 . Silencing of an RCA isoform has also been shown to result in a reduction in photosynthesis 63,64 . Jiang and coworkers 65 identified two RCA genes in sweet potato, where mRNA levels varied according to the intensity of light or heat 65 . Here, we identified three RCA genes in the M. acuminata reference (DH Pahang) genome, with RCA3 amplifiable by conventional PCR using genomic DNA and cDNA from M. acuminata Calcutta 4 and Cavendish Grande Naine genotypes. In order to validate the stability of the reference genes, we examined the relative expression of RCA3 in both genotypes subjected to light/dark treatments. When normalization of RT-qPCR was performed using the most stable reference genes (UBQ2 and RAN), a reduced relative expression of RCA3 was observed for dark treated leaf samples, with little variation between cDNA samples for biological replicates. By contrast, when normalization was conducted with the least stable reference genes (α-TUB and ACT1pp1), an over-estimation of transcript levels was observed in certain samples. These results clearly indicated that different reference genes influenced the relative expression levels of target genes under the specific conditions employed in this study. Similar such inaccurate estimation of relative expression of target genes during validation of potential reference genes has been shown across different plant taxa, for example the grass species Urochloa brizantha, and Coffea spp 66,67 .
In summary, this is the first report detailing validation of reference genes in M. acuminata genotypes contrasting in resistance to P. musae. As determined using the program geNorm, of the eight candidate reference genes tested, a minimum number of two genes were always sufficient to provide the necessary accuracy for RT-qPCR analysis under the experimental conditions employed. Considering the frequency of the top ranked stably expressed reference genes, UBQ2 and RAN are appropriate for normalization in studies that consider P. musae-inoculated or non-inoculated leaf samples for both M. acuminata genotypes Calcutta 4 and Cavendish Grande Naine. The various reference gene sets developed here for the genotypes contrasting in resistance to the Sigatoka leaf spot pathogen P. musae will enable accurate analysis of gene expression and pathways activated in this pathosystem, advancing understanding of host responses to this important foliar pathogen. For cDNA synthesis, pools containing equimolar amounts of total RNA from three biological replicates were prepared. In total, eight pools were prepared, one for each experimental condition (Table 1). A total of 1 μg of total RNA was reverse transcribed to cDNA using Super Script II RT and Oligo(dT) [16][17][18] primers (Invitrogen, Carlsbad, CA, USA). primer design. For the design of potential reference genes with stable expression in M. acuminata leaf tissues, candidate genes were selected that encode proteins involved in basal cell activites in diverse plant species, including Musa (Table 1). OligoPerfect TM Designer (Thermo Fischer Scientific, Waltham, MA, USA) was employed for design of specific primers for each candidate reference gene. Expected target amplicons varied from 80 to 150 bp, with all primers designed to specifically amplify predicted exon-exon junctions, guaranteeing amplification from cDNA rather than potential contaminant genomic DNA. Primer specificity was initially tested by electronic PCR against a local database of RNAseq data for Musa and against the reference whole genome sequence for M. acuminata, genotype DH Pahang at Phytozome (phytozome.jgi.doe.gov) 10 . Specificity and efficiency was subsequently tested against cDNA originating from leaf tissues of M. acuminata CAV and C4 (data not shown). Primer sequences that resulted in specific amplification are listed in Table 1 Baseline correction and efficiency determination. Raw ΔRn data was analysed using the LinRegPCR program, version 2017.1, with all calculations performed according to the manufacturer's instructions. Baseline fluorescence was corrected by reconstruction of the log-linear phase. Data was then grouped and applied to determine RT-qPCR mean efficiency for each gene. The same program was subsequently used to calculate the average quantification cycles (Cqs) per sample 68 . expression stability and relative expression analysis. Expression data for each candidate reference gene was visualized as a quantification cycle (Cq) value from each RNA sample, as obtained from LinRegPCR analysis. This value represents the number of amplification cycles required to reach a default threshold value for detection during the exponential amplification phase of the PCR reaction.

Methods
The expression stability of each candidate reference gene was determined using the algorithms geNorm 22 , NormFinder 23 and BestKeeper 24 , according to the default software parameters. GeNorm and NormFinder use the Cq data to calculate Relative Quantities (RQ) by application of the formula RQ = 2 (ddCq) . GeNorm evaluates expression stability by calculating an average M-value, which is defined as the pairwise variation of one gene against all other genes. Those genes presenting the lowest M-values below a threshold of 0.5 have the most stable expression. The program also calculates a V-value, a pairwise variation (Vn/n + 1) between each reference gene, to determine the optimal number of genes required for normalization. In the case of NormFinder, variation between groups of genes at both the inter-and intra-group level is calculated with ANOVA and used to generate a stability value (SV) for each reference gene, with the most stable genes presenting the lowest SV values below the default limit of 0.5. As such, this method can be more appropriate for providing estimates of expression variation from subsets of different sample types 23 . BestKeeper is a tool that calculates the geometric mean of Cq values, with the most stable genes indicated by high correlation coefficients and low standard deviations. Raw Cq values are used to calculate the geometric mean and rank the reference genes according to the lowest stability values. A global re-ranking of the candidate genes stability was conducted using the web-based tool RefFinder (http://www. leonxie.com/referencegene.php). RefFinder allows integration of data generated with all the algorithms geNorm, NormFinder, Bestkeeper and delta Ct method, without considering RT-qPCR efficiencies. Through assigning a weight to each individual gene, a geometric mean weight of the candidate genes can be calculated, which can then be used to rank the stability of each individual reference gene.
RCA3 expression analysis. The youngest fully emerged leaf from 3 month old plants was subjected to a photoperiod of 12/12 (L/D) at 25 °C for 7 days. In a second group, leaves were covered in aluminium foil paper and maintained at 25 °C for 7 days. After this period, leaves were flash frozen in liquid nitrogen and preserved at −80 °C until use. Three independent replicates were collected for each leaf sample. Total RNA, cDNA synthesis and real time PCR were performed as described previously.

Data Availability
All data generated and analyzed during the study is included in the published article and its Supplementary Information files.