Evaluation of reference genes at different developmental stages for quantitative real-time PCR in Aedes aegypti

The mosquito Aedes aegypti (Ae. aegypti) is the most notorious vector of illness-causing viruses such as Dengue, Chikugunya, and Zika. Although numerous genetic expression studies utilizing quantitative real-time PCR (qPCR) have been conducted with regards to Ae. aegypti, a panel of genes to be used suitably as references for the purpose of expression-level normalization within this epidemiologically important insect is presently lacking. Here, the usability of seven widely-utilized reference genes i.e. actin (ACT), eukaryotic elongation factor 1 alpha (eEF1α), alpha tubulin (α-tubulin), ribosomal proteins L8, L32 and S17 (RPL8, RPL32 and RPS17), and glyceraldeyde 3-phosphate dehydrogenase (GAPDH) were investigated. Expression patterns of the reference genes were observed in sixteen pre-determined developmental stages and in cell culture. Gene stability was inferred from qPCR data through three freely available algorithms i.e. BestKeeper, geNorm, and NormFinder. The consensus rankings generated from stability values provided by these programs suggest a combination of at least two genes for normalization. ACT and RPS17 are the most dependably expressed reference genes and therefore, we propose an ACT/RPS17 combination for normalization in all Ae. aegypti derived samples. GAPDH performed least desirably, and is thus not a recommended reference gene. This study emphasizes the importance of validating reference genes in Ae. aegypti for qPCR based research.


Results and Discussion
Primer evaluation and amplification efficiency of candidate reference genes. Seven candidate genes from three functional classes were investigated: (i) structure-related genes: ACT and α-tubulin, (ii) ribosomal and protein-production genes: eEF1a, RPS17, RPL8 and RPL32, and (iii) metabolism-related gene: GAPDH. Primer pairs were evaluated through standard curve generation with serially diluted pooled cDNA. Efficiency(E) and linear regression coefficient (R 2 ) values are observed to determine the performance of the designed primers in detecting and amplifying cDNA at very high to very low concentrations. All recorded acceptable E values between 92.5 and 100.8% with R 2 values ranging from 0.979 to 0.997 (Table 1). Amplification specificity was displayed through the production of a singular peak in melt-curve analysis. Purified qPCR products were sequenced to show specificity and accuracy whereby (a) each primer produced a singular sequence output, and (b) the sequence aligns with the cDNA of the expected gene through BLAST (https://blast.ncbi.nlm.nih.gov/Blast.cgi). Sequences are available on GenBank's BankIt depository with the accession numbers KY000701 to KY000707. Post-run 2% agarose gel further confirmed single amplicon production (see Supplementary Figure S1B).
Expression levels and sample integrity. Expression levels were quantified and individual candidate variability in any developmental stage, cell culture, and throughout all assays are summarized as Box-Whisker plots in Fig. 1. The highest recorded Ct value amongst the seven genes was by RPS17 at 15.02, whereas the lowest was by α-tubulin at 33.55. RPS17 and α-tubulin each produced the highest and lowest mean Ct values of 17.86 and 23.83, respectively, when all sample types (n = 17) were taken into account. The other candidate genes were also expressed at high moderate to low moderate levels, with mean Ct values of 18.94, 19.61, 19.71, 20.12, 21.32, and 23.65 each to RPL32, RPL8, eEF1α, ACT, and GAPDH. The target gene CTPsyn had a mean Ct of 22.50, with values ranging from 18.99 to 26.32. This means that comparatively, CTPsyn displayed a narrower Ct variation than six out of the seven candidate genes i.e. ACT, eEF1α, α-tubulin, GAPDH, RPL8 and RPL32 ( integrity is inferred from the intrinsic variation (InVar) score as generated by the BestKeeper algorithm. Removal of samples with scores in excess of ± 3.0 is recommended 27 . In Var scores for most of the individual developmental stages were low and did not exceed the proposed exclusion value. Triplicate variability was acceptable for all samples. As expected, InVar scores were found to be excessively high when all Ct values were pooled. This strengthens the opinion that no single gene would satisfy all stages, and that each would likely have a unique set of reference genes most suitable for normalization.
BestKeeper analysis. BestKeeper estimates the standard deviation (SD) value of each candidate gene from raw Ct numbers. An SD > 1 signifies that the variations in expression of a gene within a sample of the same origin are high, and thus indicating its instability. Our data demonstrated that not all candidates were stable across all samples (see Supplementary Table S2). Expression appears to vary most within adult tissues. All barring RPS17 were unstably expressed in adult male samples. RPL8, α-tubulin, eEF1α and RPL32 showed high SDs in non-blood fed female adults. Additionally, ACT was shown to be unstable in 48 to 72 hour embryos; α-tubulin in 6 to 9 hour embryos; eEF1α in 48 to 72 hour embryos and fourth instar larvae; RPL8 throughout the 24 to 72 hour embryonic periods as well as cell samples; RPL32 in 24 to 48 hour embryos and both first instar and third instar larvae; and GAPDH in 48 to 72 hour embryos, all larval stages, and cell culture. The target gene CTPsyn gave relatively low SD values, only exceeding 1 in 48 to 72 hour embryos and Aag2 cells. Ranking of genes is based on the value given as BestKeeper vs Pearson correlation of coefficient. The closer this value is to 1, the greater the reliability of the gene. A third component of the program's statistical analysis is a P-value, where P < 0.05 indicates the correlation of a candidate gene to the BestKeeper index calculated as the geometric mean of the Ct values. Our data showed that for all instances where the BestKeeper vs Pearson correlation of coefficient value is above 0.67, the gene would be significantly correlated to the BestKeeper index. With genes carrying SD values of above 1 excluded, RPS17 is most reliable for the 48 to 72 hour embryos, first instar larvae and adult male stages; α-tubulin for 0 to 3 hour and 24 to 48 h embryos as well as fourth instar larvae; eEF1α for both 6 to 9 hour and 18 to 24 hour embryos, along with third instar larvae; ACT for the 9 to 12 hour embryonic stage as well as both non-blood fed and blood-fed adult female samples; RPL8 for 3 to 6 hour and 12 to 18 hour embryos, second instar larvae, and pupae; and RPL32 for Aag2 cell culture samples. However, for comparison purposes, all genes are included regardless of SD values. Rankings are shown in Table 2. geNorm analysis. geNorm determines the expression stability of selected candidate genes based on a data comprised of relative values, i.e. the degree of fold differences observed between Ct values of a sample set in relation to the lowest recorded value. Two assessment outcomes are provided by the software. The first is an average expression stability score as symbolized by M. The higher the M-value of a gene, the less stable it is perceived to be. This value should fall below 1.5. Rankings based on the M-value are in Table 3. Single-normalizer strategies can reliably apply RPL8 or eEF1α for embryo-derived samples of any time point within the first 24 hours, and either ACT or RPS17 for samples from 24 through to 72 hours. RPS17, α-tubulin or RPL8 appear suitable for larval stages; RPS17 or eEF1α for pupae samples; ACT or RPL32 for adult stages, and eEF1α or RPL32 for cell culture samples. A summarization of rankings in charts as provided by the software is as shown in Fig. 2A. The second outcome from geNorm is a pairwise variation or V value which estimates the effect of a gene addition event 10 . The proposed cut-off value is 0.15. Our data showed that for ten out of the seventeen sample types, two reference genes may be enough for normalization of target gene expression (Fig. 2B). In 0 to 3 hour embryos, the addition of a third gene is recommended for proper normalization. For the other developmental stages, none of the gene inclusion events resulted in satisfactory V values. NormFinder analysis. Similar to geNorm, the data utilized by this program is based on relative values, and not raw Ct data. The algorithm produces a stability value for each gene where a lower value indicates greater stability. The program does not make suggestions for a cut-off value 28 . Rankings formed by NormFinder are summarized in Table 4. ACT and eEF1α are interchangeable in terms of usability as the reference gene for embryos aged between 0 to 48 hours and all adult samples, whereas RPL32 is best for embryos aged between 48 and 72 hours. RPS17 is the best performing gene in both first and second instar larvae as well as pupae; α-tubulin for the third and fourth larval stages, and eEF1α for cell culture samples. GAPDH and α-tubulin are not recommended for embryonic, larval nor cell samples, and RPS17 should not be the normalizer for adults. In the absence of group identifiers, it is presumed that the two genes with the lowest stability value within a sample set would provide the best combination for two-reference gene normalization strategies 29 . Consensus list of reference genes. Consensus rankings are obtained through geometrically averaging the weights assigned to each gene (in the form of stability values from geNorm and NormFinder, and a function of 1-((BestKeeper vs. Pearson correlation coefficient value) from BestKeeper) as generated by the three programs. All genes are included regardless of BestKeeper SD values. Results are summarized in Table 5. The three top-ranked genes in the consensus list for any developmental stage are typically considered to be most reliable e.g. eEF1α, RPS17, GAPDH for pupal samples; α-tubulin, RPL8 and RPS17 for both the third and fourth instar stages; and RPL32 alongsideeEF1α andRPS17 in Aag2 cell culture samples. However, as these vary greatly from one developmental stage to the next, reliability across sample types is also assessed on the basis of overall frequency at which a gene appears amongst the top-three. ACT and RPS17 are the most reliable, with a frequency of 0.216(11/51) and 0.196 (10/51), respectively.

Validation of consensus rankings using CTPsyn target gene.
To evaluate consensus ranking outcomes, an assumptive analysis was undertaken. Relative expression of the target gene CTPsyn was investigated ineach sample type. This gene was chosen due to its expected expression stability. Therefore, it is assumed that the 'true' fold change value for any developmental stage is derived solely from the △ Ct of CTPsyn (2 ΔCt CTPsyn ). Fold-change as predicted by a normalizing gene(s) is calculated with the Livak method i.e. 2 −ΔΔCt (see Supplementary Table S3). Outcomes of normalization against single, top-two consensus-ranked, top-three consensus-ranked, as well as ACT/RPS17 gene combinations are as shown in Table 6. For thirteen out of the sixteen developmental stages as well as cell samples, a single-normalizer strategy estimated CTPsyn fold-change most effectively. RPS17 was the best performing gene for six points of development including in blood-fed females, despite being ranked last within its consensus. RPL8, RPL32 and α-tubulin are each the best normalizer for two, whereas eEF1α and GAPDH each normalized best for one developmental stage. This suggests that if chosen carefully, application of one reference gene may be sufficiently robust for gene expression evaluation. Our analysis also demonstrated that the usage of sequentially ranked genes in combinations of three is preferable to two in estimating fold-change values for most sample types. However, an ACT/RPS17 pairing outperformed both two and three-gene combinations in eleven developmental stages as well as cell culture, suggesting that these two genes together could provide proper normalization regardless of sample origin.

Discussion
As qPCR increasingly becomes the method of choice in gene expression-focused studies, the need for reliable reference genes grows ever more urgent 6,8 . Misinformed selection of reference genes could lead to false positives or false negatives, effectively masking the true nature of a gene's expression patterns 30    has been recognized as an important vector of viral diseases for years, it is only recently that research involving the insect has shifted focus to the host back from the viruses it transmits. The Ae. aegypti genome is now fully sequenced and annotated 31 , and its developmental transcriptome is described 32 . Nonetheless, studies with Ae. aegypti involving qPCR often adopt genes already exhaustingly utilized as the reference genes for insects of other genera for normalization purposes. In most, the lack of dependable information regarding reference genes for Ae. aegypti has limited normalization to be against a singular reference gene, a practice which could lead to inaccurate data interpretation 33,34 . Now that it is thrusted into the spotlight as a major proponent of global epidemic threats, in-depth molecular research into the mosquito's life cycle is vigorously ongoing and thus, it is high time that a detailed panel of reference genes uniquely catering to Ae. aegypti is defined. In this study, a total of sixteen candidate genes including 18S, ATP5C1, PGK1, TBP and RNAPII were initially identified as candidates. However, these were gradually eliminated due to several factors, including the lack of    introns; primer design difficulties due to consecutive base runs spanning exon-exon boundaries; and low basal expression levels (Ct ≥ 30) even at a high starting cDNA concentration discommending the gene from usage as a reference. Remaining genes such as ribosomal proteins RPL8, RPS17 and RPL32 were chosen on the basis of frequency of appearance in Ae. aegypti literature 19,21,[35][36][37][38] . GAPDH, ACT, α-tubulin and eEF1α were selected as available transcriptome data have demonstrated their level expression across the diverse developmental states of this insect 32 .
Our study sampled subgroups comprising of sixteen pre-determined stages of development in Ae. aegypti as well as Aag2cells. Seven candidate genes were selected and their stability within each developmental stage is evaluated. Our data resulted in three differing rankings from the three evaluation programs employed, i.e. BestKeeper, geNorm, and NormFinder. Nonetheless, some degree of congruence was clearly demonstrated. Results from geNorm and NormFinder were similar in seven developmental stages as well as cell culture. The rankings generated from BestKeeper and NormFinder were alike in 3 to 6 hour and 18 to 24 hour embryos, whereas overlaps were seen between BestKeeper and geNorm in the 12 to 18 hour embryonic samples and blood-fed female adults. Good overall congruency was observed for male and non-blood fed female adults across all three programs. Conversely, major disagreements appeared in candidate gene rankings in 48 to 72 hour embryos (see Supplementary Table S3). As a whole, results generated by BestKeeper tended to contradict those of geNorm and NormFinder both, as was especially noticeable for results in second and third instar larvae. Disparities are to be expected, as each program is based off its own unique algorithm 39,40 . Moreover, many of the candidate genes in this study fall within the same functional groups. Algorithmic dissimilarities and the resulting differences in sensitivity each program would have towards co-regulated reference genes may have led to this observation 10,27,28,41 . BestKeeper also has more considerations in the form of InVar, SD, and P-values, all of which contribute to the BestKeeper vs. Pearson correlation coefficient value. These compounding factors result in the obvious differences in final outcomes.
Findings additionally reiterate the notion that there is no universal reference gene stable enough to counterbalance all age and developmental-point imposed variations in gene expression 42 . The candidate genes investigated here showed considerable variation in expression across different samples. Four displayed a range exceeding 10 Ct, with total standard deviation values of above ± 2.2. Most exhibited a larger Ct value range than the target gene used for validation i.e. CTPsyn. Although this could be due to the possibility that CTPsyn -as a synthetase enzyme producing the constantly in-demand CTP molecule -may on its own be a reference gene, the significantly high standard deviations as seen in several samples emphasizes the necessity for an assessment of reference genes in accordance to situational parameters. However, it is impractical to evaluate a large number of candidate genes for every minor qPCR procedure. With developmental stages grouped by tissue and morphological characteristics, the consensus ranking suggests certain two-gene combinations for normalization. Throughout the critical first 24 hours of embryonic development, a pairing between eEF1α and either ACT or RPL8 is optimum. A combination of RPS17 and ACT is best for embryos aged 24 to 72 hours. RPS17 could also provide ideal normalization when simultaneously applied with α-tubulin during the first, third and fourth instar larval stages; with RPL8 in second instar larvae; and with eEF1α with pupal tissue. ACT along with α-tubulin should normalize Table 6

. Degree of difference between 'true' CTPsyn fold-change value avs as estimated with either singlenormalizer genes, or different combinations of top-ranked genes, as denoted. 2 −ΔCt
CTPsyn is the presumptive 'true' fold change. Δ Δ Ct is the product of change when overall geometric mean (GM) of CTPsyn Ct values is compared to Ct value GM of CTPsyn in individual developmental stages.
Scientific RepoRts | 7:43618 | DOI: 10.1038/srep43618 adult female expression levels regardless of blood-feeding status, whereas RPL32 and eEF1α are suggested for samples of male adults and Aag2 cells. The genes least recommended as reference for egg or embryo-derived tissue samples are α-tubulin and GAPDH. For any larval stage, the usage of RPL32, eEF1α or GAPDH is unadvisable. RPL32 should be avoided when evaluating gene expression in samples from pupal tissue; RPS17 in male and blood fed female adult tissue; and GAPDH in samples derived from non-blood fed female adults as well as Aag2 cells. The frequency with which GAPDH appeared at the lower end of consensus rankings was surprisingly high, as the reliability of the gene as a reference in arthropods has been shown time and time again [43][44][45] . Nevertheless, given the high standard deviation and overall instability of gene as displayed by our data, we are comfortable in suggesting caution when utilizing GAPDH in regards to Ae. aegypti.
When normalizing expression of a target gene against a reference(s), the objective is to minimalize the normalization factors (NF) value. In geNorm, stepwise inclusion of the next best gene is given as a score denoted as pairwise variation or V value. When this falls below a 0.15 threshold, it suggests that the addition event will only slightly contribute towards decreasing the NF value, and may thus be unnecessary. The V values as generated from our data indicate that in ten out of the seventeen sample subgroups, the combination of two genes should be sufficient for normalization. However, our analysis with the target gene CTPsyn demonstrated that a top-two gene combination was the best normalizer only for second instar larval tissue. Although this same strategy performed well in general for all developmental stages and cell culture, in fourteen of the seventeen subgroups, a singular gene adequately provided proper normalization, though the most effective gene in these instances is never the top-ranked gene. This pairwise variation analysis additionally displayed that no consecutive gene inclusion events satisfied the 0.15 threshold value in 6 to 9 hour embryos, 24 to 48 hour embryos, first, third and fourth instar larvae, as well as adult male tissue samples. Several studies have suggested that as the threshold cutoff point serves as a guide rather than a rule, observing the changing trends as gene inclusion proceeds is more indicative of the ideal number of normalizing genes than the actual value themselves 46,47 . Others propose that the utilization of the three genes with the lowest M values should most appropriately assist with normalization 10,[48][49][50] . It is therefore of great interest to note that when validated against CTPsyn, triple-combos of top-ranked genes consistently gave more accurate estimations of fold-change than double-combos, and together with findings from previous studies 41,51 , indicate that applying the suggested minimum of three genes may simultaneously be the most practicable and useful strategy.
Although certain candidates may indeed be able to serve as the sole reference gene for certain developmental stages, our validation analysis also showed that normalization power across most samples did not adhere to its consensus ranking. Top genes often deviate quite significantly from the 'true' fold-change value on their own. Selecting the most preferable reference gene may thus devolve into a matter of guesswork and luck. Furthermore, as the data set becomes vulnerable to the variables encountered along the qPCR process 10,17 , dependence on a singular reference gene for normalization remains an undesired practice. This is especially true when considering the complex nature of a sample set such as ours. The sixteen developmental stages in this study represent a series of transitions in the Ae. aegypti growth environment i.e. from terrestrial (egg or embryo), to aquatic (larval and pupal stages), and back to terrestrial (adult). Though in theory a 'reference' gene should not be influenced by such circumstantial stress, this is often not the case. Changes undergone by the organism throughout these periods may exacerbate gene expression variability amongst the sample subgroups. This weakens the gene's stability, thus impairing its ability to reduce the NF value. Such an observation supports the claim that the suitability of reference genes could be experimentally exclusive, and that a panel of candidates should be simultaneously assessed within the confines of the variables of an assay.
As aforementioned, this may not however be feasible in scientific practice. Therefore, we proposed ACT/RPS17 combination for general usage in normalization practices for Ae. aegypti, regardless of tissue sample origins. This pair of genes appeared with the highest frequencies within the consensus top-three ranks of this study, alluding to their overall stability and dependability in countering developmentally-influenced variation. ACT encodes for a component of the cytoskeleton. Its importance in upholding structural integrity ensures that the gene is expressed at moderately high levels within every cell type. Over the years, ACT has acquired a rather bad reputation due to several instances whereby its seemingly excessive usage as a reference gene was proven to be unjustified. Regardless, ACT has been ranked as the most stable reference gene in a number of validation studies 43,52,53 . The gene RPS17 produces S17, a protein component of the 40S ribosomal subunit. It has long been used as the reference gene in Ae. aegypti transcriptional profiling 36,[54][55][56] , and is relied upon as the reference gene in numerous studies involving insects [57][58][59] . Readily available transcriptomics data exhibited minimal ACT expression variation throughout Ae. aegypti development 32 . However, their levels within the mosquito head appear to be moderately affected by rhythmic circadian cycle changes 60 . Conversely, the same studies reported rather significant RPS17 fluctuations during developmental progression, but minimal changes due to light-dark switches. These clashes in situations where ACT/RPS17 are more likely to vacillate perhaps allows the genes to counterbalance one another as they act together to normalize gene expression.
Nonetheless, improvements could be made in the future through increasing the number and characteristic variability of candidate genes in further reference gene validation studies. In Drosophila, it has been shown that a larger panel of reference genes is required as the sample size and inherent complexity grows [61][62][63] . Selection of gene types is also crucial. Here, the candidate genes can be separated into only three categories: (i) genes linked to ribosomal functions and/or protein production, (ii) structural genes producing components of the cellular protein scaffold, and (iii) metabolism-related genes. In the future, including other common reference genes for evaluation such as genes encoding ubiquitin proteins, phosphatases e.g. PP2A, and oxygen-radical metabolizing proteins e.g. SOD and CAT, could improve normalization and consequently the integrity of gene expression studies in Ae. aegypti [64][65][66] .

Conclusion
To our best knowledge, this is the first study of its kind in Ae. aegypti. Through the utilization of algorithms specifically conceptualized for reference gene validation, a suitable panel of genes most robust for normalization are identified for each developmental stage and Aag2 cell culture. Our results show that although a singular reference gene may suffice for interpretation of target gene expression in most stages, a combination of at least two genes is recommended to minimize the effects of variables upon the data set and for consistency of normalization. Application of three genes for normalization is optimum. Based on consensus rankings, the proposed combinations are RPL8, ACT and eEF1α for early embryos between 0 to 24 hours post-oviposition; ACT, RPS17 and eEF1α for embryos 24 to 48 hours of age; RPS17, RPL32 and ACT for late embryos (aged between 48 to 72 hours); RPS17, α-tubulin, and ACT are recommended for first instar larvae; RPS17, RPL8 and GAPDH for second instar larvae; α-tubulin, RPL8 and RPS17 for both third and fourth instar larval samples; eEF1α, RPS17 and GAPDH for pupal samples; RPL32, eEF1α and ACT for adult male tissue; ACT, α-tubulin, and RPS17 for non-blood fed female adults; ACT, α-tubulin and RPS17 for blood-fed female adults; and finally, RPL32 along with eEF1α and RPS17 for Aag2 cell samples. GAPDH is ranked lowest for most developmental stages, and is thus not to be used as a reference gene. Overall, inferred stability suggests ACT and RPS17 as the most dependably expressed reference genes and therefore, an ACT/RPS17 combination is expected to provide robust normalization for genetic expression studies in all Ae. aegypti derived samples. These findings will benefit normalization practices in Ae. aegypti, and may additionally serve as a resource for screening reference genes in closely-related arthropods.

Materials and Methods
Rearing and sample collection. Food was removed from 5 to 7 DAE adults a full day before being blood-fed using artificial membrane blood feeding system. 20 females were collected 6 hours after the blood meal as post-blood meal samples. The remaining mosquitoes were returned to normal conditions for females to lay eggs. Around 250 embryos were collected at each of the following time points post-oviposition: 0 to 3 hours (0-3 h), 3 to 6 hours (3-6 h), 6 to 9 hours (6-9 h), 9 to 12 hours (9-12 h), 12 to 18 hours (12-18 h RNA extraction and quality assurance. This study attempts to adhere to the Minimum Information for Publication of Quantitative Real-Time PCR guidelines or MIQE 17 . Samples collected throughout rearing were immediately stored in TRIzol ® reagent (Invitrogen ™ , Ambion ™ , Life Technologies) at − 20 °C. Total RNA extraction was done within two days of collection with a protocol previously described for mosquito tissue samples 67 . As much of the culturing media was aspirated away from Aag2 samples, prior to RNA extraction as described by Abcam ® 68 . Extracts were quantified on the Hellma ® Analytics TrayCell system in the SmartSpec Plus Spectrophotometer (Bio-Rad Laboratories, California). Those with an A260:A280 value between 1.75 and 2.05 were used immediately for downstream procedures. All showed clear 18S banding and minimal smearing in 1.0% agarose gel. An RNA gel with randomly chosen first bioreplicate extracts of differing degrees of freshness is shown as Supplementary Figure S1A. Extracts were kept at − 20 °C for the duration of the experiment.
Reference gene selection, primer design, and primer validation. All genes are also commonly utilized reference genes in qPCR protocols. Primers for RPS17 were previously described 19 . Others were designed on the Primer 3 software (bioinfo.ut.ee/primer3-0.4.0/). Restrictive parameters for primer selection were: melting temperatures between 59.0 °C and 61.0 °C, GC content between 40 and 60%, nucleotide length between 18 and 24, and amplicon length of between 150 to 225 bases. Regions spanning exon-exon boundaries were specified for each primer pair. Other settings were kept at default. PCR product was confirmed in silico on the Sequence Manipulation Suite website (http://www.bioinformatics.org/sms2/) to be a singular amplicon from only the mature mRNA, and not the genomic DNA sequence. All genes, accession numbers, primer sequences and amplicon size used for this study is listed in Table 1.
Reverse transcription and qPCR. Reverse transcription with 1 μ g of total RNA was carried out in 20 μ l reactions using the iScript Reverse Transcription Supermix (Bio-Rad Laboratories, California; cat. no. 1708840) according to manufacturer's protocol. Undiluted cDNA from all nine-developmental stages were pooled and serially diluted to the factor of 5 (1:1, 1:5, 1:25, 1: 125, 1:625 and 1:1875) for standard curve generation on the BioRad CFX96 qPCR platform. Optimum qPCR reactions were carried out in 10 μ l reactions using iTaq ™ Universal SYBR ® Green Supermix (Bio-Rad Laboratories, California; cat. no. 1725120), ~1 ng total cDNA, and 500 nM each of forward and reverse primers. The standard run protocol is initial denaturation at 95 °C for 2. Data mining and selection of reference gene candidates with algorithms: geNorm, BestKeeper, and NormFinder. To assess the stability of candidate reference genes, publicly available evaluation tools i.e.
BestKeeper (http://www.gene-quantification.com/bestkeeper.html) 27,70 , geNorm 10 and Normfinder 28 were utilized. The BestKeeper algorithm has been adapted for usage in Excel. The program generates a ranking through repeated pairwise correlation and regression analysis of a gene against all the other tested candidates. Up to ten genes can be evaluated at any one time for expression variations in up to one hundred samples. Raw data of Ct values (annotated as CP) and PCR efficiency of the primers were used to determine the correlation between each candidate gene and the index, expressed in the form of a coefficient of determination 70 . For geNorm and NormFinder, raw data was converted into linear values relative to the lowest Ct recorded for each candidate gene. In geNorm, the stability of a gene is assessed through the consistency of its expression ratio across all samples. The software generates both a stability value i.e. M, and a pairwise variation value i.e. V. M represents the average variation in transcript levels of a gene in comparison to all other candidate genes, achieved through a repeated process of stepwise exclusion commencing from the least stable gene. Pairwise variation estimates the effect of including another gene 10 sequentially as per the established M-value rankings through the formula of V n /V n + 1. A threshold of 0.15 is set; a V value below this would mean that an additional reference gene would not improve normalization. NormFinder is a mixed-effects model statistical analysis which estimates the stability value of a gene as a function of the approximate expression variation it would impose onto the target gene data during normalization 28 . The lower this value is, the less variation one would introduce to a normalization exercise should the candidate gene be used as a reference. It also estimates the variation between sample subgroups of the sample set. The BestKeeper vs. Pearson correlation coefficient value, geNorm M value, and NormFinder stability value are perceived as weightage. Geometric means i.e. central tendencies of these weightages for a candidate gene forms the basis for generation of a consensus ranking.
Evaluation of results through target gene normalization. CTPsyn of Ae. aegypti was utilized as the target gene for candidate reference gene evaluation. This gene encodes for the enzyme CTP Synthetase, which converts UTP, ATP and glutamine into cytidine triphosphate (CTP) molecules 71 . Human isoforms of CTPsyn have been identified as potential reference genes 72 . Although the expression levels of the gene in insects and Ae. aegypti in particular are undefined, as a housekeeping gene, it is expected to be expressed stably across all developmental stages. For this reason, CTPsyn is chosen as the 'target' gene for the purpose of this analysis. Primer sequences for CTPsyn were forward 5′ TTCCCCATTGCTACCCGAAC and reverse 5′ GAAAACCCTTCCCCAGCGTA. The expected product size is 180 bp. 'True' fold-change is based solely on CTPsyn, in the function of 2 −ΔCt CTPsyn . Fold changes estimated from normalization with (a) different genes, and (b) differing combinations and number of genes was calculated according to the Livak method i.e. 2 −ΔΔCt 73 . The degree of difference between the value predicted by normalizer(s) against fold change of CTPsyn △ Ct is the basis for evaluation of the effectiveness of the candidate reference gene(s) in normalization.