Introduction

Choroid plexus (ChP) is a highly vascularized tissue located within the four brain ventricles. It is comprised of multiple cell types, including: epithelial, endothelial, mesenchymal and immune cells, with epithelial cells constituting the majority1. The epithelial layer encases connective stroma and a highly permeable fenestrated capillary network2,3, producing 70%–80% of the cerebrospinal fluid (CSF)4,5 and forming the blood-cerebrospinal fluid barrier (BCSFB)4,5. ChP is an understudied area in neuroscience but it is attracting more attention as its developmental function is gradually elucidated and its role in neuropathology is increasingly noticed6. Historically, ChP-CSF were only known to function as “cushion” (physical protection through buoyancy) and “sink” (removal of brain metabolites through CSF drainage) of the brain4. Recently, it has been demonstrated that CSF composition and ChP-derived factors, such as signalling and trophic molecules, play indispensable functions for brain development, brain homeostasis and adult neural stem cell niches. For example, it has been demonstrated that ChP has the intrinsic ability to sense external changes associated with CNS activity7,8 and it is also well known that ChP produces and releases Otx2, an essential factor implicated in both the onset and the closure of visual critical period9. In addition, ChP also controls the transportation of many blood-derived factors (nutrients, proteins, hormones, inorganic compounds, etc.) into the brain and it is emerging as the neuroimmune gateway regulating central nervous system (CNS) immune-surveillance6. Interestingly, changes in ChP-CSF structure and function have been linked to neurodegenerative diseases such as Alzheimer’s disease10, to neurodevelopmental disorders11 such as autism-spectrum disorder12 and schizophrenia13 and also to neuroimmune disease such as multiple sclerosis14. This increasing interest in ChP parallels the need for gene expression studies of ChP in different experimental contexts.

Despite the advances in high-throughput transcriptomic technologies like microarray and RNA sequencing15, reverse transcription quantitative real-time PCR (RT-qPCR) remains a popular method for measuring mRNA expression level, especially when the number of target genes is moderate. RT-qPCR is conceptually and technically simple, economical and fast yet still highly sensitive, accurate and reproducible16. Considered as the “gold standard” for gene expression analysis, RT-qPCR is utilised to validate microarray and RNA sequencing results17. Two major RT-qPCR quantification methods have been developed and are widely used: absolute and relative/comparative quantification. Absolute quantification allows the inference of transcript number from a standard/calibration curve, which is constructed as RT-qPCR fluorescence signals corresponding to serial dilutions of a known sample (cloned or synthetic cDNA). This method, therefore, relies on the externally-built curve and fails to consider inter- and intra-sample variabilities. Relative quantification addresses this limitation by measuring target gene expression relative to the expression of a reference gene within that sample. The ideal reference genes used in a RT-qPCR experiment are required to have constant expression levels regardless of biological differences and experimental conditions18. Housekeeping genes, constitutive genes required for the maintenance of basic cellular function, are therefore, often used for this purpose. However, such ideal reference genes have yet to be discovered19, which has led to a rise in literature evaluating the stability of reference genes specific for each species, tissue, cell type and condition of interest20,21,22,23,24.

To our knowledge, reference gene stability has not yet been assessed in ChP, potentially undermining RT-qPCR experiments on this tissue. Here, we selected and examined the stability of 12 well-established housekeeping genes: Actb, Atp5f1 (also known as Atp5pb), B2m, Gapdh, Hprt1 (also known as Hprt), Pgk1, Rer1, Rpl13a, Rpl27, Sdha, Tbp, Ubc in ChP of Mus musculus (house mouse). Using not only descriptive statistics but also a combination of 4 most popular stability assessment algorithms for reference genes: BestKeeper25, DeltaCq26, geNorm18 and NormFinder27, we tested the 12 reference genes in 2 experimental panels: Developmental and Light/Dark rearing panels and identified selective combinations of stable reference genes. Finally, we validated their effect as a normalisation factor to the expression of selective ChP markers, such as Transthyretin (Ttr)28 and Orthodenticle homeobox 2 (Otx2)29.

Results

Candidate reference genes, qPCR amplification experiment and descriptive statistics

The 12 candidate reference genes used in this study were selected based on their distinct cellular function and on their extensive use in neuroscience researches18,22,23,30,31. In particular, we selected genes belonging to different functional classes to reduce the possibility that their response to the same experimental condition is co-regulated. We examined genes involved in the cellular cytoskeleton (Actb), in transcription or translation (Tbp, Rpl13a, Rpl27), in cellular metabolism (Gapdh, Sdha, Hprt1, Pgk1, Atp5f1), and in protein degradation (Ubc) in addition to ubiquitous and common cellular components, such as the major histocompatibility complex class I component (B2m) and a structural membrane protein of the Golgi apparatus (Rer1). Detailed information for each primer pair is presented in Table 1. To ensure there was no undesired product during amplification, we first examined primer specificity in silico using NCBI PrimerBlast32 and later confirmed it by melting curve analysis. The results show one single sharp peak for each primer pair in wells containing cDNA and no signal in negative control wells, indicating target-specific amplification (Supplementary Figure S1).

Table 1 Summarized information of 12 candidate reference genes and 2 target genes. Tm: melting temperature (calculated by NCBI Primer-Blast with default settings for Primer Parameters).

To better evaluate the stability of these housekeeping genes in various conditions, we established two experimental panels: (i) Developmental panel, consisting of ChP tissues from mice at postnatal day (P)0 (at birth), P15 (at eye opening), P30 (early puberty) and P60 (adulthood)33; (ii) Light/Dark rearing panel, consisting of ChP tissues from P60 mice reared in normal Light/Dark condition (Ctrl), completely in dark from birth (D), completely in dark then exposed to light for 1h (D-1hL), 4 h (D-4hL) and 24 h (D-24hL) (Supplementary Figure S2, Supplementary Table S1).

Due to the large amount of reactions per panel, each panel was divided into three 384-well plates with experimental setup following sample maximization approach, where each plate is prioritized to include all samples rather than all genes34. Inter-run calibration, therefore, is required to minimize variations between different runs (instrument-, reagent-, experimenter-related variations)34. Inter-run calibrators (IRC) were assigned to the amplification of Gapdh (Supplementary Figure S3).

We used quantification cycle (Cq) with the same meaning as cycle threshold (Ct ) or crossing point (Cp ) as recommended by RMDL consortium35. To acquire PCR efficiency (E) and coefficient of determination (R2) without standard curve, we used LinRegPCR36 which performs baseline correction for each sample individually and calculates E and R2 by fitting a regression line to a subset of data points in the log-linear phase. To combine the data of 3 plates from the same panel without inter-run variations, LinRegPCR output from 3 plates were then loaded into Factor-qPCR37, which determines the multiplicative factors and removes the systematic bias between different RT-qPCR runs. All data discussed below has been sequentially processed with both LinRegPCR and Factor-qPCR.

Descriptive statistics for the RT-qPCR amplification of both panels is presented in Table 2 and Fig. 1. The coefficient of determination (R2) shows how well the semi-log plot of Cq-log [cDNA concentration] fits to the linear regression model and indicates the presence of RT-qPCR inhibitors. We found that all primer pairs had R2 as 0.999 or 1.000, which is greater than 0.98 as recommended38 (Table 2). The amplification efficiency ranged within acceptable values, from 80.15 to 85.85 (Table 2). Mean Cq values and standard deviations (SD) were also calculated in Table 2. The range of Cq values (from 20.37 ± 0.49 to 27.93 ± 0.47) showed that all reference genes were expressed in ChP tissues (Table 2). We also confirmed the gene expression by in situ hybridization results from Allen Mouse Brain Atlas39. Actb was the most abundantly expressed gene in both experimental panels with the lowest mean Cq: 20.78 ± 0.44 in the Developmental and 20.37 ± 0.49 in the Light/Dark rearing panel, respectively. On the contrary, Tbp had the lowest expression level, which was reflected in its highest Cq of nearly 28. The genes that had the smallest and biggest variations were Rpl27 (SD = 0.38) and Sdha (SD = 0.85) in the Developmental panel and Pgk1 (SD = 0.41) and Rer1 (SD = 0.58) in the Light/Dark rearing panel (Table 2).

Table 2 Descriptive statistics of Cq values and qPCR reactions. SD: standard deviation; R2: coefficient of determination; E: amplification efficiency; R2 and E were calculated using LinRegPCR software version 2020.0.
Figure 1
figure 1

Descriptive statistics of 12 candidate reference genes in two sets of choroid plexus samples. Box-and-whisker plot showing Cq values of 12 housekeeping genes examined in the Developmental panel, displayed in different age groups (a) or displayed as all samples in the panel (c). Cq values of 12 housekeeping genes examined in the Light/Dark rearing panel, displayed in different rearing conditions (b) or displayed as all samples in the panel (d). Whisker indicates value range, the line inside the box was plotted at median value. (n = 5 mice per group (see Material and Method, Supplementary Table S1 for more details). Data is presented as Mean ± SD, adjusted p-values are indicated as *p ≤ 0.05, **p ≤ 0.01, ***p ≤ 0.001, ****p ≤ 0.0001. Cq: quantification cycle.

Figure 1a, b visualised Cq values for each housekeeping gene at different postnatal ages and in specific rearing conditions. In the Developmental panel, Cq values of different age groups fluctuated differently depending on genes. In particular, Cq of Gapdh, Atp5f1, Pgk1, Hprt1 and Sdha appear to be higher at P0 compared to later ages, whereas Cq values of the remaining housekeeping genes, did not show visible differences between P0 and later ages. Cq of Atp5f1 and Ubc showed a large variation at P60, while the remaining genes appeared more consistent at this stage (Fig. 1a). In the Light/Dark rearing panel, Cq values of the 12 genes exhibited a common pattern across all the rearing conditions. All reference genes appear to be quite stable in the Ctrl and in the D-24hL conditions, whereas the D group showed a high variability across different genes. Generally, Cq values gradually decreased in the group order: Ctrl > D > D-1 h > D-4 h > D-24 h and this applied for all genes in the panel (Fig. 1b).

To better visualize how the expression of each gene varies within each experimental paradigm, we plotted Cq values of each gene as an entire set of samples per panel (Fig. 1c,d). These graphs showed that the variation of Cq values within each gene was maximum around 2 cycles per gene. Regardless of the experimental panels, Cq values for each gene appeared to fall into common ranges. In particular, Tbp and Rer1 always have Cq between 26 – 29 whereas the other genes have the majority of Cq values between 20 – 25.

Expression stability of candidate reference genes

To evaluate the expression stability of the 12 candidate genes, we applied the four most popular algorithms: BestKeeper25, DeltaCq26, geNorm18 and NormFinder27. We started by using RefFinder40, an online tool which incorporates all these algorithms and enables a fast, convenient analysis. However, some discrepancies were previously reported when comparing the stability results calculated by RefFinder and the original software of geNorm and NormFinder (BestKeeper and DeltaCq results were consistent)20. We, therefore, used the standalone geNorm (through qBase34 software) and NormFinder (R-based version) to verify RefFinder results, confirming minor differences in the stability ranking (Supplementary Table S2). All data discussed below were acquired using geNorm and NormFinder results through their original software whereas the BestKeeper and DeltaCq results were directly extracted from RefFinder.

geNorm algorithm calculates expression stability (M) and the ideal number of needed housekeeping genes (V) based on the concept that two ideal reference genes should have identical expression ratio in all samples, regardless of experiment conditions or tissue/cell line of origin18. A lower M value, calculated as average pairwise variation between one certain gene and other candidate reference genes, indicates a more stable expression. M lower than 0.5 is usually observed for stably expressed genes in homogeneous samples34. V value, displayed as Vn/n+1, is the pairwise variation between two normalization factors. In brief, it calculates the necessity to include one more reference gene to the previous set of more stable reference genes. A V value greater than 0.15 suggests that the added gene has a significant effect and should be included for a more reliable normalisation factor18. In both experimental panels, M values were well below 0.5 for all reference genes demonstrating their stable expression in ChP. In the Developmental panel, Rpl27 (M = 0.098) and Rpl13a (M = 0.104) were the two most stable genes, whereas Sdha (M = 0.452) and Hprt1 (M = 0.48) were the least stable ones. In the Light/Dark rearing panel, Hprt1 (M = 0.115) and Rpl13a (M = 0.115) were the two most stable genes; on the contrary Sdha (M = 0.264) and B2m (M = 0.279) were the least stable genes (Fig. 2a,c; Table 3). As all V values were below 0.15 and did not show significant difference within each panel, the use of the two most stable reference genes (Rpl27, Rpl13a in Developmental panel and Hprt1, Rpl13a in Light/Dark rearing panel) were considered sufficient (Fig. 2b,d).

Figure 2
figure 2

geNorm stability analysis. M value represents the gene expression stability in the Developmental (a) and in the Light/Dark rearing (c) panels. V value suggests the optimal number of reference genes per qPCR experiment in the Developmental (b) and in the Light/Dark rearing (d) panels.

Table 3 Stability values and rankings of ChP from the Development and the Light/Dark rearing panels calculated by four algorithms: geNorm, NormFinder, BestKeeper, DeltaCq and the overall ranking. Gene’s stability decreases from 1st to 12th rank.

NormFinder applies the “model-based approach to estimation of expression variation”27 instead of pairwise comparison approach like the other methods. The algorithm estimates both intra- and inter-group variation, then combines them into the stability value ρ, which enables the addition of two sources of variation and confer a measure of systematic error. As a result, genes with smaller ρ exhibit higher stability27. In the Developmental panel, Gapdh was the most stable gene (ρ = 0.161) followed by Rer1 (ρ = 0.207), whereas Sdha (ρ = 0.469) and Hprt1 (ρ = 0.473) were the most inconsistent genes. For the Light/Dark rearing panel, Hprt1 was the most stable gene (ρ = 0.121) followed by Atp5f1 (ρ = 0.123). The least stable genes were Sdha (ρ = 0.190) and B2m (ρ = 0.212) (Table 3).

BestKeeper suggests that genes with SD greater than 1 are considered inconsistent25. Fortunately, all 12 genes studied have SD smaller than 1 and qualified for subsequent analysis. The algorithm computes Pearson’s correlation coefficient (r) between each gene and BestKeeper Index (geometric mean of Cp values), and genes with greater r values have more stable expression25. In the Developmental panel, the most stable genes were Rpl13a (r = 0.965) and B2m (r = 0.953) whereas the least stable genes were Atp5f1 (r = 0.651) and Ubc (r = 0.472). In the Light/Dark rearing panel, the most stable genes were Rpl27 (r = 0.962) and Gapdh (r = 0.962), there least stable genes were Sdha (r = 0.874 and B2m (r = 0.818) (Table 3).

The DeltaCq method, like geNorm, calculates pairwise comparisons of genes but with simpler mathematical calculations and without compromising accuracy. Specifically, it computes mean SD of differences in Cq (\(\Delta\)Cq) values of a given gene compared to all other genes in the studied list26. Genes with smaller mean SD have more stable expression. In the Developmental panel, the most stable genes by this method were Gapdh (Mean SD = 0.39) and Rer1 (Mean SD = 0.40), the least stable genes were Sdha (Mean SD = 0.60) and Hprt1 (Mean SD = 0.62). In the Light/Dark rearing panel, the range was smaller and the most stable genes were Hprt1 (Mean SD = 0.24) and Rpl27 (Mean SD = 0.25); at the bottom of the ranking were Sdha (Mean SD = 0.34) and B2m (Mean SD = 0.35) (Table 3).

To visually illustrate how consistent the results of the described four algorithms are, we plotted the graph of genes and stability values in Fig. 3 with lower stability values indicating more stable expression. In general, the variations in gene stability between methods appeared to be smaller in the Light/Dark rearing panel than in the Developmental panel. When comparing stability rankings calculated by the 4 methods, the results were more consistent at both ends of the chart (Light/Dark rearing panel) or within the lower values (Developmental panel). The middle of both graphs was slightly fluctuating, indicating that the algorithms are less consistent to each other when the difference in the gene stability is less significant. This inconsistency in stability ranking of the 4 methods (Table 3, Fig. 3) is the result of differences in either mathematics or in the concepts of gene stability on which the algorithms were established. Similar inconsistencies have also been reported in other studies20,24,30,41,42.

Figure 3
figure 3

Conformity in rankings of reference genes by 4 methods. Stability ranking of 12 reference genes in the Developmental (a) and in the Light/Dark rearing (b) panels, calculated per assessment method (geNorm, triangle; NormFinder, hexagon; BestKeeper, square; DeltaCq, diamond) and as geometric mean of the 4 rankings (Overall, full black circle) (see Table 3 for more details).

To draw a conclusion on stability testing results, overall ranking was determined by calculating geometric mean of rankings (GMR) from all four methods (Table 3, Fig. 3), with smaller GMR indicating more stable expression. We identified Rer1 (GMR = 2.783) and Rpl13a (GMR = 2.783) as the two most stable genes in the Developmental panel, whereas Sdha (GMR = 9.453) and Hprt1 (GMR = 10.847) were the two least stable genes. In the Light/Dark rearing panel, the two most stable genes were Hprt1 (GMR = 1.189) and Rpl27 (GMR = 2.213), whereas the two least stable ones were Sdha (GMR = 10.462) and B2m (GMR = 11.465) (Table 3).

Altogether these data suggest that the expression level of frequently used housekeeping genes could vary depending on different experimental conditions, supporting the notion that a preliminary study to find optimal reference genes should be performed whenever possible20,21,30,43.

The stability ranking of genes in both panels is summarised in Fig. 4, which was divided into the five top ranked genes and the rest. Within the five top ranks, we identified three housekeeping gene candidates, Tbp, Rpl13a and Rpl27, which were considered relatively stable across the two studied conditions and would also be used for the following validation experiments.

Figure 4
figure 4

Summary of gene stability ranking. The 2 columns illustrate gene stability order of the 2 experimental panels with decreasing stability from top to bottom. Three genes (bold) were among the top 5 most stable for both panels.

The effect of different reference genes on relative expression of target genes

To determine the impact of reference gene choice on gene expression results, we used different combinations of housekeeping genes to calculate the expression of Ttr in the Developmental panel, and Otx2 in the Light/Dark rearing panel. Ttr, encoding transporter protein transthyretin (formerly called prealbumin), is a ChP marker28 and its expression is expected to change among groups in the Developmental panel. Orthodenticle homeobox 2 (Otx2), is a homebox gene encoding for a transcription factor synthesized and secreted by ChP29, which has proven roles in regulating critical period in the visual system44,45,46,47. We, therefore, selected Otx2 as a good target gene for the Light/Dark rearing panel.

For averaging the reference genes, we used geometric mean rather than arithmetic mean as it is suggested to offer better control over possible outliers and abundance differences between different genes18. We normalised Ttr expression to the two most stable genes in Developmental panel, Rer1, Rpl13a (Fig. 5a), to the most common reference gene, Gapdh (used in many ChP studies48,49,50) (Fig. 5b), to the two least stable genes in the same panel, Sdha, Hprt1 (Fig. 5c), and to different combinations of any two among three stable genes, Rpl13a, Rpl27, Tbp (Fig. 5d). Normalization with Rer1 and Rpl13a (the most stable) nearly doubled the Ttr relative expression level between P0 and P15, followed by a steadier rise until P60 (Fig. 5a). Although it is still possible to see the increase in Ttr expression throughout development with the commonly used reference Gapdh, the difference was less obvious and less statistically powerful. In addition, the statistical difference between P0 and P15 was lost (Fig. 5b). When the least stable pair Sdha and Hprt1 was used, Ttr relative expression level was flattened and there was no difference between the four developmental stages (Fig. 5c). Finally, Fig. 5d shows the comparison between the best developmental reference genes [Rer1, Rpl13a] and 3 pairs of good reference gene pan-panels [Rpl13a, Rpl27], [Rpl13a, Tbp] and [Rpl27, Tbp]. Although with moderately higher fold change expression of Ttr, the later 3 pairs gave highly similar results compared to using [Rer1, Rpl13a] (Fig. 5d), which demonstrates their suitability for being used as suboptimal reference genes in the Developmental panel. Finally, we tested whether using Gapdh in combination with another more stable gene, namely Rer1, Rpl13a or Tbp, would improve the normalization of Ttr mRNA expression. Despite relatively lower fold change values, the results showed a good “rescue effect” with expression patterns similar to those seen by normalising to the most stable genes [Rer1, Rpl13a] (Supplementary Fig. S4a).

Figure 5
figure 5

Relative expression of Ttr normalised to different reference genes. Ttr relative expression compared to the most stable reference genes in the Developmental panel: Rer1, Rpl13a (a); to Gapdh (b) and to the least stable genes in the Developmental panel: Sdha, Hprt1 (c). Ttr relative expression normalized to Rer1, Rpl13a versus different combination of the 3 common stable genes Tbp, Rpl13a, Rpl27 (d). Data is presented as Mean ± SD, adjusted p-values are indicated as *p ≤ 0.05, **p ≤ 0.01, ***p ≤ 0.001, ****p ≤ 0.0001.

Similarly, in the Light/Dark rearing panel when we normalized Otx2 expression to Hprt1 and Rpl27, the two most stable genes, we noticed a reduction of Otx2 expression in the D group compared to the Ctrl group. However, Otx2 level was steady across all dark rearing conditions, D, D-1hL, D-4hL and D-24hL (Fig. 6a). Normalisation to Gapdh gave heavily distorted results with Otx2 expression levels unchanged between the Ctrl and the D samples, and increased expression in D-1hL, D-4hL and D-24hL samples (Fig. 6b). Strikingly, when Sdha and B2m (the least stable genes) were used Otx2 expression was significantly increased in all dark rearing groups (D, D-1hL, D-4hL and D-24hL) compared to the Ctrl group. Otx2 mRNA level was highest at D-1hL (Fig. 6c). Finally, Fig. 6d displays the comparison between the best genes in the Light/Dark rearing panel [Hprt1, Rpl27] and 3 pairs of good reference gene pan-panels [Rpl13a, Rpl27], [Rpl13a, Tbp] and [Rpl27, Tbp]. Although there were some differences in the results, the overall trend of Otx2 mRNA expression level was maintained across different experimental groups, which shows their competency as suboptimal reference genes for this panel. Again, we tested whether using Gapdh in combination with another more stable gene, namely Hprt1, Rpl27 or Tbp, would improve the normalization of Otx2 mRNA expression. Unlike the case of Ttr, the addition of 1 more stable gene to the reference pair did not improve Otx2 measurement to the pattern obtained using the most stable genes [Hprt1, Rpl27] (Supplementary Fig. S4b), which could be due to the difference in the stability rankings of Gapdh between the 2 panels. In fact, Gapdh ranked 2nd in the Developmental panel and 6th in the Light/Dark rearing panel, which suggests that its expression might be affected by different light/dark rearing regimes (Fig. 4).

Figure 6
figure 6

Relative expression of Otx2 normalised to different reference genes. Otx2 relative expression compared to the most stable reference genes in the Light/Dark rearing panel: Hprt1, Rpl27 (a); to Gapdh (b) and to the least stable genes in the Light/Dark rearing panel: B2m, Sdha (c). Otx2 relative expression normalized to Hprt1, Rpl27 versus different combination of the 3 common stable genes Tbp, Rpl13a, Rpl27 (d). Data is presented as Mean ± SD, adjusted p-values are indicated as *p ≤ 0.05.

Altogether, these data emphasized the significant impact of reference gene choice on the expression levels of target genes. Our data also suggest that Rpl13a, Rpl27 and Tbp are relatively stable housekeeping genes in ChP across two experimental paradigms and that any combination among them has the potential to be a good choice of reference genes for ChP.

Discussion

ChP is emerging as an underestimated but important brain tissue, playing crucial roles in development, homeostasis and protection of the CNS. Therefore, the growing body of ChP research would benefit from reproducible and accurate protocols for measuring ChP gene expression. For RT-qPCR, this requires reference genes whose expression is stable across developmental stages and/or treatment conditions. In this study we investigated the expression stability of 12 common housekeeping genes (Actb, Atp5f1, B2m, Gapdh, Hprt1, Pgk1, Rer1, Rpl13a, Rpl27, Sdha, Tbp and Ubc) in ChP across developmental stages (Developmental panel) or brain activity (Light/Dark rearing panel). We analysed expression data of housekeeping genes using both descriptive statistics (Table 2, Fig. 1) and more sophisticated analysis by implementing specific algorithms built on different perceptions of housekeeping gene stability (Table 3, Figs. 2, 3). Our data indicate that different experimental paradigms require selective reference genes, confirming the importance of identifying suitable reference genes in relative quantitative RT-qPCR.

An initial evaluation of the housekeeping gene stability using descriptive statistics, measuring mean Cq and SD already highlighted variations between groups in the same panel and across panels (Fig. 1). For instance, Cq values for Tbp, Rer1, Rpl13a and Rpl27 were relatively stable across developmental stages, while Atp5f1, Hprt1, Pgk1 and Sdha showed significantly higher Cq for P0 versus older ages. This method is usually a tempting way of evaluating gene stability due to its simplicity but it is valid only to a certain extent. In fact, its apparent simplicity belies the biological and technical variations. In order to find a more robust measure of gene stability, we employed 4 widely used application packages: geNorm18, BestKeeper25, DeltaCq26, and NormFinder27. Differences in the methodology used by each algorithm may cause a few inconsistencies and make the result “relative” (Table 3, Fig. 3). For example, in the Light/Dark rearing panel, all algorithms agreed that Sdha and B2m were the least stable genes, 3 of the 4 algorithms ranked Hprt1 as the most stable gene but Rpl27 received 4 different rankings from the 4 methods (although all within the top 4) (Table 3). Therefore, in order to incorporate results from all 4 methods, we took geometric means of the 4 rankings to generate overall final stability values, which conclude that Rer1, Rpl13a were the 2 most stable genes in the Developmental panel while Hprt1 and Rpl27 were the 2 most stable genes in the Light/Dark rearing panel.

Another important issue to take into consideration is the number of reference genes that is sufficient to obtain reliable results without wasting materials. The use of one single reference gene is generally not advisable as Vandesompele et al. (2002) showed that normalisation using only one reference gene would lead to erroneous results up to 3.0 fold in 25% of cases and 6.4 fold in 10% of cases18. geNorm’s V values demonstrated that the optimal number of reference genes for ChP in each experimental panel is 2 when selected from the most stable genes (Fig. 2b, d).

We also noticed that a subset of genes, Rpl13a, Rpl27 and Tbp were among the top 5 most stable genes in both panels (Fig. 4). Interestingly, the use of a combination of these genes across the two panels was enough to maintain the expression pattern as observed when using the most stable genes. In fact, the normalized expression level of Ttr and Otx2 was comparable when using the most stable pairs for each panel or a combination of 2 among the 3 commonly stable genes, Rpl13a, Rpl27, Tbp, demonstrating that these are suboptimal but acceptable reference genes in both panels (Figs. 5d, 6d). On the contrary, when comparing the expression level normalized to the most stable, the least stable, and Gapdh, we always visualized a stark contrast, showing how significant the choice of reference genes could alter target gene expression readings (Figs. 5a–c, 6a–c).

Standing out as commonly stable genes in both panels, Tbp, Rpl13a and Rpl27 are all involved in transcription/translation machinery. While Tbp encodes TATA-box binding protein – an important part of the eukaryotic transcription initiation complex51, Rpl13a and Rpl27 encode ribosomal protein L13a and L27, which are components of the 60S ribosome52. Although it is not always the case, Tbp and ribosomal proteins often appeared as stable candidates in many other reference gene analyses18,42,53. However, despite their high stability rankings, we do not recommend a reference set only containing Rpl13a and Rpl27 as they belong to the same family and may be subject to co-regulation. On the other side, genes involved in cellular metabolism, such as Atp5f1 (encoding subunit B in peripheral stalk of mitochondrial ATP synthase)54, Sdha (encoding an enzyme of tricarboxylic acid cycle)55, Pgk1 and Gapdh (encoding enzymes of the glycolytic pathway)56,57 appear to be quite unstable in ChP across the conditions examined, suggesting that they may be modulated by the used experimental paradigms. In general, metabolic activities have been shown to change in response to external manipulation, such as photic manipulations58 and developmental stages59, which may explain why Atp5f1, Sdha, Pgk1 and Gapdh expression is relatively unstable in this study and tend to rank at the bottom. Another common family of reference gene includes structural genes, such as Actb (encoding actin protein of cytoskeletal structure) and tubulin (encoding microtubules). It is well-known that tubulin genes are essential for the development and function of neurons60, however it has been reported that tubulin mRNA can be unstable under physiological changes61; therefore, we decided to not include it in our study. On the other hand, we did include Actb, one of the most popular gene in the literature. However, in our experimental condition it only ranked 7th in the Developmental panel and 10th in the Light/Dark rearing panel, which is in agreement with many other studies showing that structural cellular components, such as Actb are not stable housekeeping genes and therefore not good candidate reference genes30,41,62.

Finally, Gapdh, the most popular housekeeping gene in the literature, was not an ideal candidate in our study either, where it ranked 2nd in the Developmental panel and 6th in the Light/Dark rearing panel. Gapdh instability could be attributed to the fact that it is a metabolic gene and sensitive to our study’s experimental conditions as mentioned above. Despite its abundance in ChP homogenate, the mediocre quality of Gapdh, as a reference gene, was reflected in the RT-qPCR experiment with Ttr (Fig. 5a, b), in which the increase in Ttr expression across development was observed but with a reduced statistical significance, especially at early developmental stages (P0-P15). In the Light/Dark rearing panel, where Gapdh was found in a lower stability ranking (Fig. 4), the normalization to Gapdh even gave a false sense of Otx2 expression compared to most stable genes in the same panel (Fig. 6a, b). In agreement with several previous studies23,30,41,42, we noticed that Gapdh may not be a reliable reference gene for the analysis of ChP. Here, we suggest that its use should be considered in specific experimental context and be combined with or even replaced by other housekeeping genes when appropriate.

In summary, we recommend the use of a minimum of 2 reference genes for ChP RT-qPCR. In particular, Rer1 and Rpl13a appear to be stable in the ChP across postnatal ages, whereas Hprt1 and Rpl27 are mainly stable when neuronal activity is manipulated. We also showed that Rpl13a, Rpl27 and Tbp were relatively stable genes across both experimental conditions in this study. Our results demonstrated that the expression of housekeeping genes in ChP could change depending on experimental settings and that the choice of reference genes can have a great impact on the measured expression levels of target genes. Therefore, it is worthwhile to investigate the stability of candidate reference genes in the context of the specific experiment (i.e. species, cell types, treatment, etc.) whenever possible.

Materials and methods

Selection of candidate reference genes and primers

14 genes were examined, including 12 housekeeping genes and 2 target genes. All primer pairs were adapted from the literature (Table 1) with the following criteria: primer length 18–24 bp, amplicon size 50–150 bp, melting temperature (Tm) 56–60 °C, GC content 40–60%, no secondary structures formed at annealing temperature (56 °C). Primer specificity was checked in silico by NCBI-PrimerBlast32. Primers were synthesised by Sigma Aldrich.

Animal samples collection

All animal experiments were approved by the local governing regional council (Regierungspräsidium Karlsruhe, Germany). All methods were carried out following the German Animal Welfare Act regulations. Animal studies are reported in compliance with the ARRIVE guidelines.

Inbred C57Bl6/J mice were purchased from Janvier labs. Light/Dark rearing mice were socially housed in groups of two to five animals. All mice were housed in static cages. No environmental enrichment toys were added to the cages, only extra Kimwipes tissue. Light-reared mice were maintained on a 12 hour light/dark cycle, whereas dark-reared litters were kept in complete darkness from birth until adulthood (P60). A subset of them was re-exposed to light for different times (Supplementary Fig. S1). Mice were housed in standard housing conditions and received ad libitum food and water.

Newborn pups (P0) were decapitated while animals at P15, P30 and P60 were sacrificed by cervical dislocation. Both male and female animals were included. Freshly harvested brains were kept in Harvesting media63 (DMEM-F12 supplemented with 10% FBS, 2 mM L-Glutamine, 50 μg/ml Gentamycin (all reagents are from Gibco)) on ice, then ChP tissue was isolated from lateral and fourth ventricles, pooled together and immediately frozen in liquid nitrogen and stored at − 80 °C. ChP isolation was performed on ice and supported by stereomicroscope Stemi DV4 (Zeiss).

Total RNA isolation and cDNA synthesis

Total RNA was extracted from ChP tissue using RNeasy Mini Kit (Qiagen 74,104) following manufacturer’s protocol. Tissue was disrupted and homogenised using Lysis buffer RTL and QIAshredder spin columns (Qiagen 79,654). Genomic DNA was removed using RNase-free DNase I (Qiagen 79,254) for 15 min at room temperature. RNA was eluted from the column with 30 µl RNase free water and stored at − 80 °C. RNA quality and concentration were measured by UV spectrophotometry on NanoDrop One (Thermo Scientific). RNA yield for each biological sample ranged from 800 to 3300 ng. The desired absorbance ratios A260/A280 and A260/A230 were 1.8–2.2. RNA samples with absorbance ratios below 1.8 were precipitated using 100% ethanol, ammonium acetate and glycogen (Thermo Scientific R0551) following manufacturer’s protocol. For cDNA synthesis, 270–990 ng RNA each sample was used with High-Capacity RNA-to-cDNA kit (Thermo Scientific 4,387,406). cDNA samples were stored at − 20 °C.

Reverse transcription quantitative real-time PCR (RT-qPCR) experiment

For expression stability of housekeeping genes experiment, each panel was divided into two 384-well plates with experimental setup following the sample maximization approach34. In the Developmental panel, there were 4 age groups, 5 biological replicates per group (sample number n = 20). In the Light/Dark rearing panel, there were 5 groups, 5 biological replicates each group (sample number n = 25) (Supplementary Table S1 ). Within each panel, the amplification of Gapdh in 4–5 samples were assigned as inter-run calibrators (IRC)34 (see Supplementary Fig. S3 for the plate layout). For the RT-qPCR experiment of target genes’ expression—Ttr and Otx2—with different normalisation factors, all reactions for each gene were carried out on one single 384-well plate, 3 biological replicates each group. The number of technical replicates was 3 for all reactions. No template controls (NTC) were included in all experiments.

RT-q PCR was performed using PowerUp SYBR Green Master Mix (Thermo Scientific A25742) following manufacturer’s protocol with 50 ng cDNA, 20 µM each primer in a 10 µl reaction volume. Thermal protocol consisted of UDG activation at 50 °C in 2 min, Dual-Lock DNA polymerase activation at 95 °C in 2 min, 40 cycles of Denaturation at 95 °C in 15 s, Annealing at 56 °C in 15 s and Extension at 72 °C in 1 min. Melting curve thermal protocol was run right after finishing amplification: 95 °C in 15 s (ramp rate 1.6 °C/s), 60 °C in 1 min (ramp rate 1.6 °C/s), 95 °C in 15 s (ramp rate 0.15 °C/s). All RT-qPCR experiments were performed on LightCycler 480 (Roche).

Raw data processing

Non-baseline-corrected RT-qPCR raw data was extracted from the machine to provide input for LinRegPCR36 (version 2020.0). The software performed baseline correction for each sample individually, calculated amplification efficiency (E), quantification cycle (Cq) and coefficient of determination (R2) by fitting a linear regression model to log-linear phase. Technical replicates were next examined with the allowed maximum variation of Cq as 0.5 (default threshold in qBase34), unqualified replicate was eliminated. LinRegPCR output from two plates of the same panel was then loaded into Factor-qPCR37 (version 2020.0) to remove systematic bias between different RT-qPCR runs. Finally, arithmetic mean of technical replicates was taken as the Cq value representing biological samples.

Expression stability analysis

Gene expression stability was assessed using RefFinder40 (https://www.heartcure.com.au/reffinder/), a web-based tool integrating four algorithms: BestKeeper25, DeltaCq26, geNorm18 and NormFinder27. The standalone software of geNorm (qBase + 34 from Biogazelle) and NormFinder (R-based, version 5) were used in parallel. For all software, input data was Cq values, output were different types of stability values (BestKeeper: r, DeltaCq: Mean SD of mean \(\Delta\)Cq, geNorm: M, NormFinder: ρ) and stability rankings. A lower rank indicates a more stable gene. An overall ranking was determined for each gene by calculating geometric mean of rankings from all 4 methods.

Relative expression of Ttr and Otx2, statistical analysis and data visualisation

For expression analysis of Ttr, geometric mean of P0 samples was used as Calibrator. For expression analysis of Otx2, geometric mean of Ctrl samples was used as Calibrator. Relative expression level was calculated by Pfaffl method64 (or efficiency method), which uses amplification efficiency E and difference of cycle of quantification \(\Delta\)Cq from unknown samples and Calibrator.

To evaluate statistical difference in target genes’ RNA level between different groups, one-way ANOVA followed by Tukey’s multiple comparisons was applied. Statistical analysis and graphs construction were performed in GraphPad Prism (version 8.4.0). Data was presented as Mean ± SD.