Assessment of common housekeeping genes as reference for gene expression studies using RT-qPCR in mouse choroid plexus

Choroid plexus (ChP), a vascularized secretory epithelium located in all brain ventricles, plays critical roles in development, homeostasis and brain repair. Reverse transcription quantitative real-time PCR (RT-qPCR) is a popular and useful technique for measuring gene expression changes and also widely used in ChP studies. However, the reliability of RT-qPCR data is strongly dependent on the choice of reference genes, which are supposed to be stable across all samples. In this study, we validated the expression of 12 well established housekeeping genes in ChP in 2 independent experimental paradigms by using popular stability testing algorithms: BestKeeper, DeltaCq, geNorm and NormFinder. Rer1 and Rpl13a were identified as the most stable genes throughout mouse ChP development, while Hprt1 and Rpl27 were the most stable genes across conditions in a mouse sensory deprivation experiment. In addition, Rpl13a, Rpl27 and Tbp were mutually among the top five most stable genes in both experiments. Normalisation of Ttr and Otx2 expression levels using different housekeeping gene combinations demonstrated the profound effect of reference gene choice on target gene expression. Our study emphasized the importance of validating and selecting stable housekeeping genes under specific experimental conditions.


Results
Candidate reference genes, qPCR amplification experiment and descriptive statistics. The 12 candidate reference genes used in this study were selected based on their distinct cellular function and on their extensive use in neuroscience researches 18,22,23,30,31 . In particular, we selected genes belonging to different functional classes to reduce the possibility that their response to the same experimental condition is co-regulated. We examined genes involved in the cellular cytoskeleton (Actb), in transcription or translation (Tbp, Rpl13a, Rpl27), in cellular metabolism (Gapdh, Sdha, Hprt1, Pgk1, Atp5f1), and in protein degradation (Ubc) in addition to ubiquitous and common cellular components, such as the major histocompatibility complex class I component (B2m) and a structural membrane protein of the Golgi apparatus (Rer1). Detailed information for each primer pair is presented in Table 1. To ensure there was no undesired product during amplification, we first examined primer specificity in silico using NCBI PrimerBlast 32 and later confirmed it by melting curve analysis. The results show one single sharp peak for each primer pair in wells containing cDNA and no signal in negative control wells, indicating target-specific amplification (Supplementary Figure S1). www.nature.com/scientificreports/ To better evaluate the stability of these housekeeping genes in various conditions, we established two experimental panels: (i) Developmental panel, consisting of ChP tissues from mice at postnatal day (P)0 (at birth), P15 (at eye opening), P30 (early puberty) and P60 (adulthood) 33 ; (ii) Light/Dark rearing panel, consisting of ChP tissues from P60 mice reared in normal Light/Dark condition (Ctrl), completely in dark from birth (D), completely in dark then exposed to light for 1h (D-1hL), 4 h (D-4hL) and 24 h (D-24hL) (Supplementary Figure S2, Supplementary Table S1).
Due to the large amount of reactions per panel, each panel was divided into three 384-well plates with experimental setup following sample maximization approach, where each plate is prioritized to include all samples rather than all genes 34 . Inter-run calibration, therefore, is required to minimize variations between different runs (instrument-, reagent-, experimenter-related variations) 34 . Inter-run calibrators (IRC) were assigned to the amplification of Gapdh (Supplementary Figure S3).
We used quantification cycle (Cq) with the same meaning as cycle threshold (Ct ) or crossing point (Cp ) as recommended by RMDL consortium 35 . To acquire PCR efficiency (E) and coefficient of determination (R 2 ) without standard curve, we used LinRegPCR 36 which performs baseline correction for each sample individually and calculates E and R 2 by fitting a regression line to a subset of data points in the log-linear phase. To combine the data of 3 plates from the same panel without inter-run variations, LinRegPCR output from 3 plates were then loaded into Factor-qPCR 37 , which determines the multiplicative factors and removes the systematic bias between different RT-qPCR runs. All data discussed below has been sequentially processed with both LinReg-PCR and Factor-qPCR.
Descriptive statistics for the RT-qPCR amplification of both panels is presented in Table 2 and Fig. 1. The coefficient of determination (R 2 ) shows how well the semi-log plot of Cq-log [cDNA concentration] fits to the linear regression model and indicates the presence of RT-qPCR inhibitors. We found that all primer pairs had R 2 as 0.999 or 1.000, which is greater than 0.98 as recommended 38 (Table 2). The amplification efficiency ranged within acceptable values, from 80.15 to 85.85 (Table 2). Mean Cq values and standard deviations (SD) were also calculated in Table 2. The range of Cq values (from 20.37 ± 0.49 to 27.93 ± 0.47) showed that all reference genes were expressed in ChP tissues (Table 2). We also confirmed the gene expression by in situ hybridization results from Allen Mouse Brain Atlas 39 . Actb was the most abundantly expressed gene in both experimental panels with the lowest mean Cq: 20.78 ± 0.44 in the Developmental and 20.37 ± 0.49 in the Light/Dark rearing panel, respectively. On the contrary, Tbp had the lowest expression level, which was reflected in its highest Cq of nearly 28. The genes that had the smallest and biggest variations were Rpl27 (SD = 0.38) and Sdha (SD = 0.85) in the Developmental panel and Pgk1 (SD = 0.41) and Rer1 (SD = 0.58) in the Light/Dark rearing panel (Table 2). Figure 1a, b visualised Cq values for each housekeeping gene at different postnatal ages and in specific rearing conditions. In the Developmental panel, Cq values of different age groups fluctuated differently depending on genes. In particular, Cq of Gapdh, Atp5f1, Pgk1, Hprt1 and Sdha appear to be higher at P0 compared to later ages, whereas Cq values of the remaining housekeeping genes, did not show visible differences between P0 and later ages. Cq of Atp5f1 and Ubc showed a large variation at P60, while the remaining genes appeared more consistent at this stage (Fig. 1a). In the Light/Dark rearing panel, Cq values of the 12 genes exhibited a common pattern across all the rearing conditions. All reference genes appear to be quite stable in the Ctrl and in the D-24hL conditions, whereas the D group showed a high variability across different genes. Generally, Cq values gradually decreased in the group order: Ctrl > D > D-1 h > D-4 h > D-24 h and this applied for all genes in the panel (Fig. 1b).
To better visualize how the expression of each gene varies within each experimental paradigm, we plotted Cq values of each gene as an entire set of samples per panel (Fig. 1c,d). These graphs showed that the variation of Cq values within each gene was maximum around 2 cycles per gene. Regardless of the experimental panels, Cq values for each gene appeared to fall into common ranges. In particular, Tbp and Rer1 always have Cq between 26 -29 whereas the other genes have the majority of Cq values between 20 -25. We started by using RefFinder 40 , an online tool which incorporates all these algorithms and enables a fast, convenient analysis. However, some discrepancies were previously reported when comparing the stability results calculated by RefFinder and the original software of geNorm and NormFinder (BestKeeper and DeltaCq results were consistent) 20 . We, therefore, used the standalone geNorm (through qBase 34 software) and NormFinder (R-based version) to verify RefFinder results, confirming minor differences in the stability ranking (Supplementary Table S2). All data discussed below were acquired using geNorm and NormFinder results through their original software whereas the BestKeeper and DeltaCq results were directly extracted from RefFinder. geNorm algorithm calculates expression stability (M) and the ideal number of needed housekeeping genes (V) based on the concept that two ideal reference genes should have identical expression ratio in all samples, regardless of experiment conditions or tissue/cell line of origin 18 . A lower M value, calculated as average pairwise variation between one certain gene and other candidate reference genes, indicates a more stable expression. M lower than 0.5 is usually observed for stably expressed genes in homogeneous samples 34 . V value, displayed as V n/n+1 , is the pairwise variation between two normalization factors. In brief, it calculates the necessity to include one more reference gene to the previous set of more stable reference genes. A V value greater than 0.15 suggests that the added gene has a significant effect and should be included for a more reliable normalisation factor 18 Table 3). As all V values were below 0.15 and did not show significant difference within each panel, the use of the two most stable reference genes (Rpl27, Rpl13a in Developmental panel and Hprt1, Rpl13a in Light/Dark rearing panel) were considered sufficient (Fig. 2b,d).
NormFinder applies the "model-based approach to estimation of expression variation" 27 instead of pairwise comparison approach like the other methods. The algorithm estimates both intra-and inter-group variation, then combines them into the stability value ρ, which enables the addition of two sources of variation and confer a measure of systematic error. As a result, genes with smaller ρ exhibit higher stability 27 . In the Developmental www.nature.com/scientificreports/ panel, Gapdh was the most stable gene (ρ = 0.161) followed by Rer1 (ρ = 0.207), whereas Sdha (ρ = 0.469) and Hprt1 (ρ = 0.473) were the most inconsistent genes. For the Light/Dark rearing panel, Hprt1 was the most stable gene (ρ = 0.121) followed by Atp5f1 (ρ = 0.123). The least stable genes were Sdha (ρ = 0.190) and B2m (ρ = 0.212) ( Table 3).
BestKeeper suggests that genes with SD greater than 1 are considered inconsistent 25 . Fortunately, all 12 genes studied have SD smaller than 1 and qualified for subsequent analysis. The algorithm computes Pearson's correlation coefficient (r) between each gene and BestKeeper Index (geometric mean of Cp values), and genes with greater r values have more stable expression 25 . In the Developmental panel, the most stable genes were Rpl13a (r = 0.965) and B2m (r = 0.953) whereas the least stable genes were Atp5f1 (r = 0.651) and Ubc (r = 0.472). In the Light/Dark rearing panel, the most stable genes were Rpl27 (r = 0.962) and Gapdh (r = 0.962), there least stable genes were Sdha (r = 0.874 and B2m (r = 0.818) ( Table 3).
The DeltaCq method, like geNorm, calculates pairwise comparisons of genes but with simpler mathematical calculations and without compromising accuracy. Specifically, it computes mean SD of differences in Cq ( Cq) values of a given gene compared to all other genes in the studied list 26 . Genes with smaller mean SD have more stable expression. In the Developmental panel, the most stable genes by this method were Gapdh (Mean SD = 0.39) and Rer1 (Mean SD = 0.40), the least stable genes were Sdha (Mean SD = 0.60) and Hprt1 (Mean SD = 0.62). In the Light/Dark rearing panel, the range was smaller and the most stable genes were Hprt1 (Mean SD = 0.24) and Rpl27 (Mean SD = 0.25); at the bottom of the ranking were Sdha (Mean SD = 0.34) and B2m (Mean SD = 0.35) ( Table 3).
To visually illustrate how consistent the results of the described four algorithms are, we plotted the graph of genes and stability values in Fig. 3 with lower stability values indicating more stable expression. In general, the variations in gene stability between methods appeared to be smaller in the Light/Dark rearing panel than in the Developmental panel. When comparing stability rankings calculated by the 4 methods, the results were more consistent at both ends of the chart (Light/Dark rearing panel) or within the lower values (Developmental panel). The middle of both graphs was slightly fluctuating, indicating that the algorithms are less consistent to each other when the difference in the gene stability is less significant. This inconsistency in stability ranking of the Table 3. Stability values and rankings of ChP from the Development and the Light/Dark rearing panels calculated by four algorithms: geNorm, NormFinder, BestKeeper, DeltaCq and the overall ranking. Gene's stability decreases from 1 st to 12 th rank.  Fig. 3) is the result of differences in either mathematics or in the concepts of gene stability on which the algorithms were established. Similar inconsistencies have also been reported in other studies 20,24,30,41,42 .
To draw a conclusion on stability testing results, overall ranking was determined by calculating geometric mean of rankings (GMR) from all four methods (Table 3, Fig. 3), with smaller GMR indicating more stable expression. We identified Rer1 (GMR = 2.783) and Rpl13a (GMR = 2.783) as the two most stable genes in the Developmental panel, whereas Sdha (GMR = 9.453) and Hprt1 (GMR = 10.847) were the two least stable genes. In the Light/Dark rearing panel, the two most stable genes were Hprt1 (GMR = 1.189) and Rpl27 (GMR = 2.213), whereas the two least stable ones were Sdha (GMR = 10.462) and B2m (GMR = 11.465) ( Table 3).
Altogether these data suggest that the expression level of frequently used housekeeping genes could vary depending on different experimental conditions, supporting the notion that a preliminary study to find optimal reference genes should be performed whenever possible 20,21,30,43 .
The stability ranking of genes in both panels is summarised in Fig. 4, which was divided into the five top ranked genes and the rest. Within the five top ranks, we identified three housekeeping gene candidates, Tbp, Rpl13a and Rpl27, which were considered relatively stable across the two studied conditions and would also be used for the following validation experiments. Ttr, encoding transporter protein transthyretin (formerly called prealbumin), is a ChP marker 28 and its expression is expected to change among groups in the Developmental panel. Orthodenticle homeobox 2 (Otx2), is a homebox gene encoding for a transcription factor synthesized and secreted by ChP 29 , which has proven roles in regulating critical period in the visual system [44][45][46][47] . We, therefore, selected Otx2 as a good target gene for the Light/Dark rearing panel.
For averaging the reference genes, we used geometric mean rather than arithmetic mean as it is suggested to offer better control over possible outliers and abundance differences between different genes 18 . We normalised  Table 3 for more details). www.nature.com/scientificreports/  www.nature.com/scientificreports/ Ttr expression to the two most stable genes in Developmental panel, Rer1, Rpl13a (Fig. 5a), to the most common reference gene, Gapdh (used in many ChP studies 48-50 ) (Fig. 5b), to the two least stable genes in the same panel, Sdha, Hprt1 (Fig. 5c), and to different combinations of any two among three stable genes, Rpl13a, Rpl27, Tbp (Fig. 5d). Normalization with Rer1 and Rpl13a (the most stable) nearly doubled the Ttr relative expression level between P0 and P15, followed by a steadier rise until P60 (Fig. 5a). Although it is still possible to see the increase in Ttr expression throughout development with the commonly used reference Gapdh, the difference was less obvious and less statistically powerful. In addition, the statistical difference between P0 and P15 was lost (Fig. 5b). When the least stable pair Sdha and Hprt1 was used, Ttr relative expression level was flattened and there was no difference between the four developmental stages (Fig. 5c). Finally, Fig. 5d (Fig. 5d), which demonstrates their suitability for being used as suboptimal reference genes in the Developmental panel. Finally, we tested whether using Gapdh in combination with another more stable gene, namely Rer1, Rpl13a or Tbp, would improve the normalization of Ttr mRNA expression. Despite relatively lower fold change values, the results showed a good "rescue effect" with expression patterns similar to those seen by normalising to the most stable genes [Rer1, Rpl13a] (Supplementary Fig. S4a). Similarly, in the Light/Dark rearing panel when we normalized Otx2 expression to Hprt1 and Rpl27, the two most stable genes, we noticed a reduction of Otx2 expression in the D group compared to the Ctrl group. However, Otx2 level was steady across all dark rearing conditions, D, D-1hL, D-4hL and D-24hL (Fig. 6a). Normalisation to Gapdh gave heavily distorted results with Otx2 expression levels unchanged between the Ctrl and the D samples, and increased expression in D-1hL, D-4hL and D-24hL samples (Fig. 6b). Strikingly, when Sdha www.nature.com/scientificreports/ and B2m (the least stable genes) were used Otx2 expression was significantly increased in all dark rearing groups (D, D-1hL, D-4hL and D-24hL) compared to the Ctrl group. Otx2 mRNA level was highest at D-1hL (Fig. 6c). Finally, Fig. 6d Fig. S4b), which could be due to the difference in the stability rankings of Gapdh between the 2 panels. In fact, Gapdh ranked 2 nd in the Developmental panel and 6 th in the Light/Dark rearing panel, which suggests that its expression might be affected by different light/dark rearing regimes (Fig. 4). Altogether, these data emphasized the significant impact of reference gene choice on the expression levels of target genes. Our data also suggest that Rpl13a, Rpl27 and Tbp are relatively stable housekeeping genes in ChP across two experimental paradigms and that any combination among them has the potential to be a good choice of reference genes for ChP.

Discussion
ChP is emerging as an underestimated but important brain tissue, playing crucial roles in development, homeostasis and protection of the CNS. Therefore, the growing body of ChP research would benefit from reproducible and accurate protocols for measuring ChP gene expression. For RT-qPCR, this requires reference genes whose expression is stable across developmental stages and/or treatment conditions. In this study we investigated the expression stability of 12 common housekeeping genes (Actb, Atp5f1, B2m, Gapdh, Hprt1, Pgk1, Rer1, Rpl13a, Rpl27, Sdha, Tbp and Ubc) in ChP across developmental stages (Developmental panel) or brain activity (Light/ Dark rearing panel). We analysed expression data of housekeeping genes using both descriptive statistics ( Table 2, Fig. 1) and more sophisticated analysis by implementing specific algorithms built on different perceptions of housekeeping gene stability (Table 3, Figs. 2, 3). Our data indicate that different experimental paradigms require selective reference genes, confirming the importance of identifying suitable reference genes in relative quantitative RT-qPCR.
An initial evaluation of the housekeeping gene stability using descriptive statistics, measuring mean Cq and SD already highlighted variations between groups in the same panel and across panels (Fig. 1). For instance, Cq values for Tbp, Rer1, Rpl13a and Rpl27 were relatively stable across developmental stages, while Atp5f1, Hprt1, Pgk1 and Sdha showed significantly higher Cq for P0 versus older ages. This method is usually a tempting way of evaluating gene stability due to its simplicity but it is valid only to a certain extent. In fact, its apparent simplicity belies the biological and technical variations. In order to find a more robust measure of gene stability, we employed 4 widely used application packages: geNorm 18 , BestKeeper 25 , DeltaCq 26 , and NormFinder 27 . Differences in the methodology used by each algorithm may cause a few inconsistencies and make the result "relative" ( Table 3, Fig. 3). For example, in the Light/Dark rearing panel, all algorithms agreed that Sdha and B2m were the least stable genes, 3 of the 4 algorithms ranked Hprt1 as the most stable gene but Rpl27 received 4 different rankings from the 4 methods (although all within the top 4) ( Table 3). Therefore, in order to incorporate results from all 4 methods, we took geometric means of the 4 rankings to generate overall final stability values, which conclude that Rer1, Rpl13a were the 2 most stable genes in the Developmental panel while Hprt1 and Rpl27 were the 2 most stable genes in the Light/Dark rearing panel.
Another important issue to take into consideration is the number of reference genes that is sufficient to obtain reliable results without wasting materials. The use of one single reference gene is generally not advisable as Vandesompele et al. (2002) showed that normalisation using only one reference gene would lead to erroneous results up to 3.0 fold in 25% of cases and 6.4 fold in 10% of cases 18 . geNorm's V values demonstrated that the optimal number of reference genes for ChP in each experimental panel is 2 when selected from the most stable genes (Fig. 2b, d).
We also noticed that a subset of genes, Rpl13a, Rpl27 and Tbp were among the top 5 most stable genes in both panels (Fig. 4). Interestingly, the use of a combination of these genes across the two panels was enough to maintain the expression pattern as observed when using the most stable genes. In fact, the normalized expression level of Ttr and Otx2 was comparable when using the most stable pairs for each panel or a combination of 2 among the 3 commonly stable genes, Rpl13a, Rpl27, Tbp, demonstrating that these are suboptimal but acceptable reference genes in both panels (Figs. 5d, 6d). On the contrary, when comparing the expression level normalized to the most stable, the least stable, and Gapdh, we always visualized a stark contrast, showing how significant the choice of reference genes could alter target gene expression readings (Figs. 5a-c, 6a-c).
Standing out as commonly stable genes in both panels, Tbp, Rpl13a and Rpl27 are all involved in transcription/translation machinery. While Tbp encodes TATA-box binding protein -an important part of the eukaryotic transcription initiation complex 51 , Rpl13a and Rpl27 encode ribosomal protein L13a and L27, which are components of the 60S ribosome 52 . Although it is not always the case, Tbp and ribosomal proteins often appeared as stable candidates in many other reference gene analyses 18,42,53 . However, despite their high stability rankings, we do not recommend a reference set only containing Rpl13a and Rpl27 as they belong to the same family and may be subject to co-regulation. On the other side, genes involved in cellular metabolism, such as Atp5f1 (encoding subunit B in peripheral stalk of mitochondrial ATP synthase) 54 , Sdha (encoding an enzyme of tricarboxylic acid cycle) 55 , Pgk1 and Gapdh (encoding enzymes of the glycolytic pathway) 56,57 appear to be quite unstable in ChP across the conditions examined, suggesting that they may be modulated by the used experimental paradigms. In general, metabolic activities have been shown to change in response to external manipulation, such as photic www.nature.com/scientificreports/ manipulations 58 and developmental stages 59 , which may explain why Atp5f1, Sdha, Pgk1 and Gapdh expression is relatively unstable in this study and tend to rank at the bottom. Another common family of reference gene includes structural genes, such as Actb (encoding actin protein of cytoskeletal structure) and tubulin (encoding microtubules). It is well-known that tubulin genes are essential for the development and function of neurons 60 , however it has been reported that tubulin mRNA can be unstable under physiological changes 61 ; therefore, we decided to not include it in our study. On the other hand, we did include Actb, one of the most popular gene in the literature. However, in our experimental condition it only ranked 7 th in the Developmental panel and 10 th in the Light/Dark rearing panel, which is in agreement with many other studies showing that structural cellular components, such as Actb are not stable housekeeping genes and therefore not good candidate reference genes 30,41,62 . Finally, Gapdh, the most popular housekeeping gene in the literature, was not an ideal candidate in our study either, where it ranked 2nd in the Developmental panel and 6th in the Light/Dark rearing panel. Gapdh instability could be attributed to the fact that it is a metabolic gene and sensitive to our study's experimental conditions as mentioned above. Despite its abundance in ChP homogenate, the mediocre quality of Gapdh, as a reference gene, was reflected in the RT-qPCR experiment with Ttr (Fig. 5a, b), in which the increase in Ttr expression across development was observed but with a reduced statistical significance, especially at early developmental stages (P0-P15). In the Light/Dark rearing panel, where Gapdh was found in a lower stability ranking (Fig. 4), the normalization to Gapdh even gave a false sense of Otx2 expression compared to most stable genes in the same panel (Fig. 6a, b). In agreement with several previous studies 23,30,41,42 , we noticed that Gapdh may not be a reliable reference gene for the analysis of ChP. Here, we suggest that its use should be considered in specific experimental context and be combined with or even replaced by other housekeeping genes when appropriate.
In summary, we recommend the use of a minimum of 2 reference genes for ChP RT-qPCR. In particular, Rer1 and Rpl13a appear to be stable in the ChP across postnatal ages, whereas Hprt1 and Rpl27 are mainly stable when neuronal activity is manipulated. We also showed that Rpl13a, Rpl27 and Tbp were relatively stable genes across both experimental conditions in this study. Our results demonstrated that the expression of housekeeping genes in ChP could change depending on experimental settings and that the choice of reference genes can have a great impact on the measured expression levels of target genes. Therefore, it is worthwhile to investigate the stability of candidate reference genes in the context of the specific experiment (i.e. species, cell types, treatment, etc.) whenever possible.

Materials and methods
Selection of candidate reference genes and primers. 14 genes were examined, including 12 housekeeping genes and 2 target genes. All primer pairs were adapted from the literature (Table 1) with the following criteria: primer length 18-24 bp, amplicon size 50-150 bp, melting temperature (Tm) 56-60 °C, GC content 40-60%, no secondary structures formed at annealing temperature (56 °C). Primer specificity was checked in silico by NCBI-PrimerBlast 32 . Primers were synthesised by Sigma Aldrich.

Animal samples collection.
All animal experiments were approved by the local governing regional council (Regierungspräsidium Karlsruhe, Germany). All methods were carried out following the German Animal Welfare Act regulations. Animal studies are reported in compliance with the ARRIVE guidelines.
Inbred C57Bl6/J mice were purchased from Janvier labs. Light/Dark rearing mice were socially housed in groups of two to five animals. All mice were housed in static cages. No environmental enrichment toys were added to the cages, only extra Kimwipes tissue. Light-reared mice were maintained on a 12 hour light/dark cycle, whereas dark-reared litters were kept in complete darkness from birth until adulthood (P60). A subset of them was re-exposed to light for different times (Supplementary Fig. S1). Mice were housed in standard housing conditions and received ad libitum food and water.
Newborn pups (P0) were decapitated while animals at P15, P30 and P60 were sacrificed by cervical dislocation. Both male and female animals were included. Freshly harvested brains were kept in Harvesting media 63 (DMEM-F12 supplemented with 10% FBS, 2 mM L-Glutamine, 50 μg/ml Gentamycin (all reagents are from Gibco)) on ice, then ChP tissue was isolated from lateral and fourth ventricles, pooled together and immediately frozen in liquid nitrogen and stored at − 80 °C. ChP isolation was performed on ice and supported by stereomicroscope Stemi DV4 (Zeiss).
Total RNA isolation and cDNA synthesis. Total RNA was extracted from ChP tissue using RNeasy Mini Kit (Qiagen 74,104) following manufacturer's protocol. Tissue was disrupted and homogenised using Lysis buffer RTL and QIAshredder spin columns (Qiagen 79,654). Genomic DNA was removed using RNase-free DNase I (Qiagen 79,254) for 15 min at room temperature. RNA was eluted from the column with 30 µl RNase free water and stored at − 80 °C. RNA quality and concentration were measured by UV spectrophotometry on NanoDrop One (Thermo Scientific). RNA yield for each biological sample ranged from 800 to 3300 ng. The desired absorbance ratios A 260 /A 280 and A 260 /A 230 were 1.8-2.2. RNA samples with absorbance ratios below 1.8 were precipitated using 100% ethanol, ammonium acetate and glycogen (Thermo Scientific R0551) following manufacturer's protocol. For cDNA synthesis, 270-990 ng RNA each sample was used with High-Capacity RNAto-cDNA kit (Thermo Scientific 4,387,406). cDNA samples were stored at − 20 °C.

Reverse transcription quantitative real-time PCR (RT-qPCR) experiment.
For expression stability of housekeeping genes experiment, each panel was divided into two 384-well plates with experimental setup following the sample maximization approach 34  www.nature.com/scientificreports/ amplification of Gapdh in 4-5 samples were assigned as inter-run calibrators (IRC) 34 (see Supplementary Fig. S3 for the plate layout). For the RT-qPCR experiment of target genes' expression-Ttr and Otx2-with different normalisation factors, all reactions for each gene were carried out on one single 384-well plate, 3 biological replicates each group. The number of technical replicates was 3 for all reactions. No template controls (NTC) were included in all experiments. RT-q PCR was performed using PowerUp SYBR Green Master Mix (Thermo Scientific A25742) following manufacturer's protocol with 50 ng cDNA, 20 µM each primer in a 10 µl reaction volume. Thermal protocol consisted of UDG activation at 50 °C in 2 min, Dual-Lock DNA polymerase activation at 95 °C in 2 min, 40 cycles of Denaturation at 95 °C in 15 s, Annealing at 56 °C in 15 s and Extension at 72 °C in 1 min. Melting curve thermal protocol was run right after finishing amplification: 95 °C in 15 s (ramp rate 1.6 °C/s), 60 °C in 1 min (ramp rate 1.6 °C/s), 95 °C in 15 s (ramp rate 0.15 °C/s). All RT-qPCR experiments were performed on LightCycler 480 (Roche).
Raw data processing. Non-baseline-corrected RT-qPCR raw data was extracted from the machine to provide input for LinRegPCR 36 (version 2020.0). The software performed baseline correction for each sample individually, calculated amplification efficiency (E), quantification cycle (Cq) and coefficient of determination (R 2 ) by fitting a linear regression model to log-linear phase. Technical replicates were next examined with the allowed maximum variation of Cq as 0.5 (default threshold in qBase 34 ), unqualified replicate was eliminated. LinRegPCR output from two plates of the same panel was then loaded into Factor-qPCR 37 (version 2020.0) to remove systematic bias between different RT-qPCR runs. Finally, arithmetic mean of technical replicates was taken as the Cq value representing biological samples.
Expression stability analysis. Gene expression stability was assessed using RefFinder 40 (https ://www. heart cure.com.au/reffi nder/), a web-based tool integrating four algorithms: BestKeeper 25 , DeltaCq 26 , geNorm 18 and NormFinder 27 . The standalone software of geNorm (qBase + 34 from Biogazelle) and NormFinder (R-based, version 5) were used in parallel. For all software, input data was Cq values, output were different types of stability values (BestKeeper: r, DeltaCq: Mean SD of mean Cq, geNorm: M, NormFinder: ρ) and stability rankings. A lower rank indicates a more stable gene. An overall ranking was determined for each gene by calculating geometric mean of rankings from all 4 methods.

Relative expression of Ttr and Otx2, statistical analysis and data visualisation. For expression
analysis of Ttr, geometric mean of P0 samples was used as Calibrator. For expression analysis of Otx2, geometric mean of Ctrl samples was used as Calibrator. Relative expression level was calculated by Pfaffl method 64 (or efficiency method), which uses amplification efficiency E and difference of cycle of quantification Cq from unknown samples and Calibrator.
To evaluate statistical difference in target genes' RNA level between different groups, one-way ANOVA followed by Tukey's multiple comparisons was applied. Statistical analysis and graphs construction were performed in GraphPad Prism (version 8.4.0). Data was presented as Mean ± SD.