System-specific periodicity in quantitative real-time polymerase chain reaction data questions threshold-based quantitation

Real-time quantitative polymerase chain reaction (qPCR) data are found to display periodic patterns in the fluorescence intensity as a function of sample number for fixed cycle number. This behavior is seen for technical replicate datasets recorded on several different commercial instruments; it occurs in the baseline region and typically increases with increasing cycle number in the growth and plateau regions. Autocorrelation analysis reveals periodicities of 12 for 96-well systems and 24 for a 384-well system, indicating a correlation with block architecture. Passive dye experiments show that the effect may be from optical detector bias. Importantly, the signal periodicity manifests as periodicity in quantification cycle (Cq) values when these are estimated by the widely applied fixed threshold approach, but not when scale-insensitive markers like first- and second-derivative maxima are used. Accordingly, any scale variability in the growth curves will lead to bias in constant-threshold-based Cqs, making it mandatory that workers should either use scale-insensitive Cqs or normalize their growth curves to constant amplitude before applying the constant threshold method.

Scientific RepoRts | 6:38951 | DOI: 10.1038/srep38951 in sample k. This is exemplified in Fig. 1A and B: the black boxes denote F 10 , F 20 and F 40 for all 379 samples of a published technical replicate dataset.
We recently showed preliminary results 20 for such an examination of a published large scale technical replicate dataset 18 that revealed a pronounced regular and periodic pattern in the between-sample fluorescence signals for both early and late fixed cycle numbers. Subsequently, we have observed similar periodicity in many technical replicate datasets recorded with other qPCR platforms. Here, we employed autocorrelation analysis, a technique from the field of time series analysis that can reveal regularly occurring patterns in one-dimensional data. By this means, we have uncovered a periodicity of 24/12 in the qPCR raw data of 384/96-well microtiter plate systems that clearly corresponds to the plate architecture and/or optical read-out technology. This effect occurs at all cycle numbers and is typically stronger for cycles in the growth and plateau regions, so that the classical "baselining" fails to remove periodicities beyond the baseline region.
When C q values are obtained using fixed threshold methods, the persisting plateau periodicity propagates exactly, hence resulting in periodic C q values. However, this effect pertains to any systematic plateau phase scattering, periodic or not, making quantitation methods mandatory that are not directly influenced by the magnitude of the plateau phase. Interestingly, first-and second-derivative maxima methods 4,8 are a viable choice because they are mathematically scale-independent and deliver random, non-periodic C q values in the presence of periodic plateau phases. These findings lead to the simple conclusion to completely refrain from threshold-based qPCR quantitation. To this end, we have developed a web application for users to examine their own qPCR data with respect to periodic and other non-random patterns.

Results
Periodicities in published and own technical replicate qPCR data. In a recent study 20 , we demonstrated periodic patterns in raw fluorescence values at cycles 1 and 45 of the '94-replicates-4-dilutions' dataset 18 . A closer inspection of the '380-replicates' dataset in Ruijter et al. 18 revealed that the raw fluorescence values (Fig. 1A) are dispersed within a highly variable window of magnitudes (baseline region: 4000-5500, plateau region: 9000-15000). When F 10 is plotted for all 379 samples, an added Loess smoothing line uncovers a clearly periodic pattern of the fluorescence values (Fig. 1C). After baselining the data with a linear model of F 1…10 (Fig. 1B), the periodic pattern in F 10 is completely removed and the fluorescence values exhibit a random-like pattern (Fig. 1D). However, for fluorescence values at later cycles, such as F 20 in the exponential growth region (Fig. 1E) or F 40 in the plateau region (Fig. 1F), the same periodic pattern is evident, showing that baselining does not compensate for intrinsic patterns beyond the baseline region. The same periodic pattern is present in each cycle of the data from the exponential phase onwards without exception. These observations indicate that there is periodic scale variability in the data, as otherwise baselining would correct the whole curve for periodicity.
To further investigate these findings on other qPCR systems, we generated five additional replicate datasets that differed in amplicon (VIM, GAPDH, S27), chemistry (SybrGreen I, EvaGreen), and qPCR instrument (CFX96, Rotorgene, iQ5, StepOne, LC96). We then used our developed analysis pipeline based on autocorrelation analysis (Fig. 2, see also Material & Methods) to uncover putative periodicities intrinsic to these datasets, including the '380-replicates' dataset 18 . The latter, corresponding with the observations in Fig. 1, exhibits strong periodicity of ~24 that manifests in a distinct correlogram pattern (Fig. 3, top left). In this case, the Runs test for non-randomness and Ljung-Box test for autocorrelation are highly significant. Strong periodicities (Fig. 3) were also detectable for the 'VIM CFX96' and 'GAPDH StepOne' datasets, both with a period ~12-13, while 'S27 Rotorgene' , 'VIM iQ5' and 'GAPDH LC96' displayed negligible systematic patterns (with insignificant non-randomness tests for the latter two). The results from these five datasets suggest that strong periodicity is associated with some, but not all (i.e., not iQ5 or LC96) block-based systems.
Propagation of plateau phase periodicities to Cq values. We next investigated the effect of periodic fluorescence on the estimation of threshold-(C t ) and SDM (C qSDM )-based Cq values for the '380-replicates' dataset. The rationale is that baselined periodic fluorescence F i,k over all samples k at a fixed cycle i implies periodic threshold cycle values C t at fixed threshold fluorescence F t by propagation through the inverse function, from the following mathematical considerations: Suppose a sigmoidal function such as a four-parameter sigmoidal model, is fitted to qPCR data, and C t values are estimated by the corresponding inverse function at threshold fluorescence F t , Figure 3. Detection of periodicity in baselined qPCR data acquired by six different hardware systems.
Based on the analysis pipeline defined in Fig. 2 Then, an increasing parameter d (upper asymptote) decreases C t as the second term increases (b being negative, e being the first derivative maximum cycle). Indeed, estimated C t values for the '380-replicates' set at F t = 500 (exponential region) exhibit strong periodicity ( Fig. 4A) with exactly the same pattern as the fluorescence values for this dataset in Fig. 3.
As the overall scale of the qPCR curves drives the periodicity (it is strongest in the plateau phase, compare Fig. 1F), these results recommend the application of a scale-independent C q marker to neutralize such effects. The SDM is a viable choice because of the following: a SDM-based C qSDM value corresponds to the cycle number x, where the third derivative of (1), has the positive zero root and is therefore scale-insensitive as both the parameters for lower asymptote c and upper asymptote d are cancelled out. The same accounts for the first-derivative maximum (not shown). Hence, C qSDM is mathematically and physically decoupled from an overall periodic scaling of the fluorescence values. This paradigm is confirmed by the actual results: The C qSDM values of the '380-replicates' dataset are random and non-periodic ( Fig. 4B; Runs test and Ljung-Box test are insignificant). We then analysed a further technical replicate dataset consisting of seven 10-fold dilutions with 12 replicates each 34 , which was previously created with the widely used Lightcycler 480 system. The raw qPCR fluorescence values were fitted with a five-parameter sigmoidal model 8,24 , C t values estimated, rescaled into the interval [0, 1] (as a consequence of the C t value shifting in the dilution steps) and finally interrogated by autocorrelation analysis (Supplemental Fig. 3). Interestingly, a clear periodicity in the rescaled C t values with a period of 12 is evident, demonstrating that periodic patterns can be extracted from replicate dilution series after rescaling and that this system delivers periodic data.
A potential alternative to using scale-insensitive C q methods is to rescale all curves to the same final fluorescence magnitude before using threshold-based C t estimation. This normalization approach was initially advocated by Larionov and coworkers 26 , who demonstrated improved standard curve regression statistics after such normalization. Indeed, normalizing the fluorescence values within the interval [0, 1] has the same effect as SDM-based estimation: all periodicity and non-randomness in C t values is removed (Fig. 4C), although it appears that normalization does not completely remove intrinsic autocorrelation (Ljung-Box test p-value = 0.1). These results also pertain to all other datasets presenting periodicity (data not shown).
Cq value periodicities acquired by published algorithms and vendor's software. In a next step, we interrogated scale-sensitivity and periodicity in published data, where C t /C q values are available from a variety of quantitation methods. We extended these considerations to the C q estimation procedures of the methods compared in Ruijter et al. 18 by similarly analysing the supplied Cq, E and F 0 values for the '380-replicates' data provided in their supplement. Specifically, we looked for putative periodicity in these parameters obtained from six different qPCR quantitation methods: LinRegPCR, FPKM, DART, FPLM, Miner, and 5PSM (Supplemental File 3). Four of these (LinRegPCR, FPKM, DART and FPLM) deliver periodic C q values, while two (Miner, 5PSM) do not (Supplemental Fig. 1A). The estimated efficiencies exhibit no periodicity (Supplemental Fig. 1B), while the F 0 values are periodic for LinRegPCR and FPLM (Supplemental Fig. 1C). The F 0 values from the mechanistic MAK2 model display strong periodicity while the C q values from the Cy0-method are random (Supplemental Fig. 1D+E). These observations clearly confirm that methods employing first-or second-derivative maxima (Miner, 5PSM, Cy 0 ) yield non-periodic C q values, which tallies with our results and mathematical derivations. It is also a logical consequence that F 0 values estimated from F 0 = F q /E Cq using periodic C q s and non-periodic Es are likewise periodic. In contrast, all Es are non-periodic, or only slightly periodic, when calculated by E = F(C t )/F(C t − 1) and both nominator and denominator exhibit the same periodicity (albeit with a shift of one cycle). A potential cause for the lack of periodicity in SDM-based methods could be an increased dispersion (variance) of C q values which obfuscates any periodic patterns. We therefore reanalysed the different C q quantification methods from Ruijter et al. 18 with respect to the dispersion of their calculated C q values (Supplemental Fig. 1F). We found that the three SDM-based methods (Cy 0 , Miner, 5PSM) deliver C q values with lower dispersion, which manifests in narrower boxplot boxes (in blue) and lower coefficients of variation (c.v., in blue). These results are not surprising as they constitute a re-evaluation of similar C q dispersion results (compare Figure 6B in Ruijter et al. 18 ) and tally with the observations from a replicate dilution set (compare Fig. 3 in Tellinghuisen & Spiess 20 ). Moreover, they should largely be a consequence of the already mentioned decreased sensitivity of these three SDM-based methods to the overall plateau phase scattering.
As the above data were fitted with author-developed algorithms, we inspected whether Cq values also exhibit periodicity when obtained from the actual output of qPCR system analysis software. Indeed, using the 'VIM.
CFX96' data to calculate Cq values by the CFX Manager ™ software, we observed highly periodic Cq values from both supplied quantitation methods, "Manual threshold" and "Nonlinear regression" (Supplemental Fig. 2).

Effect of periodic Cq values on calibration curve-derived efficiency and copy number estimation.
The presence of periodic Cq values is likely to entail a quantification bias that depends on the location of the selected Cq values within the periodic pattern. To address the question on how large this selection bias can be, we conducted an iterative analysis on the '94-replicates-4-dilutions' dataset 18 . We previously demonstrated that this dataset also exhibits extensive periodicity in fluorescence values 20 . Similar to the '380-replicates' dataset analysed in this work, a fixed threshold estimation of the lowest dilution set (15000 copies) at F t = 500 results in 94 periodic C t values (Supplemental Fig. 4A). Using all combinations of the two most extreme C t values of the lowest (15000 copies) and highest (15 copies) dilution as well as all 94 C t values of the two intermediate dilutions (1500 and 150 copies), we created 34968 linear regressions for calibration-based absolute quantitation (Supplemental Fig. 4B). Efficiencies calculated from the slopes of the regression curves were spread within a window of 1.79 to 2.19 (Supplemental Fig. 4C), while copy numbers estimated at C t = 30 varied from 28 to 86 (Supplemental Fig. 4D). These findings demonstrate that i) efficiency estimation is highly dependent on the combination of C t values used for constructing the regression curve and ii) estimated copy numbers for unknowns must be viewed with caution as they can spread over a large interval.
Factors contributing to qPCR periodicities. In principle, at least three factors could contribute to such periodicity effects: i) uneven thermal distribution of the Peltier block system, resulting in well-to-well differences in E, which in turn influence the amount of amplicon formation, ii) bias and heterogeneity of the optical detection system and iii) pipetting induced patterns, e.g. from uneven and tip-dependent deposition with multichannel pipettes. The first of these cannot account for the observed periodicity in the baseline fluorescence, where there is negligible signal from the amplicons. To address sources ii) and iii), we performed a simple experiment in which Scientific RepoRts | 6:38951 | DOI: 10.1038/srep38951 qPCR mastermixes without template, but containing SybrGreen, ROX or 150 nM Oligo-dT 20 -Cy5, were cycled and scanned in the corresponding channels (Fig. 5). The deposition of the mastermix in all 96 wells was conducted with a single-channel pipettor in order to avoid any periodic volume differences (source iii). We selected the CFX96 (Biorad) system because of its strong periodicity in fluorescence values during qPCR amplification (Fig. 3). Even in the absence of amplification, ROX (Fig. 5A), Cy5 (Fig. 5C) and to a lesser extent SYBR Green (Fig. 5B), displayed for cycles 1, 10 and 20 periodic fluorescence patterns that were highly similar. These results support source ii) -optical detection effects -as the primary source of the periodicity, consistent with the same being responsible for overall scale variability in the amplification profiles.

Discussion
qPCR amplification profiles commonly display significant variability in their intensity scale. Using autocorrelation analysis, we have shown that for many block-based instruments, such effects are periodic in the sample number even after baselining, leading to similar periodicity in threshold-based estimates of the quantification cycle C q . The observed periodicities of ~12 for 96-well block systems and ~24 for a 384-well system suggest a correlation with block architecture (number of columns). Our passive dye experiments indicate that this effect is very likely due to optical detector bias, however positional block temperature effects on dye fluorescence magnitude may also play a role 33 . Due to fluorescence periodicity in the absence of any DNA template, we rule out possible influences on qPCR amplification efficiency, as proposed for positional bias in C q and melting-curve-derived T m values 21,22,29 . The periodicity of ROX fluorescence suggests -as is a widely applied procedure -to normalize SybrGreen fluorescence by ROX fluorescence through F i,k (SYBR)/F i,k (ROX). Conducting this approach for the 'VIM.CFX96' data with the corresponding ROX fluorescence at Cycle 20 certainly decreases the magnitude of observed periodicities (Fig. 5D, lower panel) from a range of [− 100, 100] to [− 0.02, 0.02], however the periodicity as such persists. In addition, we have observed a decrease in ROX fluorescence during cycling (Supplemental File 4), which may pose a problem for this approach. At this point it must be emphasized that we discovered highly periodic Cq values obtained from hardware systems that state to have an optical detection system which eliminates the need to use passive reference dyes, such as the CFX96 (Biorad) system (compare vendor's info 32 ). Finally, positional differences in qPCR efficiency would likely result in position-dependent amplicon yield. In another experiment (data not shown), we found no correlation between F max and amplicon yield (as obtained from capillary electrophoresis), similar to other's observations 26 . While heterogeneity in the robotic pipetting systems might also play a role 27,28 , this cannot explain the periodicity when we used a single-channel pipette to charge the wells. A most plausible explanation may be found in the different optical architectures of the qPCR systems. However, to this end, we do not feel entitled to give an undisputed explanation on which optical factors (e.g. spherical aberration of the lens/mirror system or "optical vignetting" in the field-of-view periphery) drive periodicity. For instance, block-based systems that did not show such effects (iQ5 and LC96) conduct simultaneous optical measurements of all samples (CCD camera and per-well fibre optics, respectively), instead of acquiring signals through a column-wise scanning optical shuttle (CFX384, CFX96, StepOne). On the other hand, another CCD camera/mirror system (LC480) clearly exhibited periodicity, such that a fixed scanning architecture is not necessarily devoid of delivering periodic fluorescence readouts.
These detection bias effects are clearly undesirable, and it must be recognized that their effects on C q can be completely eliminated by using a scale-insensitive definition for C q (e.g., SDM, Cy0, relative threshold) or by normalizing the data to constant scale 26 . For the latter case, it is necessary to record data well into the plateau region, or to use a whole-curve fitting method that reliably estimates the plateau. Many vendors include an SDM option for C q in their software, but the virtues of this and other scale-insensitive C q markers over C t have been underappreciated. It is further worth noting that the first two approaches also neutralize effects of true variability in the amplicon yield from random cycle-to-cycle variation in the amplification efficiency 30 . Most importantly, the observed periodicity, while interesting in itself, is merely an indicator that any kind of scale variability, periodic or random, propagates into threshold-based C t values. Consequently, the widespread application of fixed threshold-based quantitation is highly questionable, although it is established as the most commonly used qPCR quantification method. We see various reasons for this: a) it was the first method introduced and implemented in vendors' software during the dawn of qPCR technology, so that scientists might perceive it as robust and well-tested, b) it seems more intuitive and familiar to base the analysis on a single fixed parameter, similar to other analytic methods, c) unawareness of scale effects on fixed location indices in a sigmoidal curve, and d) lack of implementation in some qPCR software.
We advise researchers to use our approach to examine their qPCR data for periodicity, as this problem appears to be unacknowledged by the instrument vendors. To the best of our knowledge, none of the existing and previously reviewed qPCR software 31 can identify periodic patterns in qPCR data. Our web application, www.smorfland. uni.wroc.pl/shiny/period_app/, fills this gap, making it easy for users to examine their data for periodic and other non-random patterns (more details in Supplemental Fig. 5).

Materials and Methods
Datasets. For the analysis of periodicity in this work, we have employed one published 384-reaction technical replicate dataset ('380-replicates' 18 ) and five new 72 to 96-reaction technical replicate datasets, obtained with different amplicons, qPCR hardware systems, and detection chemistries 23   Autocorrelation analysis. The principle approach used in this work is as follows: Using either raw fluorescence values F i at a selected cycle number i, or C q values estimated at a defined fluorescence threshold level F Cq ( Fig. 2A), we fit a quadratic model of the form = + +  y ax bx c i i i 2 to the data (Fig. 2B). The rationale behind this approach is our observation of curvature and slope in F i and C q values with statistically significant quadratic coefficients. A Loess smoother with a span of 0.1 is then employed on the residuals R i = y i −  y i of the fit for the single purpose of an initial visualization of periodic patterns (Fig. 2C). A Wald-Wolfowitz (Runs) test and a Ljung-Box test are applied to the residuals in order to estimate significance of non-randomness and autocorrelation, Scientific RepoRts | 6:38951 | DOI: 10.1038/srep38951 respectively. We then use the residuals R 1 , R 2 , … . R i from the fit for calculating the autocorrelation r k with lags k = 1, 2, …. n by the following formula: In a final step (Fig. 2D), we create the correlogram of all autocorrelations r k and use the automatic peak detection R procedure findpeaks to identify the period. The complete pipeline is implemented in the CheckPeriod function of Supplemental File 2.
Estimation of other curve parameters. C q values based on a defined threshold fluorescence F t , in the following termed C t , were estimated by inverse functions of the sigmoidal models. C q values based on second-derivative maxima (C qSDM ) were calculated by finding the cycle corresponding to the maximum value of the second-derivative function. In case of fitting with cubic splines, the root of the inverse or third-derivative of the spline function was employed to estimate C t or C qSDM , respectively.
The maximum fluorescence F max of a curve ("plateau phase") was based on parameter d (upper asymptote) of a five-parameter sigmoidal model 8 .
Computational aspects and reproducibility. All analyses in this work were conducted with the R statistical programming environment (www.r-project.org). The qpcR package 24 was used for qPCR curve fitting and parameter estimation. To comply with the increasing need for computational reproducibility 25 , we provide data in Supplemental File 1 and the R script/workspace in Supplemental File 2, from which readers can reproduce all our figures.
Web-based analysis of periodicity in qPCR data. A web-based analysis platform for investigating qPCR periodicity was developed with the Shiny framework for R 25 . Here the user can upload her/his qPCR data, either fluorescence values at a defined cycle or C q values, and analyse the data with the pipeline given in Fig. 2. The web application is to be found at www.smorfland.uni.wroc.pl/shiny/period_app/. Overview screenshots of this application are given in Supplemental Fig. 5.