Temperature fluctuations in a changing climate: an ensemble-based experimental approach

There is an ongoing debate in the literature about whether the present global warming is increasing local and global temperature variability. The central methodological issues of this debate relate to the proper treatment of normalised temperature anomalies and trends in the studied time series which may be difficult to separate from time-evolving fluctuations. Some argue that temperature variability is indeed increasing globally, whereas others conclude it is decreasing or remains practically unchanged. Meanwhile, a consensus appears to emerge that local variability in certain regions (e.g. Western Europe and North America) has indeed been increasing in the past 40 years. Here we investigate the nature of connections between external forcing and climate variability conceptually by using a laboratory-scale minimal model of mid-latitude atmospheric thermal convection subject to continuously decreasing ‘equator-to-pole’ temperature contrast ΔT, mimicking climate change. The analysis of temperature records from an ensemble of experimental runs (‘realisations’) all driven by identical time-dependent external forcing reveals that the collective variability of the ensemble and that of individual realisations may be markedly different – a property to be considered when interpreting climate records.

to yield approximate dynamical similarity to the terrestrial atmosphere in terms of thermal Rossby number, Ro T , and Taylor number, Ta (Fig. 1c, Methods) 16,17 . We log simultaneously (sampling rate 1 Hz, differential resolution 0.05 K) point-wise local temperature values via five digital co-rotating thermometers, three of which penetrates into the free top surface of the working fluid cavity from above, spaced uniformly along a radius (Fig. 1b). The spatial average, T i (t), of these signals from three different 'latitudes' is used here as a surrogate for 'meridional' mean temperature (index i refers to the i-th ensemble member, i.e. experimental run). Since there is no azimuthal bias in the annulus -as there is, e.g. in the terrestrial atmosphere, due to land-ocean differences, topographical effects, etc. -we would expect the statistical properties of temperature fluctuations to be the same at different azimuths. Thus, it is safe to assume that such a longitudinal average can also be considered a proper surrogate of the global average. The other two identical sensors measure the forcing temperatures at the center (inner cylinder) and in the outer sidewall, whose difference ΔT quantifies the temperature contrast driving the sideways convection.
The novelty of our experiments lies in the procedure of intentionally changing the thermal boundary conditions in time, while keeping the rotation rate fixed (so that a 'day' i.e. one revolution of the tank takes P = 3 s). After a 'base period' of ca. 2600 revolutions of constant ΔT the cooling element at the center is turned off. Following this abrupt change in heat flux T i (t) is kept logged for another 3000 revolutions of time, corresponding to a 'global warming' scenario with gradually increasing polar temperatures. We note, that it is generally accepted that the North-South temperature contrast has been decreasing (and will continue to decrease) in the Northern Hemisphere due to climate change as reported, e.g. in the latest assessment report of IPCC 20 . The recent alarming findings 21 about the rapidly melting Arctic also underline the existence of this phenomenon, showing twice as fast warming of the Arctic as that of the global mean.

Results
Based on our criteria for the external forcing sequence ΔT i (t) to be accepted as 'identical' (Methods) the analysis was restricted to nine experimental runs and 10 000 s of continuous data from each of them with the onset of 'climate change' (hereafter marked as time zero, t = 0) occurring exactly at half time in all cases. The forcing ΔT(t) in each considered realisation (Fig. 2a) follows an exponential decay with characteristic timescale τ = 1085 s for t > 0. The system's response T i (t) in each run, and even their ensemble average 〈T〉(t) shows significant fluctuations ( Fig. 2b and c) due to the geostrophic turbulent flow dominated by irregular cyclonic (warm) and anticyclonic (cold) vortices 18 .
Addressing variability in the system we first demonstrate the difference between the 'traditional' measuresbased on single realisations -and the ensemble statistics through the example of standard deviations. We find that the centered running variances (within 501 s long windows) of the residuals of T i (t) following a 4-degree polynomial detrending in the different realisations may exhibit seemingly opposite tendencies (Fig. 3a), and are thus not representative. In the two chosen paths, one reaches the largest variability in the t < 0 base period. Although the Figure 1. Thermal convection in planetary atmospheres and in the laboratory. (a) Schematic diagram of the mid-latitude atmosphere of Earth, illustrating the basic boundary condition with a meridional temperature contrast ΔT between the warm equator (red) and a polar region (blue). The system is rotating at angular velocity Ω. (b) Sketch of the differentially heated rotating annulus with its geometric parameters (a = 4.5 cm, b = 12 cm, d = 4.5 cm) for which the boundary conditions are similar to those of the real atmosphere: warm outer rim (red), cold inner rim (blue). The locations of the three co-rotating thermometers, which were submerged by 0.5 cm into the bulk from above the water surface are also shown (black dots). The average of these signals at each time t yielded 'meridional mean temperature' T(t). (c) Schematic regime diagram for rotating laterally heated systems 16,17  statistical comparison of the t < 0 and t > 0 intervals yields no significant difference, the mean and median indeed are somewhat smaller in the latter case (not shown). Thus -if only this particular record was known -one could speculate that the fluctuations of temperature generally decreased in the 'climate change' phase compared to the base period. The other exemplary case shows just the opposite trend: a slight, statistically insignificant increase in mean variability after t = 0.
Meanwhile, in terms of the ensemble variance σ e , i.e. the standard deviation of the nine considered realisations T i (t)(i = 1, …, 9) around 〈T〉(t) at each time instant t (Fig. 3a), the system's real sensitivity to changing ΔT is revealed (Fig. 3b). The mean of σ e (t) shifts significantly by ca. 6.5% from 0.35 to 0.38 °C at t > 0 and, more strikingly, the histogram changes from left-modal (skewness: 0.37) to right-modal (skewness: −0.10) after the initiation of 'climate change' . This result indicates that the paths of the realisations differ from each other more in the presence of nonstationary forcing than in the base period: even if the transition is hardly noticable in the variance patterns of one single realisation its effect on the whole ensemble is apparent.
The typical time difference τ c between successive local extrema of the fluctuating temperature records T i (t) serves as a measure of the temporal variability of the 'weather' in the system. The local maxima (minima) indicate the crossing of cyclonic (anticyclonic) eddies at the thermometer locations. We calculate the peak-to-peak time differences for each ensemble members, after a removal of a 5th order polynomial trend and applying a 61-point running mean for smoothing. The statistics of the obtained values of τ c combined from all experimental runs shows a significant shift when comparing the t < 0 and t > 0 periods; the mean increased from 199.2 s to 214.7 s (by around 8%), the median from 183 s to 209 s (by around 14%). This finding is consistent with the theoretical expectations: smaller ΔT yields cyclonic and anticyclonic eddies of smaller size, scaling with the so-called Rossby deformation radius L R , i.e. proportionally to the square root of the imposed temperature gradient 22 Whereas the drift velocity c of baroclinic eddies is determined by the thermal wind balance and scales as c ∝ ΔT μ , where the exponent μ has been found to be between 0.88 and 1.17 in earlier experiments 19,23 , μ = 1 being the theoretical value. Thus, the crossing timescale is expected to follow a τ c ≈ L R /c ∝ ΔT 0.5−μ dependence, yielding an increasing trend with decreasing ΔT(t) in time. Even in this respect, single-realisation statistics could be misleading: due to the (geostrophic) turbulent nature of the flow the values of τ c exhibit large variance in all cases that can easily suppress the slight trend.
Further exploring temporal correlations we apply the method of detrended fluctuation analysis (DFA) 24, 25 , a strandard procedure for measuring the variability of a signal around its local trend in time windows of length n samples as a function of n. DFA4 removes local (cubic) trends, thus it is more suitable for our present purpose than e.g. Fourier transforms, since DFA4 can readily handle nonstationarities (Methods). The DFA4 spectra from all realisations (Fig. 4a) follow the same scaling properties, exhibiting power law-type scaling with two scale breaks. Below t n ≈ 40 s and above t n ≈ 400 s the scaling exponents are δ = 0.87 and 1.1, respectively, implying 1/f noise-like correlated fluctuations. Between these crossover points δ = 2.1 is found, characteristic for geostrophic turbulence 26,27 : it can be shown that if the DFA4 spectra exhibit power-law scaling then the Fourier power spectrum of the time series in the frequency domain, S(ω) also does, following S(ω) ∝ ω −β , where β = 2δ − 1 connects the two exponents 28 , yielding in the present case, β ≈ 3. This is in good agreement with the theoretical result for isotropic geostrophic (two dimensional) turbulence 29 . It is to be noted that the ΔT-dependence of the exponent β has been analyzed via comparing the ensemble-averaged power spectra of different (overlapping) sections of the time series T i (t), but no trend could be established (for more details, we refer to the Supplementary information). Thus, it can be stated that the 'quality' of geostrophic turbulence did not change throughout the 'climate change' period.
Concerning the differences between the stationary (t < 0) and 'changing' (t > 0) records (turquoise and orange curves in Fig. 4a, respectively), their fluctuations up to a window size t n * ≈ 160 s are perfectly identical in the statistical sense. This is also apparent from the averages of the two sets of spectra in Fig. 4a (red and black thick curves). On the t n > t n * scale, however, the fluctuations of the 'changing' records are significantly larger. Note, that this timescale is still about an order of magnitude below τ = 1085 s i.e. the characteristic time of the 'climate change' ΔT(t > 0), but is of the same order as the empirical delay time of ∼200 s of the dynamics estimated from the crossover point of the linear temperature trends of 〈T〉(t) in the two periods (Fig. 2d).
Also shown are the DFA4 spectra of the ensemble averages 〈T〉(t < 0) (blue line) and 〈T〉(t > 0) (green line) following the same scaling and the same separation of the stationary and 'changing' branches at t n * , as discussed above. Multiplying the fluctuation spectra of the ensemble averages by = N 3 (N = 9 being the sample size of the ensemble) yields perfect match with the average of the single-realisation spectra. This property shows that the fluctuations of different realisations are perfectly uncorrelated on all time scales t n < τ: uncorrelated fluctuations average out following N 1/ , whereas 'ensemble-correlated' fluctuations would remain unaffected by the ensemble averaging. Here the latter are absent; no 'collective variability' can be identified in the ensemble, despite of the identical forcing sequence ΔT(t). Obviously, on the time scale of τ collective behaviour does exist -the trend itself -but such large time windows are not sampled properly and are not evaluated in the spectra. The lack of collective fluctuations on the sub-τ scales highlights the largely nonlinear nature of the system's response to changing forcing.
The time-scale-dependence of variability amplification caused by 'climate change' is visualized in Fig. 4b, where the ratios of DFA4 fluctuation spectra in the 'changing' phase relative to the 'base period' of the same run -and those of the ensemble average -are plotted. (Due to the logarithmic vertical axis this practically reflects the differences of the respective graphs in Fig. 4a). Here again it becomes manifest that 'climate change' does not affect the variability on the time scales below t n * from the ensemble average point of view (the average amplification is close to zero), still, one can also easily spot individual realisations with either markedly increased or decreased variability in this spectral band as well. Above t n * all realisations exhibit clearly amplified variability. For the ensemble average it reaches a maximum increase of 47% at around t n * and stays around 20% for the t n > t n * timescales up to t n ≈ 800 s.
To determine the statistical significance of the above results, we have carried out Monte Carlo statistical testing using a standard inverse-Fourier surrogate data method 30 (Methods). The null-hypothesis of the testing is that there are no fundamental changes in the dynamics of fluctuations between the t < 0 'base period' and the t > 0 'warming phase' . If this was the case, the fluctuations during the warming would exhibit very similar distribution and spectral properties as in the base period, superimposed onto a warming trend. In order to model this hypothesis, 10 model 'warming' time series were created for each of the 9 ensemble members (i.e. 90 time series in total) using the Fourier amplitude spectra of their corresponding 'base periods' but shuffling their phases. The resulting model series were then superimposed onto a polynomial warming trend, imitating the temporal development of the ensemble average 〈T〉(t > 0), shown in Fig. 2a, to yield a realistic increasing trend (Methods).
Comparing the DFA4 spectra of the model series to their respective base periods in the considered timescale range yields the 'amplification factors' shown with turquoise curves in Fig. 5. The green curves corresponding to the actual ensemble members and the thick black curve denoting the ensemble average are repeated from Fig. 4b. The red dashed curve shows the mean of the model results and the dotted curves represent the ±3σ interval. The vertical domain covered by the turquoise curves can be understood as a measure of variability that is due to finite-size effects and the imposed trend itself. It is apparent, however, that the measured ensemble data follow a markedly different distribution, thus the null-hypothesis in the considered timescale-range of t n > t n * can be rejected with a high confidence. Towards the larger timescales, comparable to the typical eddy-crossing times τ c and also to the characteristic time of baroclinic adjustment (as mentioned earlier), a clear increase of fluctuations can be observed, indicating real dynamical differences, not merely statistical artifacts.

Discussion
The present work provides, to the best of our knowledge, the first results from any laboratory experiment aiming to model the effects of climate change on mid-latitude atmospheric circulation. The authors do not claim that the lessons learned from the presented experimental minimal model could be directly applied or compared to the processes of the Earth system and the ongoing climate change. Perfect hydrodynamic similarity is impossible to achieve, thus the ratios between all of the relevant timescales (corresponding to the rotation, baroclinic adjustment, crossing time of cyclones, the changing of the temperature contrast ΔT) cannot be set to scale properly. Nevertheless, the studied model as a dynamical system does share some important features with the climate system on the conceptual level: both are rotating, turbulent hydrodynamic systems, driven by the incoming differential heat fluxes, a forcing that changes in time. Due to the time-dependence of the forcing, these systems cannot reach an equilibrium state. Therefore, if one intends to survey the variability between the possible outcomes of such a process at any time instant, it is essential to consider a whole ensemble of realisations, subject to the same forcing scenario and differing only in their initial conditions.
Despite of the large variability of the ensemble that is due to the nonlinear nature of the processes and the finite length of the studied records, the fluid dynamical interpretation of the observed flow phenomena is relatively straightforward. The system is in the state of well-developed geostrophic turbulence, that yields a power-law scaling in the power spectra of the fluctuations in both the wavenumber-and the frequency domain (Supplementary). The characteristic size of the cyclonic and anticyclonic eddies (corresponding to warm and cold temperature anomalies, respectively) tends to decrease as the 'meridional' temperature gradient drops, in agreement with the theoretical expectations. In parallel, the zonal drift velocities decrease even faster during the process, therefore the characteristic timescale of 'weather change' at a fixed measurement location increases significantly. This timescale is of the same order as the typical response time of the flow to the changes in the forcing (baroclinic adjustment) therefore fluctuations were found to increase markedly in this spectral band.
'One experiment is no experiment' has been the mantra of researchers for ages, but the idea behind the saying has always been the separation of measurement errors from significant signals. Here, however, the fluctuations are just as inherent, fully deterministic and dominant features of the underlying nonlinear processes -just like in the Earth system -as the large-scale trends themselves. The reason for the increasing ensemble variance lies in the system's extreme sensitivity to initial conditions -a ubiquitous property of chaotic, long-range correlated systems. The authors firmly believe that the only proper approach for carrying out laboratory experiments on non-stationary turbulence would be conducting and systematically evaluating, ensembles of runs. In observational climatology this is not a viable option; we have only one Earth. Yet, the present experimental demonstration may help to increase awareness of the fact that a climate-like dynamical system can undergo a transition towards larger variability even without noticeable effects on the temporal fluctuations of one particular realisation. This message applies to the GCM community as well: climate variability information from a single numerical run (e.g. CO 2 doubling scenario) could be misleading as it does not necessarily represent the full complexity of the underlying ensemble dynamics.

Methods
Non-dimensional parameters, hydrodynamic similarity. In large-scale environmental flows Rossby number Ro ≡ U/(2|Ω|L) -with U being the magnitude of the horizontal flow velocity, L the horizontal extent of the domain and Ω the angular frequency of the planetary rotation -quantifies the characteristic ratio of hydrodynamic acceleration and Coriolis acceleration. In the dynamics of atmospheric convection the thermal boundary conditions and the relationship ρ(T) between the density and temperature of the fluid parcels are of fundamental importance just as well. A convenient nondimensional combination for quantifying all these factors is the thermal Rossby number Ro T (or Hide number), defined as where α is the coefficient of volumetric thermal expansion for the fluid, d is the vertical scale, and ΔT is the 'meridional' temperature contrast 22 . For our calculations the annular gapwidth b − a was taken as horizontal scale L for the experiments. Besides Ro T the kinematic viscosity ν of the medium also plays an important role in the dynamics; it introduces a 'viscous cutoff ' that dissipates too weak thermal winds and also damps the baroclinic instability of larger wavenumbers. This effect is parametrised by Taylor number Ta that accounts for the ratio of rotational and viscous effects, and reads as 2 Ro T and Ta are used in tandem to characterize the different dynamical regimes in rotating, thermally driven systems, such as planetary atmospheres and their minimal models in the laboratory (Fig. 1c).
Experimental procedures, data selection. For a detailed description of the experimental wave tank and the heating and cooling mechanism we refer to ref. 19. The temperature records were obtained using an ALMEMO temperature sensor array of NiCr sensors with a relative resolution of 0.05 K and 1 Hz sampling rate. The sensors were fixed onto a co-rotating mast above the free surface of the rotating annulus, and penetrated by 0.5 cm into the water surface. The data was transported in real-time via the co-rotating data aquisition module ALMEMO 8590-9, equipped with UHF/Bluetooth antenna. The initial temperature of the working fluid (de-ionised water) was set to 24.5 ± 0.5 °C before each measurement. After switching on the heating thermostats for the differential heating a transient period of 7600 s followed in order to reach quasi-equilibrium dynamics in each experimental run. Only after this period we started to log the data of the 5000 s long 'base period' . The nine experimental runs considered in this work were selected based on the criterion that the forcing time series ΔT(t) of each realisation must not deviate by more than 0.3 °C from the ensemble average 〈ΔT〉 at any time t (two of the original 11 experiments were thus excluded). The thermographic images of Fig. 2c were obtained by an InfraTec VarioCam infrared camera mounted above the set-up, operating in the spectral wavelength range of 7.5-14 μm. These thermograms can be considered to represent surface temperature structures, since the penetration depth of this wavelength range into water is less than a millimeter. The images were taken during an additional experiment following the same forcing sequence, but with the thermometers removed from the working fluid for the sake of visibility. Therefore this run was not a member of the ensemble.
Detrended fluctuation analysis. DFAp 24, 28 is a robust and easily implemented analysis of the temporal scaling properties of a fluctuating and non-stationary bounded time series x t . Firstly a summation is applied to yield a cumulated (unbounded) time series X t : where 〈x〉 denotes the mean of the time series. Next, the profile is divided into non-overlapping time windows Y j of length n and for each a local least square polynomial fit ξ(p − 1) j of order p − 1 is calculated. Finally, the fluctuation is obtained as the root-mean-square deviation from the trend as where N is the number of n-sized windows of the time series. Note, that care must be taken to the fact that the congruence between N and the length of the time series is often not zero. To preserve the remaining section the applied algorithm repeats the same dividing procedure from the end of x t , thus, practically 2N segments are generated and the applied fluctuations are combined accordingly. We determined the DFAp fluctuation functions with p = 2 … 8 for the time series T i (t) and observed that no significant differences appear between the spectra for p > 4, therefore we limited our presentation of the results for the DFA4 computations only.
Surrogate data for the statistical testing. The surrogate data for the model time series were generated using the method developed by Schreiber and Schmitz and described in ref. 30. The implementation of the algorithm is included in the open source software package TiSeAn 3.0.1 for nonlinear time series analysis 31 whose routine 'surrogates' have been used for the present work. The principle of the method is the following: if the null hypothesis was true, the typical realisations of the process are expected to share the same power spectrum and amplitude distribution, thus such model time series need to be generated. This is carried out iteratively by the following procedure from the prescribed distribution and Fourier spectra of the actual data. } is obtained. In a given iteration step, the shuffled data {x n (i) } is brought to the desired sample power spectrum by taking the Fourier transform of {x n (i) }, replacing the squared amplitudes {S k 2,(i) } by {S k 2 } and then transforming back. The phases of the complex Fourier components are kept. This step enforces the correct spectrum but usually the distribution will be modified. Therefore, in the next step the resulting series is rank-ordered to assume exactly the values taken by {x n }. Then, the spectrum of the resulting {x n (i+1) } will be modified again. These steps have to be repeated several times; at each iteration stage the remaining discrepancy between the obtained and the desired spectra and distributions is checked and the iterations continue until a given accuracy is reached. For finite N a convergence in the strict sense is not expected. Eventually, the transformation towards the correct spectrum will result in a change which is too small to cause a reordering of the values. Thus, after rescaling, the sequence is not changed.
For each resulting model series, an increasing (5th order polynomial) warming trend was added. The properties of this warming trend were derived from fitting the polynomial formula to 〈T〉(t) in the 'climate change' period (t > 0). Thus, 90 model time series were obtained -10 for each ensemble member -inheriting the power spectra and the rank-ordering of the original corresponding base period (t < 0) data.