The spatial and temporal domains of modern ecology

To understand ecological phenomena, it is necessary to observe their behaviour across multiple spatial and temporal scales. Since this need was first highlighted in the 1980s, technology has opened previously inaccessible scales to observation. To help to determine whether there have been corresponding changes in the scales observed by modern ecologists, we analysed the resolution, extent, interval and duration of observations (excluding experiments) in 348 studies that have been published between 2004 and 2014. We found that observational scales were generally narrow, because ecologists still primarily use conventional field techniques. In the spatial domain, most observations had resolutions ≤1 m2 and extents ≤10,000 ha. In the temporal domain, most observations were either unreplicated or infrequently repeated (>1 month interval) and ≤1 year in duration. Compared with studies conducted before 2004, observational durations and resolutions appear largely unchanged, but intervals have become finer and extents larger. We also found a large gulf between the scales at which phenomena are actually observed and the scales those observations ostensibly represent, raising concerns about observational comprehensiveness. Furthermore, most studies did not clearly report scale, suggesting that it remains a minor concern. Ecologists can better understand the scales represented by observations by incorporating autocorrelation measures, while journals can promote attentiveness to scale by implementing scale-reporting standards. Analysing the spatial and temporal extents of 348 ecological studies published between 2004 and 2014, the authors show that although the average study interval and extent has increased, resolution and duration have remained largely unchanged.

T he scales at which ecosystems are observed play a critical role in shaping our understanding of their structure and function [1][2][3] . Ecological patterns emerge from temporal and spatial domains that may be coarser or finer than the processes that shape them, which means that investigation across multiple scales is essential for understanding ecological phenomena 1,4 . This awareness has grown rapidly since the 1980s 5 , accelerated by the need to understand how changes in the global climate, ocean and land systems are affecting everything from individual populations 6 to entire biomes 7 , while technological advances in areas such as remote sensing and genetics are making it ever-easier to quantify ecological features across a broad and increasing range of scales 2,5 .
Given the growing awareness of scale, expanding data-gathering capabilities and the fact that the most comprehensive (and arguably best-known) meta-analyses 8,9 of ecological research scales were published nearly 30 years ago (but see refs 4,10 for more recent reviews), it is both timely and important to assess the scales of contemporary ecological investigation. To address this need, we quantified the spatial and temporal domains of empirical observations that were reported within recently (2004-2014) published ecological studies. We define domain as the distribution of observations within the spectrum of one or more scale dimensions (note: this definition differs from the 'domain of scale' 3 , which is 'a portion of the scale spectrum within which process-pattern relationships are consistent regardless of scale'), and empirical observations as ecological observations collected under uncontrolled or non-manipulated conditions. Empirical observations are critical for developing and testing the models that explain why ecological patterns vary in time and space 1,8 ; therefore, the spatio-temporal domains of observations provide an important indicator of the field's progress towards achieving a holistic, predictive understanding of ecosystems 1,2 .
Our study focused on two dimensions of spatial scale (that is, resolution (grain) and extent) and two of temporal scale (that is, interval and duration) ( Table 1). We analysed the observational domains within each of these four dimensions and between pairs of these dimensions. We also assessed two additional dimensions-actual extent (the summed area of spatial replicates) and actual duration (the summed observational time of temporal replicates)-which we used to evaluate how much the actual scales of observation (that is, how much space and time are covered by the measurement) differ from the scales they ostensibly represent. These differences may impact how effectively observations characterize ecological phenomena. For one, an increasing gap between actual and ostensible observational scales implies greater interpolation or extrapolation of observed measurements, raising the odds of over-leveraging data. Furthermore, since natural systems are frequently complex, nonlinear and non-random [11][12][13] , a larger gap increases the likelihood of data challenges such as censoring (sensu 14 ) as phenomena may resolve themselves in the space or time between replicates.

Results
We reviewed 348 papers randomly selected from 42,918 published between 2004 and 2014 in the top 30 ecology-themed journals. We extracted scale data from 378 observations of 'natural' (that is, nonexperimentally manipulated) ecological features reported within 133 of the reviewed papers (plus an additional 62 cited as the source of observations). Most sampled observations were collected using conventional field methods (80%), followed by automated in situ Nature ecology & evolutioN sensing techniques (12.4%), remote sensing (6.9%) and palaeoreconstruction (< 0.8%).
Observational domains within two dimensions. Contrasting resolution with interval revealed that most temporally replicated observations had resolutions of 10 cm 2 to 1 m 2 and were revisited at daily to yearly intervals (Fig. 2a). A less dense, oblong concentration of observations bounded on the upper left by monthly to yearly observations at 100 m 2 resolution and on the lower right by near-daily to monthly observations with 1-10 ha resolution is also evident. The four observational methods had substantially different domains, as indicated by the locations of their median values (see Supplementary Fig. 2): the median domain of field observations had 0.1-1 m 2 resolution and a monthly interval, whereas remote observations had a coarser median resolution (1,000 m 2 ) but finer median interval (∼ 1 day). Palaeo-reconstructions and automated sensing were both finely resolved (median between 10 cm 2 and 0.01 m 2 ), but automated approaches had an hourly to daily median interval compared with a multi-decadal interval for palaeoreconstructions.
Comparing the interval and duration of temporally replicated observations showed that most observations had daily to decadal intervals and durations of one month to one decade (Fig. 2b). Interval appears to increase with duration; observations lasting one month to one year tend to have daily to monthly intervals, while those lasting one year to one decade tend to have yearly to decadal intervals. This tendency is reflected in the domain medians of the primary observational methods: automated sensing had the finest median interval (hour-day) and shortest duration (month-year), followed by remote sensing (~1 day and 1 year, respectively), field observations (1 month and ~1 year, respectively) and finally palaeoreconstructions (1 decade and millennium, respectively).
Contrasting the two spatial dimensions shows a primary concentration of observations of 10 cm 2 to nearly 100 m 2 resolution with extents ranging between ∼ 1,000 and 1,000,000 ha (Fig. 2c). Another prominent concentration consists of higher-resolution (1 cm 2 to 1 m 2 ), smaller-extent (10-1,000 ha) observations, beneath which lies a third, fainter concentration of 1-1,000 cm 2 resolution and 1,000 m 2 to < 10 ha extent. These three concentrations suggest that observational extent increases with resolution, which is further evident in the median domain values (and kernel densities; Supplementary Fig. 2) of automated (0.01 m 2 resolution, 100 ha extent), field (0.1-1 m 2 resolution, 1,000-10,000 ha extent) and remote (1,000 m 2 resolution, 1-10 million ha extent) observations. Palaeo-reconstructions were the exception, having very fine median resolution (0.01 m 2 ) but large extent (1 million ha)-a possible artefact of small sample size.
There are two primary observational domains within the contrast between duration and extent. The first consists of observations lasting 1 month to 1 decade with extents of 10-1,000 ha, while the second is defined by observations of 1 year to several decades that cover 10,000-1,000,000 ha (Fig. 2d). Three other notable but lesser concentrations are also evident, including small-area observations (0.1-1 ha) covering 1 month to 1 decade, and short-duration, temporally unreplicated observations (≤ 1 day) of either 1-100 ha or 10,000-1,000,000 ha. The median observation from automated sensing (1 year duration, 100 ha extent) lies near the centre of the first major concentration, while the median extents of field (1,000-10,000 ha) and remote (1-10 million ha) observations bound the second major concentration at its upper and lower extents, with the median duration of both observational types falling between 1 month and 1 year.

Differences between actual and ostensible scales.
Observational extent was on average 5.6 orders of magnitude larger than actual extent (Fig. 3a). This difference increased with extent, reaching a maximum of 8.3 between 100 million and 1 billion ha of extent, then falling to 3 orders of magnitude between 1 and 10 billion ha (these extents comprised < 2% of observations, which were primarily collected with remote sensing). Remote observations had the smallest mean difference magnitude (1.9), compared with ≥ 5.7 for the other three methods (Supplementary Fig. 3).
The difference magnitudes between observational duration and actual duration were somewhat smaller, averaging 3.4 and ranging from ~2 for the shortest durations (hour-day) to > 4 for observations lasting 1 decade to 1 century (Fig. 3b). As with extent, the difference fell substantially for the longest durations (century to millennia), as these domains were covered by palaeo-reconstructions  Fig. 3), which show little difference between actual and ostensible duration because coring techniques capture continuous temporal records. The mean difference magnitudes for the other three observational methods ranged from just over 3 (field and automated sensing) to nearly 6 (remote observations).

Potential biases and uncertainties in quantifying scales.
Our results were potentially influenced by several methodological issues. First, most studies did not precisely report observational scales, thus we had to estimate, rather than simply record, scale values for most observations (we estimated 63, 60, 69, 36, 64 and 83% of resolution, extent, actual extent, interval, duration and actual duration values, respectively). Estimation errors may therefore have biased our findings. We attempted to quantify and account for this error by assessing between-observer variability and incorporating this uncertainty into our resampling methodology (Supplementary Results). The resulting confidence intervals ( Fig. 1) suggest that estimation errors did not unduly influence our findings.
Our scale-estimation protocols may also have introduced biasparticularly our protocol for estimating resolution (the smallest areal unit of complete measurement). We selected this definition for the sake of consistency, but some papers reported resolution as a larger area in which sub-samples were taken. For these, our estimates were finer than what the studies' authors considered to be the resolution. Our results would also be somewhat different if we had included observations from experiments. For example, average resolution and duration would probably be finer 8,9 . Additionally, the token one second (Supplemental Methods) we used to represent the duration of remotely sensed temporal replicates (which are effectively instantaneous) caused us to underestimate the differences between their durations and actual durations ( Supplementary Fig. 3). However, the relatively small number of remote observations suggests that the impact of this bias on our overall findings was negligible.
It is also possible that our findings misrepresent observational domains because of sampling error. Although we randomized our sample to ensure representativeness, we reviewed just 0.8% of the papers published during our study period. Our sample may therefore under-or over-represent observational coverage in certain domains, particularly for specific methods. This possibility is greatest for palaeo-reconstructions, where the small sample size probably resulted in an overestimate of typical observational extent Finally, our omission of papers published after 2014 could also have biased our findings. Although our sample size was too small to assign statistical significance, we found a possible positive trend in the use of remote observations and a corresponding decline in field observations over the course of our study period. If these trends were not spurious, they suggest that including studies from 2015-2017 would result in a somewhat larger relative sample of remote observations, which could slightly increase the mean observational extent (see Supplementary Results).

Discussion
Our results suggest that modern ecology's observational domains are fairly narrow and that ecologists still primarily rely on conventional field-based observational techniques. In the spatial dimensions, most observations have resolutions ≤ 1 m 2 and extents ≤ 10,000 ha (Fig. 1a,b). In the temporal dimensions, most observations are either unreplicated or relatively infrequent (> 1 month interval; Fig. 1c), and have relatively short durations (≤ 1 year; Fig. 1d).
Contrasting observational dimensions reveals that larger extents are associated with larger spatial replicates (Fig. 2c), while longer durations are associated with longer intervals (Fig. 2b). The latter association reflects a cost-imposed tradeoff between sampling frequency and temporal duration that is characteristic of field observations, but also appears to affect the other three methods, as evidenced by their relative domain locations. A similar tradeoff is illustrated by the inverse relationship between resolution and interval (Fig. 2a), which primarily relates to field observations, where larger spatial replicates demand greater effort, reducing sampling frequency 9 . Less obvious is the opposite tradeoff that affects remote observation ( Supplementary Fig. 2), where finer resolution (necessary for detail) typically necessitates longer intervals 15 .
As a result of these tradeoffs, there are several notable observational gaps, specifically within the domains defined by highfrequency (daily to sub-daily intervals) observations with high to moderate resolutions (> 1 m 2 to 100 ha; Fig. 2a) and decadal or longer durations (Fig. 2b). Another gap is evident in the high-to-moderate-resolution, large-extent (1 million to 10 billion ha) domain (Fig. 2c).
Have these domains changed since the seminal papers on scale first appeared in the late 1980s? 1,3,8 A comprehensive answer would require a similar analysis focused on earlier literature, but the data provided by three previous studies provide partial insight. The first dataset consists of duration values that ref. 8  The mean interval was 178 days, compared with 684 days in our sample, but the eightieth percentile value in our study was 169 days compared with 329 days in theirs. Extent in our sample was substantially larger according to multiple summary statistics, including the mean (368,403 ha versus 114,965,072 ha in our study), median (9 ha versus 5,051 ha) and ninetieth percentile (136,000 ha versus 46,424,808 ha; this value is smaller than the mean, which is skewed by a small number of very-large-extent observations).
Although limited due to methodological differences (for example, a focus on experiments versus unmanipulated systems), these comparisons suggest that the duration and resolution of ecological observations have changed little in the past 30 years, but observational frequency and extent have both increased. A weak positive trend in our data also suggests that the mean extent of ecological observations is steadily increasing (Supplementary Fig. 5), which probably corresponds to increasing use of remote sensing ( Supplementary Fig. 4).
Despite this apparent increase in observational extent, there remains a large gulf between the areas that ecologists actually observe and the areas their observations are intended to represent (Fig. 3a). A substantial discrepancy also exists between the amount of time spent observing phenomena and the time spans those observations theoretically represent (Fig. 3b). These differences between the actual and ostensible scales of observation have implications for ecological understanding, as the unobserved portions of space and time may contain important patterns and processes that are not captured by replicates, due to phenomenon-dependent factors such as autocorrelation and representativeness of the sampling scheme [16][17][18][19][20] . Brief, infrequent snapshots, or fine-grained, spatially sparse replicates, may be sufficient to characterize many phenomena (for example, annual changes in tree cover are well-represented by low-frequency satellite imaging 21 ), but may be inadequate for more dynamic phenomena. For example, wildfire extent and duration can be mapped by daily return satellites 22,23 , but the instantaneous nature of the imaging means that they cannot be used to observe fire behaviour 24 . To capture such behaviour, long periods of continuous observation may be more important than frequent repeats for understanding the dynamics.
It is therefore important to examine whether the scales of the phenomena being observed are adequately captured by the design of replicates. Our methods suggest one possible procedure for assessing the scale representativeness of observations, which is to (1) calculate the autocorrelation (spatial or temporal) within the observations (for example, using a semi-variogram), (2) find the threshold distance (or time) below which a suitably strong correlation (for example, r = 0.7) will exist between neighbouring sampled values, (3) add that distance (or time) to the sample resolution (or duration) and (4) recalculate actual extent (or duration) using the adjusted resolution (or sampling duration). The difference between this autocorrelation-adjusted actual extent (or actual duration) and extent (or duration) may provide a useful additional measure of how well the replicates represent the intended scale of observation. Although increasing spatial or temporal coverage may not always be the goal of a study, if the gap between actual and ostensible values remains large, alternative sampling methods may be used to close it. For example, remote sensing provides wall-to-wall spatial coverage of a study area, erasing the difference between actual extent and extent. Furthermore, the interval of high-resolution imaging (higher resolution is preferred in images as it allows individual features to be better discerned 25,26 ) is now approaching daily to sub-daily scales 27,28 , allowing improved representation of spatial and temporal dynamics. For phenomena that cannot be measured from space-either because they are not visible or because they

Nature ecology & evolutioN
require continuous observation-new approaches for collecting in situ or near-surface observations (for example, low-cost wireless sensors 10,29,30 , citizen observers 31 and autonomous vehicles 32 ) can be used to increase the spatial and temporal coverage of observations. The aforementioned insights regarding modern observational domains must be tempered by the uncertainty within our own scale estimates, as detailed above. However, most of this uncertainty is attributable to unclear reporting of scale values in the majority of papers we reviewed (a problem also noted in geography studies 33 ). This tendency towards vague documentation offers one final insight: despite decades of accumulated knowledge regarding its importance [1][2][3]34 , scale appears to remain a low priority throughout much of the ecological discipline. Beyond contributing to the broader problem of scientific reproducibility 35 , inattentiveness to scale increases the risk that observations inadequately represent the phenomenon of interest, thereby limiting the generalizability of any derived ecological knowledge 3,33,34 . To mitigate this problem, we recommend that ecological journals require authors to quantify and clearly report the values of resolution, extent, interval and duration. Fortunately, some journals already appear to be implementing such policies. For example, Global Ecology and Biogeography now requires information on the spatial, temporal and taxonomic scale of studies to be in the abstract (a policy adopted in early 2016).

Looking forwards.
Our study suggests that the concept of scale has yet to fully permeate the discipline of ecology. Evidence for this assertion lies in the continued narrowness of ecology's observational scale domains and the poor documentation of scale dimensions in the literature. However, the increasing extent of ecological observations, enabled by remote sensing and presumably motivated by many ecologists' appreciation of scale-related issues, suggests that ecology's scale domains are gradually changing. In the coming years, the accelerating gains in technology and analytical methods will allow researchers new and unprecedented capabilities to peer into, and thus close, the prominent holes in observational domains. A renewed, discipline-wide focus on scale's importance, including the adoption of stricter scale-reporting standards by journals, will help to spur ecologists to address these gaps, while fostering the improved transferability of knowledge within the discipline.

Methods
Paper selection and review. We used the 2012 Web of Science impact factors to select the 30 highest-ranked ecology-themed journals that published studies with an observational component, excluding journals devoted to reviews, meta-analyses, or laboratory, cellular or experimental studies. To select a representative sample of recent ecology studies, we downloaded the metadata for all papers published in the selected journals (Supplementary Table 1

) between 2004 and 2014.
Our study involved 6 different observers (those reviewing the papers to extract the observational scales), each of whom was given a randomly selected batch of 500 titles. A separate set of 20 papers was also randomly selected and given to all observers to review independently. This was to (1) calibrate the interpretations and extraction of scale-related information between observers and (2) estimate between-observer variance.
Each observer first reviewed the papers in the calibration set and then commenced reviewing papers in their individual random draws, beginning at the top of the list and then proceeding until at least 20 eligible papers describing ecological observations were reviewed. In cases where the reviewed papers used observations that were described in another publication, we reviewed those source papers to extract the observational dimensions. We excluded papers that were opinion or perspectives pieces (unless they presented or used existing observational data), or theoretical studies based on generated data. We also did not collect scale information from papers (or the relevant parts of papers) describing experimental manipulations because experiments tend to be of limited extent, duration and resolution due to their higher logistical costs 8,9 . Including data from experiments would therefore probably have biased our findings towards finer scales, while minimizing the impact that new observing methods (for example, satellite imaging and wireless sensing) may have had in expanding the scales of ecological investigation 10,36,37 . A bibliography of the reviewed papers appears in the Supplementary Information. Estimating observational scales. We recorded six primary dimensions of ecological observations-three related to space and three related to time. The space-related dimensions were resolution, extent and actual extent. Here, extent was primarily defined as the area falling within a perimeter defined by the outermost spatial replicates, while actual extent was the summed area of all spatial replicates (that is, N × resolution, where N is the number of spatial replicates, which we also recorded), or the area that ecologists observe in practice. In assessing spatial scales, our analysis only considered the Cartesian plane; we did not calculate the z (or depth) dimension, although this dimension is of greater importance for certain sub-disciplines of ecology (for example, depth profiles in marine ecology). In some cases (primarily palaeoecological studies), values extracted from the z dimension provided temporal information that was used to calculate both the interval and the duration of the observation.
For time dimensions, we extracted information related to the observational interval, duration and actual duration. Duration was defined as the time between the first and last temporal replicate, whereas actual duration quantifies the amount of time spent observing a particular location, which we calculated by multiplying the sampling duration (the time spent collecting a single temporal replicate) by the number of temporal replicates.
A full definition of all dimensions and how they were recorded is contained within a list of frequently asked questions (see Supplementary Methods), which was provided to each observer for initial study and reference, and adapted as necessary during the course of the study to ensure methodological consistency.
To account for potential differences in scales related to methodology, we classified each observation according to the following broad categories: field methods (manual in situ data collection), automated (in situ) sensing, remote sensing/other geographic data (hereafter remote observations) and palaeoreconstruction approaches. We also recorded when observations were reported in any study with an unclear or missing scale value.

Calibration and consistency.
Most studies did not explicitly report values for all the assessed scales, and thus interpretation and judgement had to be applied to develop reasonable estimates for their values. The frequently asked questions (Supplementary Methods) provided the protocol we followed, and were initially developed following consultation between observers before reviewing commenced. We conducted an iterative process of calibration to ensure consistency and reliability of the estimates. First, we used the calibration set to calculate betweenobserver variability with respect to paper selection/rejection and the estimation of scales. Based on this, the lead author reviewed individual records in each observer's calibration set, flagged values where the estimation procedure departed from the protocol and returned these to observers for re-estimation without providing an estimate of the actual value. Instead, the relevant section of the protocol was highlighted, and further explanation and clarifying discussion were undertaken as needed. The protocol language was adjusted for clarity during this process, and new items were added to cover circumstances that had not been addressed by the initial version. The variability measures were recalculated after each iteration.
To ensure consistency within the main analysis, the lead author also reviewed each observer's results from their individual draw of papers and flagged values that appeared to deviate from the protocol for re-review by the observer. Revised values were re-inspected, and in some cases a secondary review of particular papers was undertaken to cross-check the estimated scales.
Scale-estimation uncertainty. Two major and related sources of uncertainty affected our estimation of observational scales: (1) unclear documentation of observational scales in the reviewed studies; and (2) variation between observers in estimating observational scales (largely in cases where scales were not explicitly reported). To account for these uncertainties, we first quantified the betweenobserver variability in scale estimates (expressed as the coefficient of variation), which was constructed from each observer's final reported calibration set results. We then used the coefficients of variation for each dimension as the basis for randomly perturbing-over the course of 1,000 iterations-the scale values for each of the sampled observations. For each observational dimension at each iteration, we perturbed its observer-estimated scale value by: (1) randomly selecting (from a uniform distribution) a percentage value p that fell between 100 + y and 100 − y (where y was the dimension-specific coefficient of variation, expressed as a percentage) and (2) multiplying the scale value by the corresponding proportion (p / 100). The perturbation occasionally resulted in physically impossible values (for example, interval or actual duration longer than duration, or actual extent larger than extent). In these cases, we capped the perturbed value in the smaller of the two dimensions (that is, resolution or interval) so that it equalled the corresponding value in the largest (that is, extent or duration). We used the resulting set of perturbed observations to quantify uncertainty within our scale estimates.
In addition to the scale-estimation coefficient of variation, we also examined how well observers agreed regarding paper inclusion/exclusion, and how many extractable observations there were per included paper (see Supplementary Results).
Analyses. To characterize the scale domains of observations, we first logtransformed (base-10) the scale values within the 1,000 member perturbed Nature ecology & evolutioN ensemble to account for the large range in values. To examine the distributions of observational scales within individual dimensions (Fig. 1), we first constructed relative frequency histograms for each of the 1,000 transformed ensemble members for each dimension and then plotted the bin means across all members, as well as the upper and lower 2.5th percentile values for each bin. This produced a histogram of observational scales within each dimension that accounted for scaleestimation uncertainty.
To evaluate the distributions of observations within two scale dimensions (Fig. 2), we used the splancs package 38 of R 39 to calculate a kernel density estimate of the log-transformed values across all ensemble members, using a bandwidth of 1 on a 0.1 resolution image to provide a smoothed result that served to more effectively highlight domains in which ecological observations were concentrated.
Bandwidths of varying resolutions were tested on kernel density estimates of sampling interval versus plot resolution to test how sensitive our results were to the bandwidth value (see Supplementary Results). For comparisons involving interval, we removed temporally unreplicated observations because these lacked interval values.
To compare the differences between actual extent and extent and actual duration and duration (Fig. 3), we calculated the magnitude of difference (decade) between each pair as: log log 10 10 10 Where x is either extent or duration and y is actual extent or actual duration, respectively. We then evaluated how the magnitudes of difference varied with increasing values of extent/duration, using box plots to summarize decades within the same bins used to summarize the frequency distributions of the extent and duration of observations (Fig. 1b,d). Decades were calculated for each pair for all bootstrap replicates. We plotted the box plots against their corresponding bin means to evaluate how these differences varied with scale (Fig. 3).

Trends in methods and scale.
To evaluate the potential impact that excluding studies from 2015-2017 would have on our findings, we analysed the trends in (1) ecological observing methods and (2) typical scales of ecological observations over the 10 year period. To undertake the former assessment, we calculated the percentage of observations made using remote sensing, general field methods and automated in situ methods, and fit a linear regression between these percentages and the publication year, weighting the regression by the total number of observations in each year. For the second analysis, we applied the same regression approach to the four primary dimensions (resolution, extent, interval and duration) to assess whether there were any trends in observational scales. The regressions and resulting code for trend extrapolations can be found in the 'additional analyses' vignette in the accompanying R package/code repository (available at https://github.com/agroimpacts/ecoscales).
Extracting and analysing data from earlier meta-analyses. To compare the results of our analysis with the observational scales of earlier ecological studies, we used graph capture software (https://automeris.io/WebPlotDigitizer/) to extract the data values from figure 6.1 of ref. 8 , figure 1 of ref. 9 and figure 2 of ref. 10 . To maintain as much comparability as possible with our inclusion criteria, we excluded experimental studies in the data from ref. 8 , as well as the values of any studies exceeding 100 years' duration (no upper time bound was provided for these), leaving duration values for 419 (out of 623) studies. Since ref. 8 presented duration values as a histogram, we calculated the mean duration across all studies as the weighted (by number of observations per bin) mean of bin centre-point values (that is, the weighted mean of the bin means). We also excluded 4 (of 29) observation values from the data in ref. 10 on observational extent and frequency, which, in contrast with the other 25, were not randomly selected. Ref. 10 also used irregular scales for both x (frequency) and y (extent) axes; therefore, we had to visually estimate the scale values for each data point after graphical extraction, and converted their extent values (in km) to hectares and their frequency values to intervals. Ref. 9 presented resolution as plot diameters (m), which we squared to make comparable to our resolution metric.
Calculations of scale values from these studies can be found in the 'additional analyses' vignette in the accompanying R package/code repository (available at https://github.com/agroimpacts/ecoscales).
Reporting Summary. Further information on experimental design is available in the Nature Research Reporting Summary linked to this article.
Code availability. The code supporting this manuscript is available online at https://github.com/agroimpacts/ecoscales. Data availability. The data supporting this manuscript are available online at https://github.com/agroimpacts/ecoscales.

Replication
Describe whether the experimental findings were reliably reproduced. None.

Randomization
Describe how samples/organisms/participants were allocated into experimental groups.
Journal titles were randomly selected from full list of titles downloaded through Endnote.

Blinding
Describe whether the investigators were blinded to group allocation during data collection and/or analysis.

N/A
Note: all studies involving animals and/or human research participants must disclose whether blinding and randomization were used.

Statistical parameters
For all figures and tables that use statistical methods, confirm that the following items are present in relevant figure legends (or in the Methods section if additional space is needed).

n/a Confirmed
The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement (animals, litters, cultures, etc.) A description of how samples were collected, noting whether measurements were taken from distinct samples or whether the same sample was measured repeatedly A statement indicating how many times each experiment was replicated The statistical test(s) used and whether they are one-or two-sided (note: only common tests should be described solely by name; more complex techniques should be described in the Methods section) A description of any assumptions or corrections, such as an adjustment for multiple comparisons The test results (e.g. P values) given as exact values whenever possible and with confidence intervals noted A clear description of statistics including central tendency (e.g. median, mean) and variation (e.g. standard deviation, interquartile range)

Clearly defined error bars
See the web collection on statistics for biologists for further resources and guidance.