Introduction

One of the most fundamental and oldest objectives of ecology is to understand the factors and mechanisms causing fluctuations in the number of animals in a given area (Elton, 1942; Andrewartha and Birch, 1954; Lack, 1954; Royama, 1992; Coulson et al., 2004). This search for underlying mechanisms has recently acquired a special importance as many natural populations have suffered dramatic declines (Hunter, 2002) because of persecution, exploitation or habitat loss (Beissinger and Snyder, 1992; Casey and Myers, 1998; Krüger et al., 2001, 2010; Ferrer et al., 2003; Pimm et al., 2014).

Traditionally, animal populations have been assessed by means of direct census counts for conservation and management purposes (Luikart et al., 2010). However, an influential contribution of evolutionary theory to conservation biology has been the development of a framework for predicting the fate of small populations (Palstra and Ruzzante, 2008). Central parameters to this framework are the population census size (Nc) and effective population size (Ne). Ne, which is defined as the number of individuals in an ideal population experiencing the same rate of random genetic change over time as the actual population (Crow et al., 1970), is a particularly important quantity as it is inversely proportional to the loss of genetic diversity because of inbreeding and genetic drift in finite, randomly mating populations (Nunney and Elam, 1994; Frankham, 2005; Charlesworth, 2009). Thus, low Ne values are often interpreted as providing an indication of increased extinction risk (Newman and Pilson, 1997).

Although ascertaining Ne is crucial to conservation and management, it is still rather difficult to obtain reliable estimates in real populations because of constraints on collecting enough demographic data to directly measure Ne (Waples, 2005; Luikart et al., 2010). Consequently, molecular genetic approaches have become increasingly popular in recent years (Waples and Do, 2010). A variety of different approaches are now available to estimate Ne from genetic marker data (reviewed by Leberg, 2005; Wang, 2005; Palstra and Ruzzante, 2008; Luikart et al., 2010). These can be broadly classified into single-sample and temporal approaches. The former estimate the effective population size from properties of a single sample of individuals, most often using the unbiased linkage disequilibrium (LD) method of Hill (1981) that is based on the premise that in small populations with few parents, random genetic drift generates nonrandom associations between alleles at different loci. In contrast, temporal approaches exploit the fact that drift is higher in smaller populations by quantifying differences in allele frequencies between two or more samples collected at different time points. Applied to discrete cohorts of offspring, these approaches strictly estimate the effective number of parents that produced the sample, or Nb (Schwartz et al., 1999; Leberg, 2005; Waples and Do 2010; Waples and England, 2011; Waples et al., 2014).

Both sets of approaches have advantages and disadvantages. Temporal approaches are often considered to be the most accurate, assuming that samples can be gathered sufficiently far apart in time (Leberg, 2005; Luikart et al., 2010). However, they require at least two non-overlapping generations to be sampled, and this is often impossible for species with long generation times such as many large vertebrates (Luikart et al., 2010). Single-sample approaches have therefore increased in popularity as they can provide a snapshot of a population without the need to collect multiple samples. Precision also appears to be reasonably high, at least for the LD method, when using the 10–20 genetic markers and ~50 samples typical of most studies, as long as Ne is less than 500 (Waples and Do, 2010). However, single-sample approaches can be sensitive to gene flow (Gilbert and Whitlock, 2015) and there is also a need for studies that are long enough to quantify temporal variability when applying temporal methods (Palstra and Ruzzante, 2008).

A long-term study of common buzzards (Buteo buteo) provides an excellent opportunity to explore the temporal dynamics of the effective population size of an intensively studied vertebrate population. This species is a common, medium-sized bird of prey that breeds across the entire Palaearctic from southwestern Europe to Japan, where it preys mainly on microtine rodents (del Hoyo et al., 1994). A population in Eastern Westphalia, Germany, has been intensively monitored since 1989, with nests having been climbed and all chicks marked to allow individual recognition since 2002. Our census data suggest that this population was relatively stable at 200 breeding adults since the early 2000s, but began to increase in 2010 towards a peak of 516 in 2012 (see Results).

Common buzzards also exhibit a striking plumage polymorphism (Cramp and Simmons, 1980; Krüger et al., 2001) that serves as a phenotypic marker and thus provides an interesting additional dimension (Chakarov et al., 2008; Chakarov et al., 2013). Three colour morphs have been described that differ in their levels of plumage melanisation, termed ‘light’, ‘intermediate’ and ‘dark’. Plumage morph is fixed throughout an individual’s lifetime, follows Mendelian expectations for a single locus with two alleles (Krüger et al., 2001) and is associated with a mutation in the melanocortin-1 receptor (MA Pointer et al., unpublished data). The intermediately melanised morph, which is presumed to be heterozygous at the colour locus, is on average longer lived and has the greatest lifetime reproductive success (Chakarov et al., 2008). However, the plumage morphs do not differ significantly in genome-wide heterozygosity measured using 18 microsatellites (Boerner et al., 2013), suggesting that heterozygote advantage is not a genome-wide phenomenon. The three morphs are also genetically undifferentiated, despite buzzards tending to mate with partners of the same morph as their mother (Boerner et al., 2013).

Here, we genotyped 1622 common buzzard chicks comprising 12 complete cohorts at 15 polymorphic microsatellite loci. The resulting data were used to evaluate the comparability of single-sample and temporal estimators, as well as to explore temporal patterns in relation to the observed demography of the population.

Materials and methods

Study site

This study was conducted in a 300 km2 study area (8°25'E and 52°6'N) in Eastern Westphalia, Germany (Figure 1). It consists of two 125 km2 grid squares and 50 km2 of edge areas. The dominant land cover in the study area is the Teutoburger Wald, a low-level forested mountain region reaching a height of 315 m above sea level. The second most abundant land cover type is a cultivated landscape to the north and south.

Figure 1
figure 1

Location of the study area in Germany together with a detailed map (insert) of the 300 km2 study area, with human settlements shown in orange, forest patches in green and agricultural areas in cream. A full colour version of this figure is available at the Heredity journal online.

Collection of census and breeding success data

Census dynamics of common buzzards were monitored from 1989 to 2013 inclusive. All forest patches were visited in late winter to look for territorial pairs and nests. During the breeding season, these forest patches were visited again and checked for activity of the study species. This includes breeding pairs (occupying a nest and showing signs of egg-laying activity) as well as nonbreeding pairs that just occupy a territory. Hence, our buzzard census data include breeding and nonbreeding pairs but the number of ‘floating’ individuals that do not hold a territory cannot be reliably counted and recorded.

Each active nest was visited at least 3 and up to 10 times a year to determine breeding success (success or failure) and brood size (number of chicks fledged) for successful breeding attempts. From 1989 to 2001 inclusive, data were collected through careful and intensive observation from the ground. In subsequent years, between 85 and 99% of all successful nests were climbed and the chicks were ringed, normally in late May and early June.

Blood sampling

Buzzard nests were climbed with a rope-climbing technique and once the climber was at the nest, chicks were lowered to the ground using another rope. On the ground, they were ringed, biometric measures were recorded and a 0.5 ml blood sample was taken from the brachial vein with a syringe or needles and capillaries. Sample sizes are shown in Table 1. Blood was transferred into 1.5 ml screw-cap tubes filled with 1.0 ml ethanol or phosphate-buffered saline–EDTA buffer. Back at the laboratory, all tubes were stored at −20 °C.

Table 1 Sample sizes of common buzzards with the number of analysed samples (N), census population size (Nc), number of breeders (Nb), estimated effective number of parents that produced each cohort (), minimal allele frequency cutoff (Pcrit) and 95% confidence intervals (CIs)

Microsatellite genotyping

Total genomic DNA was extracted from 10–20 μl of each sample using a standard chloroform extraction protocol and genotyped at 15 previously developed microsatellite loci (Johnson et al., 2005). All but two of these loci map to different loci in the zebra finch (Taeniopygia guttata) and are therefore unlikely to be physically linked (Table 2). The microsatellites were PCR amplified in a single multiplexed reaction using a Type It Kit (Qiagen GmbH, Hilden, NW, Germany). The following PCR profile was used: one cycle of 5 min at 94 °C; 24 cycles of 30 s at 94 °C, 90 s at 56 °C and 30 s at 72 °C; and one final cycle of 15 min at 72 °C. Fluorescently labelled PCR products were then resolved by electrophoresis on an ABI 3730xl capillary sequencer (Applied Biosystems, Carlsbad, CA, USA) and allele sizes were scored automatically using GeneMarker version 2.6.2 (Softgenetics, State College, PA, USA). To ensure high genotype quality, all traces were manually inspected and any obvious scoring errors were adjusted accordingly.

Table 2 Details of the 12 microsatellite loci used in this study together with their polymorphism characteristics in 1419 common buzzards

Genetic data analyses

Tests for deviation from Hardy–Weinberg equilibrium and LD were implemented using Genepop version 4.3 (Rousset, 2008), specifying 10 000 dememorisations, 1000 batches and 10 000 iterations per batch. Adjustment of P-values for the false discovery rate with an α-level of 0.05 was carried out on all tabulated results using the program q-value version 1.38.0 (Storey, 2002). Genepop was also used to calculate observed and expected heterozygosities at each of the microsatellite loci.

To test for population structure, we used two complementary approaches. First, Structure version 2.3.4 (Pritchard et al., 2000) was used to test for the presence of distinct genetic clusters without prior knowledge of the sampling locations of individuals. This program uses a maximum-likelihood approach to determine the most likely number of genetically distinct clusters in a sample (K) by subdividing the data set in a way that maximises Hardy–Weinberg equilibrium and minimises LD within the resulting clusters. We ran 20 independent runs for K=1–10 using 1 000 000 Markov chain Monte Carlo iterations after a burn-in of 500 000 with the correlated allele frequencies model and assuming admixture. The most likely number of groups was evaluated using the maximal average value of Ln P(D), a model choice criterion that estimates the posterior probability of the data. Second, we used hierarchical analyses of molecular variance within GenAlEx version 6.5 (Peakall and Smouse, 2012) to test for genetic differences between buzzards sampled to the north and south of the ridge of the Teutoberger Wald.

Single-sample estimators

We used the software NeEstimator version 2.01 (Do et al., 2014) to implement the LD approach of Waples and Do (2008). As the inclusion of rare alleles can upwardly bias LD-based estimates (Waples and Do, 2010), we followed the authors’ recommendation of choosing the minor allele frequency threshold (Pcrit) to be the larger of 0.02 or a value that screens out single copy alleles (Waples and Do, 2010). To explore sensitivity, we also repeated the analysis with Pcrit=0.01 and 0.05 while similarly applying the same criterion as above to ensure that single copy alleles were not counted. The 95% confidence intervals (CIs) were derived using the ‘parametric’ option that implements χ2 approximation (Waples, 2006).

Temporal estimators

We used the temporal approach of Wang and Whitlock (2003) as this is the least sensitive of the temporal estimators to immigration (Gilbert and Whitlock, 2015). The software MLNe 1.0 (Wang and Whitlock, 2003) was used to generate both the maximum-likelihood and moment estimators, assuming that the population is not at equilibrium. For this analysis, we specified a maximum Ne of 5000, a monitor value of four and six threads. We estimated the generation time as the mean age at maturity plus the mean reproductive lifespan (IUCN Standards and Petitions Subcommittee, 2016). As the breeding lifespan of an adult female buzzard is on average 2.75 years and most buzzards recruit as breeding adults at 2 years of age (Krüger and Lindström, 2001), our estimate is 4.75 years. For the temporal analysis, we therefore took the allele frequencies from 2002 and 2013 and assumed two generations between samples. To explore sensitivity to the number of generations assumed to separate the samples, we also repeated this analysis specifying between one and five generations.

Results

We genotyped 1622 common buzzard individuals sampled over 12 years at 15 microsatellite loci. Individuals that failed to produce interpretable genotypes at three or more loci were discarded, leaving a total of 1419 individuals for the data analysis (Table 1). The loci were moderately variable, carrying an average of 9.75 alleles (Table 2). Tests for deviations of each locus from Hardy–Weinberg equilibrium in each of the years revealed the number of deviations that remained significant following table-wide false discovery rate correction for multiple tests (Supplementary Table S1). Two of the loci (Bbu 35 and 46) deviated significantly from Hardy–Weinberg equilibrium in 9 out of 12 years and a further locus (Bbu 26) deviated significantly in 7 years. We therefore took the conservative measure of excluding all three of these loci from further analyses. Tests for LD among the remaining 12 loci revealed a number of significant associations (Figure 2). However, no pairs of loci were consistently in LD across multiple years, the extent of LD varied from year to year (Figure 2) and all but two of the loci mapped to different chromosomes in the zebra finch (Table 2), suggesting that these associations are unlikely to be due to physical linkage.

Figure 2
figure 2

Summary of pairwise LD tests conducted within Genepop (Rousset, 2008) for each of the 12 successive years. Locus numbers are given in the same order as shown in Table 2.

Buzzard morph effective population sizes

To test for differences in the effective population sizes of the three buzzard plumage morphs, we implemented the LD method (Hill, 1981; Waples, 2006; Waples and Do, 2010). For this analysis, it was necessary to pool individuals across years because of the low frequency of the dark morph. Sensitivity to the minor allele frequency cutoff was analysed by generating effective population size estimates for Pcrit=0.01, 0.02 and 0.05. The resulting estimates, which were reasonably robust to the Pcrit value used, were lowest for the dark morph, intermediate for the light morph and highest for the intermediate morph (Figure 3 and Table 1), reflecting their frequencies in the wider population.

Figure 3
figure 3

Single sample Ne estimates and their associated 95% confidence intervals based on the LD approach of Waples and Do (2008) implemented in NeEstimator (Do et al., 2014) for Pcrit values of 0.01, 0.02 and 0.05 respectively.

Temporal estimators

Based on allele frequencies from 2002 and 2013 and assuming two generations between the samples, MLNE produced likelihood and moment-based estimates of 185.7 and 122.5 respectively (Figure 4 and Table 1). Exploring sensitivity to the number of generations assumed to separate the samples, we found that increased gradually from one to five generations (Figure 4).

Figure 4
figure 4

Likelihood and moment-based temporal estimators of Wang and Whitlock (2003) respectively based on the comparison of 2002 with 2013, with an increasing number of generations from one to five assumed to separate the samples.

Temporal patterns

With a 12-year time series and the majority of breeding buzzards observed and their offspring sampled, we could explore temporal changes in in relation to the observed dynamics of the study population (Figure 5 and Table 1). The results were reasonably insensitive to Pcrit and showed appreciable variation over the course of the study. In particular, was typically in the order of 25–100 for the period leading up to and including 2009, then increased towards a peak of 250–500 in 2011–2012. This increase broadly coincides with a period of rapid population growth in which Nc more than doubled between 2009 and 2012 (Figure 5b and Table 1) and the frequency of the light morph also increased (Figure 5c; F1, 8=26.26, P<0.001 for the period 2002–2013 inclusive).

Figure 5
figure 5

Temporal variation in the common buzzard study population between 2002 and 2013. (a) Annual values with 95% confidence intervals based on the LD method (Waples and Do, 2008) with Pcrit values of 0.01, 0.02 and 0.05, respectively. (b) Observed number of parents (Nb, black line) and census population size (Nc, dashed line). (c) Proportion of the light morph in the population. (d) Ratios of Nb to Nc (light grey bars), to Nc (dark grey bars) and to Nb (black bars) based on a Pcrit value of 0.02.

For every year of the study, was smaller than the corresponding number of breeders (Nb) that, in turn, was smaller than the census size (Nc). This is reflected in the Nb/Nc ratios shown in Figure 5d that were consistently <1 and fell to between 0.83 and 0.55 for the period 2004–2009, when the population experienced particularly low breeding success. Consequently, the /Nc and /Nb ratios (shown for Pcrit=0.02 in Figure 5d) were also at their lowest during this period. Analysing all of the years together, we observed positive but nonsignificant correlations between (based on Pcrit=0.02) and both Nc (F1,8=1.93, P=0.20) and Nb (F1, 8=2.14, P=0.18). The ratio of to Nc did not correlate significantly with Nc (F1, 8=0.10, P=0.76).

Population structure

As population structure can cause inaccuracies in Ne estimation (Waples and England, 2011; Neel et al., 2013; Gilbert and Whitlock, 2015), we used two approaches to test for the presence of cryptic population structure within the study area. Arguably the most versatile tests of population structure need not rely on knowledge of where individuals were sampled. Consequently, we first implemented a Bayesian cluster analysis using the program STRUCTURE (Pritchard et al., 2000) to determine whether any genetic structure could be detected in the absence of a priori geographic data. The resulting posterior probabilities were highly concordant among replicate runs, with the highest average value indicating the most likely number of clusters, K. The average log likelihood value climbed steadily with increasing K to peak at K=20 (Supplementary Figure S3). However, when this analysis was repeated separately for each of the twelve years, the maximal average log likelihood values were mainly associated with K=1, suggesting a lack of detectable population structure within years (Supplementary Figure S4). The only exceptions were 2005 and 2013, for which the most likely genetic structure consisted of six and four clusters respectively. Visual inspection of individual cluster memberships for these years revealed considerable admixture and no clear evidence for the presence of distinct sub-populations.

As a further test of population structure that exploits prior information on sampling locations, we implemented an analysis of molecular variance. For this analysis, we compared animals sampled to the north and south of the Teutoburger Wald, a low mountain region that bisects the study area. Around 1% of the variance was partitioned between the north and south (F=0.03, P=0.001) indicating the presence of weak but statistically significant population substructure. To test whether this has an effect on the effective population size estimates, we repeated the temporal analysis shown in Figure 5a separately for the northern and southern sub-populations. The same overall trend was observed for the northern sub-population, whereas values were both lower and less variable over time for the southern sub-population (Supplementary Figure S1A). This is consistent with census data (Supplementary Figure S1B) and suggests that the overall pattern shown in Supplementary Figure S1 is driven by changes in the larger northern sub-population.

Sensitivity to sample size

Sample sizes were generally lower in the early part of the study, which appears to be reflected to some extent in the corresponding effective population size estimates. We therefore tested for a relationship between annual and sample size. This was not statistically significant (F1, 10=2.76, P=0.13), suggesting that sample size does not have a strong effect on the magnitude of the estimates. To further explore whether our conclusions could be affected by variation in sample size, we generated values for pooled data corresponding to the periods 2002–2009 and 2010–2013 inclusive. To mimic sampling effects, we then randomly selected differently sized subsets of individuals each 10 times. The resulting values were consistently greater for the latter period (Supplementary Figure S2) and no clear relationship was found between and sample size for either period.

Discussion

Long-term genetic studies are essential for understanding temporal variation in the effective population size of natural populations. We therefore generated a large microsatellite data set for an intensively monitored common buzzard population in northern Germany. The 12-year duration of the study allowed us to use both single-sample and temporal approaches that yielded comparable estimates, at least for the latter part of the study. Analysis of 12 successive cohorts also uncovered appreciable temporal heterogeneity, with varying by a factor of 14 over the course of the study.

We used multiple approaches to estimate the effective population size of our study population of common buzzards. However, because of the fact that the population has expanded in recent years, we took the conservative measure of focussing on those estimators that are least sensitive to immigration into the focal population (Gilbert and Whitlock, 2015). We found that the results of the LD method were reasonably consistent across a range of Pcrit values from 0.01 to 0.05. One relevant feature of common buzzards is the presence of three distinct colour morphs that differ markedly in their frequency in the study population (Krüger et al., 2001). Accordingly, was smallest for the dark morph, intermediate for the light morph and largest for the intermediate morph. Although heterozygote advantage operates at the colour locus in this population, with the intermediate morph having the greatest average lifetime reproductive success (Chakarov et al., 2008), we recently showed that the three morphs do not differ significantly in their genome-wide heterozygosity, nor are they genetically differentiated from one another (Boerner et al., 2013). Variation in the effective population sizes of the morphs is therefore unlikely to be an artefact of differences in genome-wide heterozygosity or population substructure and instead appears to be a reflection of the relative frequencies of the morphs in the population.

We also used the LD method to generate annual effective population size estimates. These generally had rather small 95% confidence intervals, consistent with a simulation study (Waples and Do, 2008) showing that the LD method can generate precise estimates with 10–20 microsatellites and 50 individuals sampled where Ne is less than 500. The estimates themselves varied by a factor of 14 over the duration of the study and were consistently lower during 2002–2009. The smallest estimate of 25.5 was obtained for 2008, a year in which breeding success was particularly low. Afterwards, the estimates steadily increased towards a peak of 367.2 in 2011. Although the exact causes of this increase are not known, census data indicate that the population more than doubled from 2009 to 2012, whereas over the same period the frequency of the white morph increased from 30 to 41%. Hence, temporal patterns in appear to capture underlying population processes that are also reflected in the census data.

Possible causes for population growth include increased survival and local recruitment, and immigration from further afield. However, the former is unlikely to have played a major role as, despite having fitted almost all fledged chicks from the study area with clearly visible wing tags, we have not observed an increase in the number of local birds recruiting into the population (Chakarov et al., 2013). In fact, many new territories have been found by individuals without wing tags (O Krüger, personal communication). As we are confident that these individuals are not local, the only explanation remaining is that population growth is attributable at least in part to immigration. This is also consistent with changes over time both in the frequency of the light morph and in the proportion of pairwise comparisons among loci yielding significant LD values (see below).

As the LD approach will be affected by the presence of physically linked markers, we checked for LD among the 12 microsatellites in each of the years. We did not find consistent patterns of association between particular pairs of loci, and this agrees with the fact that most of the markers map to different chromosomes in the zebra finch. However, the proportion of pairwise comparisons yielding significant test statistics varied from year to year, being mostly low but reaching highs of 53% in 2010 and 38% in 2012 (Figure 2). Although this could partly reflect larger sample sizes for these two particular years, over the same period we found very little variation in either observed heterozygosity (Table 2) or standardised allelic richness (Table 3), suggesting that genetic diversity has not altered appreciably. Similarly, the results of the Structure analysis suggest that there has been no apparent increase in the amount of population structure over time in parallel with the Ne estimates. Thus, it seems likely that NeEstimator is detecting changes in LD within the population. These changes could potentially be a consequence of gene flow into the focal population, as this is known to create LD when allele frequencies are unequal among populations exchanging migrants (Gilbert and Whitlock, 2015).

Table 3 Standardised allelic richness with s.d. for the 12 microsatellite loci used in this study for each year

Although it has been estimated that 80% of buzzards recruit to within 20 km of their natal territories (Zang et al., 1989) and 96% within 100 km (Walls and Kenward, 1998), our genetic data are not consistent with localised immigration from the vicinity of our study population, as the high mobility of common buzzards should reduce population structure over this scale. However, longer-distance dispersal does happen regularly in this species (Cramp and Simmons, 1980; Kenward et al., 2001) and buzzard dispersal is strongly influenced by weather patterns (Walls et al., 2005). In particular, harsh winters are known to severely affect buzzards (Cramp and Simmons, 1980) and the winter of 2009–2010 was among the harshest in recent years that might have induced significant dispersal events. On current evidence, it remains speculative where the influx of light birds may have come from, as there is no clear evidence for clinal variation in buzzard morph frequency.

One caveat to the LD method is that LD can be generated by many different phenomena, from inbreeding through population structure to immigration (Luikart et al., 2010). Inbreeding is unlikely to be important in this system, partly because the species is monogamous but also because a previous study found no evidence for inbreeding (Boerner et al., 2013). However, population structure could potentially be present as the study area is bisected by the Teutoburger Wald. To check for the presence of discrete populations within the study area, we therefore used Bayesian cluster analysis and analysis of molecular variance. The Structure results hinted at the presence of multiple clusters when all of the data were analysed together but generally indicated a lack of structure when the years were analysed separately. Such a pattern could potentially arise because of the fact that buzzard pairs often breed across multiple years, meaning that a stronger signal of family structure may be present in the full data set relative to individual years.

In contrast, analysis of molecular variance uncovered a small but significant genetic difference between buzzards breeding to the north and south of the Teutoberger Wald. This most likely reflects differences in habitat suitability, as similar differences between the northern and southern parts of our study area have been documented for buzzard survival in interaction with local weather patterns (Jonker et al., 2014). This is probably a reflection of the predominance of scotts pine forest on sandy soils in the southern area that is known to be a suboptimal habitat for buzzards (Krüger, 2004).

Sensitivity of effective population size estimators to violation of the assumption of discrete, non-overlapping generations is an issue that hampers many studies of natural populations (Waples et al., 2014, Kamath et al., 2015). Although samples that combine multiple cohorts (strictly, as many cohorts as there are in a generation) estimate Ne, single-cohort samples are usually thought of as providing information about the effective number of parents that produced the sample, that is, Nb (Schwartz et al., 1999; Leberg, 2005; Waples and Do, 2010; Waples and England, 2011; Waples et al., 2014). With knowledge of the ratio of Nb/Ne, which can be estimated from life history information (Waples et al., 2013), it is possible to directly estimate Ne. However, this was not possible for our study population because the majority of breeding individuals are of unknown age.

Another important caveat is that for the temporal method the number of generations between samples affects both precision and bias (Waples and Yokota, 2007; Gilbert and Whitlock, 2015). Precision can be reasonably good with as little as two generations separating samples, but age structure can cause bias by violating the assumption of discrete generations (Jorde and Ryman, 1995; Wang et al., 2010). Although the magnitude and direction of bias depends on the species’ life history and the type of samples taken, as a guideline it has been suggested that at least 3–5 generations are needed to minimise bias (Waples and Yokota, 2007). In the case of our study population, wing-tag data indicate that local recruitment is negligible, suggesting that there will be little if any overlap between individuals sampled as chicks at the beginning of the study and adults that contributed chicks towards the population in the later part of the study. However, to quantify the magnitude of bias will require at least one more decade of genetic sampling.

Despite the above caveats, we believe that our main conclusions are unlikely to be strongly affected by the presence of overlapping generations for three main reasons. First, we are more interested in the relative values of the estimates rather than the absolute values. This is why we estimated Ne based on pooled samples for the different colour morphs, as the dark morph in particular is too infrequent to be able to generate meaningful estimates separately for each of the years. Second, although the results of the temporal analysis may be subject to bias (see above) and vary with the precise estimator used and the number of generations assumed to separate the samples, arguably the temporal estimates are reasonably consistent with the single-sample Ne estimates obtained for the latter years of the study, as also shown by Miller and Waits (2003) and Rowe and Beebee (2004). Third, any temporal variation in across the study will tend to be dampened by individuals who contribute towards estimates in successive years, making our temporal analysis if anything somewhat conservative.

One potential issue with our study is that sample sizes varied from year to year and were generally lower in the first half of the study where the corresponding values were also smaller. We therefore tested whether the conclusion of a temporal increase in Ne was robust to sample size variation. For this analysis, we pooled data for the periods 2002–2009 and 2010–2013 inclusive, and then calculated for each period using differently sized random subsamples. Regardless of sample size, consistently larger estimates were obtained for the latter period, in support of a genuine temporal increase in Ne. Consistent with the observation that annual did not correlate significantly with sample size, we also found no indication of an increase in with the number of randomly selected individuals.

A handful of previous studies have used similar approaches to explore temporal patterns in the effective population size of natural populations, although for species like bears with relatively long generation times this requires the analysis of museum specimens (see, for example, Miller and Waits, 2003). In one study, 20 years of archived chinook salmon scales were analysed to reveal a long-term decrease in Ne despite the census size having increased by a factor of five (Shrimpton and Heath, 2003). Another study used a time series of European brown bear samples to document a temporal increase in Ne (Skrbinšek et al., 2012) that is very similar to the one we observe. They concluded that this could be related to population growth, but lacked census data with which to test this.

An important quantity in conservation genetics is the ratio of Ne to Nc (Palstra and Ruzzante, 2008) and there is considerable interest in whether this might change over time. In natural salmonid populations, species with lower census sizes tend to have higher Ne to Nc ratios, a finding that has been attributed to ‘genetic compensation’, a buffering mechanism that could help to retain genetic diversity in small populations (Palstra and Ruzzante, 2008). The same has also been reported within species using time series data from wild Atlantic salmon and steelhead trout populations (Ardren and Kapuscinski, 2003). We found no evidence of compensation in this common buzzard population, but this makes sense given the species has very low reproductive skew. In contrast, salmonids have much higher reproductive skew and thus compensation can occur, for instance, if juvenile parr have increased reproductive success at low densities (Palstra and Ruzzante, 2008).

Finally, it is worth considering the practical implications of our findings. Estimating Ne separately for each of the years resulted in a range of values that mostly fell above the inbreeding avoidance criterion of 50 proposed by Franklin (1980), Lande (1988), Franklin and Frankham (1998) as well as Lynch and Lande (1998). This is consistent with recent work on the same buzzard population that found no evidence of inbreeding (Boerner et al., 2013). However, estimates for 2006–2008 inclusive fell below the proposed threshold and thus very different conclusions could be reached depending on the year in question. Taken together with the documented increase in census size, our results suggest that our buzzard population is not at imminent risk of extinction, and argue that caution is warranted when drawing firm conclusions on the basis of a single sample.

Conclusions

Effective population size monitoring is advocated in conservation and management programmes (Schwartz et al., 1999; Leberg, 2005; Schwartz et al., 2007) yet collecting long-term observational and genetic data from natural populations presents a major challenge. We generated a microsatellite data set spanning over a decade for an intensively monitored buzzard population that uncovered marked temporal variation in . Further long-term studies of natural populations are needed in order to generalise our findings to other species and ecological contexts.

Data archiving

Data sets used in the analysis of this paper are available from the Dryad Digital Repository: http://dx.doi.org/10.5061/dryad.jr107.