Introduction

Rapid land use changes, most severely the loss or fragmentation of large, homogeneous stretches of habitat, constitute a main driver of the ongoing biodiversity loss (Pimm and Raven, 2000; Mace and Purvis, 2008; Laurance, 2010). One of the processes underlying biodiversity loss is the increase of the relative impact of demographic stochasticity in small and isolated remnant populations (Melbourne and Hastings, 2008) that may lead to a cascade of interacting genetic and demographic factors (Allendorf et al., 2013). Fluctuations in population size (Frankham, 1995; Vucetich et al., 1997), unequal sex ratios (Frankham, 1995) or large variance in reproductive success (Nunney, 1996; Storz et al., 2001) may all reduce the effective size (Ne; Wright, 1931) relative to the absolute number of individuals in a population (Caballero, 1994; Leberg, 2005). The Ne of a population is defined as the number of breeding individuals in an idealized population that experience the same amount of random genetic drift or the same amount of inbreeding as the population under consideration (Wright, 1931). A reduction of Ne, in turn, may therefore result in increased genetic drift and inbreeding depression (Reed and Frankham, 2003) and reduced genetic diversity (Frankham, 1996). The latter has been associated with reductions in various fitness traits such as growth, survival and disease resistance (Falconer and Mackay, 1996; Reed and Frankham, 2003), and the feedback spiral between effective population size and genetic diversity may ultimately increase the probability of population extinction (Allendorf et al., 2013). Immigration from neighboring populations, however, may dampen demographic and genetic effects of population subdivision through demographic population growth and increased genetic variation as a result of gene flow (Hanski and Gilpin, 1991; Clobert et al., 2001).

Although effective population size is considered a key parameter when aiming to understand evolutionary processes (Charlesworth, 2009) or predict the viability of endangered populations (Frankham, 2005), its estimation remains difficult in practice (Luikart et al., 2010; Waples and Do, 2010; Gilbert and Whitlock, 2015). Contemporary Ne estimates based on genetic methods can be obtained from a population sampled at one point in time (so-called single sample estimators) or from multiple temporal samples (temporal estimators). Both methods, however, do not estimate the same conceptual Ne; single sample estimators are related to ‘inbreeding Ne’ and provide an estimate of the effective number of breeders, whereas temporal estimators are based on the premise that temporal variance in neutral genetic allele frequencies, and therefore the amount of random genetic drift, is inversely proportional to the effective population size and, as such, estimate the harmonic mean Ne over time (also called ‘variance Ne’) (Waples, 2005; Luikart et al., 2010). Each method is expected to perform differently depending on population structure, gene flow, population size and sampling effort, and can be considered to provide independent information (Waples, 2005; Luikart et al., 2010; Barker, 2011; Holleley et al., 2014; Gilbert and Whitlock, 2015). However, both estimates are expected to converge unless populations are permanently subdivided, or when populations are decreasing or increasing (Felsenstein, 1971; Wang, 1997a, 1997b).

Despite the increased focus on urban ecology (see, for example, Gil and Brumm, 2013; Inger et al., 2014) nourished by the unprecedented rates of urban sprawling and the knowledge that urban species will comprise a significant component of future global biodiversity (Müller et al., 2010), Ne estimates of species typical for urbanized areas remain scant. Furthermore, most urban studies focus on current levels of genetic variation, differentiation or gene flow to infer population health or predict population trends (see, for example, Vangestel et al., 2012; Saarikivi et al., 2013; Brashear et al., 2015). Yet, a more powerful approach relies on a temporal comparison of genetic data from current populations with those collected in the same locations before a population decline or subdivision (Schwartz et al., 2007; Habel et al., 2013), for example, through the use of museum specimens (Wandeler et al., 2007; see Kekkonen et al., 2011a for an overview on avian studies that compare historical and contemporary samples), as the direct consequences of demographic changes can be evaluated. Indeed, knowledge of the genetic structure and diversity before a decline allows one to assess to what extent current genetic patterns are a direct consequence of such decline, rather than reflecting species-specific properties (Matocq and Villablanca, 2001). Besides, genetic time series are also required for temporal Ne estimation methods (Palstra and Ruzzante, 2008) that are considered to be more accurate and precise (Gilbert and Whitlock, 2015).

Although long being thought of as a thriving and ubiquitous urban species (Chace and Walsh, 2004), house sparrows (Passer domesticus) have suffered a dramatic decline in abundance and distribution across large parts of Europe (Hole et al., 2002; Chamberlain et al., 2007; De Laet and Summers-Smith, 2007; Inger et al., 2014; http://bd.eionet.europa.eu/article12/summary?period=1&subject=A620) and evidence has mounted that these reductions vary considerably across locations and in their timing (De Laet and Summers-Smith, 2007; Shaw et al., 2008). The rapid decline in sparrows has been partly attributed to agricultural intensification in rural areas and the loss of green spaces in urban areas, both resulting in reduced food availability (Chamberlain et al., 2007; De Laet and Summers-Smith, 2007). House sparrows are among the most sedentary of all temperate passerines, with juveniles dispersing in a ‘stepping-stone’ manner, postnatal dispersal distances being typically short (1.0–1.7 km; Anderson, 2006) and adult birds exhibiting high breeding site fidelity (Summers-Smith, 1988; Heij and Moeliker, 1990; Anderson, 2006). Such philopatric behavior may result in local extinctions not being compensated by recolonization, and hence may transform contiguous populations into patchy ones—a pattern currently observed in highly built-up areas (Shaw et al., 2008; Vangestel et al., 2011). Despite this highly sedentary behavior of house sparrows, several studies failed to detect large-scale genetic differentiation in the absence of geographical barriers (Fleischer, 1983; Parkin and Cole, 1984; Kekkonen et al., 2011b; but see Jensen et al., 2013 for the island effect), even after a severe population decline (Kekkonen et al., 2011a; Schrey et al., 2011). This lack of differentiation may be explained by the fact that even few individuals dispersing in a stepping-stone pattern already suffice to maintain genetic homogeneity across large geographic distances (Allendorf, 1983). Although some studies have investigated the genetic structure of house sparrows, few have targeted the genetic consequences of the population crash, especially in terms of demography and effective population sizes (Schrey et al., 2011; Vangestel et al., 2012; Baalsrud et al., 2014); only a single study (Kekkonen et al., 2011a) has used a spatiotemporal sampling design to study the effect of past sparrow declines on contemporary genetic patterns, but without assessing Ne.

As elsewhere in Europe, the Netherlands experienced a dramatic house sparrow decline over the past decades starting around 1980 (detailed overview in Heij, 2006). Before that year, only a single localized population decline was recorded in 1928, probably as a result of a pathogen outbreak (Anonymous, 1928 in Heij, 2006). Although a survey (1973–1977) conducted by SOVON still yielded mean population densities of 100 pairs per ha and a total Dutch population estimate of 1 000 000–2 000 000 pairs (Teixeira, 1979), local population declines south of Amsterdam (Het Gooi, Vechtstreek) were reported from 1981 onwards (Woldendorp, 1981) and subsequently went into alarming population crashes, in particular in larger cities such as Rotterdam, where local densities of originally 10 pairs per ha completely disappeared within only a few years (Van der Poel, 1998; Heij, 2006). As a result, the Dutch breeding populations suffered a dramatic reduction of 50% between 1980 and 2002 (Heij, 2006). During the past decade, population sizes appear to have stabilized to post-decline numbers (Hustings et al., 2004). To test whether this well-documented decline also resulted in genetic signatures of population reduction and subdivision, we here conduct a longitudinal comparative study between historical (pre decline) and contemporary (post decline) populations to determine whether, and to what extent: (1) genetic differentiation has increased over time, (2) contemporary populations are subjected to increased levels of genetic erosion and (3) effective population sizes decreased in a comparable way as census population sizes.

Materials and methods

Population sampling

Genetic samples were obtained from seven Dutch populations before (museum specimens) and after (wild-caught individuals) the population decline (details in Table 1 and Figure 1). A total of 187 (pre decline) individuals had been collected between 1906 and 1981 and were sampled at the Naturalis Biodiversity Centre (Leiden, The Netherlands), where they are currently kept as study skins. From each specimen, one toe pad was removed for DNA extraction. A total of 177 (post decline) individuals were trapped in 2011 with standard mist nets at the same locations where historical samples had been collected. Upon capture, each bird was ringed, standard morphological measurements were taken and a small sample of body feathers was collected to extract DNA. After processing, all birds were released at their original site of capture. Although only adult birds were sampled in both pre- and post-decline periods, this did not exclude the occasional sampling of close relatives. However, although nonrandom sampling may potentially bias metrics of genetic diversity, such effects are supposed to be minimal as a previous study on house sparrow populations in a similar habitat reported only a small proportion of close kin within a population. Moreover, such small numbers of close relatives had no measurable effect on estimates of genetic diversity at the population level (Vangestel et al., 2012).

Table 1 Number of Passer domesticus individuals sampled per locality and per time period
Figure 1
figure 1

Map showing the seven sampling locations in the Netherlands: Amsterdam (Am), Berg-en-Terblijt (B&T), Leiden (Le), Twello (Tw), Voorschoten (Vo), Wilp (Wi) and Zoetemeer (Zo). In gray are urbanized areas extracted from the CORIN land cover.

DNA extraction, PCR and microsatellite selection

We used the NucleoSpin Tissue kit (Macherey-Nagel, Düren, Germany) to extract DNA from feathers and toe pads. A large set of microsatellite primers is available for this species and has been successfully applied in previous house sparrow studies (Neumann and Wetton, 1996; Griffith et al., 2007; Dawson et al., 2010; Vangestel et al., 2011, 2012). We genotyped all individuals at nine preselected microsatellite loci characterized by low-complexity peak patterns. Primers were assembled into three multiplexes, each containing three different primer pairs. The first multiplex reaction contained primers Pdo10 (Griffith et al., 2007), Pdo19 and Pdo22 (Dawson et al., 2012); the second one contained primer Pdo47 (Dawson et al., 2012), Pdoμ1 (Neumann and Wetton, 1996) and TG04-12 (Dawson et al., 2010); the third one contained Pdo16, Pdo32 (Dawson et al., 2012) and TG01-048 (Dawson et al., 2010). PCR reactions were performed on a 2720 Thermal Cycler (Applied Biosystems, Foster City, CA, USA) in 4.5 μl volumes, containing 1.5 μl of genomic DNA, 1.5 μl Qiagen Multiplex PCR Mastermix (Qiagen, Venlo, The Netherlands) and 1.5 μl primer mix (0.2 μM each). The PCR profile consisted of an initial denaturation step of 15 min at 95 °C, followed by 35 cycles of 30 s at 94 °C, 90 s at 57 °C and 60 s at 72 °C. Finally, an elongation step of 30 min at 60 °C was included. The PCR products were separated and visualized with an ABI 3130XL Genetic Analyzer (Applied Biosystems). Genotypes were scored with GENEIOUS 7.0.5 (Kearse et al., 2012).

MICRO-CHECKER 2.2.3 (Van Oosterhout et al., 2006) was used (10 000 Monte Carlo simulations and 95% confidence intervals) to identify scoring errors due to stuttering, differential amplification of size-variant alleles causing large allele dropout or presence of null alleles. Locus Pdo32 showed evidence for null alleles in 6 out of 14 population samples: Le (pre decline), Wi (pre- and post-decline), B&T (pre- and post-decline) and Tw (post decline). As the pattern was not consistent across populations and because pairwise FST estimates were similar with and without this locus (see further), we maintained it for subsequent analyses. After Bonferroni correction for multiple testing, none of the locus pairs showed significant linkage disequilibrium and only locus Pdo32 showed a significant deficit in heterozygotes compared with Hardy–Weinberg expectations (GENEPOP 4.2.1; Raymond and Rousset, 1995).

Genetic diversity and population structure

Patterns of genetic diversity were quantified by allelic richness corrected for sample size using FSTAT 2.9.3 (Goudet, 1995), by observed (Ho) and expected (He) heterozygosity using ARLEQUIN 3.5 (Excoffier et al., 2005), and by the number of private alleles using GENALEX 6.501 (Peakall and Smouse, 2012). Statistical significance levels were assessed using a Wilcoxon signed-rank test. GENEPOP was used to estimate pairwise FST (θ; Weir and Cockerham, 1984) as a measure of between-population genetic differentiation. Temporal changes in genetic differentiation were assessed by plotting pairwise FST values among post-decline samples against those obtained from pre-decline samples. Residual values were calculated as the difference between post-decline FST values and the identity line (FST pre decline equals FST post decline), and allowed us to assess whether genetic structure increased (positive residuals) or decreased (negative residuals) over time. We conducted a Mantel test between matrices of pre-decline pairwise FST values and the pairwise residuals to test whether the size of genetic change was related to initial levels of historical differentiation.

Spatial genetic structure for each period was assessed using the Bayesian clustering method implemented in the R package GENELAND 4.0.5 (Guillot et al., 2005b; R 3.0.3, R Core team, 2014). GENELAND uses a colored Poisson-Voronoi tessellation to model the a priori distribution of population clusters across space that yields a decrease in the probability that two individuals belong to the same population with geographic distance (Guillot et al., 2005a). First, we estimated the number of clusters K under the spatial model with correlated allele frequencies (Guillot, 2008) and without admixture, using 10 independent Markov chain Monte Carlo runs of 500 000 iterations each with a thinning interval of 50 and a post burn-in of 2000 iterations. Priors for the number of clusters K were set from 1 to 7, that is, reflecting the number of populations. Second, we repeated this procedure by fixing K to the inferred value in order to estimate allele frequencies and cluster locations. Third, we estimated the admixture proportions conditioned by the data and parameter estimates obtained from the nonadmixture run with the highest log-posterior probability by using the same Markov chain Monte Carlo parameters (Guedj and Guillot, 2011). Admixture proportion plots were constructed using DISTRUCT 1.1 (Rosenberg, 2003).

Fine-scaled patterns of genetic structure were assessed using the spatial autocorrelation analysis in SPAGEDI 1.4 (Hardy and Vekemans, 2002) that quantifies the association between matrices of pairwise genetic and spatial distances (Vekemans and Hardy, 2004). We therefore used the individual pairwise kinship coefficient Fij (Loiselle et al., 1995) as a measure of correlation between allelic states. To reduce bias due to unequal sample sizes, distance classes were defined in such a way that the number of pairwise comparisons within each distance interval was approximately constant, that is, 0–5 km, 6–45 km, 46–115 km, 116–160 km and 161–180 km. Confidence intervals for each average Fij were calculated by permuting multilocus genotypes and spatial coordinates (10 000 iterations) under the null hypothesis of no genetic structure.

Gene flow

Gene flow between post-decline populations was inferred by estimating previous generation migration rates with BIMR 1.0 (Faubet and Gaggiotti, 2008). The sampling scheme over the pre-decline period did not allow us to estimate gene flow between pre-decline populations. BIMR uses the multilocus genetic disequilibrium in migrants and their recent descendants to infer the proportion of immigrants in a given population and, as such, relaxes the assumption of Hardy–Weinberg equilibrium (Faubet and Gaggiotti, 2008). To allow convergence, we only estimated migration rates between four populations/clusters based on their location, that is, Am, B&T, Le-Vo-Zo and Tw-Wi, and ran 10 replicates. Each replicate started with 20 short pilot runs of 1000 iterations followed by 500 000 iterations discarded as burn-in. We then ran 1 000 000 iterations from which samples were drawn every 50 iterations for a total sample size of 20 000 samples for each replicate. We used the model with correlated allele frequencies (F-model) that accounts for population admixture that may have taken place before the last generation of migration, as this procedure is believed to improve migration estimates (Faubet and Gaggiotti, 2008). Parameter estimates were selected from the run with the highest log-likelihood value.

Effective population size

We estimated effective population sizes for each population separately. We used single sample estimators to investigate changes in Ne over time and temporal approaches to study the variance in allele frequencies generated by genetic drift. We used three different single sample estimators to estimate Ne for both time periods. (1) The sibship method implemented in Colony2 2.0.5.9 (Wang, 2009) estimates NeSib based on a sibship assignment analysis. We used the full-likelihood method with a weak prior probability assuming random mating and monogamy. (2) The linkage disequilibrium method implemented in NEESTIMATOR 2.1 (Do et al., 2014) estimates NeLD through the linkage disequilibrium that arises because of genetic drift. For this we used a minimum allele frequency of 0.02 and generated 95% confidence intervals by jackknifing. (3) The approximate Bayesian computation (ABC) method implemented in ONESAMP (Tallmon et al., 2008) that estimates NeABC by comparing eight summary statistics (including linkage disequilibrium). The maximum and minimum values for Ne were set as 2 and 10 000 respectively. To assess whether Ne declined over time we conducted for each method a one-sided nonparametric sign test in StatXact (version 5.0.3, Cytel Software, Cambridge, MA, USA), utilizing the paired block design to account for potential biases introduced through differences in time spans between pre- and post-decline sampling events across locations.

As temporal methods use samples of the same population obtained at multiple time points to estimate Ne, only one estimate can be obtained for each population. We generated estimates from four different temporal methods. (1) CONE 1.01 is based on the coalescent of gene copies drawn in the second period and uses Monte Carlo computations to calculate the likelihoods of specific Ne (Berthier et al., 2002; Anderson, 2005). Here, we used a similar range (minimum 2, maximum 10 000) as for ONESAMP and a sampling interval of 5 and ran the model for 5000 Monte Carlo chains. (2) TEMPOFS estimates genetic drift between temporally spaced samples using the Fs measure of allele frequency change (Jorde and Ryman, 2007). In this program we employed sampling plan II and assumed a generation time of 2 years. (3) MLNE (Wang and Whitlock, 2003) uses a maximum likelihood approach to estimate drift between temporally spaced populations. Given that pre-decline samples were collected at different times, it was not possible to estimate migration between these populations and the option of no gene flow was hence selected. Finally, we used (4) the moment-based estimator implemented in MLNE. In order to obtain one estimate for the single sample method and one for the temporal method for each population, estimates of each method were combined by calculating the harmonic means and s.d. values (s.d.=sqrt((mean(1/x))^(−4) × var(1/x)/length(x)), where x refers to the array of point estimates of Ne for a population). However, it is important to note that the s.d. values of the harmonic means are only based on the point estimates and not their confidence intervals.

Population bottlenecks

Recent decreases in Ne were investigated using BOTTLENECK 1.2.02 (Piry et al., 1999) that generates the distribution of He under mutation–drift equilibrium for each locus and population under the assumption that reductions in the number of alleles precede those in heterozygosity (resulting in heterozygosity excess) in recently bottlenecked populations. Data were simulated under the infinite allele model, the stepwise mutation model (representing two extreme mutation models; Cornuet and Luikart, 1997) and the two-phase model by combining 90% single- and 10% multi-step mutations, with a variance of 30 among multiple-step mutations (10 000 replications). Expected values were compared with observed heterozygosity levels calculated from observed allele frequencies (Nei et al., 1975). Wilcoxon signed-rank tests were applied to assess statistical significance of heterozygote excess. Finally, we inspected distribution patterns of allele frequencies to assess mode shifts from low allele frequency classes to intermediate ones, considered indicative for recent bottlenecks (Luikart et al., 1998).

Results

Genetic diversity and population structure

Levels of allelic richness, observed and expected heterozygosity and the number of private alleles did not significantly differ between periods (two-tailed probabilities: all >0.05) and estimates of different populations were highly similar (Table 2). Pairwise FST values ranged from −0.0031 to 0.0152 (mean FST±s.d.: 0.0037±0.0048) for pre-decline samples, and from −0.0023 to 0.0216 (mean FST±s.d.: 0.0085±0.0068) for post-decline samples. Comparing levels of genetic differentiation between periods indicated a temporal shift in the magnitude of pairwise fixation indices. The majority of pairwise FST values increased over time and this effect was more pronounced when populations were characterized by low levels of differentiation in the past as indicated by the Mantel test, that is, differences between pre- and post-decline FST values were negatively correlated with pre-decline measures of differentiation (r=−0.67, P=0.004; Figure 2 and Supplementary Appendix 1).

Table 2 Estimates of allelic richness (AR), observed heterozygosity (Ho), expected heterozygosity (He) and number of private alleles (PA) in seven Dutch populations of Passer domesticus pre decline (period 1) and post decline (period 2)
Figure 2
figure 2

Comparison of pairwise FST in pre- and post-decline samples of P. domesticus. The identity line represents the case where pre- and post-decline pairwise FST would be equal.

Spatial genetic structure was analyzed using GENELAND. For both periods and all replicates, the posterior distribution of K under the nonadmixture model showed the highest posterior probability for K=2 (Supplementary Appendix 2). Yet, for pre-decline samples, post estimation of cluster membership yielded complete admixture of individuals in all replicates (Figure 3a), suggesting a single admixing population. In the post-decline samples, however, there was an almost equal posterior probability for K=3 (Supplementary Appendix 2). In order to differentiate between both models, we repeated the procedure while fixing K=3 instead of K=2 (see Materials and methods for details) and computed the modal population for each pixel. As one of the three inferred clusters did not appear to be the modal population for any pixel (see also Supplementary Appendix 3), this cluster can be regarded as a ghost population (sensu, Guillot, 2008) and was subsequently ignored. Under the K=2 model, the first cluster comprised populations Am, Le, Vo and Zo and the second one comprised Tw, Wi and B&T (Figure 3b).

Figure 3
figure 3

Genetic structure plot of (a) pre-decline and (b) post-decline samples as inferred from Bayesian genetic clustering for K=2. Each bar represents an individual partitioned according to its probability of assignment to a cluster.

During both periods, a significant positive genetic autocorrelation for distance pairs below 5 km emerged (Supplementary Appendix 4), whereas Fij values were not significant beyond this distance class. On average, Fij values tended to be higher for post-decline samples (Fij (95% confidence interval): 0.007 (0.00443–0.0099)) compared with pre-decline samples (0.004 (0.0016–0.0067)). Although Fij values showed extensive overlap between both pre-and post-decline confidence intervals, mean Fij values in each period were never included within the outer limits of the confidence interval of its temporal counterpart, hence suggesting at least subtle differences in kinship coefficients.

Gene flow

Immigration rates in post-decline samples estimated from the replicates with the highest log-likelihood ranged from 0.111 (0.007; 0.594) (from Le-Vo-Zo into B&T) to 0.282 (0.009; 0.726) (from B&T into Tw-Wi). Proportions of nonmigrant individuals ranged from 0.351 (0.035; 0.937) (Am) to 0.534 (0.081; 0.915) (B&T).

Effective population sizes and bottlenecks

Single sample estimates of effective population sizes varied both among populations and estimators (Figure 4 and Supplementary Appendix 5). NeABC generally yielded smaller estimates than NeLD and also had smaller confidence intervals, suggesting a higher precision of the estimates, whereas NeSib resulted in most occasions in intermediate estimates. For the pre-decline period, some populations had negative estimates of NeLD. Negative Ne estimates occur when the genetic data can be explained entirely by sampling error without invoking genetic drift (Waples and Do, 2010). However, they were included in the harmonic means (Table 3) in order to avoid downward bias in the composite estimate of Ne as suggested by the software developers (Waples and Do, 2010). NeSib decreased significantly over time (Z=1.89, P=0.029)—all populations, with the exemption of Leiden, showed a reduction in NeSib. Such pattern, however, could not be confirmed by temporal changes in NeABC (Z=−1.134, P=0.13). Although three out of four locations showed a decrease in NeLD estimates, no formal statistical test could be conducted as three locations returned negative estimates.

Figure 4
figure 4

Estimates of effective population size of seven Dutch P. domesticus populations. The left side of each plot corresponds to pre- and post-decline estimates based on three single sample methods (NeSib, NeABC and NeLD). Percentages of change in Ne for each method are represented below the post-decline estimates. The right side of each plot corresponds to the variance effective population size estimates (on another scale) based on three temporal methods (TempoFS, MLNE and Moment-based method; CONE estimates were excluded as they were all infinite). Error bars show 95% confidence intervals. Negative, infinite and large estimates are replaced by ‘n.a.’. All estimated values are reported in Supplementary Appendix 5.

Table 3 Harmonic means of effective population size of seven Dutch Passer domesticus populations calculated from three single sample estimates (NeSib, NeABC and NeLD) and three temporal estimates (TEMPOFS, MLNE and Momentbased method; CONE estimates were excluded as they were all infinite)

Temporal estimates differed largely among methods; generally, estimates were much larger than for single sample estimators and many estimates included infinity or the maximum allowed value in their confidence intervals (Figure 4 and Supplementary Appendix 5). The CONE method (results not shown) gave infinite estimates for Ne across all populations. Hence, we calculated harmonic means based on the three other methods only (Table 3). The smallest Ne was estimated for Vo with a harmonic mean of 642 (±131 s.d.) individuals, whereas Wi showed the largest Ne with a harmonic mean of 6 094 (±3112 s.d.) individuals. We could not generate a reliable mean for the Amsterdam population as the MLNE estimate reached the maximum allowed Ne.

Based on the shape of the allele frequency distributions, none of the populations appeared to have experienced a recent reduction in effective population size (Supplementary Appendix 6). Results from the comparison between expected heterozygosity observed in populations and heterozygosity expected at mutation–drift equilibrium strongly hinged on the mutation model used, probably because of the violation of the assumption of closed populations without migration. For the infinite allele model, all populations besides Wi post decline showed signs of a bottleneck, whereas for the two-phase model, only Am pre decline was significant. Under the stepwise mutation model, Tw and Wi pre decline and Zo post decline showed significant deficiency of heterozygosity.

Discussion

Spatiotemporal analysis of house sparrow genetics within an urbanized Dutch landscape revealed a progressive decrease in effective population sizes and genetic connectivity over time, whereas genetic diversity remained largely constant. Genetic differentiation among populations was low at both time points, yet tended to increase after the population decline that was corroborated by a decrease in genetic admixture. Although our study design did not allow us to fully discriminate between the relative effects of gene flow and genetic drift, we argue that longitudinal analysis of historic museum specimens and contemporary samples collected in the same locations improves the statistical inference of key genetic and demographic parameters, such as effective population sizes.

Pairwise FST values revealed increasing genetic differentiation over time, although values remained low during both periods and were not affected by between-location differences in pre-decline sampling intervals (Supplementary Appendix 7). These results are largely in line with an earlier study on Finnish house sparrows (Kekkonen et al., 2011a). However, contrary to these authors, the spatially explicit clustering analysis applied in our study also provided evidence for a temporal decrease in genetic admixture, possibly because of the a priori assumption of spatial dependence of individuals that is thought to be biologically sound (Guillot et al., 2005a). Indeed, analyses based on this assumption earlier proved to perform well under weak levels of population differentiation (Guillot, 2008; Safner et al., 2011), in particular for detecting recent barriers to gene flow (Coulon et al., 2006; Safner et al., 2011; Blair et al., 2012). In our study, post-decline population clusters most likely resulted from a progressive connectivity loss at the landscape level, rather than from a distinct geographical barrier to dispersal (Jensen et al., 2013). At a smaller spatial scale, Vangestel et al. (2011) earlier revealed higher average levels of genetic relatedness among house sparrows from more urbanized areas in Flanders, most likely reflecting reduced dispersal in more built-up habitats. Post-decline populations from the western Dutch cluster are embedded within a highly urbanized area on lowland peat (on or below sea level) with large cities and many new townships, whereas those belonging to the eastern cluster are located on sandy soils (well above sea level) and consist of small townships in a semi-open landscape. Between both regions, the landscape is partly forested, but mainly consists of agricultural landscapes interspersed with urban areas.

Contrary to our expectation, however, the presumed temporal loss in connectivity among the Dutch populations did not coincide with a reduction in genetic diversity. Such apparent discrepancy between different genetic signatures of population subdivision may have multiple (nonexclusive) reasons. First, reduced populations may still retain a large proportion of the original genetic variation; this pattern, however, is more common in species with long generation times (Hailer et al., 2006; Lippé et al., 2006). Second, after a demographic bottleneck, genetic diversity reaches new equilibria at a much slower rate than genetic differentiation does. This may result in a lag phase between changes in census population size and in the genetic diversity signature thereof (Varvio et al., 1986; Habel et al., 2015). As such, increased levels of genetic differentiation—as shown in this study—might be a forerunner of strong genetic erosive manifestations in the near future (Kekkonen et al., 2011a; Habel et al., 2015).

As opposed to the maintained levels of genetic variation, NeSib and NeLD tended to decrease in most populations after the demographic population decline, although such trend was not corroborated by the NeABC estimates. Most reductions of post-decline effective population were in line with the 41–66% reduction in census sizes that Dutch house sparrow populations suffered from between 1984 and 2012 (http://bd.eionet.europa.eu/article12/summary?period=1&subject=A620; The state of Europe’s common birds 2007). Although temporal Ne estimates were about one order of magnitude higher than the corresponding single sample estimates, both methods in general revealed higher Ne values in populations Wi, Tw and B&T, and hence lower presumed levels of genetic drift. However, as statistical Ne estimates are thought to be biased by the presence of migration (which may both result in under- or over-estimation of the true Ne; Wang and Whitlock, 2003; Waples and England, 2011), the observed trends in Ne may also be attributed to temporal shifts in gene flow, or to combined changes in genetic drift and gene flow. Furthermore, strong disagreement between variance and inbreeding Ne may point toward changes in demography, and this is in line with the observed severe population declines (Wang, 1997b).

However, one problem associated with our data is the long time frame within which the pre-decline samples were obtained. If strong demographic changes occurred within this time frame our pre-decline estimates may be biased. However, no indications of pre-decline fluctuations in population size are present in the literature. Along the same lines, genetic signatures of demographic bottlenecks were equivocal and strongly depended on the presumed underlying mutation model. Again, several nonexclusive explanations may account for this. First, Ne to Nc ratios may have shifted over time, with an increased proportion of individuals taking part in reproduction after the population decline (Frankham, 1995). Apart from the fact that such a shift was not a priori predicted, it is also rather unlikely in our case as population counts were based on the number of breeding males. Second, substantial heterozygosity loss is only expected in the presence of a severe and sudden bottleneck (Lozier and Cameron, 2009), whereas a more gradual population decline, such as observed in European house sparrows, would result in low statistical power to detect heterozygosity excess if present (Cornuet and Luikart, 1997). Third, one of the key assumptions underlying tests implemented in BOTTLENECK is the strict absence of dispersal (Broquet et al., 2010) that does not correspond with the low levels of FST and high levels of population admixture present in our study populations. Although effects of violating this assumption are not yet well documented, high levels of gene flow may indeed blur the genetic signature of population bottlenecks.

Conclusion

Although population genetics provide us with a variety of proxies to estimate demographic changes and population structure, these do not provide direct tests of temporal variation in population structure. The latter requires the analysis of population samples taken at consecutive points in time, that is, before and after demographic events (Habel et al., 2013). Although such sampling may require long-term field studies, museum collections can provide a convenient alternative. The use of museum specimens in this study was particularly relevant to quantify changes in genetic population structure following the reported house sparrow decline after the 1970s. As shown by Kekkonen et al. (2011a) and in this study, current house sparrow populations are still characterized by high levels of genetic connectivity and diversity. Such observation alone may lead to conclude that the demographic population decline had no (or very little) effect on the genetic makeup of house sparrows. Yet, our temporal analysis revealed a clear genetic signature of population subdivision over time. Conversely, in a spatiotemporal study on Hawaiian Goose (Branta sandvicensis), Paxinos et al. (2002) showed that the contemporary low mitochondrial DNA variability was not the result of a post-1800 bottleneck, but rather reflected a species-specific property, as low variability was already present in pre-decline samples. Using a similar line of reasoning, Callens et al. (2011) showed that levels of historical decrease in mobility, rather than contemporary mobility, best explained among-species variation in sensitivity to tropical forest fragmentation. We hence conclude that longitudinal surveys of genetic population structure, such as reported here, support the unique value of natural history collections as underutilized biological resources.

Data archiving

Data available from the Dryad Digital Repository: http://dx.doi.org/10.5061/dryad.h138f.