Introduction

The major climate fluctuations of the Quaternary period, most recently the Pleistocene glaciations, played a major role in shaping today’s populations (Hewitt, 2000; 2004). Temperate species were unable to survive in their pre-glaciation range and persisted instead in ice-free regions known as refugia. A combination of fossil evidence, pollen data, molecular data and sediment cores support the presence of multiple refugia during the last glacial maximum (LGM; the Wisconsin glaciation) in the mid-latitude United States and in Beringia (Pielou, 1991; Provan and Bennett, 2008; Shafer et al., 2010b). A number of smaller refugia are purported to have existed along the periphery of the ice sheets. The now submerged Atlantic Shelf refugium in eastern North America (Pielou, 1991) probably supported numerous tree (Walter and Epperson, 2001; Jaramillo-Correa et al., 2004) and bird (Gill et al., 1993; Zink et al., 2003) species.

The boreal chickadee (Poecile hudsonicus) is a small, resident, boreal songbird found throughout Canada and the northern United States (Figure 1; Ficken et al., 1996). Banding records (Canadian Bird Banding Office) are sparse, with all recaptured birds found within 40 km from the initial banding site (n=21; recaptures occurred 4–60 months after initial banding). The boreal chickadee specialises on soft heartwood such as black and white spruce (Picea mariana and Picea glauca) and balsam fir (Abies balsamea) and is found in northern boreal forests (Ficken et al., 1996). The current distribution was almost entirely glaciated during the LGM and includes both potential refugia: Beringia (Alaska), the Atlantic shelf and areas south of the ice sheets; and physical barriers: the Rocky Mountains, the Wrangell and Chugach Mountains (Alaska), the Cabot Strait (between Newfoundland and Nova Scotia) and the Strait of Belle Isle (between Newfoundland and Quebec/Labrador). Barriers such as water may restrict dispersal as boreal chickadees are absent from the large western islands (that is, Vancouver Island, Haida Gwaii and the Alexander Archipelago), and mountains may be restricting their western distribution as they are not found west of the Coast Range in British Columbia (Ficken et al., 1996).

Figure 1
figure 1

The distribution of the boreal chickadee across North America (shaded), and the geographical distribution of the three clusters found in BAPS v5.2: eastern (black), western (white) and continent-wide (grey) groups. Sampling sites are as follows: AKA (Alaska Anchorage), AKF (Alaska Fairbanks), AKW (Alaska Wrangell St Elias), NBC (northern British Columbia), CBC (central British Columbia), CAB (central Alberta), SAB (southern Alberta), SK (Saskatchewan), NON (northern Ontario), NQC (northern Quebec), NY (New York), NSNB (Nova Scotia and New Brunswick), NL (Newfoundland), and LAB (Labrador). Physical barriers are shown by triangles (mountains) and dashes (bodies of water). Distribution modified from Birds of North America online (Ficken et al., 1996).

Gill et al. (1993) examined populations of North American chickadees using mitochondrial DNA (mtDNA) restriction fragment-length polymorphisms; in the boreal chickadee, they looked at 37 birds from seven populations. Little genetic differentiation was seen in the continental populations (n=25), which roughly correspond to the previously glaciated region, and a unique haplotype was shared among all Atlantic Canada birds (n=12). The present study aims to expand upon these findings using an intensive, range-wide sampling regime and more sensitive mtDNA sequencing and microsatellite analyses in combination with spatio-geographic modelling. The combination of nuclear and mitochondrial markers will allow for a more complete picture of the chickadee’s history than is currently available and avoids the limitations of single gene inferences. Population genetic and phylogeographic studies are increasingly employing niche modelling as an additional tool to assess historical areas of suitable habitat (for example, Graham and Burg, 2012; Ralston and Kirchman, 2012; van Els et al., 2012).

Although a number of studies have examined the genetic structure of boreal forest trees (Walter and Epperson, 2001; Jaramillo-Correa et al., 2004; Anderson et al., 2006), few have looked at their avian inhabitants, especially at the continental scale (Ralston and Kirchman, 2012; van Els et al., 2012). This study aims to assess the population genetic structure of the boreal chickadee across its range and to understand what role the Pleistocene glaciations and contemporary physical barriers have in this wholly boreal species. For this study we combined both genetic analyses and spatial modelling to ask:

  1. 1)

    What is the population genetic structure of the boreal chickadee?

  2. 2)

    Did the boreal chickadee survive in a single or multiple glacial refugia, and can the location of the refugia be identified? As the current distribution of boreal chickadees is widespread and includes both Alaska (Beringia) and Nova Scotia and Newfoundland (the putative Atlantic Shelf refugium), we predict that the populations expanded from multiple glacial refugia.

  3. 3)

    Do contemporary physical barriers impede dispersal and gene flow in the boreal chickadee? Mountains and large expanses of water have been shown to prevent dispersal in many species (for example, Cronin et al., 2005; Hearn et al., 2006), and different subspecies of boreal chickadees are found on either side of several barriers (for example, Rocky Mountains and Strait of Belle Isle; see Ficken et al., 1996). We predict reduced gene flow across these physical barriers.

Materials and methods

Sample collection

Two hundred and thirty blood samples were collected from 11 sampling locations over four breeding seasons (2007–2010; Figure 1, Supplementary Table S1). Sampling locations, hereafter referred to as populations, were limited to a 50-km radius, where possible, with no obvious barriers to dispersal within a sampling site. Fifty-three museum samples (tissue and toe-pads) augmented sample sizes from field sites and added an additional three sampling locations (Figure 1, Supplementary Tables S1 and S2). DNA was extracted using a modified chelex procedure (Walsh et al., 1991).

Lab protocols

MtDNA

Two mtDNA fragments were amplified: the control region partial domain I and II (CR) and the ATPase coding region (ATP), which contains ATPase6, ATPase 8 and the tRNA lysine. The control region was amplified using the primers LmochCR1/H1015chCR (Lait et al., 2012). Some museum samples (toe-pads) required a semi-nested PCR using the primers L26chCR/H1015chCR, followed by LmochCR2/H1015chCR (Supplementary Table S3). The ATP gene fragment was amplified using universal avian primers L8929COII/H9855ATP6 (Sorenson et al., 1999) and toe-pad samples were amplified in two shorter, overlapping fragments using H534chATP/L8929COII and L298chATP/H9855ATP6 (Supplementary Table S3). Samples were sequenced on a 3130 Genetic Analyzer (Applied Biosystems, Foster City, CA, USA), and museum samples were sent to Genome Quebec for sequencing (McGill University, Montreal, QC, Canada).

Microsatellites

A small number of individuals from geographically distant populations (AKA and NL) were screened using 20 microsatellite primer pairs developed in a number of other passerines with a M13-tailed forward primer (see Burg et al., 2005). Of those that yielded PCR products, five loci were monomorphic (BT6, Escu1, Escu3, Pocc2 and Titgata88; Hanotte et al., 1994; Bensch et al., 1997; Otter et al., 2001; Wang et al., 2005), and eight were polymorphic (Escu4, Escu6, Pat14, Pat43, Pdo5, Ppi2, Titgata02 and Titgata39; Hanotte et al., 1994; Otter et al., 1998; Griffith et al., 1999; Martinez et al., 1999; Wang et al., 2005; Supplementary Table S3). The PCR used a two-step annealing process with eight cycles at T1 and 31 cycles at T2. For four of the loci (Escu6, Pat14, Ppi2 and Titgata39), the second step was reduced from 31 to 25 cycles (Supplementary Table S3). The PCR products were run on a 6% polyacrylamide gel using a LI-COR 4300 DNA Analyzer (LI-COR Inc., Lincoln, NE, USA). Alleles were scored manually by visual inspection and were independently confirmed by a second person. Four controls of known size were included on each gel, and a ‘mystery plate’ of 44 individuals was run as an additional test for consistency.

Genetic analyses

MtDNA

All analyses were done on a concatenated sequence of the CR and ATP fragments and on the two fragments separately; the results were similar and only the concatenated results are discussed further. The sequences were aligned in MEGA v5.05 (Tamura et al., 2011), haplotypes were assigned manually and confirmed using TCS v1.21 (Clement et al., 2000). Nucleotide and haplotype diversity were calculated in DNASP v5 (Librado and Rozas, 2009). Pairwise genetic differences (p distances; ΦST; 10 000 permutations) were calculated in ARLEQUIN v3.11 (Excoffier et al., 2005), and a modified false discovery rate (FDR) correction (Benjamini and Yekutieli, 2001) was applied. A Mantel’s test was performed in GENEPOP v4.0.10 to test for isolation-by-distance (Raymond and Rousset, 1995; Rousset, 2008) using geographical distances calculated in GEOGRAPHIC DISTANCE MATRIX GENERATOR v1.2.3 (Ersts, 2010) and linearised ΦST values. Significance was tested using 10 000 permutations.

A statistical parsimony network was constructed in TCS v1.21 (Clement et al., 2000) with a 95% connection limit. A spatial analysis of molecular variance (SAMOVA) was run to identify genetic clusters independent of sampling locations using both geographical and genetic data (K=2–13; 100 iterations; Dupanloup et al., 2002). To test if any structure detected was due to historical separation in refugia or to contemporary barriers, a series of hierarchical AMOVAs (10 000 permutations) were run in ARLEQUIN v3.11 (Excoffier et al., 1992, 2005). The scenarios tested included two or three refugia (east and west; east, central and west; and east, southern (SAB) and west), one, two or three contemporary barriers (mountain ranges and water barriers as shown in Figure 1) and combinations of refugia and barriers.

A principal coordinates analysis (PCoA) was performed in GENALEX v6.3 (Peakall and Smouse, 2006) on both individuals and population pairwise ΦST values. BAPS (Bayesian Analysis of Population Structure) v5.2 (Corander et al., 2008) was used to assign individuals to K clusters based on genetic data, with no a priori population information. As per Corander et al. (2008), the analyses used the clustering with linked loci option (Corander and Tang, 2007) and varied K (K=1–14).

Approximate divergence dates and relative migration rate estimates were calculated using the isolation with migration model as implemented in IMA v3.5 (Hey and Nielsen, 2007). IMA allows two populations to be compared; as such, the eastern and western groups identified by SAMOVA were tested. Multiple initial runs were performed to identify upper parameter bounds and an ideal heating scheme. The three final runs were performed using the HKY model, 500 000 burn-in and 1 000 000 post burn-in MCMC (Markov Chain Monte Carlo) steps, geometric heating, different random number seeds and a weighted average divergence rate of 1.7% per Myr (see below) with a low (1.0% per Myr) and a high (5.0% Myr) rate.

The control region and ATP divergence rates were estimated using an average cytochrome b (cyt b) rate of 2.1% per Myr (Weir and Schluter, 2008) and a combination of Poecile cytb, ATP and CR sequences (Uimaniemi et al., 2003; Gill et al., 2005; Lait et al., 2012). The calculated divergence rate was 1.78% per Myr for ATP, 2.62% per Myr for control region domain I and 0.95% per Myr for domain II. Although these values are considerably lower than traditionally considered for the control region, similar rates have recently been calibrated in a number of parids (Kvist et al., 2001; Päckert et al., 2007) and other birds (Zink et al., 2003; Lerner et al., 2011). Although using sequences to calculate divergence times is not as ideal as using fossil calibrations, the calculations do show that cyt b, ATP and the control region have similar divergence rates in Poecile.

Microsatellites

Only individuals with genotypes from at least six loci were included in the analyses with the exception of museum samples from NON, NQC and NY where samples with up to four (50%) missing loci were included in some analyses. The SAB population was not genotyped. MICRO-CHECKER v2.2.3 was used to detect input errors, allelic dropout, slippage stutter or null alleles (Van Oosterhout et al., 2004). Exact tests were run in GENEPOP v4.0.10 (Raymond and Rousset, 1995; Rousset, 2008) to check for linkage disequilibrium and deviations from Hardy–Weinberg equilibrium using modified Markov chain parameters (1000 batches, 10 000 iterations and 10 000 dememorisation steps). All P-values were corrected for multiple tests using the modified FDR method (Benjamini and Yekutieli, 2001).

Population genetic diversity was compared using expected heterozygosity and allelic richness. Observed (HO) and expected (HE) heterozygosities were calculated in ARLEQUIN v3.11 (Excoffier et al., 2005). Allelic richness (AR) and private allelic richness (PAR) were calculated in HP-RARE v1.1 using rarefaction (Kalinowski, 2005). Both global and population pairwise FST and RST values were calculated in ARLEQUIN v3.11 (100 000 permutations; Excoffier et al., 2005); Jost’s Dest (Jost, 2008) was calculated in GENALEX v6.5b3 (Peakall and Smouse, 2012). All comparisons involving NON and NQC were run using four and five loci, respectively (excluding Escu6, Ppi2 and Pdo5 in both, as well as Escu4 in NON), and all other populations were analysed using both six (excluding Ppi2 and Pdo5; see Results) and eight loci. A modified FDR correction was applied (Benjamini and Yekutieli, 2001).

Clustering analyses were run to delineate groupings based on individual genotypes using STRUCTURE V2.3 (Pritchard et al., 2000). STRUCTURE was run using uncorrelated allele frequencies and the admixture model, both with and without sampling locations as priors. The program was run from K=1–13 (10 runs each) for 100 000 burn-in and 300 000 post burn-in MCMC steps. The results of the 10 runs were averaged using STRUCTURE HARVESTER v0.6.6 (Earl and vonHoldt, 2012), and the most probable value of K was calculated using both the highest penalised log likelihood (Pritchard et al., 2000), and ΔK (Evanno et al., 2005). All runs were repeated for the 12 mainland populations excluding NL to test for additional substructure (see Results). In order to visualise the pattern of population genetic structure, a PCoA on population summary statistics (FST) was performed in GENALEX v6.3 (Peakall and Smouse, 2006), with significance tested in PCA-GEN v1.2.1 (Goudet, 1999).

Ecological niche modelling

Both contemporary (ecological niche modelling) and historical (palaeo-distributional modelling) patterns of species distribution were estimated in MAXENT v3.3.3 using a maximum entropy statistical model on presence-only occurrence data (Phillips et al., 2006; Phillips and Dudík, 2008). The model estimates the potential distribution of a species by assuming that the species is found in its preferred environmental conditions and that the current niche requirements are reflective of those in the past and/or future (that is, niche fidelity; Phillips et al., 2006; Richards et al., 2007).

Occurrence records were comprised of sampling locations from this study (n=279) and sightings downloaded from the Global Biodiversity Information Facility data portal (n=9624; GBIF data portal, 2011) and the Avian Knowledge Network (n=22 252; Avian Knowledge Network, 2009). The models were based on 19 WorldClim climatic variables (see Supplementary Table S4; Hijmans et al., 2005) extrapolated from GIS (Geographical Information Systems) layers as described in Carstens et al. (2007). The variables included averages, extremes and ranges of temperature and precipitation and were available for both the contemporary timescale and estimates for the LGM (ca. 21 kya). The Model for Interdisciplinary Research on Climate (MIROC) climate layers used as the past climate estimates were provided by the Palaeoclimate Modelling Intercomparison Project Phase II (Braconnot et al., 2007). Ten replicates were run using the cross-validation method.

Results

MtDNA

Two fragments of mtDNA were successfully amplified for 281 samples from 14 populations (GenBank accession numbers: JN654564-JN654638 (control region); JN654639-JN654699(ATP)). A 766-bp fragment of the control region contained 56 variable sites, almost two-thirds of which were found in domain I (34 of the 56). A 923-bp fragment of coding DNA (ATP) contained 53 variable sites with 32 synonymous and 18 non-synonymous substitutions (and three in the tRNA). There were no unexpected stop codons. Both the CR and ATP fragments contained fixed differences between eastern (LAB, NL, NSNB and NY) and western (AKA, AKF, AKW, NBC, CBC, CAB and SAB) populations (with few exceptions). The concatenated fragment contained 105 transitions, three transversions and one insertion/deletion (in the control region). No nucleotide position had more than two base variants.

One hundred and twenty-seven haplotypes were identified, 26 shared between two or more individuals and 101 unique (Supplementary Table S5). No haplotypes were shared among the most eastern populations (LAB and NL) and western populations (AK, BC and AB). Haplotype diversity was high in all populations (0.600–1.000), with the highest values seen in AKF (0.986), SAB (1.000), NON (0.990) and LAB (0.935), and the lowest in NY (0.600) and NL (0.800; Supplementary Table S5). Nucleotide diversity ranged from 0.001 to 0.003 (Supplementary Table S5). Although both haplotype and nucleotide diversities were lower in NL, it had the second highest number of private haplotypes. Pairwise ΦST values (Table 1) showed NL to be significantly different from all other populations, and AKA and AKW were significantly different from all but each other. Following the modified FDR correction (Pcrit=0.01), 70 of 91 pairwise differences were significant, with most non-significant values found between neighbouring populations.A significant correlation between genetic and geographical distance (isolation-by-distance) was found (r2=0.51; P<0.001).

Table 1 Population pairwise differences based on 100 000 permutations for mtDNA (ΦST, above diagonal) and microsatellites (FST, below diagonal)

The statistical parsimony network showed very little overlap among the eastern (haplotypes P, R–Z) and western (haplotypes A–O) populations, whereas the central populations (SK, NON and NQC) were present throughout the network (Figure 2). Similar patterns were seen when the ATP and CR fragments were analysed separately (not shown). The SAMOVA detected the presence of two groups (ΦCT=0.305, P<0.001): a western group (AKA, AKF, AKW, NBC, CBC, CAB, SAB, SK and NON) and an eastern group (NQC, NSNB, LAB, NY and NL). In the hierarchical AMOVA, the largest ΦCT value was obtained when populations were clustered into the eastern and western groups as above (ΦCT=0.307, P<0.001), with the next highest when SAB was included as a third group (ΦCT=0.299, P<0.001; Supplementary Table S6). Separation of the samples into the eastern and western groups plus NL as a third group was also supported (ΦCT=0.281, P<0.001). The pattern could not be explained by contemporary barriers alone (highest ΦCT=0.194, P=0.008; Supplementary Table S6). Both the PCoA on the individual samples (not shown) and pairwise ΦST values (Figure 3a) supported the separation of the eastern and western populations, with central populations (SK, NON and NQC) falling intermediately. BAPS separated the samples into three clusters: one primarily in the east (CAB, SK, NON, NQC, NSNB, LAB, NY and NL), one in the west (AKA, AKF, AKW, NBC, CBC, CAB, SK and NON), and a continent-wide group (all populations; see Figures 1 and 2).

Figure 2
figure 2

Statistical parsimony network of the concatenated mtDNA sequences. Each square represents an individual, the groups of squares (A–Z) represent shared haplotypes, each individual is colour-coded by population and the circles represent inferred haplotypes. The inset (top right) shows the network with only the two putative refugia (AKF and NL) coloured. The haplotypes which make up the BAPS clusters (Figure 1) are shown by the dashed (western—white), grey (continental—grey) and black (eastern—black) outlines.

Figure 3
figure 3

PCoA based on the population pairwise (a) ΦST values and (b) FST values. With the mtDNA, most of the variation was explained by thefirst coordinate (74.0% coordinate 1, 10.0% coordinate 2 and 8.8% coordinate 3). With the microsatellite data, the partition of NL from the rest of the populations was along coordinate 1 (42.5%), while coordinate2 (24.8%) separated the eastern and western populations. Populations are colour-coded as per Figure 2.

Approximate divergence times were calculated between the western (AKA, AKF, AKW, NBC, CBC, CAB, SAB, SK and NON) and eastern (LAB, NL, NSNB, NY and NQC) groups identified by SAMOVA. Marginal distribution curves were unimodal and approached 0 on each side. The mean separation date between the east and west was 78.6 kya (60.4–96.0 kya 90% HPD). The high (5% per Myr) and low (1% per Myr) rates gave dates of 26.7 kya (20.5–32.6 kya 90% HPD) and 133.3 kya (102.3–162.8 kya 90% HPD), respectively. Divergence estimates increased when the central (SK, NON and NQC) populations were excluded (104.3 kya; 65.6–141.8 kya 90% HPD). Gene flow estimates between western and eastern populations were relatively low (west to east=0.0021; east to west=0.0033 migrations per 1000 generations per gene copy); estimates dropped considerably when the three central populations were removed from the analysis (west to east=0.0012; east to west=0.0006).

Microsatellites

After excluding individuals due to missing data, 260 individuals from 13 populations were successfully genotyped. Genotyping of Pdo5 and Ppi2 was not possible for museum samples from NON, NQC and many from NY, most probably due to the degraded nature of the toe pad samples. All other populations had little or no missing data. MICRO-CHECKER v2.2.3 suggested the presence of null alleles in at least one of Pdo5 or Ppi2 for 7 of the 10 populations tested. Exact tests showed departures from Hardy–Weinberg equilibrium, after correction for multiple tests (Pcrit=0.01), at two loci: Pdo5 (AKA, AKF, NL; P<0.01) and Ppi2 (AKA, AKW, CBC, NSNB, NL; P<0.01). When testing for linkage, only 4 of 338 tests were significant (Pcrit=0.008; P<0.001). Pdo5 and Ppi2 were excluded from final analyses.

Observed heterozygosity (HO) was similar to expected heterozygosity (HE) across all populations for five of the six loci (Supplementary Table S7). Pat14 had significantly lower heterozygosity than expected (P<0.01). Observed heterozygosity ranged from 0.400 to 1.000 across all loci. Allelic richness was similar across populations (3.01–3.37, average 3.20; Supplementary Table S7), with private allelic richness highest in NSNB (0.51) and NL (0.50) relative to an average of 0.34.

Global FST, based on six loci, was 0.015 and highly significant (P<0.001). Population pairwise FST values showed significant differences between NL and all other populations (Table 1); otherwise, pairwise values were generally not significant (only 6 of the 66 non-NL comparisons were significant after modified FDR correction). A similar pattern was seen with both RST values and Jost’s Dest values (Supplementary Table S8) and when FST was calculated using all eight loci (global FST=0.013, P<0.001). To ensure that the missing data in NON, NQC and the small sample size of NY were not affecting the overall results, pairwise comparisons were also run excluding these populations. Results were similar and are not shown. A weak, but significant, isolation-by-distance pattern was present in the nuclear data (r2=0.05, P=0.002).

STRUCTURE identified two clusters: NL and all other populations (Figure 4). The two groups were supported by both log likelihood penalised tests (Bayes factor=1.00, ln Pr (X | K)=−7162) and ΔK. All Q values (ancestry coefficients) were >0.50, and 95% of individuals were assigned with Q>0.75, the exceptions being mostly NON samples (Figure 4). No additional substructure was seen when NL was removed from the analysis (not shown). The PCoA (Figure 3b) showed little genetic structure among the populations with the exception of NL. The broken-stick method employed in PCA-GEN v1.2.1 (Goudet, 1999) found only coordinate 1 to be significant (P<0.05).

Figure 4
figure 4

Bayesian clustering analysis run in STRUCTURE v2.3 with six microsatellite loci for K=2. Each vertical line represents an individual, and the y axis denotes the cluster membership (Q).

Ecological niche modelling

The maximum entropy model performed significantly better than random, as shown by both the binomial test of omission (tested rate close to predicted rate; P<0.001) and the receiver operating characteristic analysis (area under the curve=0.829±0.004). The contribution of each of the environmental layers varied considerably (Supplementary Table S4). The present distribution predicted by the model (Figure 5a) resembles the current range of the boreal chickadee (Figure 1), with the northern extent well predicted and the southern boundary moved to the south. The predicted LGM distribution(ca. 21 kya; Figure 5b) shows that suitable habitat was probably present in Alaska, along the western coast of North America, in the southern United States and in Atlantic Canada.

Figure 5
figure 5

The predicted distribution of the boreal chickadee using maximum entropy in MAXENT v3.3.3 for the (a) present and (b) LGM. The darker shading represents the most probable locations. S.d. was low for both timeframes.

Discussion

Significant population differentiation was found in boreal chickadees. At the level of populations, most of the pairwise ΦST values were significant, suggesting restricted maternal gene flow (Table 1). The non-significant comparisons involved adjacent populations and support an isolation-by-distance pattern whereby gene flow is restricted by geographical distance. By contrast, nuclear data support the genetic isolation of NL from mainland populations; however, few of the other populations are significantly differentiated from one another (Table 1, Figures 3 and 4). Uncovering different patterns with mtDNA and microsatellite data is not uncommon in phylogeographic and population genetic studies (see Zink and Barrowclough, 2008). Several factors could explain the dissimilar patterns between the two types of molecular markers, including the age of the population and mode of inheritance (female vs biparental inheritance).

Several species show evidence of sex-biased dispersal (Gibbs et al., 2000; Milot et al., 2000; Helbig et al., 2001; Austin et al., 2003; Bensch et al., 2006), including the yellow warbler which, while a long distance migrant, showed similar patterns of population differentiation, at comparable sampling sites, to the boreal chickadee (Gibbs et al., 2000; Milot et al., 2000). Yellow warbler populations also showed higher levels of population differentiation at mtDNA genes than microsatellite loci. Gibbs et al. (2000) attributed the patterns to male-biased dispersal but also examined the possibility that populations may be too young to be in equilibrium and therefore differences between markers are due to recent ancestry. Hutchison and Templeton (1999) evaluated the relative influence of drift and migration on population differentiation and concluded that once isolated populations are in migration-drift equilibrium they may show isolation-by-distance. If either genetic drift or gene flow is higher relative to the other, then at large geographical distances, no isolation-by-distance pattern will be present. Boreal chickadee populations show significant isolation-by-distance patterns at both nuclear and organellar markers, suggesting the populations are in equilibrium, leaving the possibility of male-biased dispersal to explain the observed patterns. Higher gene flow in males could also explain the disappearance of an east/west split detected in the mainland populations with mtDNA for nuclear data (Table 1, Figures 3 and 4). The significant differentiation of NL from the mainland may be due to other factors such as the presence of physical barriers (see below) and therefore affects both male- and female-mediated dispersal.

The mtDNA data (haplotype distribution, high ΦST values, hierarchical AMOVA, significant SAMOVA, IMA and PCoA) support the separation of the eastern (Atlantic Canada and NY) and western (AK, BC and AB) groups of boreal chickadees with a central cline. The gradient seen is probably the result of a secondary contact zone with admixture of multiple lineages (Taberlet et al., 1998; Petit et al., 2003). Although boreal tree species commonly show evidence of a phylogenetic break in the Great Lakes region similar to that found in the boreal chickadee (Jaramillo-Correa et al., 2009), it is less commonly seen in avian species. North American birds tend to show genetic breaks further west across the Rocky Mountains between British Columbia and Alberta (Burg et al., 2005; Toews and Irwin, 2008) or on the east coast (Gill et al., 1993; Holder et al., 1999). The mtDNA pattern seen in the boreal chickadee more closely resembles that of the black and white spruce, two species with which it is closely associated.

Impact of glaciation

Pleistocene glaciations strongly influenced the population structure and distribution of many temperate species. In the case of the boreal chickadee, palaeo-modelling indicates suitable habitat was present in multiple refugia (Figure 5b) and divergence estimates suggest the eastern and western mtDNA lineages diverged during the last glaciation. Specifically, isolation with migration estimates (60–96 kya) place the divergence of the two lineages within the Wisconsin glaciation. When the ice sheets formed over 100 kya (Gillespie et al., 2004), the physical separation of the boreal chickadee populations occurred, thereby initiating genetic divergence in this species.

The location of suitable boreal chickadee habitat at the LGM in Beringia, the west coast of North America, the southern United States and Newfoundland (Figure 5b) is consistent with putative refugia described in other species (Chappell et al., 2004; Jaramillo-Correa et al., 2004; Colbeck et al., 2008; de Lafontaine et al., 2010; Graham and Burg, 2012; Pulgarín-R and Burg, 2012). Patterns of genetic diversity in the boreal chickadee were as expected if multiple refugial populations merged following prolonged isolation, with higher diversity and private haplotypes in or near putative refugia (AKF and Atlantic Canada; Supplementary Table S5) and evidence of secondary mixing in the centre portion of their range (that is, SK, NON and NQC; Taberlet et al., 1998; Petit et al., 2003). Unlike populations in the west and east, which contained only western or eastern haplotypes, respectively, central populations contained a subset of both haplotypes (Figure 2, Supplementary Table S5). If boreal chickadees survived in a single central refugium and populations expanded outwards, levels of genetic diversity would be higher in the centre of the range and would decrease with increasing distance (for example, diversity would be lower in AKF and NL). Newly colonised populations would contain a subset of haplotypes found in the central area, a pattern seen in blackpoll warblers (Ralston and Kirchman, 2012). Several lines of evidence support the presence of multiple Pleistocene refugia, namely Alaska and the Atlantic Coast, for the boreal chickadee.

AKF, a population in the area of Beringia, contains high levels of mtDNA diversity, a large number of private haplotypes and almost all shared haplotypes in the west are present in AKF (Figure 2, Supplementary Table S5). Although the majority of species thought to have persisted in Beringia are either restricted to the Northwest or Arctic regions (for example, Holder et al., 1999), evidence supports the use of this refugium by several widespread species, including the wolverine (Chappell et al., 2004) and red fox (Aubry et al., 2009), as well as black spruce and white spruce, two boreal trees upon which the boreal chickadee relies heavily (Jaramillo-Correa et al., 2004; Anderson et al., 2006; Gerardi et al., 2010).

Suitable habitat also existed east of the ice sheets in north-eastern North America (Figure 5b). The persistence of the boreal chickadee in an Atlantic Shelf refugium is concordant with patterns seen in a number of plants (Walter and Epperson, 2001; Jaramillo-Correa et al., 2004; de Lafontaine et al., 2010) and birds (Zink and Dittmann, 1993; Zink et al., 2003; Colbeck et al., 2008; van Els et al., 2012). Although both mtDNA and microsatellite analyses support the separation of NL as a distinct population (Table 1, Figure 4), it is unlikely to have been the sole refugium on the east coast. In contrast to AKF, NL has lower nucleotide and haplotype diversities (Supplementary Table S5); therefore, while we cannot exclude the possibility that individuals from NL may have emigrated following the retreat of the Laurentide ice sheet, it is unlikely that the island alone acted as a source population. Boreal chickadees probably persisted on the Atlantic shelf or the north-eastern edge of the United States and, as the ice sheets melted, birds on NL became isolated and have remained so. MtDNA patterns in multiple boreal chickadee populations from Atlantic Canada, namely NSNB, LAB and NL, are consistent with refugial populations and it is probable that all served as, or were located near, the eastern refugia. For example, NSNB and LAB both have high genetic diversity, NSNB and NL populations contain a large number of private alleles and NL has a large number of private haplotypes.

Following the retreat of the ice sheets, individuals would have expanded into the previously glaciated regions. For the boreal chickadee, this includes areas of central Canada. Populations in these areas contain a mixture of haplotypes found in Alaska and Atlantic Canada, suggestive of secondary contact. A similar pattern exists in the yellow warbler with western (BC and AK) and eastern (MB to NL) haplotypes occurring in the Great Plains and Canadian Prairies (Milot et al., 2000).

Physical barriers

A number of physical barriers restrict dispersal in boreal chickadees, including mountains, large bodies of water and geographical distance. The Rocky Mountains are often the site of phylogenetic breaks (Toews and Irwin, 2008) or suture zones (Brelsford and Irwin, 2009; Flockhart and Wiebe, 2009) in North American species. However, they do not appear to restrict gene flow in boreal chickadees. Chickadee populations on either side of the Rocky Mountains (for example, CAB and CBC, Table 1) are not genetically differentiated from one another. The prevalence of suitable habitat and treed corridors through the multitude of valleys and passes of the Northern Rockies prevent genetic isolation. However, further north, the Alaskan mountain ranges may be restricting gene flow (for example, between AKA and AKF, Table 1).

On the east coast, both the Cabot Strait and the Strait of Belle Isle are acting as barriers to gene flow in the boreal chickadee, isolating NL from the mainland populations. As with other island populations, Newfoundland hosts a number of endemic species and subspecies (for example, Cronin et al., 2005; Hearn et al., 2006), and genetically distinct populations are found in a number of species, including birds (Gill et al., 1993; Zink and Dittmann, 1993), mammals (Cronin et al., 2005) and plants (Gerardi et al., 2010). A combination of the physical water barrier and inhospitable conditions (harsh weather and lack of suitable habitat) probably contribute to the maintained genetic isolation of boreal chickadees.

The third type of physical barrier restricting gene flow in boreal chickadees is geographical distance. Both mtDNA and microsatellite data show significant isolation-by-distance, and short distance (<40 km) dispersal is supported by banding records. Reduced dispersal is consistent with other non-migratory, resident species, including both birds (Burg et al., 2005; Graham and Burg, 2012; Lait et al., 2012) and mammals (Chappell et al., 2004; Aubry et al., 2009; Shafer et al., 2010a).

Comparing genetic data to morphological taxonomy

Three to five subspecies of boreal chickadees have been described based on morphological and plumage differences (Ridgway, 1904; American Ornithologists’ Union, 1957). The most widely accepted designation includes five described subspecies: P. h. hudsonicus (AK to ON), P. h. columbianus (west of Rocky Mountains), P. h. cascadensis (northern Cascade Mountains), P. h. littoralis (QC to Atlantic coast) and P. h. rabbittsi (NL) (American Ornithologists’ Union, 1957). Our genetic data do not support the separation of populations on the western and eastern edges of the Rocky Mountains (P. h. columbianus and P. h. hudsonicus) but do support both the separation of populations west and east of Hudson’s Bay (P. h. hudsonicus andP. h. littoralis) and a distinct Newfoundland subspecies (P. h. rabbittsi). No samples were collected from P. h. cascadensis.

Conclusions

The genetic patterns seen in the boreal chickadee support the historical separation in Beringia and Atlantic coastal refugia with subsequent expansion into central Canada. Colonisation of previously glaciated areas probably followed the spread of spruce species as the patterns are congruent with black and white spruce, supporting niche conservatism in this songbird. The close association between these species suggests that they may be useful bioindicators for each other. A combination of mountains, water barriers and physical distance are influencing dispersal among contemporary chickadee populations. Within the mainland populations, male-biased dispersal is acting to homogenise populations while mtDNA differences are retained. The exception to this is Newfoundland where both sets of molecular markers support genetic isolation of this island population. The various impacts of physical barriers in different areas (for example, mountains in Alaska vs western Canada) highlights the importance of including matrix quality as well as habitat features when looking at dispersal barriers.

Data archiving

Sequence data have been submitted to GenBank: JN654564-JN654638 (control region); JN654639-JN654699 (ATP).

Genotype data have been submitted to Dryad: doi:10.5061/dryad.82hs7. Data files: Microsatellite raw data of six loci for boreal chickadees across North America.