Unraveling the recent demographic trajectory of threatened species using genetic data may complement observational studies and help detect recent population declines, which is particularly useful when a species is hard to monitor in the field. In addition, the genetic data will also provide estimates of the current levels of genetic diversity and effective size of populations, which may alert us to changes in the evolvability of species facing rapid environmental change.

Environmental changes are particularly acute in the Arctic, a sensitive region that experiences large, negative effects from climate change compared to elsewhere globally, including cryosphere change, air and water temperatures increase, ocean acidification, change in precipitation levels, and sea-level increase (ACIA 2004; Box et al. 2019; Meredith et al. 2020). Species might cope with these changes by shifting their distribution to track suitable conditions, acclimating through phenotypic plasticity, or evolving adaptations to the new local conditions (Parmesan 2006; Gienapp et al. 2008; Chen et al. 2011; Hoffmann and Sgrò 2011). Depending on the speed and extent of environmental changes, range shift can be accompanied by a range contraction (Arenas et al. 2011). Such a contraction usually implies a decline in population size that can be followed by an increased level of genetic drift that may erode genetic diversity within populations (McInerny et al. 2009; Arenas et al. 2011; Collevatti et al. 2011; Alsos et al. 2012; Rubidge et al. 2012; Garnier and Lewis 2016). Evaluating trends in population abundance is thus critical to understand how arctic species are reacting to threats (McRae et al. 2012). This difficult task may benefit from indirect genetic approaches, which will also provide information directly relevant to the evolutionary potential of populations.

Microsatellite markers have been widely used in conservation genetic studies testing for recent bottlenecks, using a variety of methods (Nyström et al. 2006; Peery et al. 2012; Xenikoudakis et al. 2015). One of the most widely used so far has been the heterozygosity-excess test, which compares gene diversity (expected heterozygosity, HE) in a focal population to the heterozygosity expected (HEq) in a population at mutation-drift equilibrium (Cornuet and Luikart 1996). More detailed information (e.g., strength and timing of the contraction) can be obtained by calculating the likelihood of the observed genetic data given a coalescent model allowing for variations in population size (Beaumont 1999; Leblois et al. 2014; Rousset et al. 2018). These methods, however, suffer from the limitations of microsatellites (low number of markers and thus limited information in the genetic data, and relatively complex mutation process).

New powerful tools have emerged following the next‐generation sequencing (NGS) technical revolution (Shafer et al. 2015; Andrews et al. 2016), including methods designed to infer current and past population size (e.g., Li and Durbin 2011; Liu and Fu 2015; Boitard et al. 2016). These methods aim to make the best of the information contained in genome-wide data (e.g., genome-wide single nucleotide polymorphism (SNP) variants or complete genomes) to infer complex demographic histories.

All genetic inference methods make several simplifying assumptions. For instance, the effects of mutation process (Williamson-Natesan 2005; Peery et al. 2012; Leblois et al. 2014), extreme variance in reproductive success (Hoban et al. 2013), and internal population structure and connectivity (Broquet et al. 2010; Chikhi et al. 2010) are important to take into account. The complexity induced by overlapping generations in iteroparous species could also affect genetic signals of bottleneck, but this effect has, to the best of our knowledge, never been specifically examined—but see, e.g., Storz et al. (2002) for a relevant discussion and Parreira and Chikhi (2015) and Parreira et al. (2020) for an example of model including overlapping generations to investigate the genetic consequences of social structure in mammals.

The effective size NE of a population for an iteroparous species with overlapping generations depends on generation time TG and lifetime variance in reproductive success among individuals Vk (following notation from Waples 2016). Assuming constant population size and a stable age distribution, Hill (1972) obtained NE = 4N1TG/(Vk+2), where N1 is the number of new individuals entering the population at each time step. The parameters in this equation result from the variation in individual survival and fertility among age classes (Felsenstein 1971; Hill 1972; see also Rousset 1999; Laporte and Charlesworth 2002, and Yannic et al. 2016 for models of structured populations). Although it was shown that the ratio of effective to census population size is mainly determined by age at maturity and adult lifespan (Waples et al. 2013) or adult survival (Waples 2016), precisely predicting the effective population size of a long-lived iteroparous species is difficult because it requires detailed demographic data on the species’ vital rates. Considering the simplest situation where newborns become reproductively mature within one breeding cycle (say 1 year) and adults have constant fecundity and survive to the next breeding cycle with probability v (constant across age classes), then TG = 1/(1−v) and NE = N/(1+v) or NE = N/(2−1/TG) where N is the number of adults in the population (Nunney 1993; Nunney and Elam 1994; Yannic et al. 2016, and in agreement with Felsenstein 1971; Orive 1993; Waples 2016). With this simplistic model (detailed in Yannic et al. 2016), we see that iteroparity and overlapping generations reduce effective population size by a factor that approaches 2 as adult survival increases. Regarding the heterozygosity-excess method to detect bottlenecks mentioned above, the disequilibrium signal was found to peak around NE generations past population size reduction, with NE the post-bottleneck effective population size (Cornuet and Luikart 1996). Hence, a demographic bottleneck in a population with overlapping generations should produce a signal of heterozygosity excess earlier (with time expressed in generations) than in a population with discrete non-overlapping generations. Of course, in units of absolute time, the whole dynamics of the bottleneck signature will appear scaled up by a factor equal to generation length TG.

Yet the intensity and temporal dynamics of bottleneck signatures are in fact complex functions of the severity of the demographic decline, mutation models and rates, and even sample size (Cornuet and Luikart 1996). Moreover, more realistic age structures (accounting for delayed maturity and variation in survival and fecundity with age) will bring additional complexity that makes it difficult to obtain explicit predictions about the behavior of bottleneck detection methods applied to long-lived iteroparous species.

A case in point, the ivory gull (Pagophila eburnea) is a long-lived high-arctic bird that has delayed sexual maturity and high annual adult survival (estimated to 0.86 ± 0.04, Stenhouse et al. 2004; and maximum observed life span of 28 years Mallory et al. 2012). The ivory gull is closely associated with sea ice throughout the year (Spencer et al. 2014; Gilg et al. 2016), and has a patchy breeding range at the high latitudes of Atlantic Arctic, i.e., in Canada, Greenland, Norway (Svalbard), and the Western Russian Arctic. The global population size was estimated at 38,000–52,000 mature individuals (BirdLife International 2018) and has been revised downward to 14,400–19,900 mature individuals by the most recent surveys (see Table 1). The Ivory gull drew attention following reports suggesting that the species had declined by 70% since the 1980s in Canada (Gilchrist and Mallory 2005) and by 40% from 2009 to 2019 in Svalbard, Norway (Strøm et al. 2020). Now, the global population is considered declining (BirdLife International 2018), although trends are contrasted across breeding regions. Populations are supposed to be stable in Greenland (Gilg et al. 2009; Boertmann et al. 2020) and Russia (Gavrilo and Martynova 2017). The three main identified threats that support its “Near Threatened” status according to the IUCN (BirdLife International 2018) are: (i) global warming inducing sea ice decline, i.e., decline of ivory gulls’ main habitat; (ii) exposure to contaminants such as persistent organic pollutants and heavy metals in the environment (Braune et al. 2006; Miljeteig et al. 2009; Lucia et al. 2015); and (iii) extensive development of human activities, e.g., resources exploitation (oil, gas, mineral), or shipping lanes (Gilg et al. 2012; Fort et al. 2013; Yurkowski et al. 2019).

Table 1 Sampling of ivory gull. Nµsat and NSNP give the number of sampled individuals used for microsatellites and SNPs analyses, respectively.

The primary objective of this study was to investigate the recent demographic trajectory of ivory gull populations worldwide. To achieve this goal, we analyzed microsatellite and SNP data obtained from a large part of the species’ range, predicting that a decline in population size should have produced a genetic bottleneck signature. Before analyzing such signatures, we first explored population structure to check if the genetic homogeneity reported by Yannic et al. (2016) using microsatellites would withstand SNP examination. We then applied three bottleneck inference methods with the microsatellites (heterozygosity-excess, M-ratio, and coalescent-based maximum likelihood inference) and one Approximate Bayesian computation (ABC) method with the SNPs. Our second objective was to explore the effect of age structure on genetic signatures of a bottleneck. Focusing on the microsatellite heterozygosity excess method (Cornuet and Luikart 1996), we used stochastic simulations to look at the dynamics of genetic signals in a long-lived species experiencing a demographic bottleneck. Finally, our third objective was to evaluate the current effective size (Ne) of the ivory gull, a key evolutionary parameter in conservation biology (Hohenlohe et al. 2021). We approached this objective using the linkage disequilibrium method (Waples and Do 2010), both with microsatellite and SNP data.

Materials and methods

Sample collection and genotyping

We analyzed genetic data obtained for individuals sampled from the global breeding range of the species in summers 2006–2012 (Table 1, Fig. 1; Yannic et al. 2016). Full sampling protocols are described in Yannic et al. (2011; 2016). Three non-destructive DNA sampling methods (buccal swabs, plucked feathers, and blood) and a non-invasive sampling method (shed feathers) were used. Here we considered data from adult birds only (n = 271), but we distinguished individuals that were seen to be incubating eggs or feeding chicks (hereafter called “breeding” individuals) and individuals for which we had no information on their breeding status (i.e., “unknown status”, possibly including visiting breeding birds from other colonies, or prospectors, Volkov and de Korte (2000)). Then, we characterized each sampling site by a single type of birds (i.e., “breeding” or “unknown” status), except Station Nord, where both categories of birds were sampled (Table 1). Thus, we considered 15 distinct samples, which we hereafter consistently called “populations” for the sake of simplicity.

Fig. 1: Map of the study area illustrating the high-Arctic distribution of ivory gull (Pagophila eburnea) colonies.
figure 1

Sampling localities are indicated by the numbers given in Table 1. Small orange dots depict known breeding sites (Gilchrist et al. 2008). Dashed lines show wintering grounds (variable during the winter and among years according to the extension of the sea ice). The background map represents the maximum sea ice extent in July between 1979 and 2013 (light blue) and the sea ice extent in July 2013 (dark blue) (data from the National Snow and Ice Data Centre, Boulder, Colorado;

Our microsatellite dataset was a subset of the multilocus genotypes obtained using 22 microsatellites previously published in Yannic et al. (2016). Because the demographic inference methods that we use assume typical microsatellite mutation models, we selected the 15 loci (4 dinucleotides: loci A111, A115, A129 and A132 and 11 tetranucleotides: loci B103, B125, C6, C7, D1, D103, D110, D126, D5, D6, and D9), that presented allele size distribution expected for dinucleotides (i.e., multiple of 2 bp units) or tetranucleotides (i.e., multiple of 4 bp units). The number of alleles observed and heterozygosity for each of the 15-microsatellite loci are provided in Table S1.

We used a second set of data consisting of SNP genotypes obtained for a subset of individuals, which were generated specifically for this study using a ddRAD-seq approach (see Supplementary Materials). This method was used only with high quality DNA extracts and thus limited to 6 of our populations (circled in red in Fig. 1).

SNPs genotyping

At the end of the de novo SNP calling and filtering procedure, the dataset encompassed 5912 SNPs distributed over 3490 RAD loci genotyped for 87 adult birds (9–20 ind. per population, Table 1). We discarded 9 birds with >10% of missing data. Error rates estimated on the 5912 SNPs obtained from replicated samples were 0.009 ± 0.003 and dropped to 0.001 ± 0.003 when we considered a single SNP per locus (n = 3490 SNPs). The characteristics of this dataset (number of individuals, number of SNPs, proportion of missing data, and error rates) should be well suited to analyze variations in population size and current effective population size (Nunziata and Weisrock 2018; see e.g., Marandel et al. 2020).

Genetic structure

We estimated the genetic differentiation among all populations by estimating global and pairwise FST at two spatial scales: among populations (n = 15) or among regions (see Fig. 1). We also estimated FST considering only the “breeding” birds to test whether spatial genetic structure could have been affected by including “visiting” birds (see Yannic et al. 2016). For microsatellite loci, we used the online Genepop v.4.7 software (Raymond and Rousset 1995; Rousset 2008) to estimate FST according to Weir and Cockerham (1984), among all populations and regions. We tested genotypic differentiation using Markov chain algorithms with default parameters in Genepop. With the SNP data, the same analysis was performed on the subset of populations for which SNPs were available (six populations and three regions; Fig. 1, see also Table 1), using Genepop v.4.6 called through the R package strataG (Archer et al. 2017).

Demographic inference

To test for a recent population genetic bottleneck we combined different approaches, using three alternative methods for the microsatellite dataset and a single approach for the SNP dataset.

Microsatellites: heterozygosity excess

First, we tested for a recent genetic bottleneck using the heterozygosity-excess method presented in Luikart and Cornuet (1998), and implemented in the software Bottleneck v.1.2.02 (Cornuet and Luikart 1996; Piry et al. 1999). This method compares expected heterozygosity HE in an empirical sample to the heterozygosity HEq that is expected in a population at mutation-drift equilibrium given the number of alleles observed in the sample. Strong reductions in NE are followed by a sharp decrease in the number of alleles (rare alleles being quickly lost) while heterozygosity HE decreases less rapidly. A transitory excess in HE (measured as ∆H = HEHEq) is therefore expected in recently bottlenecked populations (and a transitory deficit in HE is expected in case of population expansion). Microsatellites generally evolve under the classic stepwise mutation model (SMM; by the gain or loss of a single repeat unit), but mutations of several repeat units may occasionally occur. Therefore, we used a mixed Two-Phase mutational Model (TPM), which fits best the mutation processes in microsatellites (Di Rienzo et al. 1994). But because the heterozygosity-excess method is very sensitive to mutation models, we ran the analyses assuming that the probability of single-step mutations is 0.70 (TPM70) or 0.95 (TPM95), as suggested by Miller et al. (2012) and Piry et al. (1999), respectively. The variance of the geometric distribution for the multi-steps mutations was set to 12 in both cases. We used 1000 coalescent simulations and one-tailed Wilcoxon sign rank tests to test for an excess of heterozygosity considering the whole population (n = 271 ind.), each region (n = 23–106 ind.) and each population (but only the nine populations with n ≥ 10 individuals, Table 1).

Although we observed a genetic homogeneity across the species distribution range (see Yannic et al. 2016 and the results below), we considered that the observed disparity in demographic trends among regions called for an investigation of genetic bottleneck inference at the regional and population scales too. However, to ensure the detection of genetic bottlenecks is not sensitive to sample size according to the spatial scale considered, we randomly generated sets of individuals of various size (11 ≤ n ≤ 250), sampled among the 271 ivory gull genotypes. These sample sizes roughly correspond to the size of the different sets of individuals considered in this study at different spatial scales (i.e., populations, regions, and whole population). Sampling was replicated 10 times for each sample size to estimate standard errors (se) around ∆H. The effect of sample size on bottleneck detection was then estimated with the software Bottleneck considering the two mutation models: (A) TPM70 and (B) TPM95. We report mean ∆H ± se as well as the probability (p value for one-tailed Wilcoxon test) of detecting an excess in HE for each sample size and each mutation model.

Because we tested the same hypothesis several times on different datasets (i.e., at different scales) and using different mutation models (TPM70 and TPM95), we provided in addition to the original p values, the adjusted p values for multiple comparisons obtained using the Benjamini and Hochberg (1995) false discovery rate procedure with initial α = 0.05 (later denoted fdr.p values).

Microsatellites: M-ratio

Second, we used an alternative method based on the ratio (M) of the number of microsatellite alleles (k) to the range in allelic size (r) (Garza and Williamson 2001). It is expected that M is smaller in populations that have experienced a reduction in size (M < 0.68; Garza and Williamson 2001). We ran this analysis in R using the mRatio function implemented in the package strataG (Archer et al. 2017). The method was applied to the same sets of individuals and spatial scales as described above.

Microsatellites: Migraine

Third, we used a more general method described in Leblois et al. (2014), that aims to detect and date past changes in population size. This method implemented in the software Migraine (Leblois et al. 2014; Rousset et al. 2018) uses simulations to estimate the likelihood of demographic parameters given observed genetic data. We used the demographic model considering one population with a single past variation in population size (OnePopVarSize), which considers a single population of ancestral size NE,past that instantly changed to the current size NE some T generations in the past. The population may have experienced a bottleneck (Nratio = NE,past/NE < 1) or an expansion (Nratio > 1). Contrary to the two methods presented above, Migraine allows microsatellites to be characterized by different mutation models within the same dataset. In addition, it considers a generalized stepwise mutation model (GSM) where the number of repeats added or removed by a mutation follows a geometric distribution with parameter pGSM, which is estimated by the method. We ran preliminary analyses where we allowed dinucleotides (n = 4) and tetranucleotides (n = 11) to follow either one of two models (SMM or GSM). We concluded that the method gave the most reliable results when considering only the tetranucleotides, under GSM (see discussion). We then ran Migraine using 4 iterations of 10,000 points and 10,000 trees per point (detailed settings in supplementary material) to infer parameters 4NEμ, 4NE,pastμ and the composite parameter Nratio. Migraine also estimated pGSM and Tμ (noted T in Migraine).

SNPs: Demographic inference with linked selection (DILS)

Finally, we analyzed our ddRAD-seq genotypes (filtered dataset: 3490 loci) with a general method (DILS: Demographic Inferences with Linked Selection) newly developed by Fraïsse et al. (2021) that compares the genetic properties of the data at hand to that of a large number of simulated demographic scenarios to identify what historical scenario is most likely to have produced the observed data (ABC framework). We considered a single population of ancestral size NE,past that is allowed to vary in size instantaneously at some time T in the past to reach its current size NE. We ran three runs where we allowed NE,past and NE to take any value from 0 to 105 individuals, and population size change could have happened at any time between 0 (i.e., present) to 105 generations in the past. The three runs differed only in the minimum allelic frequency (maf) used to generate the dataset used in DILS (no maf, maf = 0.01, and maf = 0.02). Other parameters followed the authors’ recommendations: mutation rate set to 10−8, ratio of recombination (intra-RAD locus) and mutation set to 0.8 (Fraïsse et al. 2021). Since our filtered dataset contained at least 77 out of 87 individuals genotyped per SNP, we set the minimum number of haploid copies required to process a locus (Nmin) to 154. Similarly, since our smallest RAD sequence was 150 bp long, we set the minimum sequence length to Lmin = 150 bp (complete settings file in supplementary material). We retained results from the optimized posterior and using the random-forest method implemented in DILS.

Simulation of microsatellite heterozygosity excess when generations overlap

We used simulations to explore the effect of age structure on genetic signatures of a demographic decline. With the ivory gull case in mind, we designed simplified simulations that would help us interpret the signal of heterozygosity excess observed with our microsatellite dataset.

We used a modified version of Nemo (Guillaume and Rougemont 2006), called Nemo-age (Cotto et al. 2020), an individual-based, genetically explicit and stochastic population computer program where populations are age-structured. This tool creates stochastic forward-in-time simulations of population demography and genetic markers. We designed a life cycle where chicks become adults after one-time unit (year) and adults survive from 1 year to the next with probability ν. We simulated two bottleneck scenarios for a population that took one of three age structures. Simulation parameters were chosen so that diversity θ = 4NEμ would be not too small even at the new equilibrium after bottleneck (so that microsatellite HE and k remain meaningful) and not too large (θ ≤ 1) so that analytical expressions for ∆H are not biased (especially under SMM, see below).

In the first scenario, the population was reduced from 1000 to 250 individuals within a single time unit (and then remained at that size), while it was reduced from 1000 to 500 in our second scenario. In each case we ran simulations with adult survival ν set to 0 (no overlapping generations), 0.8 (i.e., a value like that expected for ivory gull) or 0.95 (allowing us to look at the effect of a very pronounced age structure, see Fig. S3 for an example of resulting age distributions). Increasing adult survival has two consequences. First, it will increase generation time (TG), defined as the mean age of adults (given that all adults have the same fecundity and there is no sex-biased survival in the simulations). We obtained TG = 1 year when ν = 0, TG = 4.5 years when ν = 0.8, and TG = 13.4 years with ν = 0.95 (these values estimated empirically from the simulations agreed with theoretical predictions, see supplementary material). Second, increasing ν also means that generations will be overlapping.

We simulated 20 microsatellite markers neutrally evolving under the conditions of IAM or SMM with mutation rate μ = 2.5.10−4 per locus per generation (Estoup and Angers 1998). Using these markers, we estimated heterozygosity excess ∆H every 10 years and we averaged these values over 100 simulation replicates. To produce as many ∆H estimates, we used the number of alleles k and gene diversity HE reported by Nemo-age and estimated HEq analytically from the number of alleles k observed in a sample of size n (number of gene copies sampled). Under IAM, the relationship is \(k = \mathop {\sum}\nolimits_{i = 0}^{n = 1} {\theta /\left( {\theta + i} \right)}\) with θ = HEq/1 − HEq (Ewens 1972), and for SMM it is (Kimura and Ohta 1978): \(k = \left( {\theta + \beta } \right)/\beta \left[ {1 - \mathop {\prod}\nolimits_{i = 0}^{n - 1} {\left( {\theta + i} \right)/\left( {\theta + \beta + i} \right)} } \right]\), where β = θ(1 − HEq)/HEq − 1 and \(\theta = 1/2[ {1/\left( {1 - H_{{{{\mathrm{Eq}}}}}} \right)^2 - 1} ]\) .

Because these relationships may be slightly biased in some conditions, for ten additional replicates, we also asked Nemo-age to create Genepop files from the simulated markers 20 years before the bottleneck (i.e., at mutation-drift equilibrium) and 20 years after the bottleneck (i.e., imitating the situation suspected for the ivory gull). These files were then used to estimate ∆H and test for its significance in the software Bottleneck. Additional technical details for the simulation settings are given in supplementary material.

Effective population size

Effective population size (NE) was estimated with microsatellites for each population (n = 15), each region (Canada, Greenland, Norway, and Russia), and the whole population, using NeEstimator v.2.1 software (Do et al. 2014). We used the linkage disequilibrium method considering a random mating model. Following Waples and Do (2010), we excluded rare alleles to limit estimation bias. To do so, we used two methods, first excluding singletons only (i.e., alleles represented by a single copy in the population), and second excluding alleles with a frequency ≤ 0.02 (maf, called Pcrit in NeEstimator). We assumed that all loci are physically unlinked (as verified in Yannic et al. 2016). The same analyses were performed using SNP data but considering only the six populations or three regions for which we had such data (Table 1) and using maf = 0.05 (Nunziata and Weisrock 2018; see Marandel et al. 2020).


Genetic structure

A low level of genetic differentiation was observed among populations, both with microsatellites (15 populations, FST,μsat = 0.0044, p < 0.05) and SNPs (6 populations, FST,snp = 0.0043; p < 0.001). These results confirm the findings of Yannic et al. (2016) and suggest that bottleneck analyses can be run considering a single global population of ivory gulls. Similar conclusions were obtained when measuring the differentiation using breeding individuals only (FST,μsat = 0.0006, NS; FST,snp = 0.0024, NS) or at the scale of regions (FST,μsat = 0.0035, p < 0.001; FST,snp = 0.0057, p < 0.001). Pairwise FST between populations or regions varied between −0.0036 and 0.0092 (Tables S2 and S3).

Demographic inference

Microsatellites: heterozygosity excess

Assuming a TPM95 model, we did not detect any departure from mutation-drift equilibrium in the whole ivory gull population (∆H = −0.016; one-tailed Wilcoxon text, p = 0.93 and fdr-p = 0.97; Fig. 2 and Table S4). If we considered a TPM70 model, we found a significant but slight excess of HE (H = 0.027; one-tailed Wilcoxon text, p = 0.04), which became non-significant after correction for multiple tests (fdr-p = 0.13; Table S4).

Fig. 2: Pairwise comparison of expected heterozygosity (HE) versus expectedheterozygosity.
figure 2

At mutation-drift equilibrium (HEq) in the whole ivory gull population obtained with Bottleneck using 15 microsatellites and considering two mutational models: A TPM70 and B TPM95. The diagonal represents HEHEq, equality, i.e., population at mutation-drift equilibrium. Above the diagonal the population experiences a heterozygosity excess, i.e., a signature of demographic decline, below a heterozygosity deficit, signature of demographic expansion. Gray dots represent each locus (n = 15) and the black dot is the mean ± se over loci.

At the scale of regions (Fig. 3), we did not detect any excess of heterozygosity under TPM95 (all one-tailed Wilcoxon tests, p > 0.05 and fdr-p > 0.05; Table S4). Using TPM70, Greenland was the only region with a significant signal of bottleneck (ΔH = 0.021; one-tailed Wilcoxon test, p = 0.01, but fdr-p = 0.07; Table S4).

Fig. 3: Heterozygosity-excess (ΔH) at 15-microsatellite loci in four breeding regions of ivory gull, obtained with Bottleneck software for two mutational models.
figure 3

A TPM70 and (B) TPM95). The average ΔH overall loci is represented by a black dot, and locus-specific values are represented by smaller gray dots. The dashed line represents equality between observed and expected heterozygosity, i.e., ∆H = HEHEq = 0. Symbols (*) and (NS) indicate that the one tail probability of HE > HEq, i.e., heterozygosity-excess, is significant (p value < 0.05) or not significant, using Wilcoxon’s test. The p values obtained after correction for multiple tests are provided in Table S4.

At the population scale, we found a significant heterozygosity excess under TPM70 in four populations (1_StNo, 2_StBr, 8_Rudo and 15_AlEl, ΔH > 0; one-tailed Wilcoxon tests, p < 0.05; Fig. S1), but this signal was significant with TPM95 for the population of Alert on Ellesmere Island, Canada only (15_AlEl, ΔH = 0.015; one-tailed Wilcoxon test, p = 0.001). After correction for multiple tests, a significant excess of heterozygosity was still observed in the Alert population with TPM70 (fdr-p = 0.02) but not with TPM95 (fdr-p = 0.13).

Resampling analyses indicated that ∆H estimates and associated p values are sensitive to sample size. Under TPM70, ∆H estimates became slightly underestimated as sample sizes decreased (Fig. 4A). The estimates also became more variable for the smallest sample sizes (e.g., n ≤ 25) but remain underestimated in all cases. Under TPM95, sample size had essentially no impact on ∆H estimates (Fig. 4A). Sub-sampling had more contrasted effects on the results of the Wilcoxon test used by Bottleneck (Fig. 4B). Increasing sample size increased power under the TPM70 but we see in Fig. 4B that increasing sample size moved the p value away from significance under TPM95.

Fig. 4: Effect of sample size.
figure 4

Sample size on A the estimation of ΔH (mean ± standard error) and B one-tailed Wilcoxon test p values obtained with two different mutation models (TPM70 and TPM95), estimated with the software Bottleneck. Dashed lines correspond to HE = HEq and p value = 0.05, respectively.

Microsatellites: M-ratio

M-ratio was 0.956 for the whole population, and regardless of the scale of analysis, M values were always higher than the critical value defined by Garza and Williamson (2001), i.e., M = 0.68 (Fig. S2), suggesting no genetic bottleneck at the considered scales for ivory gull.

Microsatellites: Migraine

Migraine estimated Nratio at 3.45 (95%CI [1.88–6.17]), with current effective population size 4NEμ = 50.92 (95%CI [35.76–94.47]), about three times larger than ancestral population size 4NE,pastμ = 14.76 (95%CI [7.64–25.37]) (see Fig. 5). The value of was estimated to 0.76 (95%CI [0.21–2.21]), suggesting that the expansion occurred a very long time ago (even considering high microsatellite mutation rates, see discussion).

Fig. 5: Likelihood profile of parameters 4NEμ and 4NE,pastμ estimated by Migraine (note the log scale on both axes).
figure 5

The color bar represents the likelihood of parameter values. The estimates that have the highest likelihood suggest that the ivory gull population has been increasing.

Finally, pGSM was estimated at 0.0319 (95%CI [0.00–0.129]), indicating that the mutation process of the microsatellites is very close to SMM and TPM95 (with this parameter value, the probability of a single-step mutation is g(1) ≈ 0.968, while a two-step mutation will happen with probability g(2) ≈ 0.031, and g(3) ≈ 0.001).


The analysis of SNP data with DILS consistently found that the most likely demographic scenario was an ancient expansion, regardless of the maf used in the analysis. While DILS provides absolute values for each demographic parameter (NE, NE,past, T), these values depend directly on the mutation rate assumed. Hence here we report the ratio of NE/NE,past, which was found to be 6.13 (95%CI [4.16–8.76]) with maf = 0, 3.67 (95%CI [2.56–5.60]) with maf = 0.01, and 10.60 (95%CI [6.76–19]) with maf = 0.02. The lower limit for this ratio (estimated from 95% highest posterior density intervals) was never below 3. These results, obtained with the optimized posterior, random-forest options of DILS, did not differ much when using alternative computation methods (NE/NE,past for each maf was always in [3.63–8.80] when using non-optimized posterior, and/or neural network estimation).

The timing of the change in population size was very sensitive to the choice of maf (T = 7290 generations with maf = 0, T = 20338 gen. with maf = 0.01, and T = 67601 gen. with maf = 0.02). Moreover, these estimates are directly impacted by the value chosen for mutation rate (here 10−8). But even if this value is over- or underestimated ten times, all results from DILS point to an ancient population expansion (>900 generations ago).

Simulation of microsatellite heterozygosity excess when generations overlap

The dynamics of ∆H obtained from simulations are presented in Fig. 6 (strong bottleneck scenario, 1000–>250) and Fig. S4 (mild bottleneck 1000 –> 500). To understand the effect of age structure, the results are presented with time expressed in years (top panels in Figs. 6 and S4) and generations (bottom panels).

Fig. 6: Temporal dynamics of ΔH calculated from stochastic simulations of microsatellite markers.
figure 6

Evolving under IAM (A, C) or SMM (B, D) in a single population that went from 1000 to 250 individuals within a single generation at time 0. Top and bottom figures show the same data with time expressed either in years (A, B) or generations (C, D). The dark blue curve corresponds to a population without overlapping generations (adult survival ν = 0, generation time T = 1 year), while the light blue and gray curves correspond to adult survival ν = 0.8 (T = 4.5 years) and ν = 0.95 (T = 13.4 years). Increased generation time logically resulted in a delayed signal of genetic bottleneck (A, B). In addition to that effect, in the IAMcase, age structure and overlapping generations also resulted in a reduced bottleneck signal and additional delay not explained by changes in generation time (C).

Increasing adult survival resulted in a delayed genetic bottleneck signal (in absolute time units), but the delay was not simply proportional to the increase in generation time. For instance, under IAM in the strong bottleneck scenario (Fig. 6A and C), the maximum ∆H value was observed 80 years after the bottleneck in the case of a semelparous, annual species (i.e., v = 0, max signal 80 generations post-bottleneck), while it was 220 years post-bottleneck (49 generations) with v = 0.8, and 670 years (50 generations) with v = 0.95. Hence in a long-lived species with overlapping generations, the signal peaked later in absolute time, but still much sooner than predicted by generation time alone. This is at least in part because the dynamics of the heterozygosity excess depends on the post-bottleneck NE, which is reduced in case of an iteroparous life cycle where generations overlap. Although our stochastic simulations depart slightly from the simplistic conditions where NE = N(1 + v) (see introduction and supplementary material), in the example above the difference in the timing of the ∆H peak (80/49 = 1.6) appears relatively well predicted by the difference in post-bottleneck NE predicted by N/(2 − 1/TG): with N = 250, v = 0.8 and TG = 4.5, we have NE ≈ 140, that is, 1.8 times smaller than the post-bottleneck NE when v = 0 and TG = 1. Increasing v to 0.95 had, as it should, not a strong effect on this timing since it the effect on NE is asymptotic (with v = 0.95 and TG = 13.4, NE ≈ 130).

The strength of the bottleneck signal was also affected by age structure. Using the same example as above (strong bottleneck, IAM, Fig. 6A and C), the maximum values for ∆H were 0.12, 0.10, and 0.09 with v = 0, v = 0.8, and v = 0.95, respectively.

The same consequences were observed under SMM, although the effect of age structure on heterozygosity excess was less visible because the analytical estimation of ∆H was slightly biased (see discussion, ∆H ≠ 0 even when gene diversity and the number of alleles are at mutation-drift equilibrium in our simulations, Fig. 6B and D, and S4B,D).

These simulations also showed that the maximum signal of bottleneck was rather low (between 0.045 and 0.12 depending on simulation scenario) and took time to be reached (80–1040 years after bottleneck). Using the software Bottleneck to analyze simulated datasets, we found that a significant signal of bottleneck was correctly detected 20 years after the decline in 0–9 replicates out of 10, depending on the simulation scenario, i.e., the combined effect of mutation, survival, and strength of bottleneck (Fig. S4 and Table S5). In particular, the probability to detect a significant genetic bottleneck 20 years after the simulated decline with Bottleneck was globally higher under IAM than under SMM, and was also higher for a strong bottleneck, i.e., 75% of decline (1000–>250) than for a lower bottleneck, i.e., 50% of decline (1000–>500). In addition, these analyses also allowed us to compare our analytical estimates of ∆H against the values estimated by Bottleneck. These two methods produced identical values under IAM, but the analytical estimates appeared slightly overestimated under SMM (Fig. S6).

Effective population size

With microsatellite markers, the effective size of the whole population was estimated to NE = 1138.6 individuals (95%CI [754.0–2204.3]) with maf set to 0.02. When considering all alleles but singletons, NE was estimated to 1487.9 individuals (95%CI [990.5–2871.4]). With SNPs we obtained NE = 729.5 (95%CI [689.4–774.3]) and 861 (95%CI [828.9–895.6]) individuals, considering a minimum allele frequency of 0.05 and no singleton, respectively.

At the regional scale, estimates ranged from 250.6 individuals in Canada to 3288.5 individuals in Norway (microsatellites, Table 2A) with maf set to 0.02, and similar results were obtained when removing only the singletons from the calculations (Table 2A). However, at that scale, precise confidence intervals could not be obtained with the microsatellites. The SNPs again returned lower NE estimates (from 191.0 in Greenland to 368.9 in Norway with maf = 0.05) with narrower confidence intervals (Table 2B). At the population scale, estimates of NE are imprecise, for both microsatellites and SNPs, with large confidence intervals that span to infinity.

Table 2 Estimates of effective size NE and 95% confidence interval for regional and global populations using the linkage disequilibrium method, based on (A) 15 microsatellites and (B) 3490 SNPs, considering a maf of 0.02 and 0.05 for microsatellites and SNPs, respectively, and no singleton (No S*).


No genetic signature of demographic decline globally

We found no genetic signature of a recent population decline in the ivory gull population as a whole. This result was consistent across three independent methods using microsatellites and one method using SNPs. According to the heterozygosity-excess approach, we did not find a departure from mutation-drift that would indicate a recent genetic bottleneck in the ivory gull population at the global scale. The heterozygosity excess was globally close to zero (Fig. 2) and only significant before correction for multiple tests under the two-phase mutational model TPM70 (p = 0.036; Table S4), which as discussed further below, is not an appropriate model in this case. We therefore conclude that there was no evidence for biologically relevant heterozygosity excess that would indicate that the population has suffered a recent genetic bottleneck (but see below for some limitations associated with this conclusion). Results from the M-ratio approach led to the same conclusion. There was no loss of rare alleles in ivory gull populations, because M was always much larger than critical value 0.68 (Fig. S2).

The two methods that yield quantitative parameter estimates (Migraine and DILS) even found that the ivory gull population has experienced an ancient expansion, but not a recent decline. The NE/NE,past ratios estimated by these two methods were remarkably similar given that they used independent datasets (microsatellites with Migraine: NE/NE,past = 3.45; SNPs with DILS: NE/NE,past between 3.67 and 10.60 depending on the minimum allelic frequency retained for data filtering). Estimates of the timing of the past changes in population size are more elusive because they are directly proportional to mutation rates, but all results point toward an ancient expansion event even considering conservative mutation rates (e.g., T ≥ 757 generations if μmicrosat ≤ 10−3, T ≥ 927 gen. if μSNP ≤ 10−7). We cannot date that expansion more precisely because the number of generations since the expansion varies widely depending on the dataset (microsatellites or SNPs), choice of minimum allele frequency, mutation rate, and adult annual survival (ν = 0.86 ± 0.04; Stenhouse et al. 2004).

The heterozygosity-excess method is known to be highly sensitive to the choice of mutation model. The TPM is thought to provide a better approximation of microsatellite mutation than either a strict IAM or strict SMM (Di Rienzo et al. 1994) but the expected number of alleles at mutation-drift equilibrium still depends directly on two unknown parameters that define the proportion and distribution of multi-steps mutations within the TPM. The Migraine method uses a slightly simpler model (GSM), where a geometric distribution alone defines the number of repeats that each mutation removes or adds to the ancestral state of the allele. Importantly, Migraine infers this distribution from the data by estimating the geometric distribution parameter pGSM. Here we found that Migraine could not correctly infer pGSM when using the mixture of di- and tetra-nucleotides composing our original dataset (flat likelihood profiles even when we allowed di- and tetra-nucleotides to follow different mutation models, data not shown). Using only tetranucleotides, pGSM was estimated to 0.0319 (95%CI [0.00–0.129]), indicating that the mutation process of the microsatellites is very close to TPM95 and SMM (probability of a single-step mutation g(1) ≈ 0.968). This means that the results obtained from software Bottleneck with TPM95 (no bottleneck signal) are more relevant than that obtained with TPM70 (contrary to Ellegren 2004; Engler et al. 2016; Peery et al. 2012; Wogan et al. 2020). We can also parameterize a TPM in Bottleneck that follows exactly the GSM inferred by Migraine (0% SMM and variance = \(p_{{{{\mathrm{GSM}}}}}/\left( {1 - p_{GSM}^2} \right) = 0.0319\)). Using these parameters with the tetranucleotides, we found no heterozygosity excess, and even a significant heterozygosity deficit (one-tailed Wilcoxon test, p = 0.034, indicating a possible expansion). All things considered, our analyses using alternative approaches based on microsatellites (Bottleneck, Migraine and M-ratio) and SNPs (DILS) converged toward an absence of recent genetic bottleneck in the global ivory gull population.

Local signature of population declines in Canada?

To search for genetic evidence of population decline, we focused primarily on global analyses, considering all ivory gulls as belonging to a single Arctic-wide population. We did so because the large-scale genetic homogeneity reported in Yannic et al. (2016) was essentially confirmed by the new analyses presented here either with a subset of microsatellites (n = 15, FST = 0.0044, p < 0.05) or a new SNP dataset (n = 3490, FST = 0.0043; p < 0.001). Using a general model for coalescence times where age structure is entirely defined by a constant rate of adult survival and dispersal rates may differ in juveniles and adults, we suggested in Yannic et al. (2016) that the large-scale genetic homogeneity found in ivory gulls most likely implies massive movements of individuals among colonies, occurring possibly both at the juvenile and adult stages. However, although we did find genetic differentiation to be low, it was not strictly as low as reported previously: FST = 0.001, 95%CI [−0.002–0.005] (Yannic et al. 2016). The difference may come from the selection of a slightly different marker set (see methods, but 11 markers were common to the two studies), and the previous inclusion of markers that presented complex mutation patterns (i.e., involving insertions or deletions of some base pair numbers that are not multiples of the repeated motif length). Some slight differentiation was also detected using SNPs, leading to the question of sensitivity of our bottleneck detection results to sampling scale. Moreover, the disparity in demographic trends reported across regions (declines were reported especially in Canada and in Norway) prompted us to examine genetic signals at different spatial scales. Looking at the scale of regions or populations, we found no signature of demographic decline except in one case: in the population of Alert (site 15 in Fig. 1), the hypothesis of a local decline cannot be excluded, in line with field observations in Canada (Gilchrist and Mallory 2005). Bottleneck analyses concluded to a significant or nearly significant excess of heterozygosity for this population considering the most relevant (and conservative) mutation model tested (TPM95; p = 0.032, fdr-p = 0.132, see Fig. S1). This result was also supported when reanalyzing the data with TPM parameters designed to follow the GSM inferred by Migraine (with tetranucleotides only, noted TPM00 in Table S4) as explained above (p = 0.006, and still marginally significant after correction for multiple tests, fdr-p = 0.059).

However, this Canadian population showed no clear sign of disconnection from any other population (all pairwise FST ≤ 0.007 with microsatellites or SNPs). It is therefore not trivial to understand why a single sample could show a different genetic signal of population decline. One possibility is the presence in this sample of recent immigrants, but again this hypothesis seems at odds with the low genetic differentiation among populations. Another possibility is that it could be an artifact due to the small size of the sample used (n = 12 individuals). To test this idea, we investigated the sensitivity of Bottleneck to sampling size. A resampling analysis showed that small sample sizes did not result in overestimated heterozygosity excess (Fig. 4A) using either model (although we note a strong difference in behavior between TPM70 and TPM95). Moreover, as expected, small sample size did not produce any false positive result (Fig. 4B), although here again with a notable difference between models. We conclude that the result observed for the Canadian population at Alert is unlikely to be an artifact and thus indicates a significant heterozygosity excess, signature of genetic bottleneck and thus demographic decline. We note, however, that the ∆H value for that population is low (∆H = 0.015) and that M-ratio analysis did not detect any signature of bottleneck. We suggest that it would be interesting to have more samples from Alert population and perhaps, if at all feasible, from other Canadian sites to test the hypothesis of a local bottleneck (but see below for methodological limits linked with the lifespan of the species). Increasing sample size can improve Bottleneck’s analytical power and make accessible the genetic dataset to other methods of genetic decline inference like Migraine or DILS (which were not employed in this case because we had only 12 individuals from this population). It would also be useful to investigate what local bottleneck signatures are theoretically expected in strongly connected metapopulations.

Life-history and signature of bottleneck

One important aspect of our work was to evaluate the impact of life-history on the genetic inference of demographic declines. This is not an easy task, as it requires to run complex simulations and analyze multiple datasets with resource-demanding inference methods. To tackle this problem, we focused here on the heterozygosity excess method, which we could investigate using analytical approximations on many microsatellite datasets simulated with two extreme mutation models. Although we view this analysis as a first step toward understanding the applicability of genetic bottleneck inference in long-lived species (in particular, pending the development and testing of methods specifically designed to study recent demography using genome-wide data), the simulations provided useful observations. Most importantly, we found that increasing adult survival (and thus introducing longer generation time and generation overlap) resulted in a decreased and delayed signal of genetic bottleneck (Fig. 6 and S4). The ∆H estimates under SMM appeared to be overestimated in the conditions of our simulations (Fig. S6), meaning that the effect of age structure on the intensity of the bottleneck signature was less easily examined under SMM vs IAM (Fig. 6). Under IAM, the peak ∆H in a population where adults breed once a year and survive to the next year with a probability v = 0.8 was ca. 17% lower than in a population with a semelparous and annual life-history subject to the same demographic decline (1000 –> 250). The decrease was 11% in the less severe bottleneck scenario (1000 –> 500, Fig. S4A, C).

The temporal effect was clear regardless of mutation models and corresponded relatively well to what would be expected from the consequences of adult survival on generation time and NE (Figs. 6 and S4). According to our stochastic simulations, a population that went from 1000 to 250 individuals with adult survival ν = 0.8 would show a maximum ∆H only ca. 50 generations (220 years) after the event, as compared to 80 generations when ν = 0. This effect on the timing of the bottleneck signal roughly corresponded to the ratio of post-bottleneck NE estimated with vs without generation overlap.

The question of practical importance is thus, how many generations of drift are required before the ∆H signal starts being detectable? Of course, we cannot answer that question precisely for the ivory gull case (which would require the simulation of larger metapopulations and less caricatured life-cycle and demographic scenarios), but applying the Bottleneck software on simulated datasets, we found that genetic bottlenecks were almost never detected in any SMM model regardless of the strength of the bottleneck, even when the peak of ∆H has been reached (1000 → 250 or 1000 → 500; Table S5 and Fig. S4). The results obtained in IAM are more contrasted. Genetic bottlenecks were significantly detected by Bottleneck in several simulation scenarios, but this is highly dependent on the strength of the decline and the adult survival rate (Table S5 and Fig. S5).

If ivory gulls begin breeding at two years old and adult survival is ca. 0.86 (Stenhouse et al. 2004), generation time for the species can be approximated by \(T_G = a + \left( {\frac{\nu }{{1 - \nu }}} \right) = 8.1\) years (see supplementary material), meaning that a 20-year-old decline corresponds to only 2–3 generations. Although our simulations were simplistic, they suggest that it is potentially difficult to detect such a recent event. For example, in the strong bottleneck scenario under TPM and ν = 0.8 (Fig. 1B and D, pale blue curve), the bottleneck signal after 20 years (2.2 generations in this scenario) was half its maximum value (which is seen after generation 55). The difficulty could even be increased if ivory gulls start breeding at a later age than what is currently thought based on plumage color (because TG would then be higher).

The agreement between the different independent methods used here compels us to conclude that there is no genetic bottleneck signal in ivory gulls, but this conclusion has some limitations. First, according to the simulations, the heterozygosity excess method may not be able to detect a signal of recent genetic bottleneck that is reduced and delayed for long-lived species with overlapping generations and not very powerful with microsatellite loci evolving under SMM (or nearly so). Second, there is also some uncertainty in the capacity of the other methods used here to detect bottleneck events that recent (2–3 generations).

Genetic diversity and effective population size

Estimates of effective size of the ivory gull meta-population were around 1000 individuals, according to the results obtained on the two sets of genetic markers: NE ranged from 729.5 (95%CI [689.4–774.3], with SNPS, maf = 0.05) to 1487.9 (95%CI [990.5–2871.4], with microsatellites, no singleton).

When we consider a finer scale (e.g., at the population level), confidence intervals increase dramatically, spanning orders of magnitude, up to infinity. Such uncertainties probably stem from the low sample size used to infer local NE (England et al. 2006; Nunziata and Weisrock 2018). However, because we found an extremely low genetic differentiation among sampling sites (i.e., implying a homogenization of allele frequencies), we would expect similar estimates of NE at regional and global scales. On the contrary, we observed at least for SNP data a global estimation of NE being the sum of regional estimates (Table 2). In line with the previous discussion on the “local signature of population decline in Canada”, the observed discrepancies in NE estimated at global and regional scales suggests that high connectivity among samples sites does not necessarily imply a common demographic trajectory as inferred with genetic information.

An effective worldwide population size of 1000 individuals remains extremely low in comparison to the number of individuals estimated for the species. This result suggests that genetic drift can occur, decreasing the evolvability of the species, which can lead to an increased risk of extinction.

The low NE estimated for ivory gull may result in part from the effects of the species’ life cycle. As explained in the introduction, with constant adult survival, generation overlap reduces NE following NE = N/(1 + v). Yet this effect will be partially compensated in ivory gulls because juveniles do not reproduce until they are 2 years old. Following Nunney (1993, equation 22), we find that delayed maturity will increase NE by a factor equal to 1 + (a − 1)(1 − v). Taking a = 2 and v = 0.86, we find NE/N = (2 − v)/(1 + v) ≈ 0.6.

Moreover, this estimate assumes an even sex ratio, while the ivory gull population is known to be strongly male-biased (67.8% males, Yannic et al. 2016). The origin of this bias is unknown, but its effect on NE will be different if it stems from uneven primary sex ratio or reduced male survival at the juvenile or adult stage (Nunney 1993). Details of sex-specific life history traits are required to precisely estimate the effect of this bias of sex ratio, but it will reduce further the effective population size of the ivory gull population. Finally, all the above calculations disregard variations in fecundity across age classes. Yet intermittent breeding (i.e., the non-reproduction of individuals that have already reproduced) is regularly observed in ivory gulls. In seabirds, age appears to be the key to skipped breeding, as this behavior is usually observed in the youngest and in the oldest adult birds (Cubaynes et al. 2011; Goutte et al. 2011) as an adaptive response of birds to the trade-off between survival and future reproduction considering environmental constraints. Skipped reproduction as a response to harsh conditions of breeding in the Arctic might in turn markedly reduce NE in ivory gull and further decrease the NE/N ratio.

From estimation of effective population size, we deduced a mean NE/N ratio value of ~0.022 [0.019–0.026] considering a worldwide population of 38,000–52,000 mature individuals (BirdLife International 2018) and NE of 1000. Estimations of census populations size for ivory gull remain however relatively uncertain, and NE/N ratio will be around 0.12 [0.087–0.158], considering 6325–11,500 breeding pairs (Gilchrist et al. 2008), and NE of 1000. In the case of ivory gull, it is particularly low in comparison with NE/N ratio generally observed in birds (Frankham 1995). The 100/1000 rule, formerly 50/500, postulates that at least NE = 100 is required to avoid inbreeding depression, and NE = 1000 is necessary to maintain evolutionary potential in population (Frankham et al. 2014). This rule is often used in conservation biology to evaluate the risk of extinction of the concerned species (e.g., Jamieson and Allendorf 2012). Based on these estimates, the global ivory gull population does not appear to be at imminent risk of inbreeding depression, but NE closed to 1000 should be of concern for the long-term adaptability of the species. This is particularly problematic for species living in a rapidly changing environment. However, it would be necessary to make a correction in estimation (e.g., regarding confidence in N, or life-cycle characteristics) in NE, N and NE/N ratio. Some assumptions underlying the LD method we used to estimate Ne were not fulfilled here. First, while the method assumes discrete generations, it can still be applied to age-structured species provided that the number of cohorts represented in the sampling is close to the generation time (Waples et al. 2014). Here, we randomly sampled adult birds on colonies, without any indication on their specific age or cohort, and for some colonies, sampled over a few consecutive breeding seasons. However, a random sample of adults consistently underestimates the true NE (Waples et al. 2014). Thus, our estimates of NE close to 1000 may be a slight underestimation of the true NE (depending on species life-history traits; Waples et al. 2014). Second, the LD method assumes the population is at the mutation-drift equilibrium (Waples and Do 2010). This is, however, precisely the hypothesis we want to test in this study, when inferring the signature of genetic bottlenecks based on an excess of heterozygosity.


The ivory gull presents no signature of recent genetic bottleneck across its circumpolar breeding range, while the hypothesis of a local decline cannot be definitively ruled out. Our results, indeed, contradict field observations showing that the number of ivory gulls has been severely declining at least in Canada (70% decline between 1970-early 1980 and 2004–2006; Gilchrist et al. 2008) and Svalbard (40% of decline between 2009 and 2019; Strøm et al. 2020). Two factors may contribute to the discrepancy between bottleneck genetic inferences and regional demographic trends obtained from field surveys: (1) a lowered and delayed response of genetic signal due to life-history features (generation time and overlapping generations), and (2) differences among regions in demographic trends, with a potential strong impact of larger colonies (e.g., in Russia) to maintain worldwide genetic diversity. Uncertainties on local demography are related to the difficulties to obtain larger sample sizes from such remote areas that would allow more accurate demographic reconstruction. These difficulties could be circumvented by the use of complete genome—now accessible for several bird species (i.e., the Bird 10,000 Genomes (B10K) Project)—and individual resequencing to reconstruct past demographic history (e.g., PSMC; Li and Durbin 2011), and see for some notable examples, on pinnipeds (Peart et al. 2020) or birds (Nadachowska-Brzyska et al. 2015). In addition, comparative genomic analyses studying simultaneously several Arctic seabird species, coupled with species distribution modelling, could be highly informative. Finally, we found a strong effect of age structure on the detection of genetic bottleneck with a decreased and delayed signal of decline of several years, even for a strong decline (i.e., 75% decrease). This effect should not be overlooked in the search for signals of population decline, particularly for long-lived species, because a species could experience an even strong demographic decline, without being genetically detectable.