## Abstract

Reliable information on population size is fundamental to the management of threatened species. For wild species, mark-recapture methods are a cornerstone of abundance estimation. Here, we show the first application of the close-kin mark-recapture (CKMR) method to a terrestrial species of high conservation value; the Christmas Island flying-fox (CIFF). The CIFF is the island's last remaining native terrestrial mammal and was recently listed as critically endangered. CKMR is a powerful tool for estimating the demographic parameters central to CIFF management and circumvents the complications arising from the species’ cryptic nature, mobility, and difficult-to-survey habitat. To this end, we used genetic data from 450 CIFFs captured between 2015 and 2019 to detect kin pairs. We implemented a novel CKMR model that estimates sex-specific abundance, trend, and mortality and accommodates observations from the kin-pair distribution of male reproductive skew and mate persistence. CKMR estimated CIFF total adult female abundance to be approximately 2050 individuals (95% CI (950, 4300)). We showed that on average only 23% of the adult male population contributed to annual reproduction and strong evidence for between-year mate fidelity, an observation not previously quantified for a *Pteropus* species in the wild. Critically, our population estimates provide the most robust understanding of the status of this critically endangered population, informing immediate and future conservation initiatives.

### Similar content being viewed by others

## Introduction

Reliable demographic information, and in particular knowledge of population size and trajectory, is fundamental to the effective management of species, whether they be exploited, controlled, or threatened^{1,2}. Demographic data are essential to answering important questions in ecology and conservation concerning, for example, extinction prevention, harvest management, competition, predator–prey interactions, and emigration/immigration^{3,4,5,6}. Estimating the population size of many wild species remains challenging because populations are typically large and dispersed, and individuals are often highly mobile and difficult to observe. Typically, mark-recapture experiments and methods are used to estimate population size and demographic parameters, with the foundations of the method going back over half a century^{7,8}. With recent rapid development in genetic marker technology, genetic population size estimates have become feasible using analogues of mark-recapture studies in which individuals are sampled repeatedly and identified genetically^{9}. More recent advances do not require repeated sampling but use genetic kinship as a replacement for recapture^{10,11}. Genetic mark-recapture is yet to be deployed widely for threatened terrestrial species but advantages such as smaller required sample sizes, expected lower ongoing cost and high reliability make it a powerful alternative to traditional methods^{12}.

The Christmas Island flying-fox (CIFF) (*Pteropus natalis*) is endemic to Christmas Island and is the island’s last remaining native terrestrial mammal^{13,14}. Recent evidence suggests that the CIFF population is in decline^{13,15,16} with hypothesised drivers of this decline including chronic cadmium exposure, lack of food availability, disease, and interactions with yellow crazy ants, giant centipedes, and feral cats^{13,17,18,19}. Surveys using distance sampling theory estimated the CIFF abundance to be approximately 2,100 individuals (standard error (SE) = 563)^{17}, a substantial decline from the 1984 estimate of 6,000 individuals^{20}. These and other factors prompted the listing of the CIFF under Australia’s federal Environment Protection and Biodiversity Conservation Act (1999) as critically endangered^{21}. A subsequent capture-mark-recapture (abbreviated MR to facilitate differentiation from CKMR) study (Todd et al. 2020) estimated the population size to be 3846 (standard error (SE) interval 3643–4071) individuals in 2018, again well below Tidemann’s (1985) estimate. Population trends suggest that failure to address the causes of decline will likely lead to the species’ extinction, with Geyle et al.^{22} estimating that the CIFF has a 41% chance of extinction in the next 20 years. The decline of the CIFF, which is paralleled by that of other fauna on the island (including the extinction of the island’s only other bat species, *Pipistrellus murrayi* in 2009), has serious implications for the health of Christmas Island’s ecosystem. The CIFF is a keystone species that plays an important role in seed dispersal and pollination processes on the island^{15,23,24}. Definitive demographic estimates for the CIFF population are essential to the management of this species but are difficult to obtain because of the CIFF’s cryptic nature, high mobility, the difficulty in accessing parts of the island, and incomplete knowledge of the species’ habitat use^{16,23,24}.

The close-kin mark-recapture (CKMR) method^{10,25} can provide estimates of total adult abundance, rate of change in abundance, and survival, and effectively circumvents many of the traditional challenges of adequately surveying CIFFs. CKMR uses genotyping technology to recover a genetic *tag*, in the form of inherited genes, of an individual’s closely related kin, increasing the effective sample sizes relative to MR, which can only recover tagged individuals. CKMR generalises the application space of MR and offers great potential for populations that are difficult to survey and have low recapture rates, such as the CIFF population. CKMR is less prone to trap-happy/shy bias because individuals need only be captured once^{26,27,28}. Furthermore, assumptions of sample mixing, mating system and reproductive biology can be readily checked via assessing the observed distribution of kin against their expectation under the assumed model. CKMR allows for inference on adults from the sampling of only juveniles^{29}, and age-specific fecundities can be included^{30}. Injuries associated with tagging and the potential changes to survival that may result from tags are also circumvented. Importantly, for harvested or hunted species, samples can be collected opportunistically from dead animals. To date, CKMR has been successfully applied to aquatic species only, including southern bluefin tuna^{31}, salmon^{32}, white, grey nurse, and northern river shark^{11,33,34}, and validated against classical MR for brook trout^{35}. There is considerable potential for the application of CKMR to terrestrial species with most species having the necessary reproductive biology for the method to be applicable (see Sect. 6. of Bravington et al.^{10} for examples of species where CKMR will not work).

In this study, we extended the CKMR principles to estimate the population ecology of the CIFF. We used genetic data from 450 CIFF individuals to investigate the distribution of kin pairs in the CIFF sample through pair-wise comparison of genetic data generated from the DArTseq genotyping technology^{36,37,38}. Given the distribution of observed kin pairs, and their associated covariate information, we proposed a set of novel CKMR models, which included a female-focussed model and extended sex-specific models. We develop the CKMR theory to allow for the quantification of male reproductive skew i.e., the proportion of adult males that breed on average over breeding seasons, and the probability that male and female partners recouple year-on-year (referred to as mate persistence). We further applied mitochondrial DNA (mtDNA) sequencing techniques to inform the sex of the shared parent for the sex-specific analyses and incorporate the uncertainty in mtDNA haplotype sharing in the kinship probabilities and estimation procedure. We explored the adequacy of the CKMR models for providing demographic estimates for CIFF and quantified components of the CIFF mating system, which are essential for effective management. In this paper, we discuss the implications of the first application of CKMR to a terrestrial species and comment on its utility for monitoring CIFF and threatened populations more broadly.

## Results

### Sampling and tissue collection

Christmas Island is a territory of Australia located 1500 km west of the Australian mainland. Samples and measurements from individual CIFF were collected across the island during multiple sampling exercises for several different studies^{19,24,39,40} (see “Methods” for a detailed description). Wing-membrane biopsies for each individual were taken and stored in ethanol for genetic analyses. Across all studies, a total of 728 CIFF samples (≈ 38% female see Supplementary Table S1) with both tissue and phenotypic measurements were collected.

### Genetic detection of kin pairs

For CKMR, we are primarily interested in the genetic detection of parent–offspring pairs (POPs) and half-sibling pairs (HSPs). Wing-membrane biopsies were assessed with 693 of the 728 sampled individuals having available and substantial tissue for genetic analyses. DNA extraction and quality control were performed by Diversity Arrays Technology (DArT P/L, Canberra), with 197 individuals removed due to poor extraction and complexity reduction preparation, which was hypothesised to be a result of either tissue quality, microbial contamination, or both. To genetically infer kin pairs, we used the Kinference R package^{41} as the primary quality control and analytical tool. Following further quality control on genetic parameters across individuals and single-nucleotide polymorphisms (SNPs), 409 individuals and 2085 SNPs were retained for kinship analysis (see Supplementary Note A for detailed tissue preparation, genotyping and quality control procedures).

### First-degree relatives

To estimate kinship, the probability of the two observed genotypes at each locus for a pair of individuals was computed under the HSP versus unrelated pair kinship types. The likelihood ratio was then computed from these two probabilities and the sum of log-ratios taken as the overall statistic for the pair and is referred to as the pseudo-log-odds-ratio (PLOD).

Summaries of the PLOD distribution for all pairwise comparisons between the 409 individuals present after quality control showed visible histogram peaks corresponding to theoretical expectations for unrelated, second-degree and first-degree relatives (Supplementary Fig. S1). A peak in the far-right tail of the PLOD statistic distribution indicates the presence of first-degree relatives, i.e., potentially full-sibling pairs (FSPs) and POPs, for a PLOD score greater than 100 (Fig. 1 and Supplementary Fig. S1). Fifteen pairs were detected with a PLOD statistic greater than 100 (Supplementary Table S2 summarises these pairs along with their associated covariate information. Note that Supplementary Table S2 also contains summaries for siblings described below). Discrimination of POPs and FSPs was investigated using a PLOD statistic specific for differentiating these kinship types (see “Methods”). Clear separation was observed between POPs and FSPs (Supplementary Fig. S2). The genetic discrimination of first-degree relatives was corroborated using age, individual sampling date, and mtDNA haplotype sharing information. Of the fifteen first-degree relative pairs, five showed strong evidence to be FSPs, two were mother–offspring pairs (MOPs) and eight father-offspring pairs (FOPs) (Supplementary Table S2). Under random mating we expect a priori FSPs to be rare as it requires the same mates to breed in subsequent mating seasons. If FSPs are not rare, then it implies a level of mate persistence, or potentially, fidelity to mating roost, or both.

Adult ages are unreliable for CIFFs older than 36 months (see ‘CIFF sample characteristics’ in “Methods”) and thus placing an older individual as the parent of a pair can be ambiguous. To avoid biasing the results by ‘informative censoring’, i.e., excluding data points a posteriori based on their outcomes rather than on their covariates, we excluded from the log-likelihood all POP comparisons (whether the outcome of the comparison was POP or not) that met the following criteria. Adult-adult comparisons where the age of the potential offspring was greater than 36 months because adult ages were unreliable above 36 months and the placement of the offspring relative to the parent ambiguous. We further excluded comparisons between individuals where the older individual of the pair (earlier birth year) was caught as a juvenile and could not have been old enough to be an adult at the birth of the juvenile. Given we made the modelling assumption that age at sexual maturity is 24 months (based on female value reported in Todd et al.^{40}), we further excluded POP comparisons where the cohort gap was less than 24 months. Four POPs had a birth cohort age difference of 24 months or less (Supplementary Table S2). Possible explanations for this observation were age classification uncertainty for individuals older than 36 months and variation in the age at sexual maturity across individuals. Age at maturity and sexual maturity are different for CIFF and furthermore are different between the sexes (age at maturity is 27 months for males^{40}).

For FSPs, two pairs were removed as recorded age was greater than 36 months for at least one individual in the pair. To include FSPs in the CKMR analysis the birth year of both individuals in the pair must be accurate, which is why these exclusions were made. We note that for modelling, comparisons for potential siblings were only made between individuals that have age measurements of 36 months or less.

Overall, there were six pairs that were very likely POPs as their birth cohort gap is greater than or equal to three years. Of these six, five had a male as the parent and one had a female. Two parent individuals, 6- and 4-year-old males (ID—65725, ID—67296), appeared twice implying that the offspring are siblings. These potential siblings were validated as a HSP and FSP. Therefore, for first degree relatives, we had one mother–offspring pair (MOP), five father-offspring pairs (FOPs) and three FSPs available for analysis.

### Half-sibling pairs

The set of second-degree relative pairs and potential half-sibling pairs were distributed around a centre-right peak of the PLOD distribution (Fig. 1 and Supplementary Fig. S1). The distribution of second-degree relatives contained two sources of unwanted relative pairs that have different methods for exclusion: (1) pairs of second-order relatives with the same expected PLOD score as HSPs including grandparent-grandchild (GGP), and aunt-uncle/niece-nephew pairs; (2) contamination of the lower tail of the distribution with less related pairs e.g., cousins and half aunt-uncle/niece-nephew pairs. A detailed description of the procedure used to remove these unwanted kin pairs is presented in Supplementary Note B.

Following these procedures 65 HSPs were detected. Aunt/uncle-niece/nephew pairs can be arbitrarily close in cohort gap but require FSPs to frequently persist in the population into adulthood for this type of relationship to contaminate the HSP distribution. We performed a simulation study to investigate the rate of aunt/uncle-niece/nephew pairs across different scenarios of mate persistence and found only a small proportion, close to the number of FSPs, in scenarios similar to CIFF observations (see Supplementary Note C). If present for CIFF, the retention of these aunt-uncle/niece-nephew pairs as HSPs would lead to a downward bias of reported abundance estimates approximately proportional to the fraction of HSPs, therefore, approximately 3/65 \(\approx\) 0.045.

The age gaps between the pair of individuals in each HSP ranged between zero and six years with 20 having the same birth cohort (Table 1). This observation could not be driven by females as they can only have at most one pup per year and thus any one female cannot be the shared parent of any two juveniles born in the same birthing season. Assuming age classification is accurate, if two juveniles born in the same year are a HSP then they must share a father. This large number of same birth cohort HSPs coupled with the information that two POPs are from the same father and the presence of FSPs, strongly suggested male reproductive skew and mate persistence for CIFF.

The ratio of males to females (Supplementary Table S1) in the total sample in the juvenile age class, which should be less prone to selective sampling, was approximately 1.6:1 male to female individuals. This implied that it is unlikely that these HSP/FSP observations are due to a male population that is smaller than the female population and with some males having to breed multiple times for all females to have a partner.

### Mitochondrial DNA sequencing

Given the different reproductive dynamics of the sexes, a sex specific CKMR model was required. To infer the sex of the shared parent for the HSP component of the model, we used mitochondrial DNA (mtDNA) sequencing. This was performed for the set of individuals in the pairs that showed evidence for first or second-degree relative status (see Supplementary Note D for details). It was expected that the CIFF has enough mtDNA diversity in the mtDNA control region sequenced based on prior studies^{39}.

Across the quality-controlled mtDNA sequences, 65 unique haplotypes were observed. The mode of the haplotype frequencies was 1.1% with one haplotype showing a frequency of ≈ 20% (Supplementary Table S2 and Fig. S3). For each pair with a PLOD score greater than 32, we determined if individuals in the pair shared the same haplotype and the frequency of the haplotype assigned to each individual in the pair (Supplementary Table S2). Individuals involved in FOPs did not share mtDNA haplotypes, except one pair that was excluded prior on ageing criteria. Depending on the frequency of haplotypes it is possible that FOPs share a mtDNA haplotype by chance. Across the first-degree relative set, the single MOP and viable FSPs showed identical mtDNA haplotypes.

For the 65 HSPs, 17 of the pairs shared mtDNA haplotypes, 28 of the pairs had different (or could not be differentiated) mtDNA haplotypes and a birth cohort gap of at least one year, and 20 pairs had different mtDNA haplotypes and the same birth cohort (intra cohort). Three of the HSPs (one pair intra and two inter) could not be differentiated based on mtDNA as one of the individuals of each pair did not have an mtDNA sequence available due to laboratory quality control failure.

### CIFF CKMR modelling

The genetic kinship analyses strongly indicated higher reproductive success for a subset of males across the breeding seasons. Furthermore, the presence of FSPs suggested there was the potential for mate persistence. These initial observations implied that learning quantitatively from CKMR about CIFF male demography was more challenging because the estimates of male abundance and survival were potentially confounded with the rate of dominant male turnover.

Given female CIFF have at most one pup per year, we initially grounded our results using a female-focussed CKMR model because it was simpler than the model including both sexes, and we were more confident with the female-related biological assumptions (see “Methods” and Supplementary Note G for a detailed description of the female-focussed model components). Briefly, the female-focussed model describes the number of female adults \({N}_{{\female},t}{ p}\)resent at an arbitrary unit of discrete-time *t*, taken to be years from here on, by the function

where \({N}_{{\female},{t}_{0}}\) is the adult female abundance at the initial year of modelling *t*_{0}, and \({\rho }_{{\female}}\) is the female population rate of increase. The kinship probabilities for both POPs and HSPs depend on adult female abundance at time *t* and the survival function that specifies the survival rate of an individual between time \({t}_{1}\) and \({t}_{2}\) (\({t}_{1}\)< \({t}_{2}\)) and is governed by the mortality rate *δ* via \(\varphi ({t}_{1},{t}_{2}) =\mathrm{ exp}[-\delta ({t}_{2}- {t}_{1})] (\mathrm{see\;Methods})\). Therefore, for CIFF, the female-focussed CKMR demographic model contains three parameters \({\varvec{\theta}}\) \(={\left({N}_{{\female},{t}_{0}}, {\rho }_{{\female}}, {\delta }_{{\female}}\right)}{\prime}\). Initial investigations indicated that mortality may be estimated with poor accuracy given the available data. We, therefore, fitted the female-focussed model with the mortality parameter constrained through a prior distribution that was informed from estimates of survival from a previous MR study^{42} (see “Methods” and Supplementary Note H for details on the prior distribution).

The CKMR model was then extended to include male population dynamics, male reproductive skew and mate persistence. The male population estimates were of high ecological interest and the model is more complete relative to the female-focussed model, which required some simplifying assumptions to fit without the male dynamics (see Supplementary Note G). In outline, the population dynamics for total adult abundance was modelled as

\(\mathrm{where\;} {N}_{{\female},{t}_{0}}\) and \({N}_{{\male},{t}_{0}}\) are the adult female and adult male abundance at the initial year of modelling *t*_{0}, and \({\rho }_{{\female},{\male}}\) is now a common trend parameter for adult females and males. In the extended models a sibling probability was computed by summing over three demographic probabilities FSP, MHSP, and PHSP, where the FSP probability incorporates a parameter *α* that models mate persistence (see “Methods” for details). To incorporate male reproductive skew, we make the distinction between the total set of adult males and the breeding set of males by incorporating the breeding male population size in the computation via \({N}_{{\male},{t}_{0}}\times \pi\), where \(\pi\) lies in (0*,* 1). The parameter set estimated for the base extended model was \({\varvec{\theta}}\) \(={\left({N}_{{\female},{t}_{0}}, { N}_{{\male},{t}_{0}}, {\rho }_{{\female},{\male}}, { \delta }_{{\female},{\male}, } \alpha , \pi \right)}{^\prime}\) (referred to as the 'base extended' model).

We explored three further variants of the base model. The second extended model incorporates a prior on the shared mortality rate (\({\delta }_{{\female},{\male}, })\) as detailed above for the female-focussed model (referred to as the 'base extended + prior' model). The third extended model include an extra mortality parameter \({(\delta }_{{\male}* })\) that is included in the paternal sibling calculations ('base extended + mortality'). The final includes the extra mortality parameter and a prior on the shared mortality parameter ('base extended + prior + mortality').

All available kin and mtDNA information were included in the proposed CKMR models and parameter estimates obtained via maximising the penalised log pseudo-likelihood (implemented as a covariate-class Binomial log pseudo-likelihood (see Supplementary Eq. S12).

In the main text, we present the estimates from the base extended model. The results of the female-focussed model and all four extended models were summarised in Supplementary Notes G and H. For each of the extended models, the elements of the gradient vector at the MLE estimates was close to zero indicating adequate convergence of the optimisation algorithm. Fit was similar across models based on the observed versus expected kin i.e., all observed kin were within the 95% interval of the Poisson distribution generated by the expected number of kin for each of the cells in Table 2 and Supplementary Table S7 (see “Methods” for explanation of Poisson approximation). Goodness of fit measures were similar across extended models with the base extended model showing marginally the lowest value (Supplementary Table S7).

The adult female abundance estimates in the base extended model were reasonably precise with the highest precision estimates being in years 2016 and 2017 (Supplementary Table S8 and Supplementary Figs. S10 and S11). For 2017, adult female abundance was estimated as 2049, with a 95% confidence interval (CI) of 967–4340 and coefficient of variation (CV) of approximately 40% (Table 3 and Fig. 2). Profile likelihood 95% confidence intervals for adult abundance were similar to those estimated using the Hessian standard errors (Supplementary Table S8 and Supplementary Fig. S12). Similar results from the female-focussed model were observed with 1948 adult females estimated in 2017 and a similar CV (Supplementary Table S5). Although similar, estimates of trend and mortality are difficult to compare between the female-focussed and extended models because they are tracking both sexes in the extended models. In the base extended model, total adult abundance had a point estimate of 5883 (SE = 2468) in 2016 with an 95% CI interval of 2585–13,385. Across models the lower bound on total adult abundance was relatively consistent at \(\mathrm{approximately}\) 2200–2500 individuals (Supplementary Table S8).

Graphs of kin with the same mt-DNA haplotype (including MOPs which are not ambiguous based on mt-DNA) showed very few clusters of kin, which implies that when performing any one comparison the knowledge of all prior same-haplotype comparisons provides little information on whether the comparison will be a same-haplotype kin pair (Supplementary Fig. S13). This implies that the assumption of pairwise independence of comparisons is reasonable for female parameters and variance estimates should be reliable. Clusters were more apparent for kin with different mt-DNA haplotypes (Supplementary Fig. S14). Therefore, the uncertainty estimates for male-associated parameters may be underestimated based on the Hessian and profile likelihood methods and should be interpreted with more caution.

The estimated time series of total adult, adult female, and adult male abundance shows an expected depreciation in precision at the edges of the time horizon modelled, with the largest variability observed in 2013 (first year modelled) (Fig. 2). The point estimate for the trend in total adult abundance was mildly decreasing but highly variable, with the data being consistent with a trend in adult abundance potentially increasing or decreasing (Table 3 and Fig. 2). The sampling importance resampling (SIR) trends (see “Methods”) showed this well, with many of the 10 sampled trends increasing (Fig. 2). The estimate of the average annual adult mortality rate was 0.198 (SE = 0.159), which was not substantially different from the value estimated in the base extended + prior model (0.291 (SE = 0.129)), which suggests that sex-averaged survival may be higher than previous estimates (Supplementary Table S8). Precision for the mortality rate estimates was poor and should be interpreted cautiously. Hessian and profile likelihood standard errors were discrepant for the mortality rate parameter with substantial asymmetry on the lower bound across the extended models (Supplementary Table S8 and Supplementary Fig. S12). The use of a prior on mortality rate assisted with constraining uncertainty in the lower bound, which was observed in results from the profile likelihood interval method (Supplementary Table S8). For the base extended + extra mortality model, the point estimate for \({\delta }_{{\female},{\male}}\) decreased mildly to 0.149 and the estimate for the breeding males was 0.244 (SE = 0.218). For the base extended + prior + mortality model, a similar point estimate for \({\delta }_{{\male}*}\) was observed when a prior was used for \({\delta }_{{\female},{\male}}\). For both base extended + mortality and base extended + prior + mortality models, uncertainty in the mortality parameters was high and there was little evidence for difference in these parameters. The inclusion of these extra parameters led to a marginally poorer fit based on the goodness of fit measures when compared to the base model (Supplementary Table S7).

The estimate of the breeding male proportion (of total adult male abundance) was \(\approx\) 20% across all models with an estimate of 0.230 (SE = 0.115) in the base extended model. The point estimates for *α* were consistently \(\approx\) 5.4 (SE = 0.563 in base extended) across models, which suggests that there is strong evidence of mate persistence, leading to an inflated rate of full sibling pairs in the CIFF population. We took ratios of the FSP probability (see Eq. (6) of “Methods”) across covariate classes computed at the reported estimates for the base extended model to the probability under a scenario of a random selection of mates (i.e., *α* = 0). An *α* = 5.435 corresponded to an approximate 200 times increase in the FSP probability on average across covariate classes.

## Discussion

The critically endangered Christmas Island flying-fox is a priority for conservation management, and its apparent decline raises serious concerns about both its own survival and the keystone ecological services it provides as the sole remaining native mammal on Christmas Island. To direct the CIFF effectively away from extinction, management requires accurate estimates of population demographics and dynamics. To this end, we extended the CKMR method to estimate sex-specific abundance, survival, and trend for this species. This work is the first application of CKMR to a terrestrial species and shows its capacity as a tool for monitoring threatened species.

The genetic kin-finding analysis of 409 individuals showed that first and second-order kin could be reliably detected for CIFF based on SNP markers genotyped using DArTseq. Overall, six parent-offspring, three full-sibling and 65 half-sibling pairs were valid for CKMR modelling. The CKMR method revealed vital information on survival, trend, and the CIFF mating system, including strong evidence of skewed male reproductive success. This skew makes quantification of the adult male census population via CKMR challenging because abundance is linked to the breeding set of males and survival is confounded with the rate at which dominant males exit the breeding population. We overcame these challenges through extending the CKMR theory and estimated total adult abundance to be ≈ 5900 (95% CI 2600–13,500) individuals for the highest-precision year. Estimates of total male abundance were ≈ 3,800 (95% CI 1350–10,900). However, the average proportion of this estimate that were breeding was 23% suggesting a point estimate of ≈ 750 males breeding in 2017 with a 95% CI of 450–1730. Kin clusters were more apparent for kin with different mt-DNA haplotypes, which implies that the pairwise independence assumption of the pseudo-likelihood is unreliable for male-derived parameters. This does not bias the point estimates but does affect the variance estimates of the male parameters^{10,43}. Variances may be underestimated based on the Hessian and profile likelihood methods and should be interpreted with more caution. Quantifying the extent of the underestimation is an interesting and challenging topic, which we leave as future work. This is anticipated to arise frequently in studying endangered wild populations. We expect variances of female parameters to be well estimated. Adult female abundance was estimated to be ≈ 2,050 individuals in the high precision 2017 year with a 95% CI of 960–4350. Adult female abundance speaks directly to the size of the effective breeding population and hence capacity for population growth, and so is arguably the most important component of the population to estimate.

Our estimates indicate that the present CIFF population is larger than the 1000–2000 individuals estimated in the early 2000s^{15,44}. CIFF nocturnal distance sampling surveys conducted between 2006 and 2012 reported a population estimate of 2100 (S.E. 563)^{17} and a decrease in CIFF incidence over the occupancy survey period^{16} relative to the population estimate of 6000 from Tidemann^{20}. However, results from a recent MR study indicated an increasing trend in the estimated population size from 2430 (95% CI 1665–3868) in 2016 to 3846 (95% CI 3451–4309) individuals in 2018^{42}. The point estimate of total adult abundance from CKMR was larger, but CKMR intervals included the MR point estimates. The CKMR estimates may be downwardly biased by contamination of the HSP set with aunt-uncle/niece-nephew kin. We expect the bias, if present, to be approximately proportional to the fraction of HSPs. The simulation indicated that the number of aunt-uncle/niece-nephew relationships to be close to the number of FSPs so approximately 3/65 \(\approx\) 0.045, which is well within one standard deviation of the presented uncertainty of the presented estimates. CKMR abundance estimates integrate uncertainty over an expanded parameter set and lost 40% of the samples through laboratory quality control, making comparisons of uncertainty per unit effort between these studies difficult. The CKMR point estimate of trend in adult female abundance indicated a mildly decreasing population, but the precision of the estimate was poor and could neither exclude nor corroborate the trend results of Todd^{42}. Caution should be exercised when interpreting the differences between recent population size estimates and those of the early 2000s because the underlying model assumptions and data used are difficult to reconcile between these studies. In either case, an estimate of adult abundance that is larger than previously reported is a positive sign for the population health of the species.

The CKMR method revealed additional information on the CIFF mating system, which was driven by the primary observation that 20 sibling pairs were intracohort. McCracken and Wilkinson^{45} remarked that there is no convincing evidence that mating in any bat species is random and that bats exhibit mating systems ranging from monogamy to classic lek mating systems. Across members of the *Pteropus* genus there is evidence of polygyny both in the wild and in captivity^{45}. *Pteropus rodricensis* forms apparent harem groups of one male with up to eight adult females^{46}. *Pteropus hypomelanus*, *P. vampyrus* and *P. pumilus* appear to mate in small harems in captivity^{45}. Todd et al.^{40} concluded that CIFF has a polygynous mating system with breeding being seasonal and annual. Tidemann^{20} also observed that a small proportion of the male population monopolises all adult females. Personal observations of harem structures were recorded at CIFF roost sites during recent sample collection, which further suggests a skew in mating opportunities. As female CIFF can have at most one pup per year, the 20 observed intracohort HSPs are likely from males that have mated with multiple females in a breeding season. Mitochondrial DNA sequencing indicated that the shared parent for these intra-cohort siblings was very likely male. This is also evidence for reliable age classification as female intra-cohort pairs would be present if ageing errors occurred although intra-cohort pairs can, with a small probability, share a mtDNA haplotype and still share a father. We further observed that two of the five paternal POPs were from the same father. Individual male reproductive success may vary around an average of one mate from year to year, which could produce substantial numbers of intra-cohort pairs without any one set of individuals dominating year-on-year. Quantitatively the base extended CKMR model estimated that ≈ 23% (95% CI of 0.077–0.515) of the total adult male population contributed to breeding on average over the years studied. The estimated reproductive skew provides useful knowledge to inform conservation management actions, including population viability models.

We further observed three FSPs, which under random mating we expect to be very rare as it requires the same individuals to meet in subsequent mating seasons. The mate persistence parameter (*α*) was estimated to be substantially non-zero indicating an inflated rate of mate persistence relative to individuals randomly choosing new partners in each new breeding season. While interseason mate fidelity has been documented in other bats^{47}, to date it has not been observed in *Pteropus* species in the wild. There is the potential for site fidelity to generate FSPs with females potentially mating with the same mate by association with the mating territory; however, few studies have examined this in *Pteropus* species and Welbergen^{48} showed that *P. poliocephalus* females join a male's mating territory but showed limited mate persistence within years and none between years. Combined, both prior and CKMR observations and estimates point strongly to, at least in part, a single-male, multi-female group mating system for CIFF with the potential for mate persistence, roost fidelity, or both.

Mitochondrial DNA was critical to the implementation of the female-focussed and extended CKMR models as it allowed the substantial set of sibling pairs detected to be partitioned by the sex of the shared parent. Information on mortality and trend in CKMR needs several juvenile cohorts to estimate these parameters, which are based on how fast the proportion of siblings declines as the number of cohorts separating them increases. The approximate point estimate for the averaged adult female and male survival was 0.82, which is larger than the adult female and male survival reported in Todd^{42} of 0.65. When the breeding male mortality was modelled separately the point estimate for the common mortality parameter decreased. The point estimate of breeding male mortality was greater suggesting lower survival, which may reflect a greater stress on breeding males. We note that across models the precision was too poor to make reliable inference about differences in point estimates of survival. However, if sampling of CIFF for CKMR were to continue for the CIFF we expect the precision of mortality/survival estimates to increase rapidly as the numbers of kin and differentiated cohorts increases.

The kinship analysis showed peaks in the PLOD kinship statistic that aligned well with theoretical expectations and variability. However, the PLOD distribution was not as well spread as for other species, e.g., Atlantic bluefin tuna^{43}. For CKMR analyses of other species, studies have used a variant of DArTSeq namely DArTCap, which gives much higher sequencing read depth per locus than DArTSeq, and lower genotyping error rates, which can improve the spread of the PLOD distribution and in turn discrimination of kinship classes. DArTCap requires knowledge of the variants intended to be sequenced and has an initialisation step to manufacture the capture probes that bind to the restriction fragments for sequencing^{49}. The SNP markers used in this study could be used for this initialisation step and once performed DArTCap offers a cheaper and more reliable genotyping platform than DArTSeq for future monitoring of CIFF.

An important research goal for CIFF and other threatened species would be to develop accurate ageing methods. This would facilitate adult-adult comparisons in the analysis and an increase in kin detected per sample. Although ageing by tooth extraction and examination has been conducted in surveys of other flying-fox species^{50}, the method can have implications for animal welfare and fitness. Thus, a more suitable method for age determination could include epigenetic age estimates^{51} from the biopsy used for CKMR, or the use of the universal mammalian epigenetic age predictor^{52}. The presence of a high-quality assembled CIFF genome is under consideration and could, in principle, make CKMR more efficient through the detection of more distant kin and assist the development of cheaper genotyping. A powerful feature of the CKMR method is that subsequent sampling improves the precision of the current analysis as new samples can be compared to all previous samples, offering a cost-effective and accurate continuous monitoring method. The limits to precision by continuous monitoring, which are influenced by generation length and sampling intensity (among other elements of the species' biology), can be explored through CKMR design principles. Further analyses of minimum sampling requirements for CKMR monitoring maintenance could be performed given the results from this larger study using the theory outlined in Bravington et al.^{10}. Crucially, whatever the particulars of future CIFF sampling, CKMR for CIFF only requires DNA samples and avoids the maintenance and ethical issues associated with a marked population. CKMR methods can be integrated with existing MR monitoring, providing additional detailed understanding at a low cost. The findings of CKMR analyses offer important insights particularly for threatened species.

This study is the first application of CKMR to a threatened terrestrial mammal. The study provided state-of-the-art estimates of key female and male demographic parameters, revealed further insights into the CIFF mating system, and, through a novel model, estimated the extent of male reproductive skew and mate persistence in this sample. This study provides pivotal abundance and trend estimates that together with the recent MR study form a sound baseline for ongoing management of CIFF.

CKMR has clear potential for understanding the demography of mammals and as a broad tool for the monitoring and management of threatened terrestrial species. Future applications of CKMR to terrestrial species will encounter similar challenges to those overcome in this study, for example, polygamy, reproductive skew, mate persistence and sex-specific demography. A substantial challenge, not encountered in this study, that will arise for many species is limited dispersal. Spatially explicit CKMR models have been explored^{30,53} to address this issue with the added model complexity needing consideration for many species. We anticipate that the techniques detailed in this work will serve as a methodological reference point for many future applications and extensions to terrestrial species, which is critical to the uptake and dissemination of the powerful CKMR method.

## Methods

### Sampling and tissue collection

Christmas Island is a territory of Australia located in the Indian Ocean approximately 380 km south of Java, Indonesia, and 1500 km west of the Australian mainland. It is a small 135 km^{2} island composed of tertiary limestone overlying volcanic andesite and basalt and its topography consists of a series of limestone terraces separated by cliffs^{20}. Samples and measurements from individual CIFF were collected across the island during multiple sampling exercises for several different studies^{19,24,39,40}. Those studies detail the capture, morphological measurements, and tissue sampling procedure of the samples included in this analysis. Briefly, CIFFs were caught using nets at both foraging and roost sites between August 2015 and February 2019. Capture locations were distributed across the island and the net location within each roost site was randomly selected. Foraging sites were chosen using night driving surveys from random starting locations and fruiting trees selected as a capture site were random on any given night^{40}.

Captured animals were processed at the capture site where they were anaesthetised and sex and age determined, and morphological measurements recorded; one of the capture sites was a field laboratory (The ‘Pink House’ building in Christmas Island National Park). Age class was determined using morphologic measurements and secondary sex characteristics, including tooth staining and wear and in females, teat elongation. For each individual, wing-membrane biopsies were taken and stored in ethanol for genetic analyses.

#### Approval for animal experiments

The research protocols used across the fieldwork adhered to the guidelines of the American Society of Mammalogists for research on live mammals^{54} with approval from the Animal Care and Ethics Committee of Western Sydney University (Project Protocol No. A11140). Permits to capture, handle, band and measure bats were issued by the Christmas Island National Park (Permit No. CINP-2015-6-1).

### Kin-pair detection

To genetically detect kin pairs, we performed single nucleotide polymorphism (SNP) genotyping for each available CIFF individual using DArTseq. The tissue preparation, genotyping and quality control procedures used are detailed in Supplementary Note A.

To estimate kinship, the probability of the two observed genotypes at each locus for a pair of individuals is computed as if the pair was truly a HSP versus truly an unrelated pair. The likelihood ratio under these two kinship categories was computed from these two probabilities using functions in the Kinference R package for all loci and the mean of the log-ratios is taken as the overall statistic for the pair and is referred to as the pseudo-log-odds-ratio (PLOD). Large positive values are evidence for HSP status and negative values are evidence against; full-sibling pairs (FSPs) and parent–offspring pairs also have very high positive PLOD scores and can be separated from half-siblings because their expected PLOD can be calculated a priori. Bravington et al.^{10} and Hillary et al.^{11} detail further the computation of the likelihood ratio for different kinship types. PLOD statistics were computed for each pair of individuals and the distribution was summarised to infer kinship.

Statistical genetics theory was used to predict the expectation and variance of the distributions of different kinship types, including parent–offspring pairs (POPs), full-sibling pairs (FSPs) and half-sibling pairs (HSPs) (Supplementary Fig. S1). These theoretical predictions were compared with the observed distributions to assess the adequacy of the genotype set used for kin finding and then subsequently to determine pairs that are very likely to be of a particular relatedness type, e.g., POPs, FSPs and HSPs.

First degree relatives including FSPs and POPs also have very high positive PLOD scores and are typically separated from second-degree and lower order kin by simple segmentation of the PLOD x-axis. Discrimination of POPs and FSPs was investigated using a PLOD statistic specific for differentiating these kinship types implemented in the split_FSPS_from_POPs function the Kinference R package and is robust to genotyping error. Again, the probability of the two observed genotypes at each locus for a pair of individuals is computed as if the pair was truly a POP versus truly an FSP and the summed log likelihood ratio computed. Discrimination of FSPs from POPs is based on inspection of the distributions of the PLOD for clear separation of the two kinship types and comparison against their expected values.

The set of second-degree relatives is determined by inspecting the pairs whose PLOD distribution is centred at the HSP expectation. This distribution contains HSPs and unwanted relative pairs that have (1) the same expected PLOD score (e.g., grandparent-grandchild) and (2) contamination from less related pairs (e.g., cousins). A detailed description of the procedure used to segment second-degree relatives and remove contamination in the lower tail from lower-degree relatives is presented in Supplementary Note B.

### CIFF sample characteristics

Phenotypic information for each CIFF sample included an individual identifier, collection date, capture location, sex, forearm and tibia length, developmental stage i.e., juvenile, sub-adult and adult and an estimate of age. For CKMR analysis, age was divided into three broad classes including juvenile (0–12 months old), sub-adult (> 12–24 months old), and adults (> 24 months old). Age determination for CIFFs older than 36 months is more difficult; once an individual becomes an adult, age can be estimated but a discrepancy between the age estimate and actual age of ± one year is likely especially for individuals past 36 months. Therefore, age is reported for each individual but only juveniles, sub-adults and adults less than 36 months (third year of life) old were considered to have a known age. Age at sexual maturity is assumed to be 24 months, which corresponds to the age of female sexual maturity reported in Todd et al.^{40}. Across the set of 409 individuals that passed genotyping and kinship analysis quality control, 176 were less than 12 months old, 104 were \(\ge\) 12–24 months old, 62 \(\ge\) 24–36 months old, and 66 were \(\ge\) 36 months old. In total, the years of sampling ranged from 2015 to 2019 with 46, 158, 58, 48, and 99 individuals sampled in each year respectively.

### CIFF biology

The biology of the CIFF was used to inform the computation of the CKMR kinship probabilities. The findings of Todd et al.^{40} indicate that most births occur between the months of January and April with the peak frequency of births occurring in February and March and conception occurring between June and August. Todd et al.^{40} noted that mating and birth were not limited to these peak periods and occurred with substantial frequency throughout much of the year. Therefore, we grouped individuals by their likely natural birth cohort rather than calendar year cohort—for example, if an individual was estimated to be one year old and caught in December 2017 then we considered it to be from the 2017 cohort rather than the calendar year cohort of 2016.

The small size of Christmas Island (135 km^{2} total area and 22 km maximum width) and the flying speed of the CIFF imply that individuals can potentially disperse to any location within a night. The very large distances to any other potential habitats outside of Christmas Island further imply that the CIFF population is spatially circumscribed with minimal emigration. We assume that for CIFF, fecundity is constant with females having one pup per year, the age of maturity is 24 months, and there is no stock structure. Kin detection can assist us in checking the validity of these assumptions, but we assume this to be the base model scenario.

The CIFF shows evidence for a polygamous mating system with breeding being seasonal and annual^{40}. Tidemann^{20} suggested that a small proportion of the male population monopolises all adult females, with James et al.^{15} concluding that this hypothesis required further investigation. Initial genetic kinship analyses strongly indicated higher reproductive success for a subset of males across the breeding seasons. Furthermore, there was the potential for mate persistence indicated by the presence of FSPs. Given these initial observations, learning quantitatively from CKMR about CIFF male demography is more challenging because the estimates of male abundance and survival are confounded with the rate of dominant male turnover. However, given female CIFF show evidence for having at most one pup per year, models focusing on females are anticipated to provide reliable information about female CIFF demography.

### CKMR method overview

For CKMR the focus is on pairwise comparisons of sampled individuals. Let \(i\) and \(j\) represent two individuals sampled from a population and the comparison of their genotypes generates a discrete kinship category \({K}_{ij}\) which, for example, could be unrelated, mother–offspring or half-sibling pair.

The probability of a particular kinship relationship, written as \({\mathbb{P}}({K}_{ij} = {k}_{ij }| {{\varvec{z}}}_{i},{{\varvec{z}}}_{j};{\varvec{\theta}}\)) can be calculated for each pair and kinship type. The notation emphasizes that the probability is conditional on covariate vectors \({{\varvec{z}}}_{i}\) and \({{\varvec{z}}}_{j}\) gathered for the two individuals at capture, noting that the covariates \({{\varvec{z}}}_{i}\) and \({{\varvec{z}}}_{j}\) do not include information on genotype or kinship to other individuals. Common covariates include year of sampling, sex, and year of birth \(({y}_{i},{y}_{j})\) if ages can be obtained. The kinship probabilities are the building blocks of the pseudo-likelihood defined below and link the population dynamics model, and the associated parameters \({\varvec{\theta}}\), with observed kinship for the pair of individuals. Typical base model parameters include adult abundance at time \(t\), trend, and mortality/survival. For brevity in the following, we drop the \({\varvec{\theta}}\) from the notation.

To formulate expressions for the kinship probabilities, we use the notion of expected relative reproductive output (ERRO)^{10}. For example, the probability that a sampled female \(i\) is the mother of \(j\) can be written as

where \(k\) indexes over kinship type (\(k\) =MOP in this instance), the reproductive output of an individual in year \(t\) is a random variable \(R(t)\), \({\mathbb{E}}[{R}_{i}({y}_{j})|{{\varvec{z}}}_{i}, {{\varvec{z}}}_{j}]\) is the expected reproductive output of female \(i\) in the year of \(j\)’s birth, and \({\mathbb{E}}[{R}_{+}({y}_{j})|{{\varvec{z}}}_{j}]\) is the total expected reproductive output of all adult females in year \({y}_{j}\). Formulations are similar for other kinship types but require the re-expression of this ERRO principle in terms of case-specific covariates and parameters (see Sect. 3 of Bravington et al.^{10} for examples). The difficulty of re-expression depends on the biology and sampling of the species, with integrating uncertain age a challenging example.

The goal is to make inferences about abundance and demographic parameters, which requires a population dynamics model that evolves deterministically or stochastically depending on survival and birth rate parameters. The population dynamics model determines the elements needed to compute the kinship probabilities and inference on the parameters governing this model proceeds by maximizing the pseudo-log-likelihood

where \({x}_{ijk}\) is an observed binary indicator for whether individuals \(i\) and \(j\) are a match for kinship type \(k\), which indexes over the different kinship types considered e.g., parent–offspring and half-siblings, and \(i\) and \(j\) index over the \(n\) individuals. Equation (1) is a pseudo-likelihood because the pairwise comparisons are not strictly statistically independent e.g., because no animal can have more than one mother. However, pseudo-maximum-likelihood estimates are asymptotically unbiased, and variance estimates are accurate when sample sizes are small relative to the size of the population (see Sect. 4. of Bravington et al.^{10} for a discussion and Supplement of McDowell et al.^{43}).

Inference can proceed by maximising Eq. (1) directly or be made more computationally efficient by factoring the likelihood into classes with the same covariate information, where the number of pairs in each class is a sum over Bernoulli random variables and thus follows a Binomial distribution.

### Female-focussed CKMR model

We initially investigate a female-focussed CKMR model because we were surer of the female-related biological assumptions, and it is simpler than the both-sex model. The following exposition serves as a template for the population dynamics model and kinship probabilities that are required for the extended model that includes both sexes.

The base population dynamics model describes the number of female adults \({N}_{{\female},t}\mathrm{ p}\) resent at an arbitrary unit of discrete-time *t*, taken to be years from here on, by the function

where \({N}_{{\female},{t}_{0}}\) is the adult female abundance at the initial year of modelling \({t}_{0}\), and \({\rho }_{{\female}}={\beta }_{{\female}}-{\delta }_{{\female}}\) is the female population rate of increase and balances the birth rate \(\beta\) with the mortality rate \(\delta\) assuming that the population is closed to immigration. In the CKMR model, \(\beta\) is not explicitly estimated as \(\rho\) is used in the population dynamics equations to compute \({N}_{{\female}, t}\) and \(\delta\) in the kinship probability computation (see below). Importantly, \({N}_{{\female}}\) refers to the number of breeding adult females and \({\delta }_{{\female}}\) includes a rate of reproductive senescence alongside death i.e., the model cannot distinguish between these two concepts with just \(\delta\). For females, we assume that reproductive senescence is not substantial, but this is an important consideration for males (see below). Therefore, for CIFF, the base case CKMR demographic model contains three parameters \({\varvec{\theta}}\) \(={\left({N}_{{\female},{t}_{0}}, {\rho }_{{\female}}, {\delta }_{{\female}}\right)}^{{{\prime}}}\).

Given the demographic model, we require the kinship probabilities for POPs and HSPs, which link the genetic relationship to the demographic model. In the sampling exercises from which genetic samples for this study were obtained, the individual was released after tissue samples and measurements were taken. Furthermore, adults ages \(\ge\) 36 months are considered unknown. Therefore, we require a form for the parent–offspring probabilities that can accommodate unknown adult age and encompass both the cases when the parent was sampled after the offspring and when the parent was sampled (potentially as a juvenile) before the offspring was sampled. POP probabilities can be split into MOPs and FOPs as sex is known.

Supplementary Note E details the derivation of the CIFF-specific CKMR probabilities. We state the probabilities for female parent comparisons only. For MOP comparisons, we wish to find the probability that individual \(i\), sampled at time \({t}_{i}\), is the mother of juvenile *j*, born at time \({y}_{j}\). We assume that once a female reaches maturity, she has a fixed fecundity that is independent of covariates such as age. The probability that individual \(i\) is \(j\)’s mother is

where \({{\varvec{z}}}_{i}\) and \({{\varvec{z}}}_{j}\) are the vector of observed covariates for individual \(i\) and \(j,\) which include sample date for mother and juvenile, age of the juvenile to compute birth date and sex of the parent is implicit, \(\varphi ({t}_{1},{t}_{2})\) is the survival function that specifies the survival rate of an individual between time \({t}_{1}\) and \({t}_{2}\) (\({t}_{1}\)< \({t}_{2}\)) and is governed by the mortality rate *δ* via \(\varphi ({t}_{1},{t}_{2}) =\mathrm{ exp}[-\delta ({t}_{2}- {t}_{1})].\) Equation (3) does not explicitly state that the mother was mature at the birth of the juvenile. In practice for CIFF, this constraint is enforced by removing, for the POP component of the model, adult-adult comparisons where the age of the potential offspring was greater than 36 months and comparisons where the older individual of the pair (earlier birth year) was caught as a juvenile and could not have been old enough to be an adult at the birth of the juvenile.

We note that the terms survival and mortality will both be used throughout and are to be treated as having a one-to-one correspondence via \(\varphi\). The case where \({t}_{i}={y}_{j}\) is excluded as adults sampled during breeding season may not get the full opportunity to contribute reproductively that year or the juvenile may be sampled with the mother. Importantly, these equations only rely on the birth year of the juvenile and the sampling year of the adult. The second case of Eq. (3) is specific to the mark-recapture sampling scenario where the mother can be caught before the juvenile is born.

Bravington et al.^{10} details the HSP probability calculation (see Sect. 3.2 of Bravington et al.^{10}). Briefly, we take some unknown but fixed individual *m* as the true mother of \(i\). Thus, we know *m* was alive and mature at \({y}_{i}\), which is analogous to *m* having been sampled at \({y}_{i}\). This implies that the probability that \(i\) and *j* form a maternal HSP (MHSP) is simply the probability that *m* is the mother of *j*, with \({t}_{m}={y}_{i}<{y}_{j}\). Therefore, the MHSP probability is given by

Equation (4) assumes that individual \(i\) must be born first i.e., \({y}_{i}<{y}_{j}\) such that same-cohort cases are excluded.

To implement the female-focussed model we need to differentiate the sex of the shared parent for putative siblings. To accomplish this, we used mtDNA sequencing. If haplotypes for the mtDNA region sequenced are shared, then this is evidence for the pair being maternally linked. However, if there are mtDNA haplotypes of substantial frequency (specifically less haplotypes than there are females) in the population, half siblings may share a father but maintain a shared haplotype because each individual’s mother by chance could have the same haplotype. Therefore, to adequately incorporate the potentially increased presence of FSPs due to roost and mate fidelity and dominant males along with uncertainty in mtDNA haplotype sharing due to chance haplotype sharing, we derived sibling kinship probabilities that describe the three sibling cases. That is, the pair shares both parents, the pair shares just a mother, or the pair shares just a father (see Supplementary Note F for details). The sibling probability calculation is described in Supplementary Eq. (S5).

Fitting the female-focussed model is not straightforward because the sibling probability and the inclusion of uncertainty in haplotype sharing require male dynamics and kinship probabilities to be calculated. Supplementary Note G details how this can be achieved without male components through a few added assumptions. The final paragraphs of the “Methods” section in Supplementary Note G provide the main components for reconstructing the estimation procedure for the female-focussed model and incorporate some of the concepts detailed in the extended models below.

### Extended CKMR models

The extended models include male populations dynamics and kin and attempt to model male reproductive skew and mate persistence. For CIFF, we observed substantial numbers of intra-cohort HSPs, which should be rare under random mating given the constraint that females can only have one pup per year. Their presence implies that these intra-cohort HSPs are paternally derived and are a result of male reproductive skew. Furthermore, if males repeatedly mate with the same set of females, dominate a breeding location year-on-year, or both, then full sibling pairs (FSPs) may be present in the kinship analyses, which was the case for CIFF. We present an initial base model that includes these concepts and explore extra parameters that are biologically plausible.

The population dynamics for total adult abundance is modelled as

\(\mathrm{where} {N}_{{\female},{t}_{0}}\) and \({N}_{{\male},{t}_{0}}\) are the adult female and adult male abundance at the initial year of modelling *t*_{0}, and \({\rho }_{{\female},{\male}}\) is now a common trend parameter for adult females and males. In the extended model, the MOP and FOP probabilities are computed as per Eq. (3) with the denominator replaced with the corresponding adult male abundance value for paternal kin probabilities. As for the female-focussed model, the MHSP and PHSP calculations are included in a broader 'sibling pair' probability calculation that include FSPs (detailed in Eq. (5) of Supplementary Note F).

Briefly, the sibling probability is computed by summing over three demographic probabilities FSP, MHSP, and PHSP. We let *M* be the event that pair (\(i\)*,* *j*) share a mother and *P* the event that pair (\(i\)*,* *j*) share a father. We propose the following model for the FSP probability that incorporates the potential for mate persistence across breeding seasons

where the parameter *α* models mate persistence and \({\mathbb{P}}(P|{{\varvec{z}}}_{i},{{\varvec{z}}}_{j})\) and \({\mathbb{P}}(M|{{\varvec{z}}}_{i},{{\varvec{z}}}_{j})\) are computed from Eq. (4) i.e., are the unadjusted maternal and paternal half-sibling probabilities, and Eq. (4) is computed such that \({y}_{i}<{y}_{j}\) i.e., same-cohort cases are excluded for sibling probabilities and comparisons. For Eq. (6) to be non-zero the sibling pair’s father and mother must survive and have a non-zero breeding probability, which is zero if the maternal probability is zero (e.g., intra-cohort FSP). This calculation requires the paternal half-sibling probability, and, in turn, a male abundance parameter.

To incorporate male reproductive skew, we make the important distinction between the total set of adult males and the breeding set of males in the population. We propose that the paternal POP component of the model estimates the total set of males while the paternal sibling component estimates the breeding set. The intuition for this is that the comparisons made for the POP component of the model include all sampled males as potential parents, which includes those that are not contributing to reproduction. The sibling component of the model tracks the unobserved shared parent of the two siblings and is therefore only providing information on the breeding males. Therefore, the difference between these two components of the model allows for an estimate of the proportional of total males that are breeding.

To make this distinction in the model, we incorporate breeding male population size in the computation via \({N}_{{\male},{t}_{0}}\times \pi\), where \(\pi\) lies in (0*,* 1) and models loosely the proportion of total males at time \({t}_{0}\) contributing to reproduction. The population rate of increase and mortality parameters are initially shared between the female and male components of the model. The parameter set to be estimated for the base model is now

Initial investigations indicated that mortality may be estimated with poor accuracy given the available data. A prior MR study^{42} for CIFF had estimated survival and we, therefore, explored model fit when the mortality parameter was constrained through a prior that has a Gaussian distribution with expectation \({\mu }_{{\delta }_{{\female},{\male }}}= 0.443\) and \({\sigma }_{{\delta }_{{\female},{\male }}}^{2}= 0.206\). These values corresponded to the transformed weighted (recapture sample size per year) average of the adult female survival values of \(\mu =0.643\) and SE = 0.134 reported in Todd^{42}.

We propose that the mortality rate of the breeding set of males includes not only death but an extra rate of dominant male turnover. We investigated model fit with an extra mortality parameter \({\delta }_{{\male}*}\) that is included in the paternal sibling calculations i.e., the paternal sibling components of Supplementary Eq. (S5) are computed with their own mortality parameter and all other kinship components (MOP, POP and MHSP) are computed with a common mortality parameter. Overall, we explore parameter estimates and model fit for four extended models. The base model is the simplest with no prior on mortality and no breeding male mortality parameter. The other three models complete the two-by-two table of common mortality parameter prior and breeding male mortality parameter.

### Parameter estimation and model validation

The pseudo-likelihood function proposed in Bravington et al.^{10} and summarised in Eq. (1) is used to approximate the full likelihood function and assumes independence between all kin-pairs and involves only the marginal probabilities detailed in Eqs. (3)–(4).

For CIFF, many pairwise comparisons in the computation of Eq. (1) have the same kinship covariates \({({\varvec{z}}}_{i},{ {\varvec{z}}}_{j})\)(e.g., birth cohort, sampling year). To improve computational efficiency, we aggregate comparisons into classes with equal kinship probabilities. We implemented a covariate-class Binomial log pseudo-likelihood version of Eq. (1) (see Supplementary Eq. S12). Estimates of \({\varvec{\theta}}\) are obtained via maximising Supplementary Eq. (S12) using the nlminb function in the R programming language^{55,56}. The gradient is supplied to the optimisation routine and is computed using automatic differentiation via the Template Model Builder (TMB) package^{57}, which integrates C++ with R. Automatic differentiation in TMB facilitates the estimation of parameter uncertainty, which is computed for all model parameters. Hessian-based parameter standard errors were obtained using TMB algorithmic differentiation functions and interval estimates were reported by appropriate transformation. Hessian-based standard errors are appropriate for CKMR provided approximate independence holds (see Sect. 4. of Bravington et al.^{10} and Supplement of McDowell et al.^{43}). One heuristic for assessing approximate independence outlined in Bravington et al.^{10} inspects the ratio of kin-triads (within which the pairwise comparisons are clearly not independent) to kin-pairs. If this rate is low, say on the order of the inverse of the root sample size, then independence is a reasonable assumption. To investigate this assumption, we inspect the ratio of kin clusters to kin found for CIFF. For further checks on symmetry of likelihood around MLE estimates, profile likelihood confidence intervals were also computed using the tmbprofile and cofint.tmbprofile functions in TMB, which implements the intervals based on Wilks' theorem (Sect. 13.8 of Wilks^{58}).

The score function of the log pseudo-likelihood at the estimates reported from TMB was evaluated to ensure values were near zero. Model fit was evaluated by comparing observed versus expected kin for each of the modelled kinship types (MOP, FOP, MHSP, PHSP, FSP). The expected number of kin is computed as the number of combinations multiplied by the kinship probabilities for each kinship type computed under the maximum likelihood estimates (MLE) of \({\varvec{\theta}}\). Inadequate model fit was evaluated by computing whether the observed kin count was within the 95% interval of the Poisson distribution with rate parameter equal to the expected number of kin. The Poisson was used for ease of reporting and reference, where the interval can simply be computed under the Poisson distribution using the reported expected number of kin, whereas the probability and number of comparisons for each cell would be required for the Binomial calculation. The Poisson distribution is a very good approximation to the Binomial when the number of comparisons is large and the probabilities of kinship small, which is the case for CKMR. Goodness of fit measures were also computed for the extended models by summing over the squared observed minus expected (normalised by expected value) kin differences for each kin category.

To visualise the variability in trend given the uncertainty in parameter estimates from each model we sampled ten potential trends using the sampling importance resampling (SIR) algorithm^{59}. Briefly, we drew 1000 replicates from the multivariate normal distribution (MVN) with expectation at the MLE estimates and the estimated covariance matrix of the parameter estimates reported from TMB. For each draw, we computed the CKMR likelihood value at that draw and divided it by the MVN density value at that draw, which becomes the weight for this draw. From these 1000 draws (and weights) we took a weighted random sample of 10 sets of parameter estimates. These parameter sets generate trend sets that are representative of the potential trends in abundance given the uncertainty in the distribution of parameter estimates.

### Ethics declarations

The authors have complied with ARRIVE guidelines.

## Data availability

Raw and processed genotype data, mitochondrial DNA data along with Template Model Builder C++ code, R code, and summary results generated are available at CSIRO's Data Access Portal permanent repository at https://doi.org/10.25919/2k26-t773. Results of the present study were generated from R version 4.1.2.

## References

Krebs, C. J.

*Ecology; the experimental analysis of distribution and abundance*(NY (USA) Harper and Row, 1985).MacKenzie, D. I.

*et al. Occupancy estimation and modeling: inferring patterns and dynamics of species occurrence*. (Elsevier, 2017).Anderson, R., Gordon, D. M., CraWley, M. & Hassell, M. P. Variability in the abundance of animal and plant species.

*Nature***296**, 245–248 (1982).Lande, R. Genetics and demography in biological conservation.

*Science***241**, 1455–1460 (1988).Begon, M. & Townsend, C. R.

*Ecology: from individuals to ecosystems*. (John Wiley & Sons, 2020).Hilborn, R.

*et al.*Effective fisheries management instrumental in improving fish stock status.*Proc. Natl. Acad. Sci.***117**, 2218–2224 (2020).Jolly, G. M. Explicit estimates from capture-recapture data with both death and immigration-stochastic model.

*Biometrika***52**, 225–247 (1965).Seber, G. A. A note on the multiple-recapture census.

*Biometrika***52**, 249–259 (1965).Luikart, G., Ryman, N., Tallmon, D. A., Schwartz, M. K. & Allendorf, F. W. Estimation of census and effective population sizes: the increasing usefulness of DNA-based approaches.

*Conserv. Genet.***11**, 355–373 (2010).Bravington, M. V., Skaug, H. J. & Anderson, E. C. Close-kin mark-recapture.

*Stat. Sci.***31**, 259–274 (2016).Hillary, R. M.

*et al.*Genetic relatedness reveals total population size of white sharks in eastern Australia and New Zealand.*Sci. Rep.***8**, 1–9 (2018).Waples, R. S. & Feutry, P. Close-kin methods to estimate census size and effective population size.

*Fish Fish.***23**, 273–293 (2022).Beeton, B.

*et al.*Final Report Christmas Island Expert Working Group to Minister for the Environment.*Heritage and the Arts*(2010).Zichy-Woinarski, J. C., Burbidge, A. & Harrison, P.

*The action plan for Australian mammals 2012*. (CSIRO publishing, 2014).James, D. J., Dale, G. J., Retallick, K. & Orchard, K. Christmas Island flying-fox

*Pteropus natalis*Thomas 1887: an assessment of conservation status and threats.*Christmas Island National Park, Commonwealth of Australia*55 (2007).Woinarski, J. C.

*et al.*An island-wide monitoring program demonstrates decline in reporting rate for the Christmas Island flying-fox*Pteropus melanotus natalis*.*Acta Chiropterol.***16**, 117–127 (2014).Director of National Parks.

*Status of the Christmas Island flying fox: results from the 2016 island-wide monitoring program.*(2016).Dorrestein, A., Todd, C. M., Westcott, D. A., Martin, J. M. & Welbergen, J. A. Impacts of an invasive ant species on roosting behavior of an island endemic flying-fox.

*Biotropica***51**, 75–83 (2019).Pulscher, L. A.

*et al.*Evidence of chronic cadmium exposure identified in the critically endangered Christmas Island flying-fox (*Pteropus natalis*).*Sci. Total Environ.***766**, 144374 (2021).Tidemann, C. R. A study of the status, habitat requirements and management of the two species of bat on Christmas Island (Indian Ocean): final report.

*Zoology Department, Australian National University, Canberra, Australian Capital Territory, Australia*(1985).Threatened Species Scientific Committee. Commonwealth conservation advice for Pteropus natalis (Christmas Island flying-fox).

*Department of the Environment*1–15 (2014).Geyle, H. M.

*et al.*Quantifying extinction risk and forecasting the number of impending Australian bird and mammal extinctions.*Pac. Conserv. Biol.***24**, 157–167 (2018).Director of National Parks.

*Status Report–September 2014 Monitoring, Research and Management of the Christmas Island flying-fox (Pteropus melanotus natalis)*. (2014).Todd, C. M.

*et al.*Body-size dependent foraging strategies in the Christmas Island flying-fox: Implications for seed and pollen dispersal within a threatened island ecosystem.*Mov. Ecol.***10**, 1–15 (2022).Skaug, H. J. Allele-sharing methods for estimation of population size.

*Biometrics***57**, 750–756 (2001).Nichols, J. D., Hines, J. E. & Pollock, K. H. Effects of permanent trap response in capture probability on Jolly-Seber capture-recapture model estimates.

*J. Wildl. Manag.***48**, 289–294 (1984).Wegge, P., Pokheral, C. P. & Jnawali, S. R. Effects of trapping effort and trap shyness on estimates of tiger abundance from camera trap studies. in

*Animal Conservation Forum*vol. 7, pp. 251–256 (Cambridge University Press, 2004).Gallagher, A. J., Staaterman, E. R., Cooke, S. J. & Hammerschlag, N. Behavioural responses to fisheries capture among sharks caught using experimental fishery gear.

*Can. J. Fish. Aquat. Sci.***74**, 1–7 (2017).Feutry, P.

*et al.*Inferring contemporary and historical genetic connectivity from juveniles.*Mol. Ecol.***26**, 444–456 (2017).Conn, P. B., Bravington, M. V., Baylis, S. & Ver Hoef, J. M. Robustness of close‐kin mark–recapture estimators to dispersal limitation and spatially varying sampling probabilities.

*Ecol. Evol.***10**, 5558–5569 (2020).Bravington, M. V., Grewe, P. M. & Davies, C. R. Absolute abundance of southern bluefin tuna estimated by close-kin mark-recapture.

*Nat. Commun.***7**, 1–8 (2016).Wacker, S.

*et al.*Considering sampling bias in close-kin mark–recapture abundance estimates of Atlantic salmon.*Ecol. Evol.***11**, 3917–3932 (2021).Bradford, R.

*et al. A close-kin mark-recapture estimate of the population size and trend of east coast grey nurse shark. Report to the National Environmental Science Program, Marine Biodiversity Hub*. (CSIRO Oceans & Atmosphere, Hobart, Tasmania, 2018).Bravington, M.

*et al.*Close-Kin Mark-Recapture population size estimate of*Glyphis garricki*in the Northern Territory.*Report to the National Environmental Science Program, Marine Biodiversity Hub. CSIRO Oceans & Atmosphere, Hobart*(2019).Ruzzante, D. E.

*et al.*Validation of close-kin mark–recapture (CKMR) methods for estimating population abundance.*Methods Ecol. Evol.***10**, 1445–1453 (2019).Sansaloni, C.

*et al.*Diversity Arrays Technology (DArT) and next-generation sequencing combined: genome-wide, high throughput, highly informative genotyping for molecular breeding of Eucalyptus. in*BMC proceedings*vol. 5 1–2 (BioMed Central, 2011).Kilian, A.

*et al.*Diversity arrays technology: a generic genome profiling technology on open platforms. in*Data production and analysis in population genomics*67–89 (Springer, 2012).Melville, J.

*et al.*Identifying hybridization and admixture using SNPs: Application of the DArTseq platform in phylogeographic research on vertebrates.*R. Soc. Open Sci.***4**, 161061 (2017).Phalen, D. N.

*et al.*Genetic diversity and phylogeny of the Christmas Island flying fox (*Pteropus melanotus natalis*).*J. Mammal.***98**, 428–437 (2017).Todd, C. M., Westcott, D. A., Rose, K., Martin, J. M. & Welbergen, J. A. Slow growth and delayed maturation in a critically endangered insular flying fox (

*Pteropus natalis*).*J. Mammal.***99**, 1510–1521 (2018).Bravington, M. V., Miller, D. L. & Baylis, S. M. Kinference: pair wise kin-finding for close-kin mark-recapture. R package version 0.0.80 (2021).

Todd, C. M. The ecology and conservation of the Christmas Island flying-fox (

*Pteropus natalis*). (Western Sydney University, 2020).McDowell, J. R.

*et al.*Low levels of sibship encourage use of larvae in western Atlantic bluefin tuna abundance estimation by close-kin mark-recapture.*Sci. Rep.***12**, 18606 (2022).Corbett, L., Crome, F. & Richards, G. Fauna survey of mine lease applications and national park reference areas, Christmas Island, August 2002. Appendix G.

*Phosphate Resources Limited (eds.)*(2003).McCracken, G. F. & Wilkinson, G. S. Bat mating systems. in

*Reproductive biology of bats*321–362 (Elsevier, 2000).Carroll, J. B. & Mace, G. M. Population management of the Rodrigues fruit bat.

*Int. Zoo Yearbook***27**, 70–78 (1988).Rossiter, S. J., Ransome, R. D., Faulkes, C. G., Le Comber, S. C. & Jones, G. Mate fidelity and intra-lineage polygyny in greater horseshoe bats.

*Nature***437**, 408–411 (2005).Welbergen, J. A. The social organisation of the grey-headed flying-fox, Pteropus poliocephalus. (University of Cambridge, 2005).

Feutry, P.

*et al.*One panel to rule them all: DArTcap genotyping for population structure, historical demography, and kinship analyses, and its application to a threatened shark.*Mol. Ecol. Resour.***20**, 1470–1485 (2020).Divljan, A., Parry-Jones, K. & Wardle, G. M. Age determination in the grey-headed flying fox.

*J. Wildl. Manag.***70**, 607–611 (2006).Mayne, B.

*et al.*Nonlethal age estimation of three threatened fish species using DNA methylation: Australian lungfish, Murray cod and Mary River cod.*Mol. Ecol. Resour.***21**, 2324–2332 (2021).Lu, A. T.

*et al.*Universal DNA methylation age across mammalian tissues.*BioRxiv*2021.01. 18.426733 (2021).Patterson, T. A.

*et al.*Rapid assessment of adult abundance and demographic connectivity from juvenile kin pairs in a critically endangered species.*Sci. Adv.***8**, eadd1679 (2022).Sikes, R. S. & Gannon, W. L. Guidelines of the American Society of Mammalogists for the use of wild mammals in research.

*J. Mammal.***92**, 235–253 (2011).Gay, D. M. Usage summary for selected optimization routines.

*Comput. Sci. Tech. Rep.***153**, 1–21 (1990).R. Core Team. R: A language and environment for statistical computing. (2021).

Kristensen, K., Nielsen, A., Berg, C. W., Skaug, H. & Bell, B. M. TMB: Automatic differentiation and laplace approximation.

*J. Stat. Soft.***70**, 1–21 (2016).Wilks, S. S.

*Mathematical statistics*(1964).Rubin, D. B. The calculation of posterior distributions by data augmentation: Comment: A noniterative sampling/importance resampling alternative to the data augmentation algorithm for creating a few imputations when fractions of missing information are modest: The SIR algorithm.

*J. Am. Stat. Assoc.***82**, 543–546 (1987).

## Acknowledgements

We gratefully acknowledge the staff of Christmas Island National Park who supported this work in many direct and indirect ways, in particular Samantha Flakus, Brendan Tiernan and Kerrie Bennison. We are thankful for the support provided for field data collection and genetic data generation by the Director of National Parks, Western Sydney University, Taronga Conservation Society, and CSIRO Land and Water. We are grateful for the help of Libby Timmiss, Richard Askew, Zoia Wowk, Georgie Custance, Varsha Raj, Bryony Hitchcock, Hannah Carson, Laura Pulscher, Ariana Ananda, Sophie Hall, Ari Drummond and Lauren Bassett, who assisted with capture and tissue collection. We are grateful to Hien Nguyen for advice and discussion of the work. Support for additional genetic data generation was provided by CSIRO Data61 and The University of Adelaide. The study was funded by CSIRO, the Director of National Parks, and the Department of Agriculture, Water and the Environment.

## Author information

### Authors and Affiliations

### Contributions

D.A.W, N.A.M., E.L. and M.V.B. designed the research. C.T., A.D., J.M.M., D.P., J.H., K.R., J.A.W., and D.A.W. processed and/or collected samples. L.L.J, M.V.B, K.N.A, E.L., P.F., I.P., S.B. and D.A.W performed the research. L.L.J, K.N.A, I.P., and S.B. analysed the data. L.L.J, M.V.B, K.N.A and E.L. wrote the initial draft and all other authors provided additions and feedback. D.A.W and N.A.M. secured funding for the study, D.P., K.R., J.M.M., D.A.W., and J.A.W. secured and provided funding for the initial DArTseq analysis, and K.R. and J.A.W. for field work.

### Corresponding author

## Ethics declarations

### Competing interests

The authors declare no competing interests.

## Additional information

### Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Supplementary Information

## Rights and permissions

**Open Access** This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

## About this article

### Cite this article

Lloyd-Jones, L.R., Bravington, M.V., Armstrong, K.N. *et al.* Close-kin mark-recapture informs critically endangered terrestrial mammal status.
*Sci Rep* **13**, 12512 (2023). https://doi.org/10.1038/s41598-023-38639-z

Received:

Accepted:

Published:

DOI: https://doi.org/10.1038/s41598-023-38639-z

## Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.