Microsatellite analysis reveals low genetic diversity in managed populations of the critically endangered gharial (Gavialis gangeticus) in India

The gharial (Gavialis gangeticus) is a critically endangered crocodylian, endemic to the Indian subcontinent. The species has experienced severe population decline during the twentieth century owing to habitat loss, poaching, and mortalities in passive fishing. Its extant populations have largely recovered through translocation programmes initiated in 1975. Understanding the genetic status of these populations is crucial for evaluating the effectiveness of the ongoing conservation efforts. This study assessed the genetic diversity, population structure, and evidence of genetic bottlenecks of the two managed populations inhabiting the Chambal and Girwa Rivers, which hold nearly 80% of the global gharial populations. We used seven polymorphic nuclear microsatellite loci and a 520 bp partial fragment of the mitochondrial control region (CR). The overall mean allelic richness (Ar) was 2.80 ± 0.40, and the observed (Ho) and expected (He) heterozygosities were 0.40 ± 0.05 and 0.39 ± 0.05, respectively. We observed low levels of genetic differentiation between populations (FST = 0.039, P < 0.05; G’ST = 0.058, P < 0.05 Jost’s D = 0.016, P < 0.05). The bottleneck analysis using the M ratio (Chambal = 0.31 ± 0.06; Girwa = 0.41 ± 0.12) suggested the presence of a genetic bottleneck in both populations. The mitochondrial CR also showed a low level of variation, with two haplotypes observed in the Girwa population. This study highlights the low level of genetic diversity in the two largest managed gharial populations in the wild. Hence, it is recommended to assess the genetic status of extant wild and captive gharial populations for planning future translocation programmes to ensure long-term survival in the wild.

Genotyping quality and error rates. The polymorphic loci average amplification success rate was 96.09% for Chambal samples and 95.38% for Girwa samples. The quality index across the seven polymorphic loci was 0.86 ± 0.01 (mean ± SE) in Chambal and 0.89 ± 0.01 in Girwa (see Supplementary Figs. S1 and S2 online). The average allelic dropout (ADO), false allele (FA) and null allele frequency across polymorphic loci were below 5% in both populations (see Supplementary Table S2 online). We could not detect the occurrence of large allele dropout in our data.
Genetic variation. The cumulative probability of identity (PID biased) value of the panel of seven polymorphic markers was 1.74 × 10 -3 , and the probability of identity (PID sibs) was 4.84 × 10 -2 . We identified a total of 228 (Chambal = 162; Girwa = 66) distinct individuals using multilocus genotype data. The genetic diversity estimates for the Chambal and Girwa populations are summarised in Table 1. The number of alleles observed at each locus ranged from 2 to 7. The overall mean allelic richness was 2.80 ± 0.40, and the observed (Ho) and expected (He) heterozygosities were 0.40 ± 0.05 and 0.39 ± 0.05 across seven polymorphic loci, respectively. The mean allelic richness estimated using the rarefaction approach was 2.91 ± 0. 60  www.nature.com/scientificreports/ L(K) and delta K estimates (see Supplementary Fig. S3 online). All Girwa samples were assigned to cluster-I, with an average proportion of membership (q) of 0.97, and Chambal samples were assigned to cluster-II, with q = 0.82 (Fig. 2a). Genetic differentiation measures were low but significant as derived using both the fixation indices: F ST = 0.039 (P < 0.05); G' ST = 0.058 (P < 0.05) and allelic differentiation index Jost's D = 0.016 (P < 0.05). The discriminate analysis of principal component (DAPC) identified 20 genetic clusters associated with the lowest Bayesian information criterion (BIC) (see Supplementary Fig. S4 online). All the identified clusters (K = 20) showed an overlapping pattern with no clear structuring (Fig. 2b). Furthermore, the recent migration rate (m) was 0.12 (95% CI 0-0.26) from Chambal to Girwa River and 0.02 (95% CI 0-0.05) from Girwa to Chambal.  in Girwa, indicating a genetic bottleneck in both populations. The heterozygosity excess test (HET) using the stepwise mutation model (SMM) (Chambal P = 0.28; Girwa P = 0.08) and the two-phase model (TPM) (Chambal P = 0.18; Girwa P = 0.08) failed to detect the signature of a genetic bottleneck. We also obtained a normal L-shaped allelic distribution for both populations.
Mitochondrial DNA variation. We observed two mitochondrial CR haplotypes H1 in 58 individuals (92%) and H2 in five individuals (8%) with a single parsimony-informative site (at the 375-nucleotide position) in the Girwa population. Haplotype, H1 was shared by both the Girwa and Chambal populations 22 , whereas H2 was unique to the Girwa population. The haplotype diversity was (mean ± SD) 0.148 ± 0.057, and nucleotide diversity was 0.00029 ± 0.00011 in the Girwa population.

Discussion
This study is the first comprehensive genetic assessment of the critically endangered gharial and provides crucial baseline information essential for guiding the ongoing conservation efforts. The genetic diversity observed at seven polymorphic nuclear microsatellite loci was low in the gharial populations of Chambal and Girwa. Both populations had similar mean heterozygosities, and the allelic richness (Ar) was marginally higher in the Chambal population ( Table 1). The overall heterozygosities at nuclear loci in the wild (Ho = 0.40 ± 0.05; He = 0.39 ± 0.05) were lower than the previously reported estimates in captive (Ho = 0.92 ± 0.02; He 0.65 ± 0.02) gharial populations using the same set of microsatellite loci. The estimated mean allelic richness was also lower in the wild population (Chambal = 2.91 ± 0.60; Girwa = 2.48 ± 0.48) than that reported in captivity (5.5 ± 0.5) 23 . A low level of heterozygosity is not rare in wild crocodylians. Similar levels of heterozygosity have been reported in other crocodylians, including Alligator sinensis 24 , Alligator mississippiensis 25 , Caiman yacare 26 , Crocodylus siamensis 27 , Crocodylus mindorensis 28 , Crocodylus moreletii 29,30 , and Crocodylus palustris 31,32 , as well as in other vertebrate species that have experienced demographic bottlenecks 33 . Moreover, the Girwa population exhibits two and Chambal population exhibits a single mitochondrial CR haplotype 22 . The presence of extremely low variation at a hypervariable locus is surprising because, in other crocodylian species, the reported haplotypic diversity at mitochondrial CR is considerably higher [34][35][36][37][38] . The genetic diversity of a species is influenced by its geographic range, abundance, demography, and life-history traits. Narrow-ranging, less abundant and long-lived species with demographically challenged population history tend to have lower genetic diversity than widely distributed, abundant species with stable demographic history [39][40][41] . The low abundance, narrow distribution range, complex life-history traits, and unstable demography of the gharial may have resulted in the observed low level of genetic diversity. The coefficient of inbreeding (F), primarily measures the deviation from HWE (where F = 0 indicates that the population is at HWE) and is often only weakly correlated with inbreeding and fitness 42,43 . The estimate is likely to be affected by several factors including use of a limited number of loci, closely related individuals, genetic bottleneck, gene flow and admixture [44][45][46] . Additionally, populations that experience demographic bottlenecks do not necessarily become inbred. Our estimate of F ≈ 0, shows that the populations are at or near HWE.
Population genetic structure, differentiation, and migration. The presence of population genetic structure was supported by Structure (Fig. 2a) and genetic differentiation indices (F ST = 0.039, P < 0.05; G' ST = 0.058, P < 0.05 Jost's D = 0.016, P < 0.05) but not by multivariate analysis DAPC (Fig. 2b). The Bayesian approach implemented in Structure is a model-based approach that largely depends on the assumption of population genetics models. In contrast, DAPC is a model-free approach that does not rely on population genetics models. DAPC performs better than Structure in characterizing population clusters 47 . Therefore, we believe that the result of DAPC is more reliable and appropriate for our dataset. The observed low level of differentiation and an admixed population structure suggested by DAPC hint towards possible intermixing of gharial individuals during the translocation programme when eggs from both the Chambal and Girwa Rivers were reared in the Kukrail centre.
Furthermore, we observed a migratory relation between the two populations. This supports our hypothesis of possible mixing of individuals during the translocation programme. It is important to note that the observed migration rate from Chambal to Girwa was higher than that from Girwa to Chambal, which is possible due to large number of eggs were sourced from Chambal and later released in the Girwa River. We could not perform genetic structure, differentiation, and migration analyses using mitochondrial CR because of low variability in the analysed region.
Bottleneck. The estimates of the M ratio (Chambal = 0.31 ± 0.06; Girwa = 0.41 ± 0.12) were lower than the critical value (M C = 0.68), confirming the signature of genetic bottlenecks in both populations. In contrast, the HET failed to detect the signature of a genetic bottleneck. This discrepancy between the results could be due to the limited statistical power of the two analyses. The M ratio can effectively detect genetic bottlenecks for a greater amount of time (up to 50 generations), whereas HET can efficiently detect bottleneck that occurred within a short time, 0.2-4.0 Ne (effective population size) generations 48,49 . The gharial population has suffered two instances of range-wide population decline during the early-1970s and mid-1990s, when the population dwindled to less than 200 adult individuals in the wild 6,20 . The average generation time estimates for gharial individuals is 25 years 7 , suggested that the latter decline in population occurred very recently (approx. 1-1.5 generation ago). Hence, the undetectability of the bottleneck through HET could be due to the recent bottleneck event. Additionally, bottleneck detection can be masked due to various confounding factors including time of occurrence, duration, magnitude, gene flow, pre-bottleneck genetic variability, mating system, sample size and a limited number of loci [50][51][52] . Hence, there is a need to investigate both of these populations with extensive microsatellite loci to achieve high statistical power 49,53,54 .
In conclusion, low polymorphism was observed in species-specific and cross-species microsatellite loci; it remains crucial to develop novel microsatellite loci for population genetic studies. We acknowledge that our study might have been influenced by the inclusion of closely related individuals from each nesting site. However, the sampling was conducted from a large number of nests spread across 19 nesting sites. Therefore, we believe that our sampling strategy potentially represents the level of genetic diversity of the studied populations and minimises the bias arising from closely related individuals.

Conservation implication.
Gharial translocation is considered as one of the most successful species recovery programmes in the world 12 in terms of increasing the abundance of the species in the wild. However, despite a long history of conservation efforts, limited information on the genetic status of the gharial has hindered, if Scientific Reports | (2021) 11:5627 | https://doi.org/10.1038/s41598-021-85201-w www.nature.com/scientificreports/ not precluded, effective conservation planning. Gharial populations inhabit different biogeographic zones of India and are adapted to specific environmental conditions, yet translocation programmes have not considered the effects of genetic intermixing among populations. Such intermixing of populations without prior knowledge of the genetic structure can potentially threaten the genetic integrity and distinctiveness of the resident populations 14 . Our study augments the existing knowledge on the genetic status of the critically endangered gharial, which is vital for future translocations and research. This study highlighted low levels of genetic diversity and admixed structure in the two largest managed populations of gharial in the wild, which are considerably lower than the previously reported estimates of captive populations 23 . Moreover, the gharial is a highly conservation-dependent species, and translocations without assessing the genetic status of the source populations may further deteriorate the level of genetic diversity in the translocated populations. Hence, we recommend limiting the interpopulation release of individuals to prevent further intermixing of the gene pool until information on the genetic status of the extant wild and captive gharial populations using an extensive dataset of mitochondrial and microsatellite markers is available. Information on the genetic status will assist in the identification of potential source populations and maintain adequate levels of genetic diversity to secure the continuing persistence of gharials in the wild.

Sample collection and DNA extraction. Sampling of a large number of adult individuals required for
population genetic assessments has its own ethical and logistical constraints, especially when the species in question is critically endangered. Considering these restrictions, we collected tissue from the remains of dead hatchlings and chorioallantoic membranes from hatched eggshells shortly after hatching. Samples were collected from 16 nesting sites along the Chambal River and three nesting sites along the Girwa River during 2017 and 2018, respectively (Fig. 1). We collected 348 samples (Chambal = 232 samples and Girwa = 116 samples) for genetic assessment (see Supplementary Table S3 online). Out of 348 samples, 49 samples were unique (one sample per clutch), 138 samples were from a sibling (more than one sample per clutch), and 161 samples had no clutch information (see Supplementary Table S4 online). All samples were stored in absolute ethanol at room temperature and later at − 20 °C in the laboratory for long-term storage. Total genomic DNA was extracted from tissue (64 ± 19 mg) (mean ± SD) and chorioallantoic membrane using the phenol-chloroform method 55 . Cotton swabs were used when chorioallantoic membrane was unable to separate from eggshells.
Microsatellite selection, screening and genotyping. We initially screened 11 of the 18 gharial specific microsatellite loci described by Jogayya et al. 23 . However, only six of the eleven loci were found to be polymorphic. Hence, we conducted an exhaustive literature survey to identify the potential cross-species loci used across families Alligatoridae, Crocodylidae and Gavialidae. We found a total of 424 previously described microsatellite loci and listed the loci based on successful cross-species transferability, allele diversity, heterozygosity, polymorphic information content and allele range. Finally, we selected 16 cross-species loci developed in eight different crocodylian species for screening. Out of 16, a total of 12 loci were selected based on successful crossspecies transferability in other crocodilian species. The remaining four loci were selected based on a number of alleles ≥ 4, Ho and He ≥ 5, PIC ≥ 0.5 and allelic range < 300 bp. The list of microsatellite loci screened in the current study is provided in Supplementary Table S1 online.
Polymerase chain reactions (PCR) were performed in 10 μL reaction volumes containing 5 μL of 2 × QIAGEN Multiplex PCR master mix, 2 μL of 5 × Q-solution (QIAGEN Inc., Germany), and the labeled forward primer at 0.15 μM and the reverse primer at 0.15 μM and 2 μL of DNA template. The thermal profiles included an initial denaturation at 95 °C for 15 min, followed by 35 cycles at 94 °C for 40 s, Ta at 56-62 °C (see Supplementary  Table S1 online) for 60 s, 72 °C for 60 s and a final extension of 72 °C for 30 min. The amplified products were genotyped using the GeneScan 500 LIZ dye size standard (Applied Biosystems) in 3500XL Genetic Analyzer (Applied Biosystems). The alleles were scored using the program GeneMarker v2.7.4 (SoftGenetics, LLC) with combined automated allele scoring and validated through visual inspection. Three replicates of each sample were carried out to obtain reliable multilocus genotypes using noninvasive samples following a multiple-tube approach 56 .
Mitochondrial CR DNA sequencing. We selected a 520 bp partial fragment mitochondrial CR to assess the genetic variation in gharials. We used primers (L15637 5′-GCA TAA CAC TGA AAA TGT TAA YAT GG-3′ and H16258 5′-CTA AAA TTA CAG AAA AGC CGA CCC -3′) described by Oaks (2011) to amplify the selected fragment 57  Data analysis. Genotyping quality and error rates. We obtained a consensus genotype and estimated the quality index for each locus genotyped following Miquel et al. 56  www.nature.com/scientificreports/ was identical to the consensus and '0' if the genotype did not match the consensus due to any errors such as no amplification, allelic dropout, or false allele 56 . We estimated the average amplification success as a percent of positive PCR amplification. We estimated the frequency of genotyping errors due to allelic dropout (ADO) and false allele (FA) following Broquet et al. 58 . The frequency of null alleles was estimated using FreeNa 59 and occurrences of large allelic dropout using Micro-Checker v2.2.3 60 . We used samples with a quality index above 0.67 per locus (identical genotypes in two replicates out of three replicates) and a mean quality index across loci above 0.75 for analyses.
Genetic variation. We calculated the probability of identity PID (biased) and PID (sibs) using Gimlet v1.3.3 61 . We estimated summary statistics (number of alleles per locus, observed heterozygosity, expected heterozygosity, and inbreeding coefficient) using GenAlEx v6.0 62 and deviations from Hardy-Weinberg equilibrium (HWE) for each locus using Bonferroni correction in Cervus v3.0.7 63 . We also estimated allelic richness using a rarefaction approach implemented in HP-Rare v1.1 to account for the uneven sample size of the two populations 64 .
Population structure, differentiation, and migration. We have estimated the genetic differentiation using fixation indices F ST 65 , G' ST 66 and allelic differentiation index Jost's D 67 using the strataG package 68 implemented in R studio. The estimates were calculated using 10 3 bootstrap iterations. We inferred the population genetic structure using the Bayesian approach implemented in Structure v2.3.4 69 . Structure is a systematic model-based Bayesian clustering approach that uses allele frequencies at each locus to infer the population structure. The analysis was performed for 1-10 clusters (K). For each K, ten iterations were run under the admixture model with correlated allele frequencies and sampling location as a priori. The LOCPRIOR (sampling location as a priori) model performs well when no clear signal of a structure is detected or when there is low genetic differentiation, limited loci or a limited sample size 70 . The simulations were run for 10 5 burn-in and 10 6 Markov chain Monte Carlo iterations (MCMC). The optimum number of K was inferred using the likelihood distribution L(K) and the delta K 71 , which were estimated using the web version of Structure Harvester, v0.6.94 72 . The assignment plot was prepared using the program Distruct v1.1 73 . Additionally, we used the DAPC, a multivariate nonmodel-based approach, to identify and describe genetic clusters 47 . The optimal number of the clusters was estimated based on the lowest associated BIC. The analysis was performed using the adgenet in R studio.
We used BayesAss v3.0 to estimate the recent migration rate (m) between Chambal and Girwa populations 74 . The simulations (n = 3) were run with different seed numbers using 10 7 MCMC iterations and 10 6 burn-in periods.
Bottleneck detection. We estimated demographic changes using two qualitative approaches: (a) the Garza-Williamson index (or M ratio) implemented in Arlequin v3.1 75 and (b) the HET approach implemented in Bottleneck v1.2.02 76 . The M ratio estimates the ratio of the observed number of alleles to the size of allele range based on the assumption that in a recently reduced population, the ratio is expected to decrease due to random loss of alleles in a population. The calculated M ratio was then compared with the critical value (M C = 0.68). An M ratio below M C is considered as a signature of genetic bottleneck 77 . The HET approach assumes that in a recently reduced population, an excess of the gene diversity under Hardy-Weinberg equilibrium is expected relative to gene diversity under mutation-drift equilibrium. We used one-tailed Wilcoxon test to determine the presence of a significant number of loci with excess heterozygosity. The estimates were calculated under two mutation models: the SMM and TPM. The TPM tends to be the most appropriate mutation model for microsatellite loci 78 . The TPM was carried out at 95% SMM (variance at 12), and the simulations were run for 10 4 iterations 76 .
Mitochondrial DNA variation. We generated 63 mitochondrial CR sequences from the Girwa population and submitted it to GenBank (Accession No. MT500792-MT500854). We also obtained previously published mitochondrial CR sequences of the Chambal population (n = 103) from GenBank (Accession No. MT458816-MT458918) 22 . The sequences were aligned using the ClustalW algorithm 79 in BioEdit v7.2.6 80 . The summary statistics, including the number of haplotypes, haplotype and nucleotide diversity, were estimated using DnaSP v5.10.01 81 .

Data availability
The mitochondrial sequences used in this study have been submitted to GenBank (Accession Nos. MT500792-MT500854). The microsatellite genotyping data used in the study is available on request from the corresponding author.