Whole-genome resequencing of Chinese pangolins reveals a population structure and provides insights into their conservation

Wang, Qing; Lan, Tianming; Li, Haimeng; Sahu, Sunil Kumar; Shi, Minhui; Zhu, Yixin; Han, Lei; Yang, Shangchen; Li, Qian; Zhang, Le; Deng, Zhangwen; Liu, Huan; Hua, Yan

doi:10.1038/s42003-022-03757-3

Download PDF

Article
Open access
Published: 25 August 2022

Whole-genome resequencing of Chinese pangolins reveals a population structure and provides insights into their conservation

Qing Wang ORCID: orcid.org/0000-0003-4744-8729^1,2^na1,
Tianming Lan ORCID: orcid.org/0000-0002-6934-0439^2,3,4^na1,
Haimeng Li^1,2^na1,
Sunil Kumar Sahu ORCID: orcid.org/0000-0002-4742-9870²,
Minhui Shi^1,2,
Yixin Zhu^1,2,
Lei Han⁵,
Shangchen Yang⁶,
Qian Li^1,2,
Le Zhang⁵,
Zhangwen Deng⁷,
Huan Liu ORCID: orcid.org/0000-0003-3909-0931^2,3,4 &
…
Yan Hua ORCID: orcid.org/0000-0001-8316-3937⁸

Communications Biology volume 5, Article number: 821 (2022) Cite this article

4388 Accesses
9 Citations
2 Altmetric
Metrics details

Subjects

Abstract

Poaching and trafficking have a substantial negative impact on the population growth and range expansion of the Chinese pangolin (Manis pentadactyla). However, recently reported activities of Chinese pangolins in several sites of Guangdong province in China indicate a promising sign for the recovery of this threatened species. Here, we re-sequence genomes of 15 individuals and perform comprehensive population genomics analyses with previously published 22 individuals. These Chinese pangolins are found to be divided into three distinct populations. Multiple lines of evidence indicate the existence of a newly discovered population (CPA) comprises entirely of individuals from Guangdong province. The other two populations (CPB and CPC) have previously been documented. The genetic differentiation of the CPA and CPC is extremely large (F_ST = 0.541), which is larger than many subspecies-level differentiations. Even for the closer CPA and CPB, their differentiation (F_ST = 0.101) is still comparable with the population-level differentiation of many endangered species. Further analysis reveals that the CPA and CPB populations separate 2.5–4.0 thousand years ago (kya), and on the other hand, CPA and CPC diverge around 25–40 kya. The CPA population harbors more runs of homozygosity (ROHs) than the CPB and CPC populations, indicating that inbreeding is more prevalent in the CPA population. Although the CPC population has less mutational load than CPA and CPB populations, we predict that several Loss of Function (LoF) mutations will be translocated into the CPA or CPB populations by using the CPC as a donor population for genetic rescue. Our findings imply that the conservation of Chinese pangolins is challenging, and implementing genetic rescue among the three groups should be done with extreme caution.

Assessing genetic diversity in critically endangered Chieniodendron hainanense populations within fragmented habitats in Hainan

Article Open access 24 March 2024

Population genomics reveals how 5 ka of human occupancy led the Lima leaf-toed gecko (Phyllodactylus sentosus) to the brink of extinction

Article Open access 27 October 2023

Population genomic diversity and structure in the golden bandicoot: a history of isolation, extirpation, and conservation

Article Open access 08 October 2023

Introduction

Small populations particularly tend to be crashed due to loss of genetic diversity, the accumulation of deleterious mutations, and changes in genetic make-up resulting from genetic drift and increased inbreeding^1,2. Establishing gene flow between populations is an effective method to improve the fitness of isolated small populations by increasing genetic diversity and decreasing inbreeding, which is also known as the genetic rescue^3,4,5. For example, gene rescue substantially improved the fitness of small inbred populations in African lions (Panthera leo)⁶, mountain pygmy possum (Burramys parvus)⁴ and Florida panther (Puma concolor coryi)⁷. However, the making of suitable strategy for genetic rescue highly depends on comprehensive investigation of the genetic background of these small populations, including the genetic diversity, population differentiation, mutational load, gene flow, local adaptation, the extent of inbreeding and outbreeding, etc^2,8,9.

Molecular markers have ever been widely used in conservation genetics to guide the protection and conservation of threatened species, with the most promising markers of microsatellite and mitochondrial DNA^10,11,12. However, limitations of these molecular markers are also obvious, including the limited information to explain local adaptation, inbreeding/outbreeding depression and adaptive variations¹³. Bias introduced into parameter estimations due to the limited genetic information is also needed to be emphasized for the used of neutral markers^14,15, and the evaluation of genetic diversity of the giant panda (Ailuropoda melanoleuca) is a good example for this issue¹⁶. Fortunately, all the above-mentioned limitations could be much improved at the era of whole genome sequencing. With the rapid development of sequencing technology and the plummeting cost of re-sequencing, the conservation genetics is in transition to conservation genomics¹⁷. Recently, several genomic-based studies on investigation of endangered animals were reported, explaining many long-standing questions, such as the purging of deleterious mutations in Bengal tigers (Panthera tigris tigris)¹⁸, the local adaptation of giant panda¹⁹, the detailed inbreeding depression in kākāpō (Strigops habroptilus)²⁰, which are unthinkable without application of genome-wide markers.

Pangolins belong to the placental mammal order of Pholidota, representing one of the most unusual orders of mammals due to their overlapping epidermal scales, myrmecophagous diet, lack of teeth, as well as their extraordinarily elongated tongue^21,22. They play important roles in ecosystems, including predators of social insects, creators of burrows, hosts of endo- and ectoparasites, and also prey for other predators^23,24,25. Humans, on the other hand, have extensively killed and exploited pangolins as a luxury delicacy and used their scales in traditional medicines^26,27,28. Because of this overexploitation, all the eight extant Pholidota species have undergone severe population decline²⁹. Of all eight pangolin species, the Chinese pangolin (Manis pentadactyla) is one of the most threatened species with a ~94% decline in its whole population between the 1960s and 1990s, largely due to the extensively illegal trade in Asia^30,31. It has been classified as a critically endangered species in the IUCN Red List of Threatened Species and listed in ‘Appendix I’ of the Convention on International Trade in Endangered Species of Wild Fauna and Flora (CITES)³². This species was once distributed in the vast areas of the southern Yangtze River in China and some northern areas of Southeast Asia^30,33,34. However, new burrows have not been seen for more than 20 years in some areas of the Dawuling Natural Reserve and Luofushan Natural Reserve in Guangdong province^35,36. Moreover, the Chinese pangolin might have been extirpated from several areas, including Jiangsu, Shanghai, Henan, etc^37,38.

A large majority of research on pangolins is restricted to ecological study or genetic study but with DNA markers limited at mitochondrial DNA and microsatellite fragment^38,39,40. Recently, Hu et al.⁴¹ reported the population identities of illegally traded individuals, revealed population fluctuations, and an increase in inbreeding and mutation load in Chinese pangolin populations, providing valuable information for the conservation of this species. Further, great progress has been made towards the conservation of Chinese pangolin, with several Chinese pangolins being photographed by infrared cameras in the wild in at least six cities of Guangdong province since 2020⁴², which was a promising sign for the rescue and recovery of this species. Although, Hu et al.’s study included samples from Taiwan⁴³, Yunnan province, and some unknown geographical locations, more samples from Guangdong province are needed for conjoint analysis to reveal more comprehensive genomic make-ups of this critically endangered species.

The planning of genetic rescue necessitates a thorough assessment of genetic background from a variety of perspectives, which is the prime objective of our research to aid in the continued conservation of Chinese pangolins. In this study, we intensively investigated the genomic characteristics of Chinese pangolins from Guangdong province. Together with previously published genomes, we revealed the population structure of the Chinese pangolins, and extensively explored the genetic diversity, population differentiation, mutational load, gene flow and local adaptations of each population. Our study provides a valuable resource and insights for the future conservation and conservation of this endangered species.

Results

Characteristics of sequencing data and variants

The sequencing depth of 15 Chinese pangolin individuals ranged from 13.17 X to 22.85 X, with an average depth of 17.03 ± 2.43 X (Supplementary Table 1). Together with the published genome sequencing data from other 22 individuals, we obtained 40,855,034 raw single-nucleotide polymorphisms (SNPs) for the whole pangolin population. After deep filtration, 35,023,399 SNPs were retained for further analysis. As expected, most of the SNPs were distributed in the intergenic regions. The distribution of SNPs against the minor allele frequency (MAF) showed that the proportion of SNPs with a minor allele frequency less than 5% (20,155,038 SNPs) was 49.3%, representing a large proportion of low-frequency variants in the pangolin population (Supplementary Fig. 1). In addition, we also observed significant variability in the SNP number of different populations (P = 7.5 × 10⁻¹¹ ~ 1.1 × 10⁻¹²) (Supplementary Fig. 2).

Genetic relationships among Chinese pangolin populations

We first constructed phylogenetic trees with nuclear genomes and mitochondrial gene sequences and confirmed that these Guangdong individuals are Chinese pangolins (Supplementary Figs. 3–4). We then extensively explored the genetic relationships of 37 Chinese pangolins in the whole-genome level. In general, results from principal component analysis (PCA), admixture and phylogenetic tree supported that CPA, CPB and CPC populations were assigned to three distinct clusters (Fig. 1b–d). Fifteen individuals from Guangdong province were distributed into two clusters, with one formed by only Guangdong individuals (CPA: n = 11) and the other one (CPB: n = 18) consisting of individuals from Guangdong (n = 4), Taiwan (n = 1) and Yunnan (n = 13). CPC population (n = 8) was made up of 8 individuals who might have originated from Myanmar⁴¹.

**Fig. 1: Distribution and population structure of Chinese pangolins in southern China.**

Genetic differentiation and Gene flow among populations

Genome-wide population differentiation further revealed that the CPA was distinct from the CPC with an extremely high fixation index (F_ST) value of 0.541. The F_ST between the closer CPA and CPB populations was still larger than 0.1 (Supplementary Fig. 5). The F3 statistics also supported that none of these three pangolin populations was admixed by the other two populations with positive scores in each combination (Fig. 2a).

**Fig. 2: Genetic differentiation and gene flow analysis among Chinese pangolins.**

To explore possible genetic exchanges among these Chinese pangolin populations, we performed ABBA-BABA test to quantify the shared derived alleles between populations. We set the Malayan pangolin (Manis javanica) as the Y (the outgroup). A, B and X were from all possible combinations of the CPA, CPB and CPC populations. Interestingly, when the X population was set to be the CPC population, we were unable to detect any significantly unbalanced derived allele sharing, supporting the large differentiation between CPC and other two populations (Fig. 2b). We also performed a TreeMix analysis to further quantify gene flow among populations. However, no migration events found among the three populations with tests from m = 1 to m = 10 (Fig. 2c).

Further, identity-by-decent (IBD) analysis showed no large segments (>1 Mb) among populations. For medium-size (100 kb < IBD < 1 Mb) segments, we identified 0.81 Mb shared between CPA and CPB, 0 Mb shared between CPA and CPC, and 4.44 Mb shared between CPB and CPC (Supplementary Fig. 6). However, much more and larger IBD segments were found within populations (Supplementary Fig. 7). The total length of IBD in CPA, CPB and CPC were 251.75 Mb (10.49%), 51.45 Mb (2.14%) and 48.38 Mb (2.02%), respectively.

Population separation among CPA, CPB and CPC populations

Considering multiple lines of evidence from population structure and gene flow, we inferred that the separations between CPA and CPB were more recent than that of CPC and CPA/CPB. As we expected, the relative cross coalescence rate (RCCR) estimation showed that the separation between CPA and CPB was 2.5 to 4.0 thousand years ago (kya), which was much more recent than that between CPC and CPA (25–40 kya), and CPC and CPB (25–40 kya) (Fig. 3a). Considering the possible phasing errors which could influence the accuracy of the MSMC2 method, we further performed SMC++ to validate results from MSMC2. Again, we found that the CPA and CPB populations were separated at ~ 3.4 kya, CPC was separated from CPA and CPB at ~ 25 kya (Fig. 3b), which supported results from MSMC2.

**Fig. 3: Population separation inferences in each pair of Chinese pangolin populations.**

Genetic diversity and inbreeding

The richness of genetic diversity in a population reflects its evolutionary potentials to adapt to environmental changes⁴⁴. The average heterozygosity (He) among all the individuals was 0.18% calculated by all SNPs. This value decreased to 0.15% when low-frequency alleles were excluded (allele frequency (AF) < 5%). To better compare our results to the previous report, we also calculated the average He (0.11%) using SNPs with AF > 20% (Supplementary Fig. 8a). Similar patterns were also found in nucleotide diversity (π), with the CPB population having the highest level of genetic diversity (Supplementary Fig. 8b).

By using the same method from Hu et al.⁴¹, we found 77,251 runs of homozygosity (ROHs) ranging from 100.0 kb to 3155.8 kb in the whole Chinese pangolin population. Of these ROHs, 448.58 Mb (F_ROH = 18.69%) and 21.57 Mb (F_ROH = 0.90%) were assigned to medium-size (100 kb–1 Mb) and long ROH ( > 1 Mb) (Fig. 4c, d; Supplementary Figs. 8c, d). The medium-size ROHs in CPA, CPB, and CPC population was 469.38 Mb (F_ROH = 19.56%), 451.63 Mb (F_ROH = 18.82%), and 413.52 Mb (F_ROH = 17.23%), respectively. The long-size ROHs in CPA, CPB, and CPC population was 29.54 Mb (F_ROH = 1.23%), 22.28 Mb (F_ROH = 0.93%), and 9.11 Mb (F_ROH = 0.38%), respectively.

**Fig. 4: Genetic diversity and ROH distribution in the Chinese pangolin genome.**

Mutational load

On average, we found 266 ± 55 deleterious nonsynonymous SNPs (nsSNPs) across all three populations (Supplementary Fig. 9a). Individuals in the CPC population (169 ± 15) exhibited a significantly smaller number of deleterious nsSNPs when compared to CPA (287 ± 15; P = 1.7 × 10⁻¹⁰) and CPB (298 ± 20; P = 8.5 × 10⁻¹²) populations. However, there was no significant difference in the ratio of deleterious nsSNPs to total nsSNPs across all populations (P = 0.8~0.9; Supplementary Fig. 9b). We also considered the number of Loss of Function (LoF) variants in each population (from 785 to 1250; Supplementary Fig. 9c). The number of LoF variants in CPC population was significantly lower than CPA and CPB populations (P = 1.4 × 10⁻⁷ and P = 5.5 × 10⁻⁹). However, the CPC population had a significantly higher ratio of LoF variants to total nsSNPs than CPA and CPB populations (P = 8.5 × 10⁻⁸ and P = 2.9 × 10⁻⁸; Supplementary Fig. 9d).

We further evaluated the level of mutational load by measuring the number of homozygous sites to the homozygous and heterozygous sites (hereafter ratio) for LoF mutation, missense mutation, and deleterious missense mutations. We found that this ratio of LoF mutation, missense mutation, and deleterious missense mutations in the 36 Chinese pangolins was 0.505, 0.482 and 0.432, respectively (Fig. 5a). The ratio of LoF mutation (0.545, 0.522), missense mutation (0.575, 0.540), and deleterious missense mutations (0.533, 0.485) in CPA and CPB population were very similar, but significantly higher than that in the CPC population (0.412 for LoF mutation, 0.232 for missense mutation, and 0.178 for deleterious missense mutations) (Fig. 5a, Supplementary Table 4).

**Fig. 5: Mutation load estimation for the pangolin population.**

We further evaluated the number of potentially introduced LoF mutations to the receiving population from the donor population. We found that CPB individuals would introduce 120–238 LoF mutations and CPC individuals would introduce 336–381 LoF mutations when CPA was regarded as the receiving population. The minimum LoF mutations (103–135) would be introduced to the CPB population from the CPA population. However, this number rise to 393 to 555 by introducing CPA and CPB individuals to the CPC population (Fig. 5b).

We then analyzed the 2229, 1044, 1574 and 962 genes affected by the LoF mutations in the all 36 individuals, CPA, CPB and CPC populations. In general, significantly enriched Gene Ontology (GO) categories and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways indicated that these genes were associated with male gamete generation (GO: 0048232; P = 3.8 × 10⁻¹³), visual perception (GO:0007601; P = 8.8 × 10⁻¹⁰) and protein digestion and absorption (KEGG: hsa04974; P = 3.0 × 10⁻⁶) (Supplementary Tables 5–8). The KEGG pathway “protein digestion and absorption (P < 0.01)” was found to be significantly enriched in all three populations. In addition, the Insulin signaling pathway (KEGG: hsa04910; P = 6.4 × 10⁻⁵), cellular carbohydrate metabolic process (GO: 0044262; P = 1.9 × 10⁻⁴) and response to fructose (GO:0009750; P = 2.0 × 10⁻⁴) were found in the CPB population. In particular, we did not find enriched gametogenesis-related pathways, such as male gamete generation, in the CPC population.

Recent positive selection in Chinese pangolin populations

A total of 2218 (CPA), 3154 (CPB) and 1878 (CPC) SNPs were identified under strong positive selection with iHS scores in the top 0.1% (Supplementary Fig. 10). By filtering out SNPs in the non-gene regions, 176, 299 and 277 genes were located in the positively selected regions for CPA, CPB and CPC populations, respectively. Eleven genes were shared among these three populations, but 130, 226 and 219 genes were unique to CPA, CPB and CPC populations, respectively. Enrichment analysis on these selected genes showed that 14, 13 and 16 GO categories or KEGG pathways were unique for CPA, CPB and CPC populations, respectively. These categories were correlated with various biological functions including cellular processes, biological regulation and metabolic processes (Supplementary Fig. 11).

Discussion

Here we report the, to the best of our knowledge, first genetic survey at the whole genome scale for the Chinese pangolin from Guangdong province of China. We thoroughly examined the population structure of Chinese pangolins by combining Guangdong samples and published sequences, which presumably cover the majority of the distribution area of this species. We systematically investigated genomic make-ups of long-term declined Chinese pangolins, especially for the Guangdong population, including the genetic diversity, ROH and mutational load assessment, which is expected to support the future conservation and genetic rescue of this species.

Although two pangolin individuals in Hu et al.’s study (MP09 and MP10) were mentioned to be possibly from Guangdong province, the detailed population structure is hard to be further described due to the ambiguous sampling site and small sample size⁴¹. Benefiting from the relatively large sample size collected from several sites in Guangdong province, we defined a new Chinese pangolin population, CPA, in our study from the prospective of genomics, with multiple lines of evidence (Fig. 1b–d). Negligible gene flow further strengthens the independent state of the CPA population (Fig. 2).

A very large genetic differentiation between this CPA and CPC (F_ST = 0.541) population suggested a potential subspecies-level divergence when compared with other animals. For example, the Sichuan and Qinling giant panda subspecies was defined in 2005 by the skull and molar size⁴⁵, and their genomic differentiation is much less than the CPA and CPC populations (F_ST = 0.140)¹⁶. The differentiation of tiger subspecies could be dating back to 110 kya⁴⁶, however, the genomic differentiations among tiger subspecies are still much lower than these two Chinese pangolin populations, with the highest F_ST value between Amur tiger and Sumatran tigers is of 0.318⁴⁷. This was not surprising, because the CPC population was significantly distinct from CPA and CPB from the perspective of genetics (Supplementary Fig. 5), which was also found in Hu et al.’s study with the F_ST of 0.495 between MPA and MPB⁴¹.

Interestingly, the genomic differentiation between CPA and CPB was moderate with a F_ST value of 0.101. Although this differentiation was much lower than that between CPA and CPC, this extent was comparable with differentiations between populations with extensive geographic separations, like the divergence of African leopard (Panthera pardus) populations with the average F_ST value of 0.104⁴⁸, the two largest isolated populations (Minshan and Qionglai populations) of the giant pandas¹⁶, and even comparable to that between African and Asian humans (F_ST = 0.120)⁴⁹ which are known to be separated before ~ 55 kya⁵⁰. Taken together, we infer that the first and newly discovered CPA was a distinct Chinese pangolin population, and suggested an extra conservation unit that is parallel with CPB and CPC population. We believe the future large-scale population genomic analysis and corresponding ecological and morphological studies will further enrich and strengthen this claim.

We inferred the separation between CPA and CPB was much more recent than that between CPC and CPA/CPB (Fig. 3a), which is logically consistent with population structure analysis. The separation between CPC and CPA/CPB was coincided closely with the last glacial period (LGP, ∼10–120 kya), especially the beginning of the last glacial maximum (LGM, ∼10–50 kya)^51,52. The extremely cold weather and the possibly continuous impact on the prey of Chinese pangolins during this period could be one of the reasons for population separation, considering that southern China was also severely influenced by the LGP⁵³. However, climate changes may not be the main reason for population separation^19,54,55. Human activity is always considered an important factor for shaping the evolution of animals^9,36,56. The human expansion in East Asia occurred as early as 40 kya⁵⁷, and frequent migration, admixture and replacement also occurred at the original distributed area of the Chinese pangolins⁴¹. Interestingly, the Han Chinese population began to expand and separated from the European human population at ∼30 kya^58,59. Moreover, the key period in human history was the onset of the Holocene since the development of more favorable climate conditions⁶⁰. We concluded that human activity was most likely responsible for the separation of the CPA and CPB populations.

Regarding the genetic status of Chinese pangolin in this study, we observed that genome-wide He among the three populations are similar (0.107% ± 0.086%; P = 0.058~0.755), but lower than some other critically endangered species that are suffering from long-term population decline as the Chinese pangolin, such as Northern white rhinoceros (Ceratotherium simum cottoni) (0.110%) and Western lowland gorilla (Gorilla gorilla gorilla) (0.144%)^2,61. Such a low level of genetic diversity suggested that Chinese pangolins likely had a relatively low adaptive potential⁶². The inbreeding often contributes a lot to the decrease of genetic diversity for endangered species, such as the ROH segments of highly inbred grey wolves (Canis lupus) population ranging from 2,695 bp to 95.8 Mb⁶³ and the modern Malay Peninsula population of Sumatran rhinoceros (Dicerorhinus sumatrensis) estimating 30% of the genomes contain longer ROH segments (≥ 2 Mb)⁸. However, ROHs in the Chinese pangolins are found to be much shorter than the above-mentioned endangered species. We infer that the low genetic diversity in the Chinese pangolin may not from the extensively recent inbreeding due to the absence of large size ROHs⁶⁴, but the long-term isolation without frequent gene flow between subpopulations could be an alternate explanation. Here we cannot exclude the bias introduced from the reference genome, because the accurate evaluation of ROH highly depends on the contiguity of the reference genome⁶⁵, but no long-read assembled high-quality reference genomes are available now.

The decrease of fitness in a small population can accelerate by the accumulation of deleterious mutations^66,67. When focusing on Guangdong populations, the ratio of deleterious mutations in CPA and CPB populations was higher than CPC population. LoF mutations enriched in pathways related to male gamete generation in the CPA and CPB populations, but not CPC population, further indicated more genetic burdens in Guangdong populations, considering the impaired gamete quality may reduce reproductive success and further affect species fitness and survival⁶⁸. Together with the overall distribution of deleterious mutations, we speculated that the CPC population has higher fitness than both CPA and CPB populations.

LoF mutations were also found significantly enriched in the protein digestion and absorption pathway (P < 0.01) in all the three populations, revealing a potentially weak ability of digestion and absorption of protein in the whole Chinese pangolin population. In the CPB population, we also found enriched KEGG pathways related to carbohydrates metabolism, indicating an inferior performance in the energy supply. Considering that pangolins enjoy high protein, high fat, high calorie food^69,70, the possibly weakened ability to use the protein and carbohydrates may further lower the fitness of the Chinese pangolins.

If a certain species has several isolated populations under different environments, local adaptation in different directions could occur in different populations⁷¹. Genetic rescue by populations with large differentiation shaped by natural selection tends to be disturbed by outbreeding depression⁷². To better identify the local adaption in each population, we performed enrichment analysis of genes under recent natural selection in the three populations. A large proportion of enriched pathways were unique for the three populations (Supplementary Fig. 11), indicating an obvious adaptive difference in CPA, CPB and CPC populations. It was worth noting that much of the Yunnan province lies within the subtropical highland, while Guangdong province faces the South China Sea to the south with a humid subtropical climate⁷³. We inferred that the local adaptation in a different direction might have occurred in these three populations. However, more research works are needed to explore how these genes under natural selection affect the survival of this species.

In addition, we predicted an average of 309 LoF mutations could be introduced into other populations as new deleterious alleles when conducted translocations among CPA, CPB and CPC populations (Fig. 5b). The number was several dozen times higher than that predicted among three Sumatran rhinoceros populations (an average of 10 new LoF variants)⁸. We also noticed that more LoF variants could be mutually introduced between CPA/CPB and the CPC population, which is consistent with the result from local adaptation, possibly due to the long-term isolation and adaptation to their local habitats (Yunnan and Guangdong). We cannot conclude that it is 100% not suitable to implement genetic rescue among these Chinese pangolin populations, however, genetic rescue between CPA and CPB should be safer from the perspective of outbreeding depression. The conclusion will be more convincing only if the exact effects of these LoF mutations on genes could be parsed in future works.

We should keep in mind that genetic rescue is a complicated process^4,5,74, hence more reliable evidence is still needed to be put forward in further studies. Furthermore, as a distinct population separated from the CPB and CPC, the newly discovered CPA Chinese pangolin population in our study should be given high priority in future conservation works of the Chinese pangolin.

Methods

Samples and data collection, library preparation, and sequencing

Approvals of all necessary research ethics and permits were granted by the Institutional Review Board of BGI (BGI-IRB E21056). 15 Chinese pangolin samples were collected from the Guangzhou Wildlife Rescue Center, Guangdong province, China for whole genome sequencing. All samples were taken from rescue individuals dead of natural causes or confiscated individuals from forestry police, and all individuals were identified as Chinese pangolins through distinct morphological characteristics. In addition, sequencing data of 22 Chinese pangolin individuals (BioProject: PRJNA529540 and PRJNA20331; Supplementary Table 2) and 72 Malayan pangolin individuals were downloaded from NCBI (BioProject: PRJNA529540; Supplementary Table 3) for downstream analysis in this study.

Genomic DNA was isolated using standard phenol/chloroform-isoamylalcohol extraction⁷⁵ and the precipitate was solved in 20–100 μl distilled water. ~1 μg genomic DNA was used and sheared into fragments with 200 to 800 base pairs (bp) for paired-end (PE) DNA library construction with the insert size of ~350 bp following the manufacturer’s instructions of BGISEQ platform (BGI, Shenzhen, China). DNA libraries were then subjected to the DNBSEQ-T1 sequencer for sequencing.

Genome mapping, variants calling and filtering

Sequencing reads were mapped to M. pentadactyla reference genome (YNU_ManPten_2.0, Genbank: GCA_014570555.1) and M. javanica (YNU_ManJav_2.0, Genbank: GCA_014570535.1)⁴¹ by using the Burrows-Wheeler Aligner (BWA) with the mem algorithm⁷⁶ (version: 0.7.10–r789) using the default parameters. Sorting, reordering and reads deduplication were performed by the Picard tools (http://picard.sourceforge.net) (version: 2.1.1). HaplotypeCaller implanted in the Genome Analysis Toolkit (GATK, version: 3.3-0-g37228af)⁷⁷ was used for raw variants calling. Hard filtering was performed on the raw SNP variant set with parameters of “QUAL < 30.0 || QD < 2.0 || FS > 60.0 || MQ < 40.0 || MQRankSum < −12.5 || ReadPosRankSum < −8.0”. The final high-quality SNP set was used for further genetic analyses. Lastly, SNPs in scaffolds with length smaller than 100 kb were excluded.

SNPs were annotated by the software ANNOVAR⁷⁸ (version: 2015-12-14) with the genome of M. pentadactyla reference genome (YNU_ManPten_2.0). We sorted out the basic information of the resequencing data of each population (CPA, CPB and CPC) and grouped SNPs in different categories by results from ANNOVAR, including exotic, nonsynonymous, synonymous, UTR, intronic, intergenic, splicing, and non-coding RNA (ncRNA). In addition, we draw the distribution of MAF for all individuals. We prepared three SNP sets by controlling the filtration on the allele frequency for downstream analysis, including: 1) SNPs without any filtration of allele frequency; 2) SNPs with the allele frequency larger than 5%; 3) SNPs with the allele frequency larger than 20% following the method of Hu et al.⁴¹.

Phylogenetic tree, PCA and admixture analysis

To avoid closely-linked sites, PLINK⁷⁹ (version: 1.90b3.38) was used to produce a pruned subset of SNPs by Linkage disequilibrium (LD) values, resulting in a set of 171,405 SNPs. This SNP set was then converted into PHYLIP-format file by using an in-house Python script to construct the Maximum-Likelihood (ML) phylogenetic tree with 1000 replications by the PhyML⁸⁰ (version: 20151018) with all 37 samples. The Malayan pangolin (YNU_ManJav_2.0) was used as an outgroup.

For species identification, the Chinese pangolin genome (YNU_ManPten_2.0) and Malayan pangolin genome (YNU_ManJav_2.0) was regarded as the reference genome for 109 pangolins (37 Chinese pangolin individuals and 72 Malayan pangolin individuals), respectively. We then constructed two ML phylogenetic trees according to the above methods. We also extracted the cytochrome b (Cyt b) and cytochrome c oxidase subunit 1 (CO1) sequences from Felis catus: NC_001700.1 and 8 species of Manis (Manis tricuspis: NC_026780.1, Manis tetradactyla: MG196299.1, Manis gigantea: MG196303.1, Manis temminckii: KP306516.1, M. pentadactyla: MG196307.1, Manis crassicaudata: NC_036433.1, Manis culionensis: NC_036434.1, and M. javanica: KT445979.1), 37 Chinese pangolin individuals and 72 Malayan pangolin individuals. ML trees with 1000 replications based on mitochondrial gene sequences were constructed using Molecular Evolutionary Genetics Analysis (MEGA, version: 10) with the Kimura’s two-parameter model⁸¹.

All SNPs without pruning and AF filtering were used as the initial input data for PCA and ADMIXTURE analysis. The Genome-wide Complex Trait Analysis⁸² (GCTA, version: 1.92.2) software was used for PCA inference. The population genetic structure was then inferred by using the same dataset as PCA analysis with the ADMIXTURE⁸³ (version: 1.3) program. We predefined the number of genetic clusters (K) from 1 to 10 and ran the cross-validation error (CV) procedure to explore the best K, using the default parameters and settings.

Population differentiation and gene flow

We used Weir and Cockerham’s F_ST⁸⁴ to estimate the population differentiation. All bi-allelic SNPs were used for the calculation of genome-wide F_ST between each pair of the populations using the VCFtools (version: 0.1.13) software⁸⁵.

In order to explore whether one population was admixed by the other two populations, we performed the F3 test with combinations of CPA, CPB and CPC using the ‘qp3Pop’ in the ADMIXTOOLS⁸⁶ (version: 5.1). The EIGENSTRAT format input data was generated by CONVERTF program in the ADMIXTOOLS.

To examine the excess of shared derived alleles between different populations of Chinese pangolins, we applied the classic ABBA-BABA test (D statistics)⁸⁷ using the “-informative” command of the POPSTATS⁸⁸. The four populations were set to be ((A, B), (X, Y)), and the A, B and X were among each of CPA, CPB, and CPC. The Y was Malayan pangolin which was the outgroup. We screened significant D values using the Z-score (|Z | > 3) based on a block jackknife procedure.

The sharing of identity by descent (IBD) between individuals was calculated by the RefinedIBD⁸⁹ (version: 16May19. ad5) with analysis parameters (length = 0.1 and lod = 3.0). We compared different lengths of IBD (IBD > 1 Mb or 100 kb < IBD < 1 Mb) and their percentage on the genome among populations to evaluate the degree of gene flow.

We constructed maximum-likelihood population trees using TreeMix⁹⁰ (version: 1.13) to investigate the phylogenetic relationship in the presence of admixture events among populations. TreeMix was run with the parameters -bootstrap 5000 -global and the migration event -m (from 0 to10).

Population separation among pangolin populations

We first used MSMC2⁵⁹ (version: 2.1.2) to infer the separation among the three populations. MSMC2 was performed for four independent replications with two samples randomly selected from each population. Genotype phasing was using the Beagle⁹¹ (version: 5.0) software with default parameters before the MSMC2 inference. Parameters for MSMC2 calculations were as follow: -skipAmbiguous -I 0-4, 0-5, 0-6, 0-7, 1-4, 1-5, 1-6, 1-7, 2-4, 2-5, 2-6, 2-7, 3-4, 3-5, 3-6, 3-7 -i 20 -t 6 -p ‘10*1 + 15*2’. To further validate the results inferred by the MSMC2, we performed the SMC++ ⁹² (version: 1.5.2) to infer the split time of the two populations, because the SMC++ software does not depend on phasing data, which can avoid calculation bias introduced by switch errors during phasing analysis. The mutation rate and generation interval of the M. pentadactyla we used here was 1.47×10⁻⁸ ^41,43 per site per generation and one year^41,93.

Genetic diversity and ROH analysis

VCFtools was used to estimate whole-genome genetic diversity, including heterozygosity (He) and nucleotide diversity (π)⁹⁴. ROH was identified by PLINK following the method described by Dobrynin et al.⁹⁵. Long ROH is often the indicator of recent inbreeding that occurred several decades ago. According to the formula reported by Kardos et al. (Generations = 100/2 * ROH _length)^41,63, we only counted ROHs that were larger than 100 kb in this study. If ROHs were longer than 1 Mb, we assumed that these ROHs were generated by more recent inbreeding (< 50 years). The genetic diversity, ROH and mutation load of Taiwan individuals have been studied in the study by Hu et al.⁴¹. Therefore, Taiwan individuals were excluded from this part and mutational load analysis.

Mutational load

We used the genotypes of the same alleles in the M. javanica to represent the ancestral state before identifying derived mutational loads. A deleterious mutation we used here means that an amino acid change in a protein was predicted to be harmful to the function, which becomes the main genetic basis of inbreeding depression⁹⁶. The deleteriousness of derived mutations was diagnosed using the Grantham Score (GS)⁹⁷. Here, nsSNPs with GS ≥ 150 were defined as deleterious mutations^97,98. We used “-aamatrixfile grantham matrix” parameter in the package ANNOVAR to print out GS for nonsynonymous variants. We counted the number of deleterious nsSNPs and the ratio of deleterious nsSNPs to total nsSNPs. Moreover, we selected derived mutations in coding regions of each pangolin individual for annotation by SnpEff⁹⁹ (version: 4.3). LoF variants here we used included splice_donor_variant, splice_acceptor_variant and stop_gained. Numbers of LoF variants and ratios of LoF variants to total nsSNPs in Chinese pangolin populations were counted. Missense mutations were represented by missense_variant. The ratio of homozygous (two per site) to (homozygous (two per site) plus heterozygous sites (one per site)) for all LoF, missense and deleterious variants were calculated for estimating the level of mutational load⁴⁴.

GO and KEGG functional enrichment of genes affected by LoF mutations was performed using Metascape website (Last modified January 1, 2022)¹⁰⁰. The GO terms and KEGG pathways with an enrichment factor > 2 and a multi-test adjusted P-value < 0.05 were considered to be significantly enriched. P-values and multi-test adjusted P-values were transformed with log base 10.

Recently natural selection in populations

We computed the iHS¹⁰¹ (version: 1.3) to identify genomic signatures of positive selection in the CPA, CPB and CPC populations. The iHS calculations were performed independently in each population. As the genetic distance between adjacent SNPs was needed for the calculation, a chromosome segment of 1 Mb was straightly converted as 1 centiMorgan (cM).

For the identification of candidate genes, SNPs within the top 0.1% iHS scores were assigned as candidate sites. Based on candidate sites, we then used three methods to screen candidate regions described as Voight et al.¹⁰¹: a) regions of consecutive 50 SNPs; b) regions of 100 kb with 50 kb step size; c) 5 kb flanking regions away from candidate sites. We then calculated the sum of the iHS scores (siHS score) of all candidate sites in each genome region, and candidate regions were selected with the top 10% siHS score. The intersection of candidate genes obtained by three region-selected methods were used for the next analysis¹⁰¹. For the candidate genes, GO and KEGG functional enrichment was performed using Metascape website¹⁰⁰. The GO terms and KEGG pathways with an enrichment factor > 2 and a multi-test adjusted P-value < 0.05 were regarded as significantly enriched.

Statistics and reproducibility

To test the significant difference of ROH and mutation load between different populations, two-sided Welch two-sample t-tests were performed in R¹⁰² (version: 4.0.2). P-value less than 0.05 was considered to be significant. All statistics was done using available packages and reproducibility can be accomplished using parameters we mentioned in Methods.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability

All data generated or analyzed during this study are included in this published article and its supplementary files. Source data underlying Figures (1bcd, 2, 3ab, 4 and 5ab) in this article were provided in Supplementary Data 1. The data that support the findings of this study have been deposited into CNGB Sequence Archive (CNSA)¹⁰³ of China National GeneBank DataBase (CNGBdb)¹⁰⁴ with accession number CNP0001723 (https://db.cngb.org/).

References

Purvis, A., Gittleman, J. L., Cowlishaw, G. & Mace, G. M. Predicting extinction risk in declining species. Proc. R. Soc. Lond. Ser. B: Biol. Sci. 267, 1947–1952 (2000).
Article CAS Google Scholar
Xue, Y. et al. Mountain gorilla genomes reveal the impact of long-term population decline and inbreeding. Science 348, 242–245 (2015).
Article CAS PubMed PubMed Central Google Scholar
Frankham, R. et al. A practical guide for genetic management of fragmented animal and plant populations. (Oxford University Press, 2019).
Weeks, A. R. et al. Genetic rescue increases fitness and aids rapid recovery of an endangered marsupial population. Nat. Commun. 8, 1–6 (2017).
Article CAS Google Scholar
Frankham, R. Genetic rescue of small inbred populations: Meta-analysis reveals large and consistent benefits of gene flow. Mol. Ecol. 24, 2610–2618 (2015).
Article PubMed Google Scholar
Trinkel, M. et al. Translocating lions into an inbred lion population in the Hluhluwe-iMfolozi Park, South Africa. Anim. Conserv. 11, 138–143 (2008).
Article Google Scholar
Pimm, S. L., Dollar, L. & Bass, O. L. Jr The genetic rescue of the Florida panther. Anim. Conserv. 9, 115–122 (2006).
Article Google Scholar
Von Seth, J. et al. Genomic insights into the conservation status of the world’s last remaining Sumatran rhinoceros populations. Nat. Commun. 12, 1–11 (2021).
CAS Google Scholar
Wang, P. et al. Genomic Consequences of Long-Term Population Decline in Brown Eared Pheasant. Mol. Biol. Evol. 38, 263–273 (2021).
Article CAS PubMed Google Scholar
Hinten, G., Harriss, F., Rossetto, M. & Braverstock, P. Genetic variation and island biogreography: microsatellite and mitochondrial DNA variation in island populations of the Australian bush rat, Rattus fuscipes greyii. Conserv. Genet. 4, 759–778 (2003).
Article CAS Google Scholar
Sharma, R. et al. Genetic diversity and relationship of Indian cattle inferred from microsatellite and mitochondrial DNA markers. BMC Genet. 16, 1–12 (2015).
Article CAS Google Scholar
Jensen‐Seaman, M. & Kidd, K. Mitochondrial DNA variation and biogeography of eastern gorillas. Mol. Ecol. 10, 2241–2247 (2001).
Article PubMed Google Scholar
Allendorf, F. W., Hohenlohe, P. A. & Luikart, G. Genomics and the future of conservation genetics. Nat. Rev. Genet. 11, 697–709 (2010).
Article CAS PubMed Google Scholar
Ryynänen, H. J., Tonteri, A., Vasemägi, A. & Primmer, C. R. A comparison of biallelic markers and microsatellites for the estimation of population and conservation genetic parameters in Atlantic salmon (Salmo salar). J. Heredity 98, 692–704 (2007).
Article CAS Google Scholar
Allendorf, F. & Seeb, L. Concordance of genetic divergence among sockeye salmon populations at allozyme, nuclear DNA, and mitochondrial DNA markers. Evolution 54, 640–651 (2000).
Article CAS PubMed Google Scholar
Guang, X. et al. Chromosome-scale genomes provide new insights into subspecies divergence and evolutionary characteristics of the giant panda. Sci. Bull. 66, 2002–2013 (2021).
Article CAS Google Scholar
Ouborg, N. J., Pertoldi, C., Loeschcke, V., Bijlsma, R. K. & Hedrick, P. W. Conservation genetics in transition to conservation genomics. Trends Genet. 26, 177–187 (2010).
Article CAS PubMed Google Scholar
Khan, A. et al. Genomic evidence for inbreeding depression and purging of deleterious genetic variation in Indian tigers. Proc. Natl Acad. Sci. 118, e2023018118 (2021).
Zhao, S. et al. Whole-genome sequencing of giant pandas provides insights into demographic history and local adaptation. Nat. Genet. 45, 67–71 (2013).
Article CAS PubMed Google Scholar
Dussex, N. et al. Population genomics of the critically endangered kākāpō. Cell Genomics 1, 100002 (2021).
Article CAS Google Scholar
Clark, L., Van Thai, N. & Phuong, T. Q. in Workshop on trade and conservation of pangolins native to south and southeast Asia. 111.
Gaudin, T. J., Emry, R. J. & Wible, J. R. The phylogeny of living and extinct pangolins (Mammalia, Pholidota) and associated taxa: a morphology based analysis. J. Mamm. evolution 16, 235–305 (2009).
Article Google Scholar
Dorji, D. Distribution, habitat use, threats and conservation of the critically endangered Chinese pangolin (Manis pentadactyla) in Samtse District, Bhutan. Unpublished. Rufford Small Grants, UK (2017).
Del Toro, I., Ribbons, R. R. & Pelini, S. L. The little things that run the world revisited: a review of ant-mediated ecosystem services and disservices (Hymenoptera: Formicidae). Myrmecological N. 17, 133–146 (2012).
Google Scholar
Li, H.-F., Lin, J.-S., Lan, Y.-C., Pei, K. J.-C. & Su, N.-Y. Survey of the termites (Isoptera: Kalotermitidae, Rhinotermitidae, Termitidae) in a Formosan pangolin habitat. Fla. Entomologist 94, 534–538 (2011).
Article Google Scholar
Zhou, Z.-M., Zhou, Y., Newman, C. & Macdonald, D. W. Scaling up pangolin protection in China. Front. Ecol. Environ. 12, 97–98 (2014).
Article Google Scholar
Zhang, H. et al. Molecular tracing of confiscated pangolin scales for conservation and illegal trade monitoring in Southeast Asia. Glob. Ecol. Conserv. 4, 414–422 (2015).
Article Google Scholar
Luczon, A. U., Ong, P. S., Quilang, J. P. & Fontanilla, I. K. C. Determining species identity from confiscated pangolin remains using DNA barcoding. Mitochondrial DNA Part B 1, 763–766 (2016).
Article PubMed PubMed Central Google Scholar
IUCN. The IUCN Red List of Threatened Species., https://www.iucnredlist.org/search?query=pangolin&searchType=species (2021).
Wu, S., Liu, N., Zhang, Y. & Ma, G. Assessment of threatened status of Chinese Pangolin (Manis pentadactyla). Chin. J. Appl. Environ. Biol. 10, 456–461 (2004).
Google Scholar
Yue, Z. in Proceedings of the workshop on trade and conservation of pangolins native to South and Southeast Asia.
Heinrich, S. et al. Where did all the pangolins go? International CITES trade in pangolin species. Glob. Ecol. Conserv. 8, 241–253 (2016).
Article Google Scholar
Dongliang, Z. Present Situation and Countermeasures of the Protection and Management of Manis pentadactyla in Fujian Province. J. Fujian Forestry Sci. Technol. 23, 85–88 (1996).
Jiang, Z. et al. Red list of China’s vertebrates. Biodivers. Sci. 24, 500 (2016).
Article Google Scholar
Wu, S. et al. The population and density of pangolin in dawuling natural reserve and the number of pangolin resource in Guangdong province. Acta Theriol. Sin. 22, 270–276 (2002).
CAS Google Scholar
Yang, L. et al. Historical data for conservation: reconstructing range changes of Chinese pangolin (Manis pentadactyla) in eastern China (1970–2016). Proc. R. Soc. B 285, 20181084 (2018).
Article PubMed PubMed Central Google Scholar
Wu, S., Ma, G., Liao, Q. & Lu, K. (China Forestry Publishing House, Beijing, 2005).
Zhang, F. et al. Observations of Chinese pangolins (Manis pentadactyla) in mainland China. Glob. Ecol. Conserv. 26, e01460 (2021).
Article Google Scholar
Nash, H. C., Wong, M. H. & Turvey, S. T. Using local ecological knowledge to determine status and threats of the Critically Endangered Chinese pangolin (Manis pentadactyla) in Hainan, China. Biol. Conserv. 196, 189–195 (2016).
Article Google Scholar
Hassanin, A., Hugot, J.-P. & van Vuuren, B. J. Comparison of mitochondrial genome sequences of pangolins (Mammalia, Pholidota). Comptes rendus biologies 338, 260–265 (2015).
Article PubMed Google Scholar
Hu, J.-Y. et al. Genomic consequences of population decline in critically endangered pangolins and their demographic histories. Natl Sci. Rev. 7, 798–814 (2020).
Article PubMed PubMed Central Google Scholar
Chinanews. For the first time in Guangdong during the day, the Chinese pangolin came out of the cave, https://www.tellerreport.com/life/2020-07-10-for-the-first-time-in-guangdong-during-the-day-the-chinese-pangolin-came-out-of-the-cave.HkzCOdjByv.html (2020).
Choo, S. W. et al. Pangolin genomes and the evolution of mammalian scales and immunity. Genome Res. 26, 1312–1322 (2016).
Article CAS PubMed PubMed Central Google Scholar
Robinson, J. A. et al. Genomic flatlining in the endangered island fox. Curr. Biol. 26, 1183–1189 (2016).
Article CAS PubMed Google Scholar
Wan, Q.-H., Wu, H. & Fang, S.-G. A new subspecies of giant panda (Ailuropoda melanoleuca) from Shaanxi, China. J. Mammal. 86, 397–402 (2005).
Article Google Scholar
Liu, Y.-C. et al. Genome-wide evolutionary analysis of natural history and adaptation in the world’s tigers. Curr. Biol. 28, 3840–3849. e3846 (2018).
Article CAS PubMed Google Scholar
Armstrong, E. E. et al. Recent evolutionary history of tigers highlights contrasting roles of genetic drift and selection. Mol. Biol. evolution 38, 2366–2379 (2021).
Article CAS Google Scholar
Pečnerová, P. et al. High genetic diversity and low differentiation reflect the ecological versatility of the African leopard. Curr. Biol. 31, 1862–1871. e1865 (2021).
Article PubMed CAS Google Scholar
Altshuler, D., Donnelly, P. & Consortium, I. H. A haplotype map of the human genome. Nature 437, nature04226 (2005).
Google Scholar
Nei, M. & Roychoudhury, A. K. Evolutionary relationships of human populations on a global scale. Mol. Biol. evolution 10, 927–943 (1993).
CAS Google Scholar
Hoffecker, J. F. Desolate landscapes: Ice-age settlement in Eastern Europe. (Rutgers University Press, 2002).
Clark, P. U. et al. The last glacial maximum. Science 325, 710–714 (2009).
Article CAS PubMed Google Scholar
He, K. & Jiang, X. Sky islands of southwest China. I: an overview of phylogeographic patterns. Chin. Sci. Bull. 59, 585–597 (2014).
Article Google Scholar
Zhou, X. et al. Population genomics reveals low genetic diversity and adaptation to hypoxia in snub-nosed monkeys. Mol. Biol. Evolution 33, 2670–2681 (2016).
Article CAS Google Scholar
Kozma, R., Melsted, P., Magnússon, K. P. & Höglund, J. Looking into the past–the reaction of three grouse species to climate change over the last million years using whole genome sequences. Mol. Ecol. 25, 570–580 (2016).
Article PubMed Google Scholar
Dong, F. et al. Population genomic, climatic and anthropogenic evidence suggest the role of human forces in endangerment of green peafowl (Pavo muticus). Proc. R. Soc. B 288, 20210073 (2021).
Article PubMed PubMed Central Google Scholar
Wu, X. On the origin of modern humans in China. Quat. Int. 117, 131–140 (2004).
Article Google Scholar
Bergström, A. et al. Insights into human genetic variation and population history from 929 diverse genomes. Science 367, eaay5012 (2020).
Schiffels, S. & Durbin, R. Inferring human population size and separation history from multiple genome sequences. Nat. Genet. 46, 919–925 (2014).
Article CAS PubMed PubMed Central Google Scholar
Schroeter, N. et al. Biomolecular evidence of early human occupation of a high-altitude site in Western Central Asia during the Holocene. Front. Earth Sci. 8, 20 (2020).
Article Google Scholar
Tunstall, T. et al. Evaluating recovery potential of the northern white rhinoceros from cryopreserved somatic cells. Genome Res. 28, 780–788 (2018).
Article CAS PubMed PubMed Central Google Scholar
Spielman, D., Brook, B. W. & Frankham, R. Most species are not driven to extinction before genetic factors impact them. Proc. Natl Acad. Sci. 101, 15261–15264 (2004).
Article CAS PubMed PubMed Central Google Scholar
Kardos, M. et al. Genomic consequences of intensive inbreeding in an isolated wolf population. Nat. Ecol. evolution 2, 124–131 (2018).
Article Google Scholar
van der Valk, T., Díez-del-Molino, D., Marques-Bonet, T., Guschanski, K. & Dalén, L. Historical genomes reveal the genomic consequences of recent population decline in eastern gorillas. Curr. Biol. 29, 165–170. e166 (2019).
Article PubMed CAS Google Scholar
Meyermans, R., Gorssen, W., Buys, N. & Janssens, S. How to study runs of homozygosity using PLINK? A guide for analyzing medium density SNP data in livestock and pet species. BMC genomics 21, 1–14 (2020).
Article Google Scholar
DeRose, M. A. & Roff, D. A. A comparison of inbreeding depression in life‐history and morphological traits in animals. Evolution 53, 1288–1292 (1999).
Article PubMed Google Scholar
Keller, L. F. & Waller, D. M. Inbreeding effects in wild populations. Trends Ecol. evolution 17, 230–241 (2002).
Article Google Scholar
Gallo, A., Boni, R. & Tosti, E. Gamete quality in a multistressor environment. Environ. Int. 138, 105627 (2020).
Article CAS PubMed Google Scholar
Shibao, W., Qian, L., Ganxin, F. & Yayong, K. Prellminary study on food nutrient contents of Chinese pangolin (Manis pentadactyla). J. Zhanjiang Norm. Coll. 20, 74–76 (1999).
Google Scholar
Ma, J.-E. et al. Transcriptomic analysis identifies genes and pathways related to myrmecophagy in the Malayan pangolin (Manis javanica). PeerJ 5, e4140 (2017).
Article PubMed PubMed Central CAS Google Scholar
Kawecki, T. J. & Ebert, D. Conceptual issues in local adaptation. Ecol. Lett. 7, 1225–1241 (2004).
Article Google Scholar
Edmands, S. Between a rock and a hard place: evaluating the relative risks of inbreeding and outbreeding for conservation and management. Mol. Ecol. 16, 463–475 (2007).
Article PubMed Google Scholar
WorldData.info. The climate in China, https://www.worlddata.info/asia/china/climate.php (2022).
Ralls, K., Sunnucks, P., Lacy, R. C. & Frankham, R. Genetic rescue: A critique of the evidence supports maximizing genetic diversity rather than minimizing the introduction of putatively harmful genetic variation. Biol. Conserv. 251, 108784 (2020).
Article Google Scholar
Barker, K. Phenol-Chloroform Isoamyl Alcohol (PCI) DNA extraction. At the Bench (1998).
Li, H. & Durbin, R. Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics 26, 589–595 (2010).
Article PubMed PubMed Central CAS Google Scholar
DePristo, M. A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491 (2011).
Article CAS PubMed PubMed Central Google Scholar
Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic acids Res. 38, e164–e164 (2010).
Article PubMed PubMed Central CAS Google Scholar
Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).
Article CAS PubMed PubMed Central Google Scholar
Guindon, S. et al. New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst. Biol. 59, 307–321 (2010).
Article CAS PubMed Google Scholar
Kumar, S., Stecher, G., Li, M., Knyaz, C. & Tamura, K. MEGA X: molecular evolutionary genetics analysis across computing platforms. Mol. Biol. evolution 35, 1547 (2018).
Article CAS Google Scholar
Yang, J., Lee, S. H., Goddard, M. E. & Visscher, P. M. GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 88, 76–82 (2011).
Article CAS PubMed PubMed Central Google Scholar
Alexander, D. H., Novembre, J. & Lange, K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19, 1655–1664 (2009).
Article CAS PubMed PubMed Central Google Scholar
Weir, B. S. & Cockerham, C. C. Estimating F-statistics for the analysis of population structure. Evolution 38, 1358–1370 (1984).
Danecek, P. et al. The variant call format and VCFtools. Bioinformatics 27, 2156–2158 (2011).
Article CAS PubMed PubMed Central Google Scholar
Patterson, N. et al. Ancient admixture in human history. Genetics 192, 1065–1093 (2012).
Article PubMed PubMed Central Google Scholar
Durand, E. Y., Patterson, N., Reich, D. & Slatkin, M. Testing for ancient admixture between closely related populations. Mol. Biol. evolution 28, 2239–2252 (2011).
Article CAS Google Scholar
Skoglund, P. et al. Genetic evidence for two founding populations of the Americas. Nature 525, 104–108 (2015).
Article CAS PubMed PubMed Central Google Scholar
Browning, B. L. & Browning, S. R. Improving the accuracy and efficiency of identity-by-descent detection in population data. Genetics 194, 459–471 (2013).
Article PubMed PubMed Central Google Scholar
Pickrell, J. & Pritchard, J. Inference of population splits and mixtures from genome-wide allele frequency data. Nature Precedings 8, 1–1 (2012).
Browning, S. R. & Browning, B. L. Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. Am. J. Hum. Genet. 81, 1084–1097 (2007).
Article CAS PubMed PubMed Central Google Scholar
Terhorst, J., Kamm, J. A. & Song, Y. S. Robust and scalable inference of population history from hundreds of unphased whole genomes. Nat. Genet. 49, 303–309 (2017).
Article CAS PubMed Google Scholar
Zhang, F. et al. A note on captive breeding and reproductive parameters of the Chinese pangolin, Manis pentadactyla Linnaeus, 1758. ZooKeys 129, 129–144 (2016).
Nei, M. & Li, W.-H. Mathematical model for studying genetic variation in terms of restriction endonucleases. Proc. Natl Acad. Sci. 76, 5269–5273 (1979).
Article CAS PubMed PubMed Central Google Scholar
Dobrynin, P. et al. Genomic legacy of the African cheetah, Acinonyx jubatus. Genome Biol. 16, 1–20 (2015).
Article Google Scholar
Kyriazis, C. C., Wayne, R. K. & Lohmueller, K. E. Strongly deleterious mutations are a primary determinant of extinction risk due to inbreeding depression. Evolution Lett. 5, 33–47 (2021).
Article Google Scholar
Li, W.-H., Wu, C.-I. & Luo, C.-C. Nonrandomness of point mutation as reflected in nucleotide substitutions in pseudogenes and its evolutionary implications. J. Mol. Evolution 21, 58–71 (1984).
Article CAS Google Scholar
Feng, S. et al. The genomic footprints of the fall and recovery of the crested ibis. Curr. Biol. 29, 340–349. e347 (2019).
Article CAS PubMed PubMed Central Google Scholar
Cingolani, P. et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly 6, 80–92 (2012).
Article CAS PubMed PubMed Central Google Scholar
Zhou, Y. et al. Metascape provides a biologist-oriented resource for the analysis of systems-level datasets. Nat. Commun. 10, 1–10 (2019).
CAS Google Scholar
Voight, B. F., Kudaravalli, S., Wen, X. & Pritchard, J. K. A map of recent positive selection in the human genome. PLoS Biol. 4, e72 (2006).
Article PubMed PubMed Central Google Scholar
Team, R. D. C. The R Reference manual: base package. (Network Theory, 2004).
Guo, X. et al. CNSA: a data repository for archiving omics data. Database 2020, baaa055 (2020).
Chen, F. Z. et al. CNGBdb: china national genebank database. Yi Chuan= Hereditas 42, 799–809 (2020).
PubMed Google Scholar

Download references

Acknowledgements

This project was financially supported by funding from the Guangdong Provincial Key Laboratory of Genome Read and Write (grant No. 2017B030301011). This work was also supported by China National GeneBank, Guangdong Academy of Forestry and Technology Innovation Project of Guangdong (No. 2022KJCX008). Finally, we are thankful to the China National GeneBank for producing the sequencing data and Guangdong Provincial Academician Workstation of BGI Synthetic Genomics (No. 2017B090904014).

Author information

These authors contributed equally: Qing Wang, Tianming Lan, Haimeng Li.

Authors and Affiliations

College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China
Qing Wang, Haimeng Li, Minhui Shi, Yixin Zhu & Qian Li
State Key Laboratory of Agricultural Genomics, BGI-Shenzhen, Shenzhen, China
Qing Wang, Tianming Lan, Haimeng Li, Sunil Kumar Sahu, Minhui Shi, Yixin Zhu, Qian Li & Huan Liu
BGI Life Science Joint Research Center, Northeast Forestry University, Harbin, China
Tianming Lan & Huan Liu
Guangdong Provincial Key Laboratory of Genome Read and Write, BGI-Shenzhen, Shenzhen, China
Tianming Lan & Huan Liu
College of Wildlife and Protected Area, Northeast Forestry University, Harbin, China
Lei Han & Le Zhang
College of Life Sciences, Zhejiang University, Hangzhou, China
Shangchen Yang
Guangxi Forest Inventory and Planning Institute, Nanning, China
Zhangwen Deng
Guangdong Provincial Key Laboratory of Silviculture, Protection and Utilization, Guangdong Academy of Forestry, Guangzhou, China
Yan Hua

Authors

Qing Wang
View author publications
You can also search for this author in PubMed Google Scholar
Tianming Lan
View author publications
You can also search for this author in PubMed Google Scholar
Haimeng Li
View author publications
You can also search for this author in PubMed Google Scholar
Sunil Kumar Sahu
View author publications
You can also search for this author in PubMed Google Scholar
Minhui Shi
View author publications
You can also search for this author in PubMed Google Scholar
Yixin Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Lei Han
View author publications
You can also search for this author in PubMed Google Scholar
Shangchen Yang
View author publications
You can also search for this author in PubMed Google Scholar
Qian Li
View author publications
You can also search for this author in PubMed Google Scholar
Le Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Zhangwen Deng
View author publications
You can also search for this author in PubMed Google Scholar
Huan Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yan Hua
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

T.L., H.L. and Y.H. conceived and designed the project. T.L. and Y.Z. were responsible for collecting the data. Y.H. and HM.L. performed DNA extraction, library construction, and sequencing. Q.W., HM.L., Q.L., L.H., S.Y., and L.Z. coordinated the data analysis. Q.W., M.S., and HM.L. contributed to data analysis. Q.W. participated in data interpretation and visualization. T.L., S.K.S., D.Z., and Q.W. wrote the manuscript. T.L., H.L., and Y.H. provided supervision. All authors have read and approved the final manuscript.

Corresponding authors

Correspondence to Tianming Lan, Huan Liu or Yan Hua.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Communications Biology thanks the anonymous reviewers for their contribution to the peer review of this work. Primary Handling Editors: Caitlin Karniski and Zhijuan Qiu.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Description of Additional Supplementary Files

Supplementary Data 1

Reporting Summary

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Wang, Q., Lan, T., Li, H. et al. Whole-genome resequencing of Chinese pangolins reveals a population structure and provides insights into their conservation. Commun Biol 5, 821 (2022). https://doi.org/10.1038/s42003-022-03757-3

Download citation

Received: 15 November 2021
Accepted: 22 July 2022
Published: 25 August 2022
DOI: https://doi.org/10.1038/s42003-022-03757-3

This article is cited by

Haplotype-resolved chromosome-scale genomes of the Asian and African Savannah Elephants
- Minhui Shi
- Fei Chen
- Tianming Lan
Scientific Data (2024)
Decay of Skin-Specific Gene Modules in Pangolins
- Bernardo Pinto
- Raul Valente
- L. Filipe C. Castro
Journal of Molecular Evolution (2023)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Introduction

Results

Characteristics of sequencing data and variants

Genetic relationships among Chinese pangolin populations

Genetic differentiation and Gene flow among populations

Population separation among CPA, CPB and CPC populations

Genetic diversity and inbreeding

Mutational load

Recent positive selection in Chinese pangolin populations

Discussion

Methods

Samples and data collection, library preparation, and sequencing

Genome mapping, variants calling and filtering

Phylogenetic tree, PCA and admixture analysis

Population differentiation and gene flow

Population separation among pangolin populations

Genetic diversity and ROH analysis

Mutational load

Recently natural selection in populations

Statistics and reproducibility

Reporting summary

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Comments

Search

Quick links