Introduction

In 2009, the pig genome was being sequenced and annotated for public utility1, which made it possible for analysis and application of genome-wide genetic variants of pig. Shortly afterwards, Illumina developed the PorcineSNP60 Genotyping BeadChip, which included around 60,000 loci by genotyping 158 individuals across multiple pig breeds2. Due to the advantages of high density and inexpensive price, the Illumina PorcineSNP60 Genotyping BeadChip has been widely used in multiple research areas. For example, identifying candidate genes associated with economically important traits in different pig populations3,4,5, detecting inbreeding depression of reproductive traits in Iberian pigs6, mining genetic diversity and selection signatures in Chinese and Western pig breeds7, and discovering population structure of local pig breeds from Russia, Belorussia, Kazakhstan, and Ukraine8. These studies greatly enhance our knowledge for understanding the information implied in the pig genome.

Simultaneously, with the rapid development of the next generation sequencing technology, genome resequencing has also been introduced into the pig genome research. i.e., identifying candidate loci related to economically important traits between Korean and European origin breeds9, digging signatures of selection in European or Asian domestic pig genome10, 11, and measuring coancestry and fitness in Sus cebifrons and Pietrain pigs to design optimal breeding or conservation programs12. As a full-scanning approach for discovering all possible genome-wide variants, genome resequencing is considered as a more comprehensive solution for resolving the mystery of the genome. But so far, to some extent, the high costs of sequencing and bioinformatics analysis from genome resequencing have limited its application in population genetics of domestic animals, particularly in some research groups with insufficient funds.

Alternatively, other approaches based on reduced representation genome sequencing strategy to reduce complexity in the genome for resequencing can be applied in pig genome-wide study13,14,15. For example, genome reducing and sequencing (GGRS), which contains more number of loci but similar or lower costs than that of Illumina PorcineSNP60 Genotyping BeadChip, has recently been used to identify selection signatures in Yorkshire and Landrace pigs16 and discover the genetic diversity of Chinese indigenous pig breeds of the Taihu Lake region17. Meanwhile, another reduced representation genome sequencing, named specific-locus amplified fragment sequencing (SLAF-seq), was also exploited. As a reduced representation library and high-throughput sequencing based approach, SLAF-seq can pre-design the experimental project through bioinformatics and selection of fragments with a specific length in constructing library. More importantly, SLAF-seq has been considered as a high cost-effective way for large-scale genotyping due to its advantages of high genotyping accuracy, high efficiency for marker discovery, low cost, and high capacity for large populations18. Therefore, SLAF-seq is widely applied for revealing loci or genes associated with characteristic traits of multiple breeds. i.e., fruit traits in cucumber19, tolerance to low-phosphorus stress in soybean20, and disease, growth and carcass traits in chicken21,22,23. Obviously, these data support that reduced representation genome sequencing strategy is a convenient and effective way for genome-wide study in plants and animals.

As representatives of Chinese pigs, Erhualian and Meishan are two native pig breeds with high reproductive performance and excellent meat quality such as high litter size and intramuscular fat content. Most recently, many single nucleotide polymorphisms (SNPs) and candidate genes associated with reproductive or meat quality traits in Erhualian pigs have been identified using Illumina PorcineSNP60 Genotyping BeadChip24, 25. Additionally, due to high costs, a limited number of pigs were used (usually from 4 to 10 individuals) for genome resequencing and most of the studies classified Erhualian and Meishan as one of domesticated pig breeds rather than separate groups for further analysis10, 26, 27. Knowledge of the genetic difference between Erhualian and Meishan pig breeds is lacking. Furthermore, a genome-wide view of possible candidate loci or genes associated with the characteristics of Erhualian and Meishan pigs compared with other distant breeds such as European origin pigs has still not been well investigated. Taken together, for the first time, a SLAF-seq approach for pigs was developed in this study and then was performed to analyze the genetic difference among Landrace, Erhualian, and Meishan pigs, thus providing a cost-effective way for pig genome-wide study and new guidance for understanding the genetic basis of germplasm characteristics of these pig breeds.

Results

Development of SLAF markers in the pig genome

In order to exploit SLAF markers, in silico restriction analysis was firstly performed on the current pig reference genome (Sscrofa 10.2). As a result, two restriction enzymes, RsaI and HaeIII, were used as enzyme combination for developing around 500,000 SLAFs. Finally, a total of 516,733 SLAF tags were predicted using in silico restriction analysis, which showed similar average SLAF distance among different chromosomes (Supplementary Table S1).

Specific-locus amplified fragment sequencing (SLAF) basics

To obtain the actual SLAF markers used in this study, SLAF-seq was performed in 28 Landrace, 26 Erhualian, and 27 Meishan individuals using the same enzyme combination as in silico restriction analysis. As shown in Table 1 and Supplementary Table S2, a total of 453.75 million reads were obtained from all individuals, which showed that average Q30 and GC content were 87.67% and 44.55%, respectively. Similar to the number of expected SLAFs, 461,261 unique SLAF tags, with average SLAF numbers and sequencing depth as 398,034 (12.74×), 344,471 (6.10×), and 369,160 (6.13×) in Landrace, Erhualian, and Meishan pig populations, respectively, were identified further (Table 1 and Supplementary Table S3). Moreover, compared to the distribution of the expected 516,733 SLAF tags, similar uniformly distribution of 461,261 SLAF tags was also found on different chromosomes (Supplementary Figure S1). In addition, Oryza sativa indica was used as a control during sequencing, the results found that the percentage of digestion normally and paired-end mapped reads of control were 91.73% and 90.68%, respectively, indicating that SLAF-seq process was normal and available.

Table 1 Characteristics of SLAF-seq among Landrace, Erhualian, and Meishan.

Identification of SNPs in Landrace, Erhualian, and Meishan pigs

After genomic mapping and SNP calling, a total of 5,373,997 SNPs were discovered using all individuals (Supplementary Table S4). Moreover, an average of 2,176,131, 1,302,927, and 1,530,285 SNPs were discovered in Landrace, Erhualian, and Meishan pig populations, respectively (Table 1 and Supplementary Table S4). A series of quality control filtering of SNPs were performed to identify 165,670 SNPs used in the further analysis (Supplementary Table S5). Specifically, 161,377 autosomal SNPs were included in linkage disequilibrium and differentially selected region (DSR) analyses, while the 4,293 SNPs on chromosome X were used for DSRs analysis only (Supplementary Figure S2). To reduce the effect from ascertainment bias, a subset of 107,626 autosomal SNPs with minor allele frequency (MAF) ≥ 0.2 was produced and used in genetic structure analysis (Supplementary Figure S2).

Measuring of linkage disequilibrium by chromosomes

Linkage disequilibrium (r2) for pairs of loci was measured by chromosomes in each pig population. As shown in Supplementary Figure S3, the average r2 values dropped quickly when physical distances reached 40 kilo base pairs (Kb). With the increasing of physical distance, the similar decreasing trends of the average r2 values were observed among three populations. In addition, average r2 values of Landrace and Meishan were higher than that of Erhualian.

Accessing genetic structure, heterozygosity and gene diversity

As expected, population structure analysis revealed that Landrace and Meishan formed the first two independent populations (K = 2) followed by the Erhualian (K = 3) (Fig. 1A). Moreover, a multidimensional scaling plot clearly showed that Landrace was a distant population compared with Erhualian and Meishan (Fig. 1B). Based on pair-wise estimates of F ST described by Weir and Cockerham28, the significant genetic differentiation appeared between Landrace and Erhualian (F ST = 0.5480), between Landrace and Meishan (F ST = 0.5800), and between Erhualian and Meishan (F ST = 0.2335), respectively, demonstrating that Erhualian and Meishan hold closer genetic relationship compared to Landrace. As shown in Supplementary Figure S4, observed heterozygosity of Landrace, Erhualian, and Meishan was 0.1956, 0.2410, and 0.2432, respectively, while the gene diversity of Landrace, Erhualian, and Meishan was 0.2276, 0.2952, and 0.2776, respectively. Erhualian and Meishan had the higher observed heterozygosity and gene diversity than that of Landrace.

Figure 1
figure 1

Population structure analysis of the genetic differentiation among Landrace, Erhualian, and Meishan. (A) Population structure among Landrace, Erhualian, and Meishan using STRUCTURE software. (B) Multidimensional scaling plots among Landrace, Erhualian, and Meishan using PLINK software. F ST represents the pair-wise F ST between any two pig populations.

Identification of DSRs between Landrace and the two Chinese indigenous pig breeds

To obtain the genomic difference among different pig breeds, a SNP-based F ST estimation was performed to identify DSRs, which defined regions with shared differentially selection signals across breeds. As these two Chinese indigenous pig breeds have similar origin, Erhualian and Meishan were firstly merged as one population compared with Landrace. As a result, a total of 268 DSRs were identified between Landrace and the two Chinese indigenous pig breeds, which included 237 DSRs in autosomes and 31 DSRs in chromosome X (Supplementary Table S6). The DSR (named region 135) with both the richest top SNPs and the highest W value was located on chromosome 7 (Table 2 and Fig. 2A–C). Moreover, 855 unique known genes, including 511 genes from autosomal DSRs and 344 genes from X chromosome DSRs (Table 2 and Supplementary Table S6), were discovered further, which included many genes associated with economically important traits, i.e., ESR1 and KIT. Furthermore, the gene functional analysis showed that the biological processes between Landrace and the two Chinese indigenous pig breeds were enriched in flavonoid related metabolic process, pigment biosynthetic process, response to hormone, and regulation of developmental process (Supplementary Table S7).

Table 2 Top 20 percent of autosomal differentially selected regions (DSRs) among Landrace, Erhualian, and Meishan.
Figure 2
figure 2

Global distribution of F ST among Landrace, Erhualian, and Meishan. (A) Global distribution of F ST between Landrace and the two Chinese indigenous pig breeds on autosomes 1–18. A representative DSR (region 135 of SSC7) with both the richest top SNPs and the highest W value was indicated. (B) The W-statistics was performed for DSRs between Landrace and the two Chinese indigenous pig breeds. (C) Smoothed F ST showed strong selection signals in region 135 between Landrace and the two Chinese indigenous pig breeds. L-EM represents Landrace-the two Chinese indigenous pig breeds, while E-M means Erhualian-Meishan. (D) Global distribution of F ST between Erhualian and Meishan on autosomes 1–18.

DSRs associated with female characters between Landrace and the two Chinese indigenous pig breeds

In order to further investigate DSRs or genes related to female characters, a female based genome-wide screening was performed between Landrace and the two Chinese indigenous pig breeds. A dataset with MAF ≥ 0.1 containing 175,952 SNPs from autosomes and 8,320 SNPs from chromosome X was used in this analysis. Based on the SNP F ST estimates, 241 DSRs in autosomes and 60 DSRs in chromosome X were revealed. Further examination of these DSRs identified 443 unique known genes from autosomes and 327 genes from chromosome X (Supplementary Table S8). GO terms of these genes from DSRs were involved in chromatin organization, odontogenesis, animal organ development, and reproductive system development (Supplementary Table S9).

Characterization of DSRs between Erhualian and Meishan

A genome-wide distribution of F ST between Erhualian and Meishan on autosomes is shown in Fig. 2D. Between these two pig breeds, 256 DSRs, including 252 DSRs in autosomes and 4 DSRs in chromosome X, were discovered. Moreover, 341 unique known genes from autosomal DSRs and 6 genes from X chromosome DSRs were discovered further (Supplementary Table S10). The gene enrichment analysis found that DSRs between Erhualian and Meishan were enriched in sodium-independent organic anion transport, glycerophospholipid metabolic process, and phosphorus or phospholipid related metabolic process (Supplementary Table S11).

Discussion

In the present study, we used SLAF-seq to analyze the genetic structure and differentially selected regions among Landrace, Erhualian, and Meishan pigs. By using over 160,000 SNPs developed in this study, our results clearly indicated that there was a closer genetic relationship between Erhualian and Meishan compared with the genetic relationship between Landrace and Erhualian or Meishan. Moreover, 268 DSRs including 855 genes were discovered between Landrace and the two Chinese indigenous pig breeds, while 256 DSRs including 347 genes were found between Erhualian and Meishan. As such, our data provide a comprehensive view of the genome-wide basis for genetic difference among these pig breeds.

Basically, as a high-resolution method for genome-wide genotyping, SLAF-seq is considered as an enhanced reduced representation sequencing and possesses several remarkable features18, i.e., accurate genotyping by depth paired-end sequencing, reduced sequencing costs, pre-design strategy for fragmentation efficiency, and double barcode system for large populations. In this study, a total of 516,733 SLAFs from the pig reference genome (Sscrofa 10.2) were predicted using RsaI and HaeIII enzyme combination. With more than 6 × sequencing depth but below $80 cost for each individual, 5,373,997 SNPs were discovered from Landrace, Erhualian, and Meishan pig populations by SLAF-seq. After filtering, 165,670 SNPs were available for analyzing the genetic difference among these pig populations, which would have an average inter-marker distance of approximately 17 Kb and several multiples of the number of SNPs from Illumina PorcineSNP60 Genotyping BeadChip. Compared with the published pig GGRS, which contained 34,789–105,550 available SNPs for different objective analysis14, 17, our SLAF-seq method might provide more genomic variation information. More importantly, our data explored the genetic difference between Landrace and the two Chinese pig breeds using reduced representation sequencing strategy, which has not been well studied by GGRS. Considering the huge difference in genome variation between European and Chinese pig breeds, our data demonstrate that SLAF-seq is a powerful approach and has great potential for further studies in more breeds. Therefore, the SLAF-seq approach could be considered as a more competitive choice for pig genome study.

It is well known that Landrace was originally developed in Denmark, while Erhualian and Meishan were domesticated in China, implying that Landrace was as a geographically distant breed relative to Erhualian and Meishan. As expected, both population structure and cluster analysis showed that Landrace was genetically distant from two Chinese indigenous pig breeds (Fig. 1). Moreover, pair-wise F ST estimation supported Landrace had distant genetic relationship compared with two Chinese indigenous pig breeds (F ST = 0.5480 for Erhualian and F ST = 0.5800 for Meishan), demonstrating that there was a great genetic differentiation between Landrace and two Chinese indigenous pig breeds. These results were consistent with the data from other genetic markers such as mitochondrial DNA and microsatellite markers29, 30, which showed European and Chinese pig breeds exist long genetic distances. In particular, we found there was close but significant genetic differentiation (F ST = 0.2335) between Erhualian and Meishan. Actually, a significant genetic differentiation between Erhualian and Meishan had been observed using GGRS methods17. Therefore, our data provide a further evidence that Erhualian and Meishan should be two pig breeds, although they had been classified as the same pig breed named Taihu pigs in the annals of history as both pig breeds lived in Taihu Lake regions, eastern of China. Additionally, higher observed heterozygosity and gene diversity were found in these two Chinese indigenous pig breeds compared with Landrace, implying that these two Chinese indigenous pig breeds had more abundant genetic diversity. This is not surprising because Chinese indigenous breeds had more genetic diversity than that of European pig breeds30. Accordingly, this study provides a global view of the genetic structure and relationship among these three pig populations.

Generally speaking, Erhualian and Meishan are two of the most prolific pig breeds known in the world24, 31, 32 and have been mainly considered as the maternal line for synthetic strain selection. Meanwhile, these two pig breeds are used to build specific populations for identifying candidate genes or QTLs (quantitative trait loci) related to reproductive traits, i.e., Iberian x Meishan F2 population33, Large White × Meishan cross gilts34, and Duroc x Erhualian resource population35. Intuitively, compared to Landrace, a major of the maternal sources in European pig breeds, Erhualian and Meishan pigs possess the better farrowing capacity, which results in far more litter size36, 37. To further identify the candidates related to female reproductive traits, a female based dataset was used at the same time. Compared with the candidate genes from all individuals, a total of 502 genes from female based DSRs were found overlapping (Supplementary Tables S6 and S8). Many important genes related to female reproductive traits such as ESR1, RBBP4, and BRCA1, were discovered in both DSR datasets. In the previous study, as a key gene for mediating estrogen, ESR1 with its PvuII site polymorphism had been shown to be associated with litter size in different pig populations, i.e., Meishan synthetic line38, Landrace population39, a population of Erhualian and Xiang pig breeds40, and Large White x Meishan F2 crossbred gilts41. However, other studies did not find significant associations of ESR1 with litter size in a synthetic line of Duroc and Large White origin or a Meishan x Large White F2 population42, 43, demonstrating that the role of ESR1 in sow prolificacy among different pig populations was still controversial. Therefore our data might provide additional information for interpreting the difference in litter size between Landrace and the two Chinese indigenous pig breeds. RBBP4, a biomarker of oocyte competence, had different expression levels between in vivo- and in vitro-matured oocytes in bovine44 and could affect bipolar spindle assembly by regulating histone deacetylation during oocyte maturation in mouse45. However, little is known about the function of RBBP4 in pig reproductive process. Our data suggest that RBBP4 might have an important role in sow reproductive performance, but confirmation of this assumption warrants further study. Interestingly, BRCA1, a key DNA repair protein, was identified in DSRs. Previously, genetic variants of BRCA1 were mainly related to breast or ovarian related diseases in human and cattle46, 47. Besides, reducing numbers of primordial follicles and simultaneously increasing DNA double-strand break repair with age were observed in BRCA1-deficient mice compared to the wild-type mice, showing that BRCA1-deficient led to ovarian aging48, 49. In women, BRCA1 mutation produced lower numbers of eggs compared with control50. These data indicated BRCA1 might play a critical role in regulating ovarian development and the number of eggs. Hence our results might give us a novel clue for its role in pig reproductive performance.

Furthermore, a DSR with both the richest top SNPs and the highest W value in autosomes was found on Sus scrofa chromosome 7 (SSC7, 0.05 to 0.27 Mb) between Landrace and the two Chinese indigenous pig breeds (Fig. 2, Table 2 and Supplementary Table S6). Interestingly, EXOC2, a component of the exocyst complex, was found in this region. Previous studies indicated that EXOC2 played a role in filopodia formation by binding the GTPase RalA in Swiss 3T3 cells and mediated vesicle trafficking by interacting with the Rho guanine nucleotide exchange factor (GEF)-H1 in HeLa cells51, 52. Additionally, genetic variants of EXOC2 or adjacent EXOC2 had been considered as contributing to pigmentary traits such as hair color, skin pigmentation and tanning ability in Europeans as well as vitamin D status in the Caucasian population53,54,55,56,57. In this study, Landrace has white hair color and skin, while Erhualian and Meishan have colored hair and skin37, indicating that there is great difference in pigmentary traits between Landrace and the two Chinese indigenous pig breeds. Our data suggest that EXOC2 might be a good pigmentation trait candidate of pigs. However, the biological mechanism of EXOC2 has not been well characterized in regulating human pigmentation. Moreover, little is known about the role of EXOC2 in pigs. Therefore, confirmation and elucidation the role of EXOC2 in pigmentation trait in pigs requires further investigation. Actually, another key well-documented pigmentary gene, KIT was also discovered in DSRs (Supplementary Table S6). In the past two decades, KIT had been deeply investigated for its mutation associated with coat color of the dominant white in pigs58,59,60. Moreover, as a colored breed, Meishan had been found to have different KIT allele distinguish from white breed such as Landrace and Large White58, 60. Obviously, our data provide further evidence for the role of KIT in pigmentary traits. More interestingly, we discovered some genes associated with growth or meat quality traits in DSRs, i.e., ACSL6, RAPGEF1, BMPER, and KDR (Table 2 and Supplementary Table S6). ACSL6, a member of the long-chain acyl-CoA synthetase gene family for catalyzing the formation of acyl-CoA from fatty acids, ATP, and CoA, was shown to be associated with dry matter intake and metabolic mid-test body weight in cattle61. RAPGEF1, a guanine nucleotide releasing protein, was discovered to play an important role in skeletal muscle differentiation62 and was significantly associated with type 2 diabetes in the Korean population and Finns63, 64. These studies demonstrated that ACSL6 and RAPGEF1 might have important function in body growth or development of skeletal muscle and fat. Hence our results might provide new ideas for their function in affecting pig growth and development. In addition, BMPER and KDR, two key genes in the development of angiogenesis and vasculogenesis65, 66, were identified in DSRs. It is not surprising because our previous study had found genetic variants in the promoters of BMPER and KDR affected intramuscular fat deposition in Erhualian pigs67, 68. Moreover, BMPER was found to be associated with rump length and body size in Qinchuan cattle population69. Here our data suggest that BMPER and KDR might contribute to the regulation of pig growth and meat quality, but additional studies are needed to further confirm this speculation. In a word, our results provide a comprehensive clue for discovering the genetic difference between Landrace and the two Chinese indigenous pig breeds.

As described above, Erhualian and Meishan are two famous pig breeds with the typical characters of Chinese indigenous breeds such as high productive performance and good meat quality. However, as two different pig breeds, some characteristics, i.e., head wrinkles and colors of limbs and nose37, are very different between these two pig breeds. In this study, the genetic differentiation (F ST = 0.2335) between Erhualian and Meishan was observed, further confirming that there exists a significant genetic difference between these two pig breeds. But until now, only variants from single gene such as FUT1 have been used to compare the difference in gene frequencies between these two pig breeds70. In a genome-wide level, the genetic difference between these two pig breeds is still largely unknown. Our present study identified 256 DSRs containing 347 genes between Erhualian and Meishan. Importantly, GO terms associated with these genes were enriched in sodium-independent organic anion transport, glycerophospholipid metabolic process, and phospholipid related metabolic process. Some important candidates, i.e., SLCO3A1 in sodium-independent organic anion transport, CDS2, CRLS1, and MTMR7 in glycerophospholipid metabolic process, and SPTLC1, SGMS2, and CRLS1 in phospholipid related metabolic process, were discovered from enrichment analysis. The difference of the biological processes between these two pig breeds might result from a long-term artificial selection based on different breeding objectives, but further investigation of the cause of breed difference is required.

In summary, a SLAF-seq approach for pigs was developed to reveal the genetic structure and relationship among Landrace, Erhualian, and Meishan pigs. Meanwhile, DSRs including hundreds of genes between Landrace and the two Chinese indigenous pig breeds and between Erhualian and Meishan were identified further. Consequently, our study not only provides a cost-effective approach for pig genome-wide screening, but also establishes the genetic basis for further investigating important gene functions from DSRs among these pig breeds in future research.

Materials and Methods

Animal

Blood or ear tissue samples were collected from 28 Landrace (8 male and 20 female), 26 Erhualian (8 male and 18 female), and 27 Meishan (Small Meishan, 8 male and 19 female) at the Xiang Xin livestock and poultry Co., Ltd (Shanghai), Changzhou Erhualian production cooperation (Jiangsu), and Taicang pig breeding farm (Jiangsu), respectively. Total DNA was extracted by the phenol-chloroform extraction method, with concentration and purity measured using the NanodropTM 2000 spectrophotometer (Thermo Scientific, Waltham, MA, USA) and electrophoresis. All animal care and handling procedures were approved by the Animal Ethics Committee at Nanjing Agricultural University, China. All methods were performed in accordance with the guidelines and regulations of the Animal Ethics Committee at Nanjing Agricultural University.

SLAF library construction and sequencing

A simulated restriction enzyme digestion was carried out on the current pig genome (Sscrofa 10.2) to identify expected SLAF yield, avoid repetitive SLAFs, and obtain the relatively uniform distribution of restriction fragments in the genome. As a result, genomic DNA was digested with RsaI and HaeIII restriction enzyme combination. Meanwhile, in order to assess the experimental procedure, Oryza sativa indica (http://rapdb.dna.affrc.go.jp/) was used as a control for evaluating the effectiveness of enzyme digestion and paired-end mapped reads. In brief, SLAF library construction and sequencing for each individual was conducted as described previously18 with slight modifications: DNA fragments of 314–364 base pair (bp) were selected as SLAFs and used for paired-end sequencing by Illumina HiSeq 2500 system (Illumina, Inc., San Diego, CA, USA) at Beijing Biomarker Technologies Corporation.

Genome mapping, SNP calling and filtering

Raw paired-end reads were mapped to the pig reference genome (Sscrofa 10.2) using BWA software71. In general, SLAF groups produced by reads were mapped to the same position. If an accession was only partly digested by the restriction enzymes, reads mapped to the reference genome should have overlaps with two SLAF tags and were assigned to both of the SLAF tags in the same accession. SNP calling was performed by both GATK and samtools analysis72, 73, and a locus was defined as a SNP if it was simultaneously called from these two packages. Filtering high-quality SNPs was conducted by PLINK v1.0774. The analysis of Hardy-Weinberg Equilibrium (HWE) for SNPs from autosomes and chromosome X was performed by Pedstats tools and HWE package for R75, 76, respectively.

Population genetic basics analysis

Population structure analysis was performed using STRUCTURE with 10,000 iterations via the correlated allele model77, and then plotted by DISTRUCT software78. Linkage disequilibrium analysis was computed between each marker pair within each breed separately using Haploview 4.179. The population relatedness of pair-wise F ST and gene diversity were calculated using the HIERFSTAT package for R. Observed heterozygosity and IBS matrix of distance were estimated by PLINK v1.0774, and then multidimensional scaling (MDS) analysis of autosomal SNPs was determined by R 3.2.4.

Detection of differentially selected regions (DSRs)

The DSR algorithm was carried out as described previously3, 80 with slight modifications: 1) raw values were ranked; 2) Fisher’s exact test was executed in R 3.2.4 to compare the allele frequencies between Landrace and the two Chinese indigenous pig breeds and between Erhualian and Meishan, respectively. SNPs with P values < 0.05 were considered as statistically significant after Bonferroni correction; 3) F ST of SNP was estimated using the model proposed by Nicholson et al.81 and Flori et al.82, and then the significant SNPs with 0.5% or 5% highest F ST values were selected as the top significant SNPs; 4) by placing the top significant SNPs (0.5%) in the center of a DSR, adjacent SNPs were collected to determine the region boundaries until more than two consecutive SNPs were not in the top significant 5% threshold. The overlapped DSRs were combined as the same DSR. When a region contained more than five SNPs from the top significant SNPs (5%) but without top significant SNPs (0.5%), we also considered it as a DSR in our study. In addition, the W-statistics was used to identify the relative importance of DSRs83 and smoothed F ST values were obtained by a local variable bandwidth kernel estimator84.

Gene functional enrichment analysis

Genes in these DSRs were identified using the Sscrofa 10.2 assembly (www.animalgenome.org/cgi-bin/gbrowse/pig) and NCBI database. The functional enrichment analysis of target genes was performed using Panther bioinformatics resources (www.pantherdb.org). Terms with P values less than 0.01 and the enrichment value more than 1 were selected as significant or enriched terms.

Data accessibility

The datasets have been submitted to the SRA database of NCBI (accession number SRP090907).