Introduction

Copy number variations (CNVs) are defined as deletions, insertions, and duplications ranging from 50 base pairs (bp) to 5 million base pairs (Mbp) of genomic sequence between two individuals of a species1,2,3,4. Previous studies suggest CNV have potentially larger effects than other variation such as SNP, including changing gene structure and dosage, altering gene regulation and exposing recessive alleles5,6,7. CNV discovery studies have been extensively reported in human8, primates9, mouse10,11,12, zebrafish13, dog14,15,16, and livestock, including chicken17,18, pig19,20, sheep21,22,23, goat24 and cattle25,26,27,28,29.

In recent years, many studies have revealed genomic diversity could be generated by the differential selection of CNVs in specific environments for adaptations30,31,32,33. In human, positive selection for a higher AMY1 copy number enables better digestion of starchy foods34. A change in CCL3L1 copy number is associated with markedly enhanced HIV/acquired immunodeficiency syndrome (AIDS) susceptibility35. The human UGT2B17 gene shows significant copy-number diversity, and displays region-specific differences for metabolism in multiple populations36. Also, olfactory receptor (OR) genes with variable copy numbers among most mammalian species were found associate with population-specific differences in smell37. CNVs are specifically enriched among evolutionary “young” ORs, implying that CNVs may play an essential role in the origin of a gene or the emergence of new OR gene clusters38.

Domesticated cattle are one of the most economically important farm animals. The exploration of genetic diversity, conservation, selection and evolution of genomic variants in cattle have attracted much attention in past decades39. Specifically, investigations of population genetic properties and selection patterns involved with CNVs have gradually become an emerging research topic for farm animals. For instance, Xu et al. have investigated the population-genetic properties of differentiated CNVs using high density SNP array among European taurine, African taurine, and indicine groups, and provided a list of lineage-differentiated CNVs, which were involved in traits related to parasite resistance, immunity response, body size, fertility, and milk production40. Bickhart et al. further explored the diversity and population genetics of both taurine and indicine cattle based on CNV using next generation sequencing and showed hundreds of copy number variable genes were breed-specific31. Although a few studies have been carried out to investigate CNV in Chinese cattle41,42,43, genome-wide CNV landscapes and its population-genetic properties in Chinese cattle adapted for local specific environments are largely unknown.

In this study, we performed a genome-wide CNV analysis using high density SNP array in diverse Chinese cattle populations. The objectives of this study were to 1) Generate a comprehensive CNV landscape in Chinese cattle populations; 2) Investigate and compare the diversity and population-genetic properties of CNVs; 3) Explore the diverse selection patterns involved with CNV genes for local adaptation in Chinese native cattle.

Results

CNVs identification

We performed a genome-wide CNV analysis using the Illumina Bovine HD SNP array in 188 individuals from eight Chinese cattle populations (Fig. 1). After filtering by call rate and genetic relationship, a total of 167 individuals remained. Autosomal CNVs were identified following standard analysis procedures using PennCNV based on the taurine reference assembly (UMD3.1). After CNV detection, low-quality samples were filtered out. A total of 157 individuals were finally used for downstream analyses. Our study detected a total of 13,325 CNVs, representing an average length of 61.2 kb across all individuals (Table 1). These CNVs were merged into 3,356 copy number variant regions (CNVRs), covering 148.0 Mb (average of 44.1 kb) of polymorphic sequence, and corresponding to 5.81% of autosomal genome (148/2,545.9) and 5.07% (148/2,918.0) of the whole genome (Fig. 2 and Supplementary File 1: Table S1). Among them, 2,124 “unique” (only present in one individual), 1,278 “gain” (account for 38.1%), 1,748 “loss” (52.1%) and 330 “both” (9.8%) were identified in our analysis. Notably, we found the count of loss event was approximately 1.4-fold more than gain events, and 5.3-fold more than both events, however, the average length of “both” CNVRs (176.1 kb) were larger than “loss” (32.3 kb) and “gain” (26.2 kb).

Figure 1
figure 1

Geographic distribution of four groups from eight populations in China. YHC, MGC, CDM, PWC, LSC, ZTC, WSC and NDC are abbreviations for Yanhuang, Monggu, Caidamu, Pingwu, Liangshan, Zhaotong, Wenshan and Nandan, respectively. The numbers of individual sample for each group were shown beside figure legend. The distribution of cattle was summarized and visualized with R packages maps, mapproj, maptools and ggplot 2.

Table 1 CNV events and CNVR detected in four groups (North, Northwest, Southwest and South groups).
Figure 2
figure 2

Genomic distribution and status of detected CNVRs in 157 cattle. Red, green, and blue lines represent the predicted statuses of gain, loss, and both, respectively. A total of 13,325 CNVs were merged into 3,356 copy number variant regions (CNVRs), covering 148.0 Mb (average of 44.1 kb) of polymorphic sequence. Among them, 2,124 “unique” (only present in one individual), 1,278 “gain” (account for 38.1%), 1,748 “loss” (52.1%) and 330 “both” (9.8%) were identified in our analysis.

Next, we divided the 157 individuals from eight populations into 4 groups which consisted of North (n = 35), Northwest (n = 17), Southwest (n = 63) and South (n = 42) based on the Multidimensional scaling (MDS) analysis results based on SNPs (Figure S1). The CNV events among individuals within each group were merged into group-specific CNVRs. We observed 969, 544, 1880 and 1401 CNVRs in North, Northwest, Southwest and South groups with 62.8, 51.1, 45.0 and 44.6 kb of average length, respectively (Supplementary File 1: Table S1). To display the distribution of CNVR across genome, we filtered away CNVRs with only one CNV event, these single CNV events are more likely to be false-positive CNVs which are not real, and finally 1,232 CNVRs remained. Based on the 1,232 CNVRs, we generated a Circos plot to visualize CNVR landscapes across groups (Fig. 3).

Figure 3
figure 3

Circos plot illustrating CNV regions in 4 groups of cattle. The circles from outside to inside represent frequency of CNV event in each CNVR from North (A), Northwest (B), South (C) and Southwest (D) group respectively.

CNVR annotation and enrichment analysis

To explore the potential function involved with CNV genes in Chinese cattle, we performed functional annotation and GO enrichment analysis of CNV genes using PANTHER. A total of 1,400 unique genes were found overlapped with 3,356 CNVR, and these identified genes were further used for PANTHER enrichment analysis. We found the identified CNV regions were mainly enriched in fatty acid beta-oxidation, catabolic process, catabolic process, transport, localization, system process, neurological system process and G-protein coupled receptor signaling pathway sensory perception (Supplementary File 2: Table S2). Simultaneously, we carried out the gene feature analysis using the detected CNV genes, and we observed a total of 891 CNVRs and 679 CNVRs overlapping with 1,320 CDSs (coding sequences) and 1,189 exons, respectively.

Selection signatures of CNVs

To investigate the group-specific selection involved with CNV events, V i, a new statistic modified from d i , was proposed to characterize group differentiation of CNVs. The V i value was defined as a function of unbiased estimates of all pairwise V ST between one group and the remaining groups, and V i statistic was suitable for detecting selection specific involved with CNVs to a particular group. In this study, we performed a genome-wide scan for differentiation analysis using V i in four groups (North, Northwest, Southwest and South). We then generated Manhattan plots of V i distribution for four groups as shown in Fig. 4 (Summary statistics were presented in Supplementary File 3: Table S3). After filtering away CNVRs with only one CNV event, 1,232 CNVRs remained. For each group, we defined candidate regions involved with selection using two thresholds: top 1% or 5% of CNVRs with highest V i values in the empirical distribution. In this study, we finally obtained 12 and 62 candidate CNVRs at the top 1% and top 5% for each group, and threshold of V i value for top 1% were 6.70, 7.19, 6.07 and 9.25 in North, Northwest, Southwest and South group, while the threshold for top 5% were 3.80, 4.62, 3.32 and 4.95, respectively.

Figure 4
figure 4

Genome-wide distribution of V i values among North, Northwest, Southwest and South groups. The V i value was defined as a function of unbiased estimates of all pairwise V ST between one group and the remaining groups within a population, and V i statistic was suitable for detecting selection specific involved with CNVs to a particular group. The distribution of V i for each CNVR across all auto chromosomes is shown for each group. Alternating color indicates V i values from adjacent chromosomes.

To identify shared and group-specific CNVRs under potential selection, we next generated the Venn diagram based on these CNVRs (Fig. 5). At top 1%, 5, 4, 2 and 6 of group-specific CNVRs were found in North, South, Northwest and Southwest groups, respectively, while no shared CNVR was found. Next, we relaxed the threshold from top 1% to top 5%, we observed 9 CNVRs were shared by all 4 groups, 82 CNVRs were identified as group-specific CNVRs, while 24 CNVRs were detected in North, 24 in South, 25 in Northwest and 9 in Southwest group, respectively. To further estimate the divergence of CNVs across groups for the identified candidate CNVRs, we extracted the LRR values for each shared and unique CNVRs in four groups. Notably, we observed clear differences based on average LRR for each group-specific CNVR and the Box-plot of the distributions of average LRR in the identified CNVRs across diverse cattle groups were shown in Fig. 6.

Figure 5
figure 5

Discovery of shared and group-specific CNVRs among 4 groups. The Venn diagram shows the number of overlapping CNVRs in 4 groups including North, Northwest, Southwest and South groups. The top 1% (n = 12) and top 5% (n = 60) of CNVRs among four groups were displayed in left and right, respectively.

Figure 6
figure 6

Box-plot of average LRR values for group-specific CNVRs across four groups. (A) North-specific group, (B) Northwest-specific group, (C) Southwest-specific group, (D) South-specific group.

Genes in shared and group-specific CNVRs

To investigate lineage-differentiated CNV genes in Chinese cattle, we also conducted CNVR-based population differentiation analysis, and identified a set of potential CNV genes under divergent selection (Table 2). In the shared CNVRs, we detected 4 genes embedded with 3 CNVRs, among them, CNVR208 and CNVR2872 on BTA1 and BTA22 with length of 14.6 and 15.0 kb, overlapping gene CBR1 and TKT, respectively, and CNVR2451 on BTA17 with length of 71.4 kb spanning two genes LOC100297192 and IGLL1.

Table 2 Summary statistics of shared and group-specific CNVRs, including the position of CNVR, estimated V i value for each group and corresponding CNV genes.

In this study, a total of 30 group-specific CNVRs were overlapped with 46 annotated genes for 4 groups (Table 2). In North group, we obtained 10 CNVRs covering 15 genes, which were mainly associated with metabolic process (GUCY1A2, RAVER1, EBD, DEFB1 and DEFB5) and response to stimulus (GUCY1A2, EBD, DEFB1, DEFB5 and ABCC5). Moreover, gene TMC6 was related to immune system in CNVR2614 (V i = 4.42), and COL27A1 within CNVR1360 (V i = 4.44) and BOC gene within CNVR114 (V i = 4.97) were identified as North-specific candidates. In addition, we detected several genes including calcium ion binding (EFHB), phosphatase activity (DUSP27) and GTP binding and phosphorus-oxygen lyase activity (GUCY1A2). In Northwest group, a total of 15 genes were obtained in 12 CNVRs, among them, most genes were related to metabolic process (KIN, RPL7A, SATB1, ABO and FMOD), cellular process (KIN, GLRB, SATB1 and FMOD), response to stimulus (KIN and GLRB) and developmental process (EPHA3 and FMOD). Gene MRVI1 was simultaneously overlapped with CNVR2175 and CNVR2176 with V i = 6.40 and V i = 7.25. For Southwest group, we detected 8 genes in 4 group-specific CNVRs. The genes FAT2 and SLC36A1 overlapped with CNVR1171 (V i = 3.78), and CNVR3341 overlapped with three genes including LRFN4, MIR2408, PC and RCE1. We also obtained CNVR990 and CNVR2517, which overlapped with gene KIT and ZNF331, respectively. KIT was identified in CNVR990 (V i = 3.40) in Southwest group, which was also shown to have high V i value (4.49) in South group. We found 4 group-specific CNVRs with 8 genes in South group, most of these genes involved with cellular process, biological regulation and response to stimulus. We observed CNVR2181 with four genes showing relatively high value (V i = 7.71), CNVR190 (V i = 5.14) and CNVR2618 (V i = 6.63) overlapped with genes KPNA6 and ATP5H, while CNVR507 overlapped with two genes (ESPNL and SCLY) (Table 2).

Discussion

In this study, we performed a genome-wide CNV scan using high density SNP array in Chinese cattle. Recently, many studies regarding CNV discovery had been reported for various cattle populations using aCGH, SNP array and next generation sequencing25,28,31,39,44,45,46,47,48. For instance, some previous studies carried out CNV analysis in world-wide cattle including taurine, indicine, and mixed cattle population from African using both BovineSNP50 and BovineHD SNP array26,44. In addition, other studies have conducted CNV analyses for local cattle population, including South African Nguni cattle49, Brazil Nelore cattle50,51, Hanwoo and Japanese Black cattle52,53. For Chinese native cattle, only a few studies have been carried out using aCGH and low density SNP array in limited populations41,42,43. Therefore, a comprehensive investigation of CNV and their population genetic properties in diverse Chinese cattle is needed. In the current study, CNV analysis for CDM, WSC, ZTC and NDC was explored for the first time, and the newly discovered CNVs in specific local populations could offer important molecular resources and may further help to elucidate the selection mechanisms of CNV and their genomic adaptation.

In the current study, we detected 13,225 CNV events and 3,356 CNVRs across the cattle genome, of which, 90.33 Mb were newly detected as compared to the cattle HapMap samples31,44. In contrast to previous CNV studies in Chinese cattle, we observed 11.38 Mb CNVR was overlapped with previous reports and 136.72 Mb CNVR was newly discovered in present study41,42,43. In addition, we also found 54.76 Mb CNVRs were overlapped as compared to the recent publication in Nelore cattle and 93.34 Mb CNVR were specifically detected in current study54. Totally, we detected 4,534 gain and 8,791 loss events across cattle populations in China, and the results shown that deletions are more numerous than duplications which is generally observed in human55, mouse11, dog15,56, and cattle25,44,45. This finding may indicate that deletions were more easily to be detected by PennCNV on the BovineHD SNP array44. We observed more CNV events were detected in South and Southwest (95.21 and 86.86 per sample) than in North (74.29 per sample) and Northwest (73.76 per sample), and a previous study also detected more CNV events in indicine breed than taurine26. This may be because Northern group was more influenced by B. taurus, and South group was more influenced by B. indicus, while the cattle in the central and lower areas of the Yellow River and the Huaihe River appear to be a mixture of B. taurus and B. indicus types57,58. This finding may imply subspecies divergence of genome structures in Chinese cattle. In addition, some of these differences could be related to the fact that SNP markers in BovineHD were designed based on the UMD3.1 reference genome, which may cause bias for CNV calling in indicine-derived cattle populations45.

Diversity and selection aspects of CNVs have been extensively explored in world-wide cattle populations using aCGH, SNP array and next generation sequencing, demonstrating that CNVs display breed-specific differences and may be associated with adaptation, health, and production traits25,31,39,40,46. To our knowledge, our study is a first attempt to explore the CNV properties using high density SNP array in Chinese cattle across a broad latitudinal range. We suspect CNV are important genomic variants under distinct selection pressures, and CNV can contribute to diverse morphology in cattle sampled from different geographic regions with various environmental factors (e.g., climate, temperature, altitude, rainfall, and food source). Our study explored lineage-differentiated CNVs in Chinese cattle and identified several potential CNV candidates under divergent selection for adaptation in local environmental conditions. CNV genes may have distinct functional roles and be subject to different evolutionary pressures. Interestingly, for shared CNVR with high V i values across four groups (North, Northwest, Southwest and South), we identified several functionally important genes related to immunity and metabolism, such as IGLL1, CBR1, and TKT. IGLL1 has been previously detected with copy number changes in cattle genome40,44, this gene has been showed differentially expressed in hepatic and mammary tissue in dairy cows59,60 and was associated with resistance to gastrointestinal nematodes in Angus cattle61. The expression of CBR1 was found to be associated with oxidative stress in bovine embryos62, and this gene had been reported to be associated with growth traits weaning gain in Bos indicus 51. Previous studies revealed that TKT involved in fatty acid synthesis and storage in muscle, which were also correlated with intramuscular fat in both cattle and sheep63. The shared CNVs across cattle groups could suggest the parallel selection on CNVs through evolution of cattle genomes, which was also reported in three-spined stickleback64. The existence of shared CNVs indicates their potential function may contribute to the parallel adaptive evolution within multiple natural populations in diverse species.

Moreover, our study revealed several group-specific CNV genes, which may play significant roles in diverse morphology and adaptation for local environmental condition. In the North group, we identified several candidate genes related to calcification of cartilage and immune system in MGC and YHC. TMC6 related to immune system in CNVR2614 (V i = 4.42) was associated with milk somatic cell score in dairy cattle, which may imply the potential selection for this gene related to the milk quality in cold environmental condition65. COL27A1 within CNVR1360 (V i  = 4.44) may play a crucial role in cartilage calcification and average daily gain66,67. Our results indicate these CNV genes could enhance development of bone and muscle, cold tolerance and disease resistance for North group cattle, which live in relatively cold environment in northern China.

For the Northwest group, CDM cattle live on the Qinghai-Tibet plateau in northwestern China, which is an arid environment exhibiting dry, hypoxia, low air pressure. Specifically, we observed several genes MRVI1, ABO, GLRB and EPHA3 related to nervous system, platelet reactivity, parasite resistance and histoblood group antigens. MRVI1 overlapped with both CNVR2175 and CNVR2176, and several variants in MRVI1 had been reported to be associated with platelet count, mean platelet volume and platelet reactivity68. In the current study, ABO, embedded within CNVR1759 (V i = 4.90), is related to A and B histoblood group antigens69, and copy number changes have been identified within ABO in recent studies by Hou et al.26,61. ABO have also been associated with parasite resistance and susceptibility to gastrointestinal nematodes in Angus cattle61. Moreover, a previous study suggested one CNVR containing ABO, SURF6, RPL7A was significantly associated with milk somatic cell score in Holstein cattle70. GLRB, which was identified as candidate in CDM cattle at BTA17 has been associated with flight speed in beef steers, and GLRB protein is a ligand gated ion-channel subunit throughout the central nervous system71,72.

In the Southwest group (PWC, LSC and ZTC), we identified several genes KIT, FAT2, SLC36A1, ZNF331, LRFN4 and RCE1 in four south-specific CNVRs. Notably, KIT is a tyrosine kinase receptor, and normal KIT signaling is required for development and survival of neural crest-derived melanoblasts73. Genetic variation in KIT gene has been shown to affect coat coloring pattern in a variety of mammals74,75,76,77. Rubin et al. suggested that white and white spotted pigs are caused by at least two out of the four KIT duplications and the porcine KIT locus also illustrates the evolution of alleles under strong positive selection78. Our results suggested the identified genes may facilitate the cattle living in mountainous areas of Southwestern China.

This study provides a comprehensive investigation of CNV properties in diverse Chinese cattle, and newly identified CNVs contribute to the important genetic resources for the global cattle population. Our results further suggest that lineage-differentiated CNVs may be under divergent selection for adaptation in local environmental conditions. Hybridization array studies may generate both false positive and false negative results, regardless of how the data are analyzed for CNV discovery79. Many studies recommend using multiple CNV calling algorithms instead of just one80; however, although the net effect of this strategy decreases the false negative rate, it also increases the false positive rate81. With advances in next generation sequencing projects, such as the 1000 Human Genomes project2 and the 1000 Bull Genomes project29, we should be able to better estimate the false positive and false negative rates with better CNV calling standards. Therefore, careful experimental design and rigorous data filtering were required to reveal the impacts of CNVs on both phenotypic variability and diverse selection. Future CNV studies utilizing next-generation sequencing (NGS) and complementary analysis programs will help precisely define the CNV structure and elucidate its function.

Materials and Methods

Ethics Statement

All of the animal experiments were approved by the Chinese Academy of Agricultural Sciences (CAAS, Beijing, China). All of the animal procedures were performed in strict accordance with the guidelines proposed by the China Council on Animal Care and the Ministry of Agriculture of People’s Republic of China.

Sample selection

Blood samples were obtained from 188 individuals representing 8 different cattle breeds across a broad latitudinal range in China. These include Menggu cattle (MGC), Yanhuang cattle (YHC), Caidamu cattle (CDM), Pingwu cattle (PWC), Liangshan cattle (LSC), Zhaotong cattle (ZTC), Wenshan cattle (WSC), and Nandan cattle (NDC). Genomic DNA was extracted from blood samples using the TIANamp Blood DNA Kit (Tiangen Biotech Co. Ltd), and DNA with an A260/280 ratio ranging between 1.8 and 2.0 was subject to further analysis. In this study, we divided 8 breeds into 4 groups based on geographical locations (North group, Northwest group, Southwest group and South group) (Fig. 1). The genotyping platform adopted in this study was Illumina’s Infinium II Multi-Sample Assay. SNP chips were scanned using iScan and analyzed using Illumina’s GenomeStudio 2011. After filtering by the call rate of each given animal (threshold was > 95%), the final data including Log R Ratio (LRR) and B Allele Frequency (BAF) were exported from GenomeStudio software. To avoid the bias of population genetic estimation, genetic relationships between pairwise individuals were estimated using PI-HAT value implemented in PLINK v1.0782, unrelated individuals with pairwise PI-HAT < 0.25 were kept for subsequent analyses.

CNVs detection

In this study, we utilized the PennCNVv1.0.3 software to detect CNV across autosomes in Chinese cattle populations83. chrX and chrUn were not considered here due to mapping uncertainty as described previously by Hou et al.26. The PennCNV algorithm incorporates LRR and BAF, which denote the normalized intensity ratio for each SNP alleles and the frequency of allele B, respectively. The population frequency of B allele (PFB) file was calculated based on the BAF of each marker across populations. The gcmodel file was generated by calculating the GC content of the 1 Mbp genomic region surrounding each marker (500 kb each side). The final CNVs were obtained by filtering the low quality samples with the following thresholds: standard deviation (SD) of LRR as 0.35, BAF drift as 0.01 and waviness factors as 0.05.

CNVR compilation

CNV regions (CNVRs) were produced by aggregating overlapping CNVs (by at least 1bp) across samples using BEDTools v2.26.084. The CNVRs were classified as “gain”, “loss”, or “both” events. Overlapping “loss” and “gain” CNVRs were merged into single regions to account for “both” events. To facilitate the comparison of CNV pattern among the diverse groups, the CNVRs for each group were generated. The frequency landscape of each group-specific CNVR was visualized using Circos software85.

Function annotation

To elucidate the functional aspect involved with identified CNVs in cattle genome, gene content of cattle CNV regions was assessed using the RefGene track of the UCSC genome browser. Annotation of genes and gene feature analysis (identify the coding sequence and exon of genes presented within CNVs) was performed using the scan_region.pl script from PennCNV v1.0.3 package. PANTHER (Protein ANalysis THrough Evolutionary Relationships) classification system (http://www.pantherdb.org/) was used to explore functional ontology categories as described previously86. We tested the hypothesis that the PANTHER molecular function, biological process and pathway terms were under- or over- represented in CNV regions after Bonferroni corrections. Only GO terms with P-value < 0.05 after the Bonferroni correction were considered.

Signatures of selection

To detect the group-specific CNV events, we proposed a statistic named V i to estimate the region-specific divergence in CNVR for each group based on unbiased estimates of pairwise V ST , which estimates population differentiation based on average LRR values across all probes falling within a specific CNV region87.

For each CNVR, we calculated the statistic \({V}_{i}={\sum }_{j\ne i}\frac{{V}_{st}^{ij}-E[{V}_{st}^{ij}]}{sd[{V}_{st}^{ij}]}\), where \(E[{V}_{st}^{ij}]\) and \(sd[{V}_{st}^{ij}]\) denote the expected value and standard deviation of V ST between groups i and j calculated from the average LRR values across all CNVRs87. V ST is calculated using the following equation: (V T  − V S )/V T , where V T is the variance in LRR apparent among all unrelated individuals and V S is the average variance within each group, weighted for sample size. The V i was inspired by statistic d i proposed by Akey et al.88, which is based on the estimation of difference of allele frequency across group. d i measures the standardized locus-specific deviation using SNP genotype in levels of population structure for a particular group relative to the genome-wide average. Here, we proposed V i to measure the changes of copy number variant regions (based on average LRR values) for a particular group. Large positive V i values indicate high levels of group difference of CNV regions relative to the genome-at-large, which are potentially involved with selection. Thus, V i is particularly well suited for detecting selection of CNV region specific to a particular group.

The Manhattan plot of V i value for each CNVR across groups were created by the R package “qqman”. To identify the shared and group-specific CNVRs, the Venn diagram was generated based on overlapping of CNVRs with R package “VennDiagram”89. Box-plot was generated to display the distribution of average LRR in the identified CNVRs across diverse cattle groups. Unless specified, all statistical analyses were performed using R programming (https://www.R-project.org).