Characterization of heterozygosity-rich regions in Italian and worldwide goat breeds

Heterozygosity-rich regions (HRR) are genomic regions of high heterozygosity, which may harbor loci related to key functional traits such as immune response, survival rate, fertility, and other fitness traits. This study considered 30 Italian and 19 worldwide goat breeds genotyped with the Illumina GoatSNP50k BeadChip. The aim of the work was to study inter-breed relationships and HRR patterns using Sliding Window (SW) and Consecutive Runs (CR) detection methods. Genetic relationships highlighted a clear separation between non-European and European breeds, as well as the north–south geographic cline within the latter. The Pearson correlation coefficients between the descriptive HRR parameters obtained with the SW and CR methods were higher than 0.9. A total of 166 HRR islands were detected. CHI1, CHI11, CHI12 and CHI18 were the chromosomes harboring the highest number of HRR islands. The genes annotated in the islands were linked to various factors such as productive, reproductive, immune, and environmental adaptation mechanisms. Notably, the Montecristo feral goat showed the highest number of HRR islands despite the high level of inbreeding, underlining potential balancing selection events characterizing its evolutionary history. Identifying a species-specific HRR pattern could provide a clearer view of the mechanisms regulating the genome modelling following anthropogenic selection combined with environmental interaction.

Goats (Capra hircus) were domesticated around 10,500 years ago in the Fertile Crescent 16 .After dispersion from its center of domestication, this species has undergone an intense adaptation process and occupied different agroecological areas worldwide 17 .Different breeds and goat strains have been selected for milk, meat and fiber production, playing an important role in the livestock sector around the world.Given the wide range of genetic variability and its ability to populate very different geographical areas and climates and produce in conditions of low anthropogenic input, the goat species is the best for studying genetic diversity and adaptation 18,19 .Thanks to these features, this species provides one of the most suitable models for understanding the patterns and distribution of HRR.Furthermore, Italy is the European country with the highest number of goat breeds, and it provides a precious reservoir of genetic diversity shaped by its varied history, environment, climate, and farming traditions 20 .Therefore, in this study, we investigated the genomes of Italian and worldwide goat breeds, to detect and characterize their HRR patterns and reveal regions of heterozygosity (HRR islands) which contain candidate genes related to specific traits.

Genetic diversity indices
The genetic diversity indices were calculated per breed and then averaged per geographical group (see Table 1 for details on the dataset).The summary statistics showed a high variability all over the breeds (see Supplementary Table S1).On average, Asian breeds revealed the lowest values of expected (H E = 0.340) and observed (H O = 0.332) heterozygosity, and the highest average level of inbreeding (0.243).Conversely, the Turkish breeds (KIL and KLS) deviated from the Asian trend and showed values in contrast (H O and H E higher than 0.396 and inbreeding lower than 0.091).Similarly, African breeds showed a mean H O of 0.371 and a mean H E of 0.366, but a notable lower F IS equal to 0.152.All European breed groups (Europe, Alpine arch, and Italy) reported average heterozygosity indices higher than 0.386 and average inbreeding coefficient lower than 0.121.Values of the Italy group of breeds spanned from ARG (H O = 0.416, H E = 0.412 and F IS = 0.051) to MNT_I (H O = 0.271, H E = 0.263 and F IS = 0.381) that also represented the range of the whole dataset.ARG, MLG, and JON were the breeds with the highest H O (0.416, 0.415, and 0.415, respectively), while RCC, ARG, and M×S showed the highest H E (0.414, 0.412, and 0.412, respectively).The lowest F IS was estimated in ARG (0.051) and the highest, excluding the islandisolated MNT_I, was reported in GIR (0.174).The MAF averaged per geographical group followed the trend of Table 1.Dataset composition.CODE geographic group, Breed breed's acronym, Name breed's full name, n raw the raw breed's size, n final reduced size after data management and quality control, GEO geographic area of breeding.*Reference: Cortellari et al. 20 .**Reference: Stella et al. 18 .

Genetic relationships
A representation of genetic relationships within-and between-breeds comes from multidimensional scaling analysis.Two different graphs of the same MDS analysis were generated: the first (Fig. 1) shows the breeds grouped according to their geographic breeding area as reported in Table 1, and the second (see Supplementary Fig. S1) represents all breeds separately.In Fig. 1, the first component (C1 = 29.88%)clearly separated the European and Italian breeds from the rest of the dataset: African and Asian breeds clustered according to their different breeding location, while European, Alpine, and Italian breeds partially overlapped and showed a north-south gradient of variation.Focusing on this cluster, the spatial breeds grouping highlighted the effective geographic distribution within each country and between countries, with the Southern, Center, and Northern (Alpine cluster) Italian breeds extending along a line.The second component (C2 = 11.25%) reported a partial overlapping between Spanish and Egyptian breeds, as well as between Turkish, Bezoar and the European macro cluster (Alpine, Italian, and French breeds).Considering both C1 and C2 components, the Spanish MLG breed and PYR from France slightly detached from the European macro cluster (see Supplementary Fig. S1).The Neighbor-Net based on pairwise Reynolds' genetic distances also shows distinct clustering according to the different geographic distribution of breeds (see Supplementary Fig. S2).In particular, a phylogenetic divergence between European and non-European breeds was highlighted, with the MAL and MLG breeds at the basis of the separation.The Italian breeds highlighted a complex interweaving of nodes and positioned at the center of the Net.Alpine and French breeds clustered showing a certain degree of distinction from other European breeds.The Brazilian breed (CAN), MNT_I and breeds from Asia showed the highest degree of divergence with respect to the analyzed goat breeds.

Heterozygosity-rich regions
The HRR analysis was performed using two methods of investigation (CR and SW) and reported a total of 13,612 HRR for CR approach and 13,558 HRR for SW approach.Individuals reporting no HRR belonged to Asian breeds (3 animals for KAC, 1 animal for PAT, and 1 animal for TAP).Table 2 shows the descriptive statistics of HRR per breed and geographic cluster.Pearson correlation coefficients were calculated between average values (over the 49 breeds) of the four parameters (N HRR , L HRR, S HRR and D HRR ) obtained with the two methods (CR and SW), reporting an r > 0.999 for N HRR , S HRR and D HRR and an r = 0.947 for L HRR .Based on the high overlapping of the HRR results and high correlation coefficients, we here referred only to the results of the CR approach that reported the highest total number of HRR.

Heterozygosity islands and gene annotation
According to the filter applied for the identification of HRR islands, 1,665 SNPs were considered to form a total of 166 HRR islands (see Supplementary Table S2).The average number of islands on the overall dataset was 3.39 per breed, and the highest number of HRR and islands was found in MNT_I (103 and 16, respectively), while no HRR islands were detected for the BRK breed.S3).No results were found for CHI1-A shared island.

Discussion
In livestock, selective breeding tends to progressively decrease the diversity and resilience of target breeds 21 .
Locally reared breeds still seem to maintain their rustic characteristics thanks to the lower anthropic impact 22 .Goats are well known for being the most suitable livestock species for adapting to harsh environments 18,19 .In particular, local goat breeds represent a way to the sustainability of animal production in marginal areas in both developed and developing countries 18,23 .The domestication process has triggered different evolutionary trends, which over the centuries have led to the development of several breeds with different productive aptitudes.This study aimed to provide a general overview of the relationship among the studied goat breeds and to highlight their HRR patterns as repositories of advantageous gene diversity related to adaptation and resilience processes 6,7 .
Identifying a species-specific HRR pattern could provide a clearer view of the mechanisms regulating the genome modelling following anthropogenic selection combined with environmental interaction.

Genetic diversity indices
The high degree of variability in the dataset reflected the expected biodiversity 19,23,24 among different breeds.
The choice of the dataset, composed of goat breeds claiming a general dairy productive orientation, led to an  www.nature.com/scientificreports/evident imbalance between the European panorama and the rest of the geographical groups.Therefore, the average diversity values per group need to be considered a function of the notable numerical disparity between European goats and breeds from other parts of the world.Colli et al. 19 showed a pattern of observed heterozygosity in worldwide goat breeds that matched our results.They highlighted a clear association of this parameter with geography reflecting climatic conditions and breeding management.Therefore, the higher values are probably caused by admixture events between breeds reared in areas with extensive practices (i.e., transhumance in South Italy).The two Sicilian breeds-Argentata dell'Etna and Messinese-represent a known case of occasional admixture linked to the shared breeding area, which impacts on high values of both H O and H E 25 .Despite their status as cosmopolitan breeds and notoriously more subject to selection, the Saanen and the Camosciata delle Alpi showed relatively high heterozygosity and low molecular inbreeding, proving the proper management of their selective plan.Notably, the feral Montecristo goat resulted in the lowest heterozygosity of the whole dataset, likely reflecting a strong inbreeding due to its geographic isolation.Somenzi et al. 26 have retraced the evolutionary history of this feral population, suggesting repeated bottleneck events and founder effects that characterized the demographic history of the insular goat.

Genetic relationships
Genetic relationships analyses pointed out the differences between European and non-European goats, likely as a consequence of gene pool divergence between domesticated breeds and ancestral populations from the regions of the Fertile Crescent 19 .We can also assume that physical barriers, such as long distances and mountain ranges, have marked this divergence.Moreover, bias related to the design of the Illumina chip panel must be considered to explain the clear divergence of the two macro goat clusters 24,27 .The Multidimensional-scaling analysis highlighted the high cohesiveness of the European breeds in their totality (Mediterranean, Central-Europe and Alpine goats) when compared with the rest of the data set.The north-south geographic cline and the greater homogeneity of some breeds such as the Malaguena and the Pyrenean, were particularly evident.More significant variability among the different breeds of the other geographic groups was found (e.g., Pakistani vs Turkish goats, Ethiopian vs Egyptian goats vs Malian goats), probably due to several factors involving the diverse histories of local communities, the different levels of gene flow affecting breeds from different geographic areas (e.g.Pakistani vs Turkish goats) 19,28 , the eventual influence of the European genome 29 , as well as the original complexity of the African and Asian goat stock origin 30,31 .The phylogenetic nodes of the Neighbor-Net based on pairwise Reynolds' genetic distances better represented the genetic relationships among breeds, confirming the genetic closeness of the Europe, Alpine and Italy geographic groups and their divergence from the African and Asiatic strains.The Montecristo goat showed a close relationship to the other insular goat (Corse) and genetic proximity to the Italy group possibly because of the recent inputs of domestic stocks (twentieth century) as already reported in a previous study 26 .

Heterozygosity-rich regions
HRR are heterozygous genomic regions which are potentially associated with disease resistance, immunity, and adaptation processes.The higher the level of diversity, the better the biological response of species to environmental changes or new diseases.Consecutive Runs and Sliding Window are the two approaches currently used to detect stretches of consecutive homozygosity (ROH) and HRR in livestock species e.g. 3,7,8,32,33.Although only few studies have focused on identifying HRR islands with both methods of detection, it seems that the use of the Consecutive Runs approach is preferred 3,7,15 .In this study, we investigated both methods, highlighting the overlapping results for HRR identification.Strong correlation between CR and SW was estimated for all the parameters, indicating nearly identical results for all breeds.In a similar study on HRR in pigs, Bordonaro et al. 14 found a Pearson correlation coefficient higher than 0.96 for all four parameters (N ROH , L ROH , S ROH , F ROH ), confirming the obtaining of overlapping results between the two approaches.As mentioned by Biscarini et al. 6 , the choice of the parameter values is of fundamental importance in detecting ROH/HRR because of its effect on results.The information present in the scientific literature regarding the setting of the detection parameters is not very detailed.According to Mulim et al. 7 , the allowed number of opposite and missing SNP and the gap between two consecutive markers are the factors that mainly affect the detection of runs.In our preliminary screening, we evaluated the number of HRR found by testing several combinations of parameters (the minimum number of SNPs in a run, the minimum length, and the number of missing/opposite markers).The minimum number of SNPs in a HRR (minSNP) and the maximum number of SNP with opposite genotype in a window (maxOp-pWindow) were the parameters that most influenced the number of HRR found in SW method, while for the CR method it was the number of opposite genotypes allowed in a HRR (maxOppRun) 34 .Selli et al. 8 conducted a methodological study on several combinations of detection parameters in Sliding Window approach to investigate HRR.They found that a reduction in the minimum number of SNPs in a run (minSNP) and in a window (windowSize), as well as the number of homozygous and missing SNPs allowed (maxOppRun and maxMissRun), caused a decrease in HRR found, whereas a reduction in the minimum length of the run (minLengthBps) caused an increase in the number of HRR.Similarly, minSNP and windowSize positively affect the HRR length.In this scenario, it is unclear how different combinations might affect the results, and further analyses are needed 6 .In general, the within-breed occurrence of HRR is lower than that of ROH 2,3,6 .Indeed, we found an average number of HRR per breed lower than 18, while Cortellari et al. 23 found an average number of ROH of 31.5 in the same Italian breeds data.Other studies on HRR reported an average of 9.5 in semi-feral Chillingham cattle population known to be characterized by a strong inbreeding 4 , less than 9 in the italian Maremmana cattle breed 6 , 121.5 in Pinzgauer cattle breed 2 , 57.8 in commercial turkeys 35 , 52.2 in a local horse breed 3 , ~ 40 in Duroc pig 9 and 139.6 in sheep 8 .These notable differences may be attributed to the species of interest, to the breeds analysed, and to the different SNP arrays and parameters used for the HRR detection.All the cited studies also found that HRR were much shorter than ROH, confirming our general mean of ~ 500 kb in length and reflecting low coverage of the entire genome (maximum of 6.38 Mb for the Alpine breed group).These results might be related to the absence of missing and opposite SNPs in the detection approach, that usually reduces the size of detected ROH 6 and, similarly, of HRR.We highlighted a notable difference between the results obtained for European (Italy, Alpine, Europe groups) and non-European clusters (Africa, Asia, Brazil).We assume that this result, also reported by Li et al. 15 in chinese goats compared to worldwide breeds, could come from ascertainment bias related to the SNP array design based on European goat breeds 24,27 .Worth of note are the outcomes of Maltese × Sarda (M×S) goats.This population represents the only crossbreed sample included in the data set and the high values for all the indices were expected and probably related to the combination of relatively different genomes.Similarly, Mulim et al. 7 reported the borderline case of the Montana cattle that combines different genomes due to the crossbreeding between Bos taurus indicus and Bos taurus taurus and, consequently, showed the highest amount of HRR in their dataset.

Heterozygosity islands and gene annotation
The identification of recurrent HRR for each breed highlighted wide regions of shared heterozygosity in the chromosomes CHI1, CHI11, CHI12, and CHI18.Williams et al. 4 observed that heterozygosity-rich regions in the Chillingham cattle genome appear to be not randomly distributed but rather grouped in particular chromosomal locations.Moreover, some of the detected heterozygosity islands had already been recognized as homozygosity islands in previous studies 15,36 .Classifying the same chromosomal regions as both homozygosity-rich and heterozygosity-rich emphasizes the diversity among breeds within the goat species.The different breeding strategies and the different production orientations 15 rather than the adaptation to different environments have differentially shaped the genome of breeds 14 .CHI1-A (131.88-132.54Mb) was the most shared island within the whole dataset.STAG1 and PCCB genes, mapped within CHI1-A, are linked to reproduction-related traits and were reported to experience balancing selection in goat breeds, playing a significant role in the domestication of the species 15 .Notably, this HRR island was not present in either Bezoar or Montecristo, the two wild goat populations in the dataset.The second most common HRR island was CHI18-A (36.36-37.20 Mb).It was detected mainly in Italian and Alpines breeds, matched a ROH island found in goats from Europe, Africa and Asia by Bertolini et al. 36 , and it was reported as a possible signature of selection for fiber production with a low number of SNPs supporting this finding 37 .This region harbors genes involved in productive mechanisms in other species, such as the RIPOR1 gene for feeding intake control in the Nellore cattle breed 38 or the PSKH1 gene listed as a candidate gene for mean staple length, live weight, greasy fleece weight, and mean fiber diameter in Merino sheep 39 .The EDC4 gene was reported to mediate the post-transcriptional regulation of IL-6 cytokine playing an important role in immune response 40 .The nearby island CHI18-B (39.56-40.01Mb) was detected in 7 Alpine breeds out of 10 and in 5 breeds of the Italy group, allowing us to hypothesize a possible correlation to a well-defined geographic area.This short region harbors the ZFHX3 gene, also known as ATBF1, which has a well-known function in humans as a transcription factor that regulates myogenic and neuronal differentiation, with similar effects on Chinese native goat breeds where its polymorphisms influence animal growth rate 41 .As far as found in literature, a study on 3 Chinese local goat breeds confirmed the association of this gene with development traits 42 .In our results, we did not find this HRR island in any of the six Asian breeds.Bertolini et al. 36 found two ROH regions in goat chromosome 12 (from 43 to 44 Mb in European breeds and from 50 to 51 Mb in worldwide goats) which overlapped two of our HRR hotspots CHI12-B (43.91-44.47Mb) and CHI12-A (49.80-51.28Mb), harboring genes that are related to ectodermal, nervous system, and hearing function (GJB2 and GJB6) 43,44 as well as gonad development (SAP18) 45 .CHI12-A was also reported as a ROH hotspot by Li et al. 15 in Chinese and other worldwide goat breeds; the authors underlined this region in common with a ROH hotspot in Chinese sheep breeds 46 and possibly under parallel selection between goats and sheep species before the domestication due to the presence of a series of genes (GJA3, GJB2 and GJB6) associated with perception senses, such as sight and hearing essential for surviving in harsh environments (lack of food and presence of wild enemies).These genes were also identified under positive selection for growth (body size, skeletal and embryonic development) in Boar goat 47 and in Barki goat and sheep 48 .Signatures of selection localized in CHI12-A (49.80-51.28Mb) were reported in goats by Serranito et al. 22 which listed a series of annotated genes involved in the adaptation of small ruminants to arid environments (RNF17, ATP12A, GJB2) and high altitude (ATP12A, PARP4, ZMYM5, PSPC1, CENPJ, GJB2).Moreover, the interleukin17D (IL17D) gene, spanning from ~ 50.90 to ~ 50.93 Mb and belonging to the IL17 family of cytokines, was closely associated to host defense and immune response in humans 49,50 .This region perfectly matched the one found in 5 commercial and local goat breeds by Biscarini et al. 51 .CHI11-A (37.76-38.56Mb) was a shared island among 11 breeds, mainly from the Alpine and Italy groups.This region overlapped the ROH island found by Bertolini et al. 36 , which detected a signature of selection in a previous study 37 .In our dataset, the two cosmopolitan breeds (Saanen and Camosciata delle Alpi) reported this HRR island in which several mapped genes have a role in livestock production.In particular, the PPP4R3B gene was associated to thermotolerance in the African N'Dama cattle breed 52 ; Berihulay et al. 53 found this gene involved in gluconeogenesis and lipidic metabolism in Abergelle goats.The PNPT1 gene was reported implicated into RNA transport in pigs 53 and Bandur sheep 54 .Previous studies linked the EFEMP1 gene with traits of interest in livestock species.Zhang et al. 55 identified this gene as a significant influencer of the oleic acid content in the meat of the Wagyu × Angus cattle breed.Or else, EFEMP1 was previously identified as a differentiated expressed gene in muscle for residual feed intake in Nelore steers and seems to be regulated by a transcriptional factor (TCF4) 56 .The worth of notice are the islands in CHI8, even though they are present only in few breeds: CHI8-A (74.95-75.32Mb) and CHI8-C (39.05-39.29 Mb) seem to have a strong correlation with immune response.In particular, the AQP3 gene in the first island was found in a ROH hotspot in Ganxi and Guangfeng goats by Li et al. 15 who stated its involvement in immunoreactions.The CD274 and JAK2 genes in CHI8-C were reported in several biological processes for immune response in our gene enrichment results.

Heterozygosity islands in feral Montecristo goat
The evolutionary history of the feral Montecristo goat, geographically isolated in the homonymous island, subject to repeated bottleneck phenomena, and in the absence of anthropic management, albeit with a partial introduction of domestic germplasm 26 , represents an outlier case in the analyzed dataset.Moreover, notwithstanding the highest inbreeding coefficient (F IS ) and the lowest heterozygosities (H O and H E ) of the whole dataset, Montecristo goat reported the highest number of HRR.Events of balancing selection characterizing this feral goat population might be assumed during the shape of its genomic structure.Despite a high inbreeding, these events may have led to an increase in short segments of heterozygosity, that are strongly associated with fitness and survival traits 2,12 .Indeed, FAF1 gene in CHI3-A (24.83-25.57Mb) was highly associated with tolerance to Theileria infection in African cattle 57 .On chromosome 3, HRR islands (CHI3-C and CHI3-D) harbored two genes (SPATA1 and ASH1L) connected to reproduction traits in livestock species [58][59][60] .Moreover, SYT11 may play an important role in adaptation to different environmental conditions in Kenyan goat breeds 61 , while ARHGEF2 gene's function in livestock is still unclear, but may be involved in epithelial barrier permeability affecting host-microbial interactions in the rumen 62 .The HRR island in CHI6-A (23.81-24.43Mb) was also found as a HRR region in MNT_I by Somenzi et al. 26 , which attributed a reproduction function due to the presence of PPP3CA gene associated to fecundity traits and litter size in small ruminants.The HRR islands in CHI8-B (38.47-38.69Mb), in CHI15-A (65.98-66.54Mb), and CHI15-B (72.58-73.07Mb) showed other candidate genes related with immune response traits such as RANBP6 63 , GUCY1A2 64 , PIWIL4 65 , and CNTN5 66 .The IL31RA gene in CHI20-A (23.34-23.95Mb) is a cytokine receptors known to be involved in human inflammation and allergic diseases 67 .Moreover, SLC38A9 gene in the same HRR island was identified as an indicator of heat stress in bovine 68 .Finally, the last gene worth of interest is EML1 in CHI21-C (64.17-64.51),which plays a role in processes of exocytosis (the process of releasing vesicle content to the extracellular environment) in the molecular pathway of Ab production by B lymphocytes 69 .However, we cannot exclude that the high number of HRR islands found in the feral goat Montecristo derives from the bias linked to the chip design based on the domesticated goat genome.

Conclusions
This work studied the genetic relationships among goat breeds with dairy aptitude and investigated their continuous heterozygosity patterns.We confirmed a notable divergence between the European lineage and the Asian and African breeds.European goats have shown a clear north-south geographic cline and genetic interconnections probably due to gene flow between geographically close breeds.The investigation of heterozygosity-rich regions highlighted specific portions of the goat genome associated with different functional factors.The distribution of the HRR islands in the goat genome seems to be mainly related to the breeds' geography.A species-specific HRR pattern, possibly shaped by adaptation mechanisms, might provide a clearer view of the mechanisms modelling the genome as a function of anthropic selection.Interestingly, some heterozygosity hotspots showed overlap with ROH islands reported in the literature.This possibly indicates genomic areas targeted by selective breeding which has shaped the goat genome differently in different production contexts.The methodological part of the manuscript gave new insight into the standardization of HRR detection, highlighting the point of overlap between two different detection methods.Further investigation is needed to improve the parameter settings of HRR detection, possibly involving different species and breeds genotyped with different densities of SNP arrays.

Dataset and quality control
A dataset including worldwide goat breeds was generated by merging the Italian Goat Consortium2 (IGC2) and the ADAPTmap repositories described in Cortellari et al. 20 and Stella et al. 18 , respectively.The breeds involved in the analyses were selected for their milk-production aptitude, excluding those for meat or fiber, and grouped according to their geographic breeding area, resulting in a reduced dataset comprehensive of 1,289 individuals

Figure 2
graphically presented the results for each parameter per breed and geographical group.The Alpine group had the highest average values within the dataset (AN HRR = 12.66, AL HRR = 0.51, AS HRR = 6.38 and AD HRR = 0.0026), with ALP and SAA showing high values above the average both within the group and in the entire dataset.Similar averages were exhibited by the breeds of the Europe group, with MLG breed showing the highest values.The Italy group had mean values of the HRR parameters close to those of Alpine and Europe and showed MON and M×S with the highest values, while MNT_I and GIR were the ones with the lowest.The rest of the goat breeds analyzed showed a fair level of rich heterozygosity and values not far from the group means (AN HRR = 11.59,AL HRR = 0.50, AS HRR = 5.80 and AD HRR = 0.0024).Both African and Asian breeds reported low values for all parameters.Within the first group, breeds showed comparable average values, while in the second group, KIL and KLS breeds from Turkey differentiated and showed values above the average.

Figure 1 .
Figure 1.Multidimensional scaling (MDS) plot according to C1 (29.88%) and C2 (11.25%) components.The geographical clusters are represented by different shades of a color.See Table1for a full definition of the dataset.
To explain the workflow in HRR detection, we report the results of SAA.The 0.1 percentile threshold for cutting off the SNP-in-HRR p-values allowed the detection of 164 SNPs for both CR and SW.These markers identified 89 in-breed HRR, but only 21 exceeded the frequency (20% within-breed) and length (minimum 4 SNPs) thresholds, for a final detection of 3 HRR islands.Table3reports details of each HRR island including genomic coordinates, breeds, and annotated genes.CHI1, CHI11, CHI12 and CHI18 were the chromosomes harboring the highest number of HRR islands (sum of breeds' HRR islands), reporting a total of 33 (3 unique), 16 (4 unique), 25 (4 unique) and 36 (4 unique) islands, respectively.Several breeds had HRR islands in common.Notably, 31 (including 22 Italian breeds) out of 49 breeds shared an island in CHI1 (CHI1-A from 131.88 to 132.54 Mb).Similarly, a portion of the genome ranging from 36.36 to 40.01 Mb on CHI18, including two HRR islands, was largely shared by several European breeds: 19 breeds had the CHI18-A island in common, and 13 breeds the CHI18-B.Finally, 15 breeds with different geographic distributions reported a common HRR island in CHI12 (CHI12-A from 49.80 to 51.28 Mb).All HRR islands are plotted in Fig.3, highlighting the chromosome and breeds involved.The gene enrichment performed on genes annotated within HRR islands revealed a total of 74 different processes, mainly biological processes involved into regulation of metabolic processes (CHI12-A), sensor organ development (CHI11-C) and activation of immune response (CHI8-C, CHI16-D) (see Supplementary Table

Figure 2 .
Figure 2. Boxplots of the Heterozigosity-rich regions (HRR) indices aggregated by geographic area.(a) N HRR = average number of HRR per breed, (b) L HRR = average length of HRR per breed (in Mbp), (c) S HRR = mean genome length covered by HRR segments (in Mbp), (d) D HRR = average coefficient of diversity.The different colors indicate the two approaches, CR (blue) and SW (light blue).Each boxplot extends from the 25th to the 75th percentile and shows the average (red dashed line) and the median (black horizontal line) group value.The black dots represent the single-breed results.

Figure 3 .
Figure 3. Graphical distribution of Heterozigosity-Rich Regions (HRR) islands per chromosome.Detected HRR in the meta-population (a) and per breed (b).The gap between consecutive islands within a chromosome and between consecutive chromosomes do not correspond to the real distance in bps.

Table 3 .
Distribution of HRR islands per chromosome.For each HRR island, the table reports the island code, spanning range in base pairs (from-to), the list of breeds carrying the island (Breed) and the NCBI name of the annotated genes.