BoLA-DRB3 gene haplotypes show divergence in native Sudanese cattle from taurine and indicine breeds

Autochthonous Sudanese cattle breeds, namely Baggara for beef and Butana and Kenana for dairy, are characterized by their adaptive characteristics and high performance in hot and dry agro-ecosystems. They are thus used largely by nomadic and semi-nomadic pastoralists. We analyzed the diversity and genetic structure of the BoLA-DRB3 gene, a genetic locus linked to the immune response, for the indigenous cattle of Sudan and in the context of the global cattle repository. Blood samples (n = 225) were taken from three indigenous breeds (Baggara; n = 113, Butana; n = 60 and Kenana; n = 52) distributed across six regions of Sudan. Nucleotide sequences were genotyped using the sequence-based typing method. We describe 53 alleles, including seven novel alleles. Principal component analysis (PCA) of the protein pockets implicated in the antigen-binding function of the MHC complex revealed that pockets 4 and 9 (respectively) differentiate Kenana-Baggara and Kenana-Butana breeds from other breeds. Venn analysis of Sudanese, Southeast Asian, European and American cattle breeds with 115 alleles showed 14 were unique to Sudanese breeds. Gene frequency distributions of Baggara cattle showed an even distribution suggesting balancing selection, while the selection index (ω) revealed the presence of diversifying selection in several amino acid sites along the BoLA-DRB3 exon 2 of these native breeds. The results of several PCA were in agreement with clustering patterns observed on the neighbor joining (NJ) trees. These results provide insight into their high survival rate for different tropical diseases and their reproductive capacity in Sudan's harsh environment.


Results
Distribution of BoLA-DRB3 alleles in selected native Sudanese cattle breeds. PCR-SBT genotyping allowed us to identify 53 BoLA-DRB3 alleles (46 previously reported variants and seven new alleles; Table 1) from the native breeds selected in this study. The number of alleles (n a ) was 46 in Baggara cattle (40 previously reported and six new), 33 in Kenana cattle (28 previously reported and five new), and 33 in Butana cattle (28 previously reported and five new) (Tables 1 and 2). The new BoLA-DRB3 variants were confirmed by the presence of at least three carrier animals and in two breeds, and were submitted to the DNA Data Bank of Japan (http:// www. ddbj. nig. ac. jp) under accession numbers LC569724-LC569739. Nucleotide and predicted amino acid sequences of the seven new allele variants are shown in Fig. 1 and compared with the most similar BoLA-DRB3 reported so far. All seven new BoLA-DRB3 allele variants shared about 89. 7-92.6% and 80.52-85.71% nucleotide and amino acid similarity with the BoLA-DRB3 cDNA clone NR1, respectively (Aida, 1995).
A Venn diagram was constructed using data obtained in this study and from previous reports 18,19,21,27,29 . Data were grouped in terms of the breed's geographical origin as follows: native Sudanese; Southeast Asian; Zebu; European; and American Creole cattle breeds (Fig. 2). This analysis revealed that out of the 115 alleles identified in the five cattle groups, fourteen were unique to native Sudanese breeds (Fig. 2), four of which exhibited gene frequencies that were higher than 0.5%, representing about 26% of the 53 alleles detected in the native Sudanese cattle. In addition, two other variants were only present in native Sudanese and American Creole breeds, while six other alleles were only found in Sudanese cattle populations and American Creole or Southeast Asian native or Zebu breeds, or a combination of these groups. In addition, the BoLA-DRB3 NJ tree, including all the previously reported alleles and the seven new variants, showed that the variants detected in Sudanese cattle populations were interspersed among the various clusters (Fig. 3). A similar result was observed when the BoLA-DRB3 tree was inferred using amino-acid residues located in the antigen-binding site (ABS) (Fig. S1).
As shown in Fig. S2, the native Sudanese cattle breeds have an even gene frequency distribution, with a high number of alleles with low frequency. Low allele frequency was particularly noticeable in the Baggara breed. Only two, five and seven alleles appeared with frequencies of > 5% in the Baggarar, Kenana and Butana breeds, respectively. These common alleles accounted for a low proportion of the cumulative gene frequencies (12.83, 44.23 and 50.83% in the Baggara, Kenana and Butana breeds, respectively); four of which (BoLA-DRB3*003:02:01, *021:01 *022:01 and *024:01) were common in at least two out of the three Sudanese breeds (Table 1).
Nucleotide and amino acid diversity in the BoLA-DRB3 alleles found in native Sudanese cattle breeds. Genetic diversity at the DNA and amino acid levels was evaluated using four methods that compare the average amino acid and nucleotide substitutions for every pair of alleles within the breeds. The nucleotide diversity (π) exceeded 0.074 and the mean number of pairwise differences values exceeded 17.99 within Sudanese native breeds (Table 3). Comparison with results previously reported for other cattle breeds showed that these nucleotide diversity values all fall within the range previously reported (π range = 0.068-0.090; NPD range = 16.31-20.96) when using PCR-SBT genotyping methods 18,19,21,29,30 . Regarding amino acid diversity,  18,19,21,27,29,35 . Regarding the HWE test, the three Sudanese native populations were in equilibrium (  (Table 2). In addition, we estimated the selection index (ω) in each amino acid site to evaluate the presence of diversifying selection (ω > 1) along BoLA-DRB3 exon 2. These analyses showed high ω values in more than 30 sites in each breed, mainly located in the ABS (Fig. 4).
BoLA-DRB3 genetic structure and levels of population differentiation in Sudanese cattle. The level of genetic differentiation among the three Sudanese breeds was studied through the F ST index. The average F ST was statistically significant although this value accounts for less than one percent of the total genetic variance (F ST = 0.0076 (ranging between 0.007 and 0.009); p < 0.001) (Table S2). This low but significant value can be explained by high within-population diversity and differences in rare alleles profiles among them 36 . The average F ST value observed in Sudanese cattle is higher than those estimated in Myanmar native breeds (F ST = 0.003), and slightly lower than those reported for Holstein populations from different countries (F ST = 0.009) 18,37 (Fig. 5 and Table S2). When breeds were grouped in terms of the breed's geographical origin, as was done in the Venn diagram, the genetic variance among breed groups and among populations within groups accounted for 1.18% and 3.71% of the total genetic variance. When the five sampling sites of native Sudanese breeds were compared (two sampling locations of Kenana cattle were very close and assumed as one), the average F ST value was 0.0074 (p = 0.164), while the pairwise F ST ranged from 0.0002 (p = 0.450) between both Baggara populations and 0.0118 (p < 0.0001) between Baggara Daiwani and Butana Qadarif. Significant differences were observed in nine out of the ten native population comparisons (p < 0.05; Table S3). Similar genetic distance values were observed among Holstein populations from different countries and between native breeds of Myanmar 18,37 . Genetic differentiation of BoLA-DRB3 alleles in native Sudanese cattle breeds: comparison with Zebu and Taurine breeds. First, BoLA-DRB3 allele frequencies from Sudanese cattle populations and for each breed included in the dataset were used to generate Nei's D A and D S genetic distance matrices. Then, dendrograms were constructed from these distance matrices using NJ algorithm. All trees revealed congruent topologies, which were consistent with the historical and geographical origin of the breeds analyzed. As expected, these trees revealed two main clusters, which included the Taurine and Zebuine breeds (Fig. 6a). It is noteworthy that Sudanese breeds were located in a sub-cluster within the indicine cluster, with the two dairy breeds located in the east of the country, Butana and Kenana being more related to each other than the Baggara www.nature.com/scientificreports/ breed in the west. These results reveal that Sudanese cattle breeds have a particular diversity in the BoLA-DRB3 gene, as a consequence of its gene frequency profile and the presence of a high number of private alleles. The results of the PCA showed that the first three components accounted for 47.30% of the data variability. The first principal component (PC) accounted for 24.31% of the total variance and, as shown in a previous study 64 , clearly exhibited a differentiation pattern between the Zebu (negative values) and Taurine (positive values) breeds, while native breeds from Southeast Asia and Sudan were located in an intermediate position near the axis origin of the plot (Fig. 6b). This PC was primarily determined by differences in the frequency of the same alleles, such as BoLA-DRB3*022:01, *028:01, *036:01, *031:01, *030:01, and *057:02 with the higher negative axis 1 values, whereas the alleles BoLA-DRB3*001:01, *002:01, *007:01, *008:01, *010:01, *011:01, *012:01, *015:01 *016:01, *018:01 had the higher positive values for this axis. The second PC explained 11.98% of the total variation and showed a gradient among Taurine breeds, with Chilean Hereford (positive values) and Japanese Jersey (negative values) located at opposite ends. Furthermore, this component discriminated between native Sudanese and native Southeast Asian cattle breeds. Finally, the third PC accounted for 11.01% of the variance and allowed for the differentiation of Chilean Hereford, and Japanese Jersey and Japanese Holstein cattle from other Taurine breeds. In summary, the native Sudanese cattle breeds were located within a narrow cloud in an intermediate position between the Zebu and Taurine breeds and close to other Southeast Asian breeds, in agreement with the composite origin of these native breeds. This is also supported by the presence of African and Zebu unique BoLA-DRB3 alleles within these populations. These PCA results agree with the overall clustering observed after NJ tree construction.
The BoLA class II molecule binds peptides derived from antigens via five antigen binding pockets named pocket 1, pocket 4, pocket 6, pocket 7 and pocket 9 24 . To assess whether observed differences in allelic frequency are reflected within amino acid motifs in each pocket, we analyzed frequency of the protein pockets implicated in the antigen-binding function of the MHC complex by PCA. As shown in Fig. S3a-e, the three native breeds of Sudan are located in a closed cloud in the five PCAs made based on the frequency of the pockets, although varying their relative position with other breeds and breed groups, and in some cases the spatial distribution did not exhibit a clear relationship with the geographical or historical origin of the breeds. However, pockets 4 and 9 are the ones that best differentiate these native breeds from the rest. Regarding pocket 4, Baggara and Kenana breeds of Sudan are located in a narrow cloud located at the end of axis 2, and their position is mainly explained by the GFDEREY, RFDERFV and GLDRKEV motifs. The position of the Butana and Kenana Sudanese breeds in pocket 9 was the result of positive PC1 and PC2 values for the presence of amino acid motifs EYD and EFA.
Finally, PCA was performed at the Sudanese population level to evaluate the degree of genetic structure among the sampling sites (Baggara Daiwani, Baggara Nyakawi, Kenana, Butana Bu Atbara and Butana Bu Qadarif). This analysis showed that the first three components accounted for 90.95% of the data variability. The first PC accounted for 30.65% of the total variance and clearly exhibited a differentiation pattern between the Baggara population (negative values) and the Butana Bu Qadarif (positive values) population, while Kenana, Butana Bu Atbara were located in intermediate positions (Fig. 7). These results agree with the geographical distribution of the studied population. The second and third PCs explained 30.66% and 25.24% of the total variation and allowed for the differentiation of the Butana Qadarif and Kenana populations, respectively.

Discussion
Since the first pioneering studies based on serotype analysis, a number of striking differences between the BoLA profiles of African and European cattle have been reported due to difference in the antigen's frequency of occurrence and the presence of unique antigens in African cattle 38 . Over the next decades, several private alleles were identified in taurine, zebu and taurindicus native African breeds, like N´Dama, Boran, and Sanga 32,34 ; https:// www. ebi. ac. uk/ ipd/ mhc/ group/ BoLA/). However, in the present study, we carried out the first genetic characterization of the BoLA-DRB3 gene at population level in native Sudanese breeds using PCR-SBT. This analysis allowed us to detect 53 alleles, including seven new variants. The high number of private alleles agrees with data obtained by 16 , who analyzed the BoLA region in depth using a genome-wide sequencing approach, identifying six major African BoLA haplotype blocks.
Wild cattle or 'aurochs' (Bos primigenius), the ancestor of domestic cattle, inhabited a large geographical area throughout Eurasia and North Africa. According to the trans-species theory of MHC alleles 39 , it is expected that www.nature.com/scientificreports/ the extremely high genetic variability present in the BoLA-DRB3 gene (365 alleles have been reported in the IPD-MHC (https:// www. ebi. ac. uk/ ipd/ mhc/ group/ BoLA; 33 database, access date 16/04/21) was present in the wide geographical distribution of the aurochs. On the basis of archeological and genetic studies, it has been proposed that modern bovines were domesticated in two geographical sites, one located in the West Asia (Near east), and the other in Indian subcontinent (India and Pakistan) [40][41][42][43][44][45][46] . Each of these domestication centers would have retained only a fraction of the total diversity as a result of bottleneck and genetic drift effects 47 . This is clearly seen in the distribution of mitochondrial haplogroups among cattle breeds 5,[40][41][42][43][44] . In Africa, taurine cattle originated from the Near east domestication center, and introgressed through the North part of the continent and from there they would have dispersed east, west and south. Then, indicine cattle were introduced to Africa and Bos indicus genes were introgressed into native populations through absorbent crosses 48 . Currently, an east-west gradient of Zebu influence in African native genes is observed. Subsequent dispersal and crossbreed processes described above (founder group, migration and gene introgression) and natural and artificial selection would have shaped the BoLA-DRB3 diversity in the current bovine populations. Accordingly, the BoLA-DRB3 alleles detected in the Sudanese cattle were interspersed distributions along the allele NJ tree instead of grouped in specific clusters of the dendrogram, which is consistent with the ancient origin of the BoLA-DRB3 alleles. Similar results have been reported in other native cattle breeds from different geographical regions 21,22 .
Our Venn diagram illustrates the distribution of allelic diversity among different bovine groups, demonstrating that 14 BoLA-DRB3 alleles were only detected in the Sudanese cattle breeds. Seven of these alleles corresponded to new variants described in this study (Table 1). Furthermore, a review of the IPD-MHC database showed that this group of Sudanese private alleles included seven other variants previously detected only in African breeds (Table S4).
Two BoLA-DRB3 alleles, that were only previously reported in Creole cattle breeds 21,37 , were identified in native Sudanese breeds. Studies based on mitochondrial DNA and Y chromosome haplotypes have revealed an African component in the germplasm of the American creole bovine breeds. Two origins have been proposed for this African component: through the native Iberian cattle that are the ancestors of Creole cattle and/or a direct introgression from mainland Africa following the slave trade routes 49 . The Iberian theory is unlikely as the BoLA-DRB3*011:02 and BoLA-DRB3*029:02 alleles have not been detected in the Spanish Morucha breed, which were only autochthonous Iberian breed in which the genetic diversity of the BoLA-DRB3 gene has been studied so far 20 . In summary, 16 possible African putative alleles were identified in the native bovine populations of Sudan, totaling 20.22% of the gene frequency. The presence of private BoLA-DRB3 alleles (not detected in zebu breeds so far) in native African breeds with humped phenotype suggest that current global diversity of this gene could have been retained in the founder group that originate African taurine native breeds 45 .
On the other hand, a group of alleles is shared between the Sudanese breeds and the Zebu, Southeast Asian and/or Creole American breed groups (Table S5), but is absent in the European breeds. It is worth noting that these alleles were first identified in cattle breeds such as Boran, Ethiopian Arsi, N´Dama and Brahman ( 32,34,50 ; https:// www. ebi. ac. uk/ ipd/ mhc/ group/ BoLA/) ( Table S5). The introgression of these variants could have been a consequence of the successive waves of introduction of Zebu cattle into the African continent 48 . These alleles account for an additional 15.33% of the gene frequencies. The remaining alleles have a worldwide geographical distribution; thus, 20 variants have been detected in all the breed groups included in the Venn diagram. Further studies on the genetic diversity of the BoLA-DRB3 gene in other African bovine populations will surely reveal a greater allelic repertoire.
The current repertoire of alleles of the BoLA-DRB3 gene in the native cattle of Sudan would not only have been molded by stochastic forces, such as the formation of the founder group, gene drift and recent or historical gene introgression as described above, but also by processes of natural and artificial selection. In Sudan, as in other African regions, cattle are subjected to strong environmental pressures, such as tropical diseases, heat stress, drought and poor nutritional and forage deficits. Furthermore, animals are affected by diverse infectious diseases, including parasites (e.g., ticks, theileriosis, babesiosis, anaplasmosis, trypanosomosis; 51-57 , bacteria (e.g., Hemorrhagic septicemia, Anthrax, tuberculosis, brucellosis, Thrombotic meningoencephalitis; [58][59][60][61][62] ) and viruses (e.g., foot and mouth disease, lumpy skin disease, Pox virus, bovine viral diarrheal diseases complex; 53,63,65 ). For this reason, it is to be expected that native Sudanese cattle will be under strong selection pressure, which would contribute to maintaining and shaping the genetic diversity of the BoLA-DRB3 gene. In this sense, a wide repertoire of alleles allows the population to identify and respond to a greater range of antigens. Furthermore, heterozygous animals trigger an immune response to a greater variety of antigens. For these reasons, it has been proposed that this allelic diversity is maintained by balancing or over-dominant selection 30,65,66 . Different indices at the population, nucleotide and amino acid levels showed high levels of genetic diversity in the bovine breeds of Sudan for the BoLA-DRB3 gene. This is clearly reflected in the presence of a homogeneous distribution of gene frequencies (a high number of alleles with low frequencies). This is particularly extreme in the Baggara breed in which Slatkin's neutrality test showed evidence that the BoLA-DRB3 gene frequency profile showed an even distribution consistent with the theoretical proportion expected under balancing selection pressures. Similar results have been reported for other cattle breeds, including Japanese Black, Yacumeño Creole, Bolivian Gir, Pyer Sein and Shwe Ni 21,22,30 . Furthermore, the selection index (ω) revealed the presence of diversifying selection in several amino acid sites (mainly in the ABS) in BoLA-DRB3 exon 2 of the Sudan native breeds. In contrast, the HWE test did not detect the effect of over-dominant selection 67 . As discussed previously 21 , this effect has been observed only in some of the breeds studied so far and the most common explanation for the absence of heterozygote excess in the studied bovine breeds is the magnitude of the overdominance selection coefficient at MHC loci (probably lower than 0.02; 68 ). Such selection would only be enough to increase the number of heterozygotes in large populations and in the absence of high rates of stochastic forces (population bottlenecks, www.nature.com/scientificreports/ genetic drift, and inbreeding). For this reason, and because the HWE method may suffer from low resolving power, such effects were not observed. The repertoire of alleles of the BoLA-DRB3 gene present in the native cattle of Sudan allows these breeds to be clearly differentiated from the rest, forming a cluster in the NJ trees and a narrow cloud in the PCA. This pattern is confirmed when PCAs are performed based on the pocket 4 and pocket 9 gene frequencies. It has previously been proposed that pocket 4 plays an important role in the binding of peptides due to this pocket being located in the center of the PBC 64 69,70 . In addition, it has been reported in cattle that immune responses against vaccine and disease resistance is significantly related to differences in the pocket 4 motif 49,50 . A particular amino acid (e.g., amino acid R in position 70) or amino acid motifs (e.g., ER at 70 and 71 sites; EIAY motif at positions 66-67-74-78, and the deletion of the amino acid 65), in sites that affect the conformation of pocket 4, have been associated with immune response or resistance to infectious diseases, such as mastitis, persistent lymphocytosis, dermatophilosis, and tick-borne diseases 25,50,69,[71][72][73] . Many of these diseases, as well as others mentioned above, are present in Sudan and could have contributed to shaping the current repertoire of BoLA-DRB3 alleles present in native Sudanese cattle. However, these results were obtained in breeds that have different genetic backgrounds and that are raised in different environments and production systems, so further association studies are necessary to determine the effect (resistance or susceptibility) of the alleles present in the native cattle breeds of Sudan against different infectious diseases.

Conclusions and future prospects
To the best of our knowledge, this is the first study to document in detail the genetic diversity (taurine vs indicine) of BoLA-DRB3 alleles in cattle not only in Sudan but in the entire African continent. In addition to the clear genetic clustering of cattle based on ancestral origin and phylogeography, we identify seven novel alleles in the three native Sudanese cattle breeds. Two evolutionary forces appear to contribute to the preservation and shaping of the genetic diversity of the BoLA-DRB3 gene in native Sudanese cattle; diversifying selection mainly affects the ABS of the native breeds and balancing selection. The results demonstrate that the background variation between two cattle groups, taurine and indicine, is primarily due to events of origin, selection, and adaptation, which explains the variations found in the diversity of the BoLA-DRB3 genes, not only between the two major groups but also with the indicine cattle group. This variation may explain how cattle from Sudan are resistant to various diseases. We presume that this genetic information provides a basis for better design of suitable breeding schemes. This variation may contribute to resistance in Sudanese cattle to various diseases.

Materials and methods
Sampled populations and genomic DNA extraction. The ODK (Open Data Kit) system was used to record the sampling information: breed name, sex, estimated age, sampling location GPS coordinates, photo of the animal and owner's information. All methods were carried out in accordance with relevant guidelines and regulations of the Faculty of Veterinary Medicine, University of Khartoum (Vet. Med. U of K), and all experimental protocols were approved by the Vet. Med. U of K research board committee. Before animals were sampled, written informed consents were obtained from all animal owners. Three cattle breeds were examined: (1) Butana breed: collected from the Atbara Butana Station and surrounding villages and from El-Gadarif city and Butana plain; (2) Kenana breed: samples were collected from Rabak city and surrounding villages and from UmBanein Kenana Station; (3) Baggara breed populations (i) Nyalawi population, which is a western Baggara breed sampled from calves from Nyala city, South Darfur; (ii) Daeinawi population, from Ed daein city. Whereas Nyalawi are large white cattle, some with black splashes, the Daeinawi are smaller and red with black along the neck and lateral sides of the head, hind quarters and shoulder sides (Fig. S4).
A total of 225 native breed cattle were sampled: Baggara N = 113, Butana N = 60 and Kenana N = 52 (Table S1 and Fig. S4). Seven milliliters of venous blood were collected in EDTA-containing vacutainer tubes. Genomic  www.nature.com/scientificreports/   PCR amplification and sequencing. Exon 2 of the BoLA-DRB3 was amplified by PCR as described by 26 .
Using DRB3FRW   Sequence data analysis. Prior to analysis, all the chromatograms were visualized and sequence fragments were edited manually using ATGC software version 9.1 (GENETYX Corporation, Tokyo, Japan) correcting base calling errors. Multiple sequence alignments were performed using the MUSCLE algorithm implemented in MEGA X 74 , and were subsequently joined to reconstruct a fragment of 280 bp spanning the entire exon 2.
BoLA-DRB3 allele genotyping. For typing BoLA-DRB3 genotypes, we used the method implemented by 26 : First, we downloaded a MHC_nuc.txt file from the IPD-MHC in order to update the allele database. This file contains all reported BoLA-DRB3 alleles. Then DNA sequences from the cattle for both strands (forward and reverse ab1 files) were imported together into the Assign 400ATF ver. 1.0.2.45 software (Conexio Genomics, Fremantle, Australia), which automatically aligned the sampled cattle sequences with those of previously reported BoLA-DRB3 sequences, building a consensus. The most likely genotype is shown in the same window as the chromatograms so that they can be crosschecked. When we found a clear mismatch from several samples, we assigned these samples containing new alleles and revised the BoLA-DRB3 database containing new allele sequences. The accuracy of the in silico genotyping method was demonstrated in Takeshima et al. (2001Takeshima et al. ( , 2011 where the new detected alleles were confirmed by cloning and sequencing, and the used method was developed and validated for only the BoLA-DRB3 gene. If the sample could not genotype using these criteria, we discarded the sample result from this analysis.

Statistical analyses.
Genetic diversity at allele level. Allele frequencies and the number of alleles (n a ) were obtained by direct counting. The distribution of alleles across breeds was analyzed by a Venn plot created using the R package 'VennDiagram' (http:// cran.r-proje ct. org/). The observed (h o ) and unbiased expected (h e ) heterozygosity of the BoLA-DRB3 locus were estimated according to 73 using the Arlequin 3.5 software for population genetic analyses 76 (Schneider, 2000). F IS statistics 77 for each breed were calculated using the Exact Test included in Genepop 4.7 software 78 to evaluate deviation from Hardy-Weinberg equilibrium (HWE). The Ewens-Watterson-Slatkin Exact Test of neutrality was carried out using the method described by 79 and implemented in the Arlequin 3.5 program. www.nature.com/scientificreports/ Breed genetic structure. Genetic structure and genetic differentiation within Sudanese cattle breeds and among bovine breeds were assessed using Wright's F ST statistics 77 . This parameter was estimated using Arlequin 3.5 and Genepop 4.7 software. The F ST values were represented graphically using the pairFstMatrix.r function implemented in the R statistical environment.
Genetic relationship between breeds. To condense the genetic variation at the BoLA-DRB3 locus, allele frequencies were used to perform a PCA according to the 80 method, implemented in Past software 81 . Nei's standard genetic distances Ds 82,83 were calculated from allele frequencies and were used to perform cluster analysis using the Neighbor-Joining (NJ) algorithm 84 . Confidence intervals for the groupings were estimated by bootstrap resampling of the data using 1000 replicates. Genetic distances and trees were computed using the Populations 1.2.28 software 84 . The trees were then visualized using TreeView 85 .
Genetic diversity at sequence level. Nucleotide diversity (π) and pairwise differences in nucleotide substitutions between alleles within each breed were calculated using Arlequin 3.5. The mean number of nonsynonymous (d N ), and synonymous (d S ) nucleotide substitutions per site from averaging over all sequence pairs were estimated within each group using the modified Nei-Gojobori model 83 and Jukes-Cantor's formula implemented in the software MEGA X 72 . The possibility that certain codon sites are under diversifying selection within each native Sudan breed was investigated using the Bayesian method implemented using OmegaMap 86 . This method incorporates intragenic recombination and does not assume a known fixed genealogy, so that recombination does not inflate the false detection rate of positive sites 87 . The BoLA-DRB3 allele tree was constructed from a distance matrix that was based on the NJ method using the MEGA X software. Furthermore, a tree based only on ABS amino acid motifs was inferred using Maximum Parsimony method implemented in MEGA X. To test the significance of the branches of both trees, 1000 bootstrap replicate calculations were performed.

Data availability
Supplementary Material contains Table S1-S5 and Figures S1-S3 including detailed descriptions of all supplemental files.