HLA diversity in the Argentinian Umbilical Cord Blood Bank: frequencies according to donor’s reported ancestry and geographical distribution

Umbilical cord blood (UCB) is a suitable source for hematopoietic stem cell transplantation. The study of HLA genes by next generation sequencing is commonly used in transplants. Donor/patient HLA matching is often higher within groups of common ancestry, however “Hispanic” is a broad category that fails to represent Argentina’s complex genetic admixture. Our aim is to describe HLA diversity of banked UCB units collected across the country taking into consideration donor’s reported ancestral origins as well as geographic distribution. Our results showed an evenly distribution of units mainly for 2 groups: of European and of Native American descent, each associated to a defined geographic location pattern (Central vs. North regions). We observed differences in allele frequency distributions for some alleles previously described in Amerindian populations: for Class I (A*68:17, A*02:11:01G, A*02:22:01G, B*39:05:01, B*35:21, B*40:04, B*15:04:01G, B*35:04:01, B*51:13:01) and Class II (DRB1*04:11:01, DRB1*04:07:01G/03, DRB1*08:02:01, DRB1*08:07, DRB1*09:01:02G, DRB1*14:02:01, DRB1*16:02:01G). Our database expands the current knowledge of HLA diversity in Argentinian population. Although further studies are necessary to fully comprehend HLA heterogeneity, this report should prove useful to increase the possibility of finding compatible donors for successful allogeneic transplant and to improve recruitment strategies for UCB donors across the country.

HLA allele frequencies. Detailed information on individual allele frequencies for all loci, in the general population, SAE and SAA groups can be found on Supplementary Table 1.

Common, intermediate and well documented alleles.
In order to further study the alleles found in our samples, we classified them according to the CIWD catalogue 10 . The majority of the alleles detected at each locus are common (> 80% for all loci) ( Table 4).
Well documented (WD) alleles were observed in all loci. Notably, A*01:104, known for Asian Pacific Islander population (API) within the CIWD catalogue, was observed in SAE group, and A*03:08, observed in European (EURO) and African (AFA) populations according to the catalogue, was spotted in SAA group. Also, A*68:23 and A*24:175 catalogued for Hispanic (HIS) and EURO populations were seen in our SAE and SAA group respectively.
Regarding HLA-B WD alleles, B*08:33, and B*15:70, both catalogue for HIS and EURO populations, were seen in OU (other mixed or unknown origin). Also, B*49:18:02 known for being present in HIS population was found in our SAE group, whereas B*51:04 known to be in many populations (HIS, EURO, AFA, and Middle East North Coast of Africa: MENA) was seen in our SAA group.
Most of HLA-C well documented alleles found in our cohort are known for EURO and HIS populations, and notably spotted in SAA group (C*07:206, C*15:03), in OU group (C*15:08) and in SAE and SAA Mixed (M) group (C*06:30). www.nature.com/scientificreports/ Within Class II group, only one well documented allele was found for HLA-DRB1 and HLA-DQB1 respectively. DRB1*14:02:02 known for EURO and HIS ancestry was seen in OU and DQB1*04:02:03 known for EURO and API ancestry, was spotted in SAA group.

Haplotype frequencies.
A complete list of predicted five locus haplotypes is given in Supplementary Tables 4-6. A total of 10872 HLA-A~C~B~DRB1~DQB1 haplotypes were estimated in our general population (Suppl . Table 4), with 655 haplotypes accounting for almost 100% cumulative frequency. The list of 15 most frequent allelic combinations (> 5%) is available in Table 5.
Regarding SAE group, a total of 4337 haplotypes were estimated; 277 of which accounted for a cumulative frequency of almost 100% (Suppl. Table 5).
For SAA group a total of 4522 haplotypes were estimated, where 299 accounted for almost 100% frequency (Suppl. Table 6).

Discussion
This is the first report on high-resolution HLA diversity on Argentinian cord blood units. Also, it is the first report on Argentina's population analyzing allele and haplotype frequencies taking into consideration donors reported ancestral origins.
As previously described, different ancestral origins coexist within Argentina's borders, mainly from European descent (especially from Italy and Spain) and from Native American tribes (Mapuches, Kollas, etc.) but also from Africa, Asia. Moreover, this diversity is remarkable in terms of geographic distribution where groups of Amerindian descent tend to live in the Northern and Southern regions of the country. Therefore, our analysis included the general population, but also a sub-study to account for these differences. Our results showed that, out of 451 UCB units analyzed, SAE group (of European descent) and SAA group (of Native American descent) were both evenly distributed (SAE = 171 vs. SAA = 184, Fig. 1). As we expected, geographical distribution was also noticeable, as SAE group was more frequent in the Central region and SAA group was more frequent in the North region.
To a great extent, our results, regarding allele frequency for all loci and haplotypes estimations, are in accordance with the previous reports on Argentinian bone marrow donors by the National Registry 34,35 . Nonetheless, numerous differences were found between SAE and SAA groups.
Finally, as we analyzed the haplotypes estimations for the general population, and for SAE and SAA groups respectively, most of the more frequent estimated haplotypes had already been described in previous works, at the same level of resolution, in bone marrow donors by the Argentinian Registry (Table 5) 34 .
In conclusion, our results showed clear differences in allele frequencies between both groups (SAE vs. SAA). We believe it is important to represent, in our Public Umbilical Cord Blood Bank, all of Argentinians HLA diversity and among it, our ethnic minorities (people born in South America of Amerindians populations) poorly represented in other registries. This lack of representation in worldwide registries, among other challenges, is reflected in the failed searches for unrelated donors for certain patients not only in our population but also for  www.nature.com/scientificreports/ individuals in neighboring countries in South America 36 . Thus, we believe this report will not only provide a better comprehension of the HLA heterogeneity in Argentinian population but also contribute to improve our strategies of recruitment of UCB donors across the country.

Materials and methods
Samples. The results were reported back to INCUCAI with a "G" level resolution.  Het.) and thus a deviation from HWE. Linkage disequilibrium analysis was performed between all pairs of loci with unknown gametic phase (significance level < 0.05). We generated five-locus haplotype estimated frequencies (A~C~B~DRB1~DQB1) from the general population, SAE and SAA groups, using the iterative expectation maximization (EM) algorithm (ε = 1e −7 ).