HLA class II genotyping of admixed Brazilian patients with type 1 diabetes according to self-reported color/race in a nationwide study

The HLA region is responsible for almost 50% of the genetic risk of type 1 diabetes (T1D). However, haplotypes and their effects on risk or protection vary among different ethnic groups, mainly in an admixed population. We aimed to evaluate the HLA class II genetic profile of Brazilian individuals with T1D and its relationship with self-reported color/race. This was a nationwide multicenter study conducted in 10 Brazilian cities. We included 1,019 T1D individuals and 5,116 controls matched for the region of birth and self-reported color/race. Control participants belonged to the bone marrow transplant donor registry of Brazil (REDOME). HLA-class II alleles (DRB1, DQA1, and DQB1) were genotyped using the SSO and NGS methods. The most frequent risk and protection haplotypes were HLA~DRB1*03:01~DQA1*05:01 g~DQB1*02:01 (OR 5.8, p < 0.00001) and HLA~DRB1*07:01~DQA1*02:01~DQB1*02:02 (OR 0.54, p < 0.0001), respectively, regardless of self-reported color/race. Haplotypes HLA~DRB1*03:01~DQA1*05:01 g~DQB1*02:01 and HLA~DRB1*04:02~DQA1*03:01 g~DQB1*03:02 were more prevalent in the self-reported White group than in the Black group (p = 0.04 and p = 0.02, respectively). The frequency of haplotype HLA~DRB1*09:01~DQA1*03:01 g~DQB1*02:02 was higher in individuals self-reported as Black than White (p = <0.00001). No difference between the Brazilian geographical regions was found. Individuals with T1D presented differences in frequencies of haplotypes within self-reported color/race, but the more prevalent haplotypes, regardless of self-reported color/race, were the ones described previously in Europeans. We hypothesize that, in the T1D population of Brazil, although highly admixed, the disease risk alleles come mostly from Europeans as a result of centuries of colonization and migration.

The highest prevalence of T1D is observed in the European population 13 , and most of the studies are concentrated on homogeneous populations 6,[14][15][16] . However, previous data have shown that the frequency of HLA haplotypes, as well as their effects on T1D risk or protection, could vary among populations 17 . With the advancement of genetic risk scores for the diagnosis and prediction of T1D, it is critical to account for ethnic differences in the genetics of T1D that may impact clinical outcomes such as chronic complications. Haplotypes that denote risk for one population might have a phenotype of protection on another. For instance, the haplotype DRB1*07:01~DQA1*03:01~DQB1*02:02 appears to be protective for the European population and denotes susceptibility for African Americans 18 .
Brazil has a large multiethnic population as a result of centuries of miscegenation since Portuguese colonization in 1500. The Brazilian population is formed by basically three principal ancestry roots: European (EUR), sub-Saharan African (AFR), and Native American (NAM). The country was originally populated by NAM. With the colonization, and later de slavery traffic, the EUR (particularly Portuguese) and the AFR ancestries started the miscegenation of the population, spreading gradually to the internal part of the country, explaining the substantial Brazilian genetic variability [19][20][21] .
There is a scarcity of data on the genetics of the T1D population in Brazil, characterized as highly admixed. In this study, the primary objective was to evaluate the HLA class II genetic profile of Brazilian individuals with T1D and its relationship with self-reported color/race (CRsr) in comparison to a sample of individuals without T1D that belonged to the bone marrow transplant donor's registry of Brazil (REDOME), matched by region of birth and self-reported color/race. Second, we aimed to analyze regional geographic differences in HLA class II risk distribution of individuals with T1D in Brazil, a country with continental proportions.

Research design and methods
Study design and population. This analysis derives from a nationwide multicenter cross-sectional study conducted between August 2011 and August 2014 in 14 public clinics located in 10 Brazilian cities. The methods have been described previously 22 . Briefly, subjects received health care from the National Brazilian Health Care System (SUS) and were included in the study if they had been diagnosed by the presence of typical clinical presentation of T1D, including variable degrees of hyperglycemia, weight loss, polyuria, polydipsia, polyphagia and the need for continuous insulin use since the diagnosis with at least six months of follow-up evaluations in each center. From the initial cohort of 1,760, we randomly selected 1,019 individuals by region of birth and CRsr. The comparison between the selected group and the initial group showed no differences regarding principal clinical and demographic variables (data not shown). The institutional ethics committee of Pedro Ernesto University Hospital (State University of Rio de Janeiro) and each center's local ethics committee approved the study. All participants or their representatives signed written informed consent for the study. A standardized questionnaire was also used during a clinical visit to evaluate clinical and demographic data such as gender, current age, birthplace, self-reported color/race, age at diagnosis and duration of diabetes.
We also included information on HLA typing, region of birth and CRsr from 5,116 REDOME entries matched for the region of birth and CRsr at a 5:1 ratio. Inclusion criteria as a donor at REDOME are 18 to 55 years of age, good health status, and no infection, hematological or immunological disease. Individuals who had a diagnosis of cancer or diabetes with the use of insulin or other injectable medication are also excluded from REDOME 23 . We provide a supplemental figure with a chart flow of the selection process of patients with and without T1D (Supplemental Fig. S1). Each center's local ethics committee approved the study. All participants or their representatives signed written informed consent for the study. A standardized questionnaire was also used during a clinical visit to evaluate clinical and demographic data such as gender, current age, birthplace, self-reported color/ race, age at diagnosis, and duration of diabetes.
DnA extraction. Genomic DNA was extracted from peripheral blood with the commercial SP QIA Symphony Kit by automation with QIA Symphony equipment, following the manufacturer's instructions (Qiagen, USA). HLA genotyping. HLA-class II alleles (DRB1, DQA1, and DQB1) from 1,019 individuals with T1D were genotyped. Genotyping was performed using PCR-RSSO (LabType SSO2B1 High resolution, One Lambda Inc., West Hills, USA) in 543 (53.3%) participants with T1D and 476 (46.7%) had their DNA typed by next-generation sequencing (NGS). Of those, 352 were amplified at loci HLA-DRB1 and HLA-DQB1 by long-range PCR using primers from the NGSgo.v2 (GenDx, Utrecht, the Netherlands) Library Preparation Kit and 124 with Holotype HLA Assay (Omixon Inc., Budapest, Hungary) for HLA-DRB1, HLA-DQB1 and HLA-DQA1, according to the manufacturer's instructions. These primers cover exons 2, 3, and 4. HLA-DQA1 allele was imputed in 31.5% of the samples from the group of T1D individuals (n = 321) using the linkage disequilibrium criteria, based on the results found by NGS.
The HLA genotyping results of the group of participants without T1D were obtained at high resolution in DRB1 and DQB1 loci in 2,201 REDOME entries. The class II alleles assigned in any loci with NMDP codes were defined based on Common and Well Documented, version 2.0 (n = 2,915). HLA-DQA1 alleles were typed in all control samples with PCR-RSSO (LabType SSO2B1 High resolution, One Lambda Inc., West Hills, USA).
Three-locus haplotype frequencies (DRB1~DQA1~DQB1) were estimated for each of the races and regions for both groups (individuals with and without T1D), resolving phase and allelic ambiguity using the expectation-maximization (EM) algorithm 25,26 . Deviations from Hardy-Weinberg equilibrium (HWE) were assessed at the allele-family level (first nomenclature field) using a modified version of the Guo and Thompson algorithm 27 as implemented in Arlequin software v.3.5 28 .
The most frequent haplotypes associated with risk for T1D (OR > 3.0) were compared among Brazilian regions.

Statistical analysis.
Categorical variables such as self-reported color/race, geographical region of birth and gender were presented as frequencies (percentages). All normally distributed continuous variables, such as age, duration of diabetes, and HbA1c values, were given as the mean ± standard deviation (SD). We used chi-square and Fisher's tests to compare categorical data; Student's t-test and analysis of variance (ANOVA) were used for comparisons between groups with numeric variables when indicated. Samples were divided into two groups (individuals with and without T1D) for population comparison testing. Arlequin software was used to calculate FST genetic distance, and the exact test for population differentiation test results was performed via allele frequency extrapolations 28 . Tests were then repeated after dividing the two populations into smaller groups according to self-declared ethnicity (White, Black, and Brown, when n > 30) and region to detect potential ancestry or regional related biases.
Bonferroni correction was applied for multiple tests. We used the Statistical Program for Social Sciences version 17.0 (SPSS, Inc., Chicago, Illinois). A two-sided p-value of less than 0.05 was considered significant. Haplotype frequencies between cases and controls were compared using a Pearson chi-square test. Odds ratios (ORs) and 95% CIs were calculated. Table 1. Half of the participants declared themselves as White. Individuals with T1D were older than healthy participants (p = 0.02). The group of individuals without diabetes had more male individuals than the group of individuals with T1D (p < 0.001).

Results population characteristics. Population characteristics are shown in
Overview of the risk and protective alleles and/or haplotypes of the HLA system in individuals with and without T1D. The Table 6 shows haplotypes that were seen at least 18 times total in the T1D participants and the healthy control group. Other less frequent haplotypes were grouped as others. HLA class ii distribution by self-reported color/race. Figures 1 and 2 present a bar plot with the distribution of the most prevalent risk and protection alleles in the T1D group, respectively, by self-reported color/ race. Tables with the haplotype frequencies in both groups stratified by CRsr (White, Black, Brown, Asian and Indigenous) appear in the supplemental material. HLA-DRB1*03:01~DQA1*05:01~DQB1*02:01 was the most frequent risk haplotype in all self-reported color/race groups, and haplotype HLA-DRB1*07:01~DQA1*02:01D QB1*02:02 was the haplotype with the highest frequency of protection in all groups but did not show statistical significance in the Black, Asian and Indigenous groups. Haplotypes HLA-DRB1*03:01~DQA1*05:01 g~DQB1*0 2:01 and -DRB1*04:02~DQA1*03:01 g~DQB1*03:02 were significantly more prevalent in the self-declared White group than in the Black group (p = 0.04 and p = 0.02, respectively). Individuals self-reported as Black had a statistically higher prevalence of the haplotype HLA-DRB1* 09:01~DQA1*03:01 g~DQB1*02:02 compared to White and Brown groups (p = <0.00001 and p = 0.008, respectively). This haplotype presented a higher frequency in the Brown group than in the White group (p = 0.001). Figure 3 shows the distribution of the self-reported color/ race for the most prevalent risk and protection alleles for all participants. Frequent haplotypes associated with T1D risk grouped by Brazilian regions are shown at Supplemental Table S6. No statistical difference was observed.

Discussion
In general, our results are in accordance with previous studies in the European population as well as with the last regional studies in Brazil. The most frequent haplotype in all CRsr groups and geographical regions was HLA-DR B1*03:01~DQA1*05:01 g~DQB1*02:01, which is also the most prevalent risk haplotype described in European populations. This demonstrates that although highly admixed, the Brazilian population seems to have greater genetic influence from European populations. The miscegenation process in Brazil is relatively recent, beginning only 500 years ago with the entrance of the Portuguese colonizers (European ancestry). The native Brazilian population was originally formed by indigenous populations, who share some similar HLA alleles and haplotypes with Native Americans 29 . Almost two centuries later, with the beginning of slavery traffic, African ancestry began to contribute to the miscegenation of the Brazilian population. The roots of these three ancestries (European, Native Amerindian, and African) are the basis of our admixed population. Our colonization history described above might explain higher degrees of European ancestry in our population, as demonstrated in previous studies 19 . Although our study design cannot confirm the hypothesis that, in the highly admixed T1D Brazilian population, the disease risk alleles appear to come mostly from Europeans as a result of centuries of colonization and migration, data from two Brazilian previous studies showed that the incidence of T1D was greater in patients self-reported as White 30,31 .  www.nature.com/scientificreports www.nature.com/scientificreports/ DRB1*03 and DRB1*04 alleles are known to be the most prevalent high-risk alleles between individuals with T1D, with individual allele frequencies varying between 20 and 30% 32 . The highest frequencies are shown in European populations, but they have also been described in African Americans 18 . In Brazil, the frequencies of those alleles are as high as 28% 7 , similar to those found in our sample (28.9%). Up to 63.3% of our type 1 participants carry DRB1*04 and/or DRB1*03, and 39.2% carry both (either in homozygosis or heterozygosis) compared to 5.1% of the control group. It is important to note that the most frequent haplotypes in our analysis were not always the ones with the most significant effect. For instance, although the DRB1*03:01~DQA1*05:01 g~DQB1* 02:01 haplotype was the most frequent in individuals with T1D (28.9%), the haplotype with the largest effect was DRB1*04:01~DQB1*03:01 g~DQA1*03:02 (OR 6.6, CI 4.91-8.87, p-value <0.000001).
The commonly described protection alleles are DRB1*03:02, DRB1*07, DRB1*10, DRB1*11, DRB1*13, DRB1*14, and DRB1*15. Frequencies of those alleles vary among populations 1 . Haplotype HLA-DRB1*07:01~D QA1*02:01~DQB1*02:02/03:03 was more prevalent in our control group, with a frequency up to 12.5% compared to 6.3% of the T1D group. This haplotype has been described as protective in previous studies in Brazil 32 as well as in European populations 33 , but it was shown to be associated with risk in the African population 18 . The same situation occurred with DRB1*13. Although the Brazilian population originates mainly from three ancestral roots, with African being one of them, it has lower degrees of sub-Saharan African genomic ancestry than populations  Table 6. Distribution of the HLA-DRB1~DQA1~DQB1 haplotypes in individuals with type 1 diabetes mellitus and without T1D. T1D = type 1 diabetes mellitus; n = number of individuals; OR = odds ratio; CI = confidence interval; sixty-eight haplotypes with total number in patients plus controls greater than 18 were included (0.3%). P required for statistical significance after Bonferroni correction for multiple tests <0.00074. Rare alleles were included in others.    19 . It is important to highlight that up to 51% of our T1D population declared themselves White as opposed to only 9% reported as Black.
Genotype DRB1*03/DRB1*04 presented the highest risk in our study, with an OR of 12.1 (CI 9.64-15.20, p < 0.000001), followed by DRB1*03/DRB1*03 (OR 10.6, CI 7.52-14.92, p < 0.00001). The DRB1*09 allele only presented risk when accompanied by one of the high-risk alleles (DRB1*03 or DRB1*04), and this combination was present in 4% of the individuals with T1D. This result is similar to previous studies in Brazil 7 . A study with the African American population shows DRB1*09 as a risk allele even when not associated with DRB1*03 or DRB1*04 18 . This might be explained by the very low rates of Asian or African ancestry in our population, as discussed above and demonstrated in previous studies 19 . One possible conclusion is that in admixed populations, such as that in Brazil, the disease was brought over by populations of European ancestry, with a stronger presence of DRB1*03 and DRB1*04 among those self-declared as White and the presence of DRB1*09 in those self-reported as Black. Nonetheless, although a frequency variation of the haplotype DRB1*09:01~03:01 g~02:02 was found among the Brazilian regions within the T1D group (1.4% South vs. 7.8% Northeast), it did not present statistical difference.
It is also important to highlight the rates of homozygosity found in our T1D population, where 9.81% of the individuals with T1D were homozygous for DRB1*03, 5.89% for DRB1*04 and only 0.2% for DRB1*09, similar to a previous study in Brazil 7 . Noble et al. 's study in African Americans found similar rates for the DRB1*03 genotype but higher rates for DRB1*09 homozygosity 18 , probably due to the above-cited explanation with a population of higher degrees of African ancestry.
Although we found differences in gender proportions between groups, HLA risk assessment usually does not differ between males and females. One study in children at risk of T1D found an association between gender and HLA risk alleles DRB1*03/DRB1*04 and islet autoimmunity 34 . This is probably not relevant in our population, as we included only individuals with T1D older than 13 years.
In our study, we analyzed only Class II HLA alleles. Although Class I alleles and non-HLA genes also contribute to T1D risk, Class II alleles such as DR and DQ demonstrate the strongest associations with the disease 1 . Recently, several risk scores for diagnosis and risk assessment of T1D have been proposed, and the vast majority of them are based on the presence of high-risk class II HLA alleles [10][11][12] .
The present study is the first multicenter study in T1D including all five geographical regions of the country with a large multiethnic sample. Additionally, we had a large number of controls matched by region of birth and CRsr, adding strength to our results. Another strength is that we used a uniform, standardized recruitment protocol in all participating centers and the three genotyped loci HLA-DRB1, -DQA1, -and -DQB1). REDOME comprises HLA types from all regions and with representative entries of the distinct CRsr, and the allele frequency distributions vary both per region and CRsr 21 . To minimize these differences, a randomized selection included information available in the REDOME database in a pair-matched CSsr and region basis.
Our study has some limitations. First, autoantibodies and C peptide levels were not measured. The diagnosis of diabetes was made based on typical clinical presentation and the need for insulin since diagnosis. Although some individuals with other types of diabetes might have been included, it is important to emphasize that 96.5% of them were diagnosed before 30 years of age, which reinforces the high probability that they most likely have T1D. Second, although T1D participants were from urban areas, patients who receive primary attention care and live in rural areas represent the minority of patients with T1D under treatment in Brazil.