Associations between human leukocyte antigens and renal function

Human leukocyte antigens (HLA) have been associated with renal function, but previous studies report contradictory findings with little consensus on the exact nature or impact of this observation. This study included 401,307 white British subjects aged 39–73 when they were recruited by UK Biobank. Subjects’ HLA types were imputed using HLA*IMP:02 software. Regression analysis was used to compare 362 imputed HLA types with estimated glomerular filtration rate (eGFR) as a primary outcome and clinical indications as secondary outcome measures. 22 imputed HLA types were associated with increased eGFR (and therefore increased renal function). Decreased eGFR (decreased renal function) was associated with 11 imputed HLA types, seven of which were also associated with increased risk of end-stage renal disease and/or chronic kidney disease. Many of these HLA types are commonly inherited together in established haplotypes, for example: HLA-A*01:01, B*08:01, C*07:01, DRB1*03:01, DQB1*02:01. This haplotype has a population frequency of 9.5% in England and each allele was associated with decreased renal function. 33 imputed HLA types were associated with kidney function in white British subjects. Linkage disequilibrium in HLA heritance suggests that this is not random and particularly affects carriers of established haplotypes. This could have important applications for the diagnosis and treatment of renal disease and global population health.


Methods
Study population and quality control. This is a UK Biobank (UKB) retrospective cohort study using data from 502,616 subjects aged 39-73 years at the time of recruitment between 2006 and 2010 14 . 88% of the cohort self-identifies as "white British", and principal component analysis conducted by UKB concluded that 82% of the UKB cohort is white British 15 . Analysis was restricted to this group to reduce population stratification; 92,858 subjects who were not white British were analysed separately 16 .
Individuals within the cohort whom UKB deemed to be related 17 (kinship coefficient ≥ 0.044) were also excluded (n = 7318) to avoid HLA frequency bias 18 . Where subjects were related, the individual with the most complete set of genetic data, based on a set of "high-quality markers" 17 , was included. Genetic sex influences kidney function 19 , so only individuals whose sex could be clearly assigned were included. Subjects identified by UKB to have sex chromosome karyotypes other than XX or XY 20 and those whose genetic sex, as calculated by UKB, did not match their self-reported sex 21 were removed (n = 786 in total). Finally, 347 subjects were excluded at UKB's recommendation due to missing genetic data 22 . A total of 101,309 subjects were excluded during quality control, leaving 401,307 subjects for analysis. All quality control was performed using Stata/SE 13.0 (StataCorp).
HLA typing. Imputation estimates a person's most likely HLA type based on the presence of particular single nucleotide polymorphisms 23 . HLA types were imputed for each subject by UKB using HLA*IMP:02 software 24 at the following loci: HLA-A, B, C (Class I) and DPA1, DPB1, DQA1, DQB1, DRB1, DRB3, DRB4, DRB5 (Class II) 25 at a level equivalent to high resolution typing using eight reference datasets 26 . 362 HLA types were imputed. Two of these (HLA-DQB1*02:02 and DPB1*03:01) were not in Hardy-Weinberg equilibrium (HWE, P < 0.00014) so were excluded from this study; the remaining 360 alleles were included. Table 1 shows the 100 HLA types with frequency > 1% in the cohort.
Measures of renal function. Renal function was determined using estimated glomerular filtration rate (eGFR), a measure of toxin filtration calculated using serum biomarkers such as creatinine and cystatin 27 . High levels of these biomarkers are indicative of poor renal function and manifest in a lower eGFR. This study used the Chronic Kidney Disease Epidemiology Collaboration (CKD-EPI) eGFR calculation which adjusted for age and sex 28 . Three eGFR values were calculated for each subject, using measures of: creatinine; cystatin; and both creatinine and cystatin 29 . Pairwise correlation confirmed that the three eGFR values were similar (Pearson's correlation coefficients > 0.6; P < 0.0001). The cystatin-based eGFR value provided the most complete dataset; only this value was used for analysis to avoid repetition of testing using closely correlated variables.
Clinical histories for each subject were used as secondary outcomes. Subjects with kidney dysfunction were identified by examining self-reported questionnaires in addition to data relating to clinical diagnosis and procedures undertaken. These were deduced using a combination of the International Classification of Diseases 30 (ICD)-9 and -10, Office of Population Censuses and Surveys (OPCS) 31 -3 and -4, and UKB's own coding systems 32,33 . Subjects were categorised as: ESRD patients (yes or no); kidney transplant recipients (yes or no); dependent on renal replacement therapy (RRT) including transplantation at any point (yes or no); and CKD patients of any stage (yes or no) (see Table 6).
Statistical analysis. Linear regression analysis was performed to test for associations between HLA alleles and eGFR as a continuous variable. All 360 HLA alleles which were in HWE were included, with a Bonferroni threshold of P < 0.00014 considered significant 34 (0.05/360). Subjects who had ever received RRT were excluded as their eGFR values may have suggested healthy renal function even though their native function was poor.
Logistic regression was used to test for associations between HLA types and adverse clinical outcomes (ESRD, RRT, CKD, and kidney transplantation; binary variables). Age at recruitment and sex were included as covariates, and only alleles in HWE with minor allele frequency > 5% were considered (n = 50) in order to increase statistical power. P < 0.001 was considered significant after Bonferroni correction. All regression analysis was performed using Plink software 35 . Ethical approval. All methods were carried out in accordance with relevant guidelines and regulations. All experimental protocols were approved by UKB's Research Ethics Committee. Informed consent was obtained for all subjects. UKB has obtained Research Tissue Bank approval from its ethics committee that covers the majority of proposed uses of the resource, so researchers do not typically need to obtain separate ethics approval.

Results
Variation in renal function. Variation in renal function within the UKB cohort is outlined in Table 2.
eGFR values could not be calculated for around 18,000 subjects (< 5%) due to missing creatinine and/or cystatin measurements. Subjects dependent on RRT were excluded from the analysis, although their eGFR values are listed in Table 2, which shows calculated eGFR values and the corresponding CKD stages 36 as well as the number of subjects in the final analysis.
The calculated eGFR values were compared to average values to check that they were plausible. Average eGFR for different age categories were taken from the National Kidney Foundation 37 . The values calculated by this study were in line with NKF's estimates, as shown in Table 3 www.nature.com/scientificreports/ Regression analysis. 33 HLA types were significantly associated with renal function after correction for multiple testing. Table 4 lists the 11 HLA alleles linked with decreased renal function (defined by either decreased eGFR or the presence of CKD or ESRD). Table 5 shows the 22 HLA alleles associated with increased renal function. No HLA associations were identified with kidney transplant status or RRT status. Tables 4 and 5 also show the population frequency of the alleles, the beta value or odds ratio (OR) of each effect, and the level of significance of the associations.
Associations with increased renal function. The HLA associations with increased eGFR values do not appear to belong to full length haplotypes, but can be separated into groups of two or three HLA alleles which are often co-inherited.

Discussion
We identified significant HLA associations with renal function in the largest reported study to date. 22 HLA alleles were associated with increased renal function and 11 with decreased function. The HLA associations with increased renal function did not suggest a protective effect against CKD or ESRD, but the 11 associations with decreased renal function (seven of which were also linked to ESRD and/or CKD) were of particular interest. HLA genes are inherited through maternal and paternal haplotypes, which suggests a high probability that these alleles are not independently associated with renal function, but rather that this observation is non-random within the population. Specifically, individuals who carry the haplotypes listed are at increased risk of developing renal dysfunction, and may carry sub-clinical levels of impairment even in the absence of identifiable disease. This clustering of the HLA genes within well-documented haplotypes adds validity, which is reinforced as the  www.nature.com/scientificreports/ primary and secondary outcome measures were calculated using the independent phenotypes of biomarkers and clinical outcomes. It should be noted that some significant alleles appear to be alone in significance (that is, the alleles that they are in LD with were not significant). Examples include HLA-A*32:01 and B*14:01, among others. In these cases, it is possible that the allele itself is linked to kidney function, independent of its haplotype, or it is possible that the other alleles in LD with this allele are also significant, and this study failed to detect this. The CKD-EPI calculation of eGFR was selected rather than MDRD 41 or Cockcroft-Gault 42 due to its increased accuracy when assessing subjects with normal renal function (eGFR > 60) 43 . Using only one eGFR value avoided multiple testing of closely related variables; the formula based on cystatin was selected as it had the fewest missing values. For comparison, the two other CKD-EPI eGFR formulae (one based on creatinine, and another based on both creatinine and cystatin) were used and the data re-analysed. In addition to the associations already described, three additional associations were identified as significant (assuming the same Bonferroni threshold of P < 0.00014): HLA-A*23:01 and DRB3*02:02 were linked to decreased renal function, and B*27:05 was linked to increased function.
Comparison with previous research. Previous literature has reported conflicting HLA associations with renal function in populations of different ethnic origin. Potentially, these contradictory findings may include false positives arising from inadequate statistical power, multiple testing, publication bias or methodological differences. Alternatively, it is possible that HLA associations with kidney function differ between populations due to varied heritage. Limiting this study to only white British subjects reduced any likelihood of bias due to population stratification. Almost 100 HLA associations with ESRD have been described, only 11 of which have been confirmed by two or more independent studies. Our study replicated one of these 11 observations but refuted two. HLA-DRB1*03 was previously associated with renal dysfunction by four groups with a combined total of 1261 ESRD subjects and over 3000 controls [5][6][7]44 . We found not only HLA-DRB1*03:01 but an entire haplotype to be associated with decreased eGFR and increased risk of poor clinical outcome. However, HLA-B*07 was reported to be protective against ESRD in 1620 ESRD patients and 1211 controls by Doxiadis et al. 10 , and Karahan's study of 587 patients and 2643 controls 7 . In this population, HLA-B*07:02 was associated with decreased renal function. Furthermore, HLA-DRB1*04 was associated with adverse renal outcomes in three previous studies with over 4000 ESRD subjects 12,45,46 , but here, DRB1*04:01 was linked to increased renal function. The remaining eight previously replicated HLA associations were not significant in this study. Overall, 14 of our associations confirmed previous observations [5][6][7]12,44,45,[47][48][49] , while 12 of our findings refuted previous results 7, 10,12,45,46,49,50 .
It is worth noting, however, that this study is much larger than any previous study. Most previous studies used case-control methodology (see "Strengths and weaknesses" below) and many failed adjust for multiple testing. Therefore, the findings reported here, which have undergone more stringent statistical testing, may be less prone to type I or II error.
Implications. This study is unique in that some of the HLA alleles associated with decreased renal function form a well-characterised haplotype. Both this and individual component HLA alleles have been associated with multiple diseases which result in CKD or ESRD, including systemic lupus erythematosus and IgA deficiency 51 . Our study indicates that even within a healthy population, renal function may be sub-clinically impaired in subjects with these alleles. These findings have the potential to impact upon clinical practice. HLA typing is already used as a diagnostic tool for disorders with strong HLA associations such as coeliac disease 52 , ankylosing spondylitis 53 , and actinic prurigo 54 . It may be advisable for clinicians to use HLA disease association typing to aid the diagnosis of renal failure, which could ensure timely therapeutic intervention. However, HLA associations with these diseases are much stronger than those reported here: the association between B*27 and ankylosing spondylitis has an odds ratio of 171 55 , while HLA associations with coeliac disease have OR > 10 56 , compared to ORs < 1.13 in this study. Clinicians and national kidney transplantation programmes may also use the HLA types associated with increased renal function to help identify suitable kidney donors.

Strengths and weaknesses.
A key advantage of this study is the cohort size, which is larger than any previously published research. 382,204 subjects were included in the analysis of the primary outcome measure (eGFR), and the secondary analysis consisted of 11,379 cases of ESRD (and 389,928 controls). This study uses a variety of measures of renal function, most of which are calculated independently and are therefore unlikely to be subject to systemic bias. eGFR is a useful outcome measure because it provides a continuous scale, giving an accurate and precise estimate of renal function. Many previous studies used case-control methodology, reducing kidney function from a spectrum to binary categorisations such as "ESRD or healthy". Measuring renal function on a spectrum may strengthen the statistical and clinical significance of this study.
A limitation of this investigation is that the HLA typing was performed by imputation rather than direct genotyping, which is more accurate 57 . This is because the cost of HLA typing a cohort of this size using traditional methods is prohibitively expensive. The imputation program used for the UKB population was HLA*IMP:02, though Karnes' review 57 of competing programs suggests that SNP2HLA is more accurate. Nevertheless, the review stated that HLA*IMP:02 is 94% accurate when imputing white subjects which, given the size of our cohort, is acceptable within the scope of this study. Furthermore, 360 of the 362 imputed alleles were in HWE (P > 0.00014), suggesting that the majority of imputed allele frequencies were consistent with frequencies that might be expected in a stable population. The two alleles which were not in HWE (HLA-DQB1*02:02 and DPB1*03:01) were excluded from the analysis. Some HLA associations found in this study do not appear to be part of a haplotype. These alleles may be independently associated with renal function, or they may be false positives caused by inaccurate imputation. For It is possible that the strategy employed to identify subjects with adverse kidney-related clinical outcomes was insufficiently comprehensive to capture all cases. If data held by UKB were incomplete, or if relevant codes were not included (see Table 6), subjects with poor renal outcomes would be mischaracterised as healthy. This could be averted by obtaining a peer-reviewed validation of the coding systems that documents exactly which codes are representative of adverse renal outcomes, but to the best of our knowledge no such validation exists. Clinical outcomes were secondary outcome measures in this study; the primary outcome of eGFR is not affected by this limitation.  www.nature.com/scientificreports/ A final limitation of this study is that the sizes of the associations with eGFR were smaller than previously published HLA disease associations 55,56 and possibly too small to be considered clinically relevant. 25 out of 33 (76%) significant associations with eGFR had a beta value between − 0.5 and 0.5, suggesting that the presence of the allele has only a minor effect on kidney function. However, in seven cases, these apparently small effects were corroborated by associations with adverse clinical outcomes, implying that small beta values are not a contraindication of clinical relevance.

Conclusions
This study has identified 22 HLA types which are associated with increased kidney function, and 11 which are linked to decreased kidney function in a large UK population. Many of these are commonly inherited together in haplotypes. Importantly, seven alleles, which are each seen in between 14-34% of the cohort, were linked to both decreased eGFR and increased incidence of adverse clinical outcomes. Due to the constitution of the cohort, the results of this study can only be applied to white British people aged 39-73. Repeating the analyses with alternative cohorts may add considerably to our current knowledge and allow a better assessment on the implications for population health. www.nature.com/scientificreports/