Targeted analysis of genomic regions enriched in African ancestry reveals novel classical HLA alleles associated with asthma in Southwestern Europeans

Despite asthma has a considerable genetic component, an important proportion of genetic risks remain unknown, especially for non-European populations. Canary Islanders have the largest African genetic ancestry observed among Southwestern Europeans and the highest asthma prevalence in Spain. Here we examined broad chromosomal regions previously associated with an excess of African genetic ancestry in Canary Islanders, with the aim of identifying novel risk variants associated with asthma susceptibility. In a two-stage cases-control study, we revealed a variant within HLA-DQB1 significantly associated with asthma risk (rs1049213, meta-analysis p = 1.30 × 10–7, OR [95% CI] = 1.74 [1.41–2.13]) previously associated with asthma and broad allergic phenotype. Subsequent fine-mapping analyses of classical HLA alleles revealed a novel allele significantly associated with asthma protection (HLA-DQA1*01:02, meta-analysis p = 3.98 × 10–4, OR [95% CI] = 0.64 [0.50–0.82]) that had been linked to infectious and autoimmune diseases, and peanut allergy. HLA haplotype analyses revealed a novel haplotype DQA1*01:02-DQB1*06:04 conferring asthma protection (meta-analysis p = 4.71 × 10–4, OR [95% CI] = 0.47 [0.29– 0.73]).


Functional annotation of variants and gene expression
We explored the potential biological consequences of the single nucleotide polymorphism (SNP) predicting the altered codon in the amino acid in the classical HLA allele and its best proxies (i.e., in strong linkage disequilibrium [LD] in Europeans, r 2 >0.7) by using different in silico tools, with the aim of providing additional information on the role of HLA class II gene dysregulation in asthma. Variant prioritisation was based on functional scores obtained with DSNetwork [1] and RegulomeDB [2], allowing to identify the most probable functional variants. We used Ensembl Variant Effect Predictor (VEP) to determine the consequences of top variants in genes, protein sequence, and regulatory regions [3]. Additionally, we assessed the potential regulatory role of the variants using HaploReg v4.1 [4] and RegulomeDB, including the evaluation of chromatin states by identifying histone marks, DNase I hypersensitive sites, and altered regulatory motifs in that region. We used Capture Hi-C Plotter [5] to analyse the existence of long-distance physical chromatin interactions with regulatory elements and gene promoters in different tissues, considering a default score of 5 as threshold to identify PCHi-C interactions . We also accessed GTEx [6], ExSNP [7], and SNPdelScore [8] to evaluate tissuespecific local expression quantitative trait loci (eQTLs) and splicing quantitative trait loci (sQTLs).
We reported QTL associations at p≤0.05.
In parallel, we accessed the results of two public gene expression studies of asthma that were catalogued in Gene Expression Omnibus (GEO). First, we accessed transcriptomic data of samples from 27 healthy controls and 128 individuals with asthma (Ref. GSE63142) [9]. These samples were obtained from the Severe Asthma Research Program (SARP) [10] and the RNA was extracted from bronchial epithelial cells. Patients were classified according to their phenotype in 72 non-severe and 56 severe asthma patients. Then, we accessed gene expression results of bronchoalveolar lavage (BAL) samples from 12 healthy controls and 74 asthmatic individuals (28 non-severe asthma, 46 severe asthma) (GSE74986) [11]. Samples from healthy controls and patients with moderate asthma are from the Study of the Mechanisms of Asthma (MAST, ClinicalTrials.gov:NCT00595153), while those samples from patients with severe asthma are from the BOBCAT study [12]. For both transcriptomic studies (GSE63142 and GSE74986), expression arrays were used to obtain the gene expression profiles. The differential gene expression between cases with asthma and healthy controls was examined using shinyGEO [13] and R programming [14] for those genes in the vicinity of the SNPs and classical HLA alleles significantly associated with asthma susceptibility in our study (i.e. HLA-DQA1 and HLA-DQB1).
We also included HLA-DRB1 in analyses since the best ranked proxy (rs9271588) for rs10093 is an intergenic variant located between HLA-DQA1 and HLA-DRB1. We assessed the average intensity differences of the matrix probes targeting HLA-DQA1, HLA-DQB1, and HLA-DRB1 using two-sample t-tests and one-way analyses of variance (ANOVA) (Figures S3-4).   healthy controls, 72 individuals with non-severe asthma, and 56 with severe asthma obtained from transcriptomic data from bronchial brushing samples. Expression differences were assessed using ANOVA followed by t-tests. The probes used for each of the genes are indicated in parentheses in the title. Data obtained from the GEO accession GSE63142. Figure S4. Gene expression of (a) HLA-DQB1 and (b) HLA-DRB1 in 12 healthy controls, 28 individuals with non-severe asthma, and 46 with severe asthma obtained from transcriptomic data from bronchoalveolar lavage (BAL) samples. Expression differences were assessed using ANOVA followed by t-tests. The probes used for each of the genes are indicated in parentheses in the title. Data obtained from the GEO accession GSE74986.  Tables   Table S1. Total number of filtered variants tested in the targeted association analysis for each of the chromosomal regions with an excess of African local ancestry.

Chromosome
Chromosomal