Genome-Wide Association Study of Polymorphisms Predisposing to Bronchiolitis

Bronchiolitis is a major cause of hospitalization among infants. Severe bronchiolitis is associated with later asthma, suggesting a common genetic predisposition. Genetic background of bronchiolitis is not well characterized. To identify polymorphisms associated with bronchiolitis, we conducted a genome-wide association study (GWAS) in which 5,300,000 single nucleotide polymorphisms (SNPs) were tested for association in a Finnish–Swedish population of 217 children hospitalized for bronchiolitis and 778 controls. The most promising SNPs (n = 77) were genotyped in a Dutch replication population of 416 cases and 432 controls. Finally, we used a set of 202 Finnish bronchiolitis cases to further investigate candidate SNPs. We did not detect genome-wide significant associations, but several suggestive association signals (p < 10−5) were observed in the GWAS. In the replication population, three SNPs were nominally associated (p < 0.05). Of them, rs269094 was an expression quantitative trait locus (eQTL) for KCND3, previously shown to be associated with occupational asthma. In the additional set of Finnish cases, the association for another SNP (rs9591920) within a noncoding RNA locus was further strengthened. Our results provide a first genome-wide examination of the genetics underlying bronchiolitis. These preliminary findings require further validation in a larger sample size.

severity of viral bronchiolitis. Some evidence of genetic susceptibility was provided by a twin study that showed a higher concordance in hospitalization rates between identical twins than in fraternal twins 9 . That study estimated the heritability of bronchiolitis to be 22%.
Genetic studies of susceptibility to bronchiolitis have mostly focused on variants in selected candidate genes, often related to immunity. A few studies have evaluated the relevance of larger groups of genes and have linked variations in the severity of RSV disease with innate immunity-related genes and the IL13/IL4 locus in the 5q31 cytokine cluster 10,11 . Moreover, genetic studies have demonstrated RSV bronchiolitis-associated loci in genes encoding proteins such as toll-like receptors (TLRs), surfactant protein D (SFTPD), Vitamin D receptor (VDR), and various cytokines, of which some have been suggested in more than one study 3,10,12,13 . However, most genetic associations have not been replicated in subsequent studies.
Severe viral bronchiolitis during childhood, especially after RSV bronchiolitis or rhinovirus bronchiolitis, is associated with the development of asthma later in life 5,14 . It is unknown whether the risk of asthma increases because of the bronchiolitis or whether a common genetic background underlies both diseases 15 . Polymorphisms in genes including IL-10, IL-13, TLR4, VDR, CCR5, and ADAM33 have been associated with both RSV bronchiolitis and a risk of asthma 16,17 . This may support a genetic predisposition to both infant viral bronchiolitis and subsequent asthma. An increased risk of childhood-onset asthma after rhinovirus infection has been associated with 17q21 locus 18 .
Regarding bronchiolitis, genetic associations suggested in different studies need to be validated in further studies. A more comprehensive characterization of genetic determinants will lead to improved recognition of risk groups and will likely contribute to the development of new and more-specific treatments for respiratory infections. Therefore, the aim of the present study was to conduct a hypothesis-free genome-wide association study (GWAS) to identify genetic polymorphisms that predispose infants to viral bronchiolitis.

Results
Suggestive associations in GWAS of bronchiolitis and RSV bronchiolitis. We detected several suggestive association signals (p < 1 × 10 −5 ) in the GWAS performed on 217 infants hospitalized for viral bronchiolitis and 778 controls. (Supplementary Figure S1). The stringent genome-wide significance level (p < 5 × 10 −8 ) was not reached. Table 1 contains the results for the SNPs in the strongest suggestive GWA loci (p < 5 × 10 −6 ). Cases were stratified according to presence of RSV, and RSV-positive bronchiolitis cases (n = 121) were tested against the same GWAS controls ("RSV GWAS"). Again, the signals did not reach the level of genome-wide significance (p < 5 × 10 −8 ) (Supplementary Figure S2). Table 2 lists the strongest suggestive GWA loci (p < 5 × 10 −6 ) in the RSV GWAS. Three of the best loci were shared between the bronchiolitis GWAS and the RSV GWAS. The shared loci were situated in the regions near the genes VSTM4, C10orf71, and DRGX in chromosome 10 and in the region near LOC105375265 and LOC105375266 in chromosome 7. Based on the results obtained from the bronchiolitis GWAS and the RSV GWAS, we selected one representative SNP for the replication phase from each of the best suggestive GWA region. All SNPs selected for replication genotyping and the respective results are presented in Supplementary Table S1.
SNPs associated with RSV bronchiolitis in previous studies. We screened for SNPs that were reported to be associated with RSV bronchiolitis in previous studies and SNPs in the vicinity (within 20 kb) of such variants in our GWAS data. The exact SNPs reported in previous studies did not show associations or were not present in our GWAS data. The results for the variants near the reported bronchiolitis-associated SNPs are presented in Supplementary Table S2 (variants with p < 0.01 in the bronchiolitis GWAS or RSV GWAS are shown). The best SNP was rs56039226 in the CX3CR1 intron (p = 5.7 × 10 −5 ). This SNP is located 11 kb upstream of the nonsynonymous CX3CR1 variant, rs3732378 (previously T280M). The A-allele-containing genotypes of rs3732378 were associated with RSV bronchiolitis severity in a previous study by Amanatidou et al. 19 . The exact SNP was present in our GWAS data, but it did not show an association (p = 0.095). However, the A-allele was also slightly more frequent in the GWAS cases compared to the controls (allele frequencies of 0.16 and 0.14 in cases and controls, respectively).
SNPs associated with asthma in previous studies. When we screened SNPs that have been reported to be associated with asthma in the NHGRI-EBI GWAS Catalog (with p ≤ 10 −6 ), no associations were seen in our data. The best SNP was rs12436663 (p = 5.8 × 10 −4 , OR = 1.96 for T allele in RSV GWAS) within MRPP3 intron. The regions (20 kb) around the asthma-associated SNPs showed moderate signals in our data. The results are shown for variants with p ≤ 0.0001 in GWAS, RSV GWAS or GWAS with cases < 1year (Supplementary Table S3). Many of the variants are known eQTLs. The variants showing strongest suggestive signals were in the area of VCAN, HLA-DQA1, and NIN. Those genes have various roles including cell adhesion, cytokine production, antigen binding and GTP binding 20 .
Associations of rs269094, rs9591920, and rs1537091 with bronchiolitis in the replication populations. In the replication phase, we analyzed 77 of the most promising SNPs based on the results of the GWA analysis in an independent population collected in the Netherlands (416 cases and 432 controls). The suggestive associations of three of the SNPs (rs269094, rs9591920, and rs1537091) with bronchiolitis were nominally replicated (p < 0.05), and meta-analysis for these SNPs yielded lower p values compared to the original GWAS ( Table 3). The results were similar for the RSV GWAS and GWAS of infants aged < 1 year. The results for the SNPs with a p value < 0.1 in the Dutch population and ORs in the same direction in the GWAS subsets, and meta-analyses, are shown in Supplementary Table S4. To identify genes for which expression is potentially affected by our candidate SNPs (p value < 0.05 in the Dutch replicate), we queried genotype-tissue expression (GTEx) data 21 . Rs269094 was previously identified as an eQTL for KCND3 in spleen (p = 7.1 × 10 −10 ); the C-allele is associated with higher expression. Another correlated SNP in the same region (rs269101, r 2 = 0.42 in 1000 genomes phase3 Finnish individuals) showed a trend toward association in our GWAS (p = 4.2 × 10 −5 ) and in the replicate (p = 0.054), further supporting the result. By exploring the NHGRI-EBI GWAS Catalog, we also discovered that variants in or near KCND3 have been associated with diisocyanite-induced asthma and the cytokine response to smallpox 22 . The two other candidate SNPs were not identified as known eQTLs.
We further screened the three SNPs (with p < 0.05 in the Dutch population) in the Finnish case data collected in Turku, using the NordicDB population controls as a reference (Supplementary Table S5). Rs9591920 showed an association (p = 0.042, OR = 1.5). The allele frequencies of the two other SNPs, rs269094 and rs1537091, did not differ between cases and controls in this analysis. Each suggestive GWA locus comprises several SNPs in LD (r 2 > 0.5, p < 0.01). Loci were formed with default options of LD-based result clumping (PLINK 1.9). We selected one variant for replication genotyping from each locus. Signals in sporadic imputed variants (not shown) were considered unreliable and not chosen for replication genotyping. Variants with a minor allele frequency of < 0.02 within either cases or controls were not chosen for replication genotyping. Ж Range and location refer to human genome build 37 (GRCh37/hg19) coordinates. **Corresponding locus shown for variants within genes; two nearest loci shown for intergenic SNPs. ‡ OR for A1 allele under additive model in logistic regression, with sex and three principal components as covariates. † Suggestive GWA loci shared between GWAS and RSV GWAS (of loci with p < 5 × 10 −6 ). GWA, genome-wide association; SNP, single nucleotide polymorphism; Chr, chromosome; MAF, minor allele frequency; OR, odds ratio; CI, confidence interval; LD, linkage disequilibrium.

Discussion
In the present study, we identified several suggestive GWA loci associated with viral bronchiolitis. Three of the SNPs selected for replication genotyping showed nominal associations (p < 0.05) also in an external replication population. Of those variants, rs269094 is a known eQTL for KCND3, which has been linked to diverse functions including regulation of neurotransmitter release and heart rate 20 . More recently, the gene has also been associated with diisocyanite-induced occupational asthma and cytokine response in smallpox vaccine recipients 23,24 . At the protein level, KCND3 interacts with another membrane protein, DPP10, of which genetic variants have been associated with conditions including asthma 20, 25 . The association of another SNP with p < 0.05 in the Dutch population, rs9591920, was nominally replicated in the additional Finnish case set. The variant resides within an uncharacterized noncoding RNA (ncRNA) locus (LOC105370220) 20 . Thus, it is difficult to assign potential functions to this region. In general, intergenic or noncoding SNPs have been associated with many complex phenotypes in previous GWASs. The SNPs within ncRNA loci could affect either the expression or function of the RNA molecule, which may further regulate target gene expression 26 . The third variant showing nominal association in the Dutch cohort, rs1537091, is located in an intergenic region, nearest loci being long intergenic non-protein coding RNAs without known function. Further study is required to validate the associations of our candidate variants and study their potential mechanism in bronchiolitis susceptibility.
We detected only modest signals for the regions around SNPs reported by previous studies to be associated with bronchiolitis. This might be due to allele frequency or LD structure differences among study populations, false discoveries by chance due to limited population size or gene-environment interactions. The confounding effect of population stratification is difficult to take into account in candidate gene studies; in the present study, we could account for the slight structure caused by differing genetic origins by exploiting the genome-wide data and principal components and thus reduce the risk of false-positive findings. Further, we could exclude population outliers and close relatives based on information provided by genome-wide sampling. In our data, the best variant near the SNP that was previously reported to be associated with bronchiolitis was in CX3CR1 intron (p = 5.7 × 10 −5 ).  Each suggestive GWA locus comprises several SNPs in LD (r 2 > 0.5, p < 0.01). Loci were formed with default options of LD-based result clumping (PLINK 1.9). We selected one variant for replication genotyping from each locus. Signals in sporadic imputed variants (not shown) were considered unreliable and not chosen for replication genotyping. Ж Range and location refer to human genome build 37 (GRCh37/hg19) base pair positions. **Corresponding locus shown for variants within genes; two nearest genes shown for intergenic SNPs. ‡ OR for A1 allele under additive model in logistic regression, with sex and three principal components as covariates. † Suggestive GWA loci shared between GWAS and RSV GWAS (of loci with p < 5 × 10 −6 ). GWA, genome-wide association; RSV, respiratory syncytial virus; SNP, single nucleotide polymorphism; Chr, chromosome; MAF, minor allele frequency; OR, odds ratio; CI, confidence interval; LD, linkage disequilibrium. Due to suggested shared genetic contribution among bronchiolitis and asthma, variants that have been associated with asthma in previous studies, and regions (20 kb) around the variants, were screened in our data. Moderate signals (p < 0.0001) were detected in variants within genes including VCAN and HLA-DQA1. HLA-DQ region has been associated with different types of asthma in several studies 27 , and an association of VCAN with asthma was discovered more recently 28 . Asthma clusters in families and likely has a strong genetic component, with heritability estimates as high as 60% 29 . Hundreds of genetic polymorphisms have been associated with different asthma phenotypes in past genetic studies. However, asthma is a complex and heterogeneous disease which is likely contributed by a combination of many genes in interaction with environmental factors 30 . Thus, the potential roles of the suggested genes in LRI susceptibility, and an overlap with biological pathways related to asthma, warrant further study.
The present study had several limitations. The sample size was relatively small, and thus especially the stratified GWAS subset analyses might have been underpowered. In the discovery GWAS, the best suggestive association signals, that were almost genome-wide significant, were seen in the area of LOC105375265 on chromosome 7 and in the area of VSTM4 and DRGX on chromosome 10. The signals in these regions were identified both in the GWAS and the RSV GWAS but not in the Dutch replication population. Despite the lack of replication, these signals might well reflect true associations in the Finnish population.
There is no worldwide consensus on treatment and diagnosis of bronchiolitis. The American Academy of Pediatrics defines 24 months as an upper age limit of bronchiolitis 2,31 . In Finland and Sweden, clinicians have begun to consider bronchiolitis to be mainly a disease of infants under 12 months of age 32 . In our GWAS, most of the cases were < 12 months of age at the time of the diagnosis. Moreover, for the most part the results for the GWAS < 1 year were similar to those of the main GWAS. However, the differing clinical definitions of included bronchiolitis cases were one limitation of the current study. Further limitation was the differing proportion of RSV positive samples between the GWAS and the replication populations. In studies to come, selecting bronchiolitis study cases according to more specified criteria will likely help detecting greater effects and susceptibility genes associated with specific phenotypes.
The present GWAS identified several variants that were suggestively associated with bronchiolitis. Although no significant associations were detected, our study provides the first genome-wide examination of bronchiolitis and a large body of data that may benefit future studies on complex genetic determinants and biological pathways that cause variation in bronchiolitis susceptibility. Further replication and functional validation is needed to verify and uncover the potential roles of our candidate SNPs. The relationship between infant viral bronchiolitis and asthma, and a plausible genetic overlap of the two diseases, remain an important area of study.

Materials and Methods
Study design and populations. The study populations comprised a total of 2045 study subjects consisting of 835 bronchiolitis cases and 1210 controls (974 females and 1071 males). The study was approved by the ethics committees of Tampere University Hospital and Kuopio University Hospital, Finland, and the regional ethics committee of Gothenburg, Sweden. The study was carried out in accordance with the approved guidelines of the WMA Declaration of Helsinki. Informed consents were obtained from the study subjects or provided by their parents. The main elements of the study were: (1) a GWAS in a Finnish-Swedish discovery population of 217 cases and 778 controls, (2) an analysis of the most promising variants in a Dutch replication population of 416 cases and 432 controls, and (3) a further examination of the most promising variants in an additional Finnish set of 202 cases. Moreover, GWAS cases were stratified according to age or presence of RSV and the subgroups of RSV positive cases ("RSV GWAS") and those of < 12 months of age were studied separately (Fig. 1).
The inclusion criteria for all the affected children were (1) clinically diagnosed bronchiolitis, (2) an age < 24 months at the time of diagnosis and (3) native Finnish, Swedish or Dutch ethnicity. Bronchiolitis was defined as the presence of LRI and respiratory symptoms typical of bronchiolitis, including wheezing and labored breathing 2 . RSV infection was tested by antigen immunofluorescent or radioimmuno assays, viral isolation, PCR of nasopharyngeal cells or by serum antibody assays.
Overall, RSV was detected in 72% of the cases, and > 80% were under 12 months of age. Characteristics of bronchiolitis cases in GWAS discovery cohorts and validation cohorts are presented in Tables 4 and 5. The   [33][34][35][36] . A subset of the GWAS population was collected in Gothenburg, Sweden (29 cases, 30 controls) 37 . Replication cohorts originated from The Netherlands (n = 848) and Finland (n = 202) 10,38,39 . DNA sample preparation and genotyping. Genomic DNA was obtained from whole blood samples, buccal swabs, or buffy coats. DNA from blood samples collected from the GWAS subjects was extracted with the UltraClean ® Blood DNA Isolation Kit (MO BIO Laboratories, Inc., Carlsbad, CA, USA). DNA from infants in the Dutch replication cohort was isolated from blood samples or buccal swabs with the QIAamp DNA Blood Kit (Qiagen, Hilden, Germany). DNA extraction from buffy coats collected from the Dutch population controls was done by digestion with proteinase K followed by salting out with potassium acetate and chloroform-isoamyl alcohol extraction. Genome-wide genotyping was performed with the Human Genotyping Array Kits (Illumina, San Diego, CA, USA. (Supplementary Table S6). The Technology Centre at the Institute for Molecular Medicine Finland (FIMM), University of Helsinki, did the genotyping. FIMM also performed the targeted SNP genotyping in the replication phase with the Sequenom iPLEX Gold assay (San Diego, CA, USA). NordicDB controls are described in detail in Bilguvar et al. 40 .
Quality control and imputation of the genome-wide data. After quality control (QC), 995 study subjects were available for the analyses: 464 male subjects and 531 female subjects/217 cases and 778 controls. There were 79,563 autosomal SNPs with a MAF of > 0.01. The data was prephased with SHAPEIT2, followed by genotype imputation (i.e., statistical prediction of missing genotypes) with IMPUTE2 41,42 . More detailed description of the QC and imputation can be found in the online Supplementary Material. After imputation, the amount of SNPs (MAF > 0.01) was 5,304,323.

Selection of SNPs for replication analysis.
Seventy-seven SNPs were genotyped in the Dutch replication population. Because the stringent genome-wide significance level (p < 5 × 10 −8 ) was not reached in the discovery GWAS, less-strict criteria were applied for the selection of SNPs. p values of < 5 × 10 −4 and < 1 × 10 −4 for genotyped and imputed SNPs, respectively, were used as threshold values for suggestive associations. For imputed SNPs, linkage disequilibrium (LD) was taken into account by selecting relatively independent (r 2 < 0.5) SNPs from each suggestive GWA locus. Sporadic imputed SNPs were excluded. Provided that a suggestive GWA locus had several variants with similar p values, the best SNP was chosen based on predicted or empirical regulatory functions (including eQTLs, promoter activity, and protein binding sites) by exploring annotation databases, including HaploReg v. 4.1, GTEx, and dbSNP 21,43,44 . We did not choose low-frequency SNPs (MAF < 0.02 in cases or controls) in the main GWAS population to avoid genotyping monomorphic SNPs in the Dutch population. Primary endpoint studied in GWAS was viral bronchiolitis. Cases were further stratified into subsets according to presence of RSV and age. Stratified phenotypes studied were bronchiolitis caused by respiratory syncytial virus ("RSV GWAS") and bronchiolitis in infants < 12 months of age.  We tested the homogeneity of odds ratios among the GWAS subpopulations according to city of origin with the Breslow-Day test and excluded variants with p values of < 0.01.

Statistical analysis. Principal component analysis (PCA)
. PCA with PLINK1.9 was performed on nonimputed LD-pruned GWAS data to delineate components explaining possible remaining population structure. The adequate number of components was evaluated based on genome-wide inflation factor lambda (λ ), eigenvalues, and inspection of covariate significances in logistic regression. Three leading components were incorporated into subsequent association analyses.
SNP association analysis. Allele frequency differences were assessed with logistic regression under an additive model, with gender and three principal components as covariates. Analyses were performed with PLINK 1.9 45 . Lambda values were close to 1, indicating an adequately controlled population structure (λ = 1.004 for the genotyped population, λ = 1.06 for the imputed population). In the RSV GWAS, we studied a subset of cases positive for RSV infection (n = 121) against the GWAS controls (n = 778). Similarly, cases that were < 12 months of age (173 cases, 778 controls) were tested as a group. Replication populations collected in the Netherlands and Turku, Finland, were studied with logistic regression analysis under an additive model, with gender as a covariate. Measures of basic QC, which entailed removing variants with MAF < 0.01, genotyping rate < 0.9, or deviation from HWE (p < 0.001), were performed. The replication case set of Turku was studied against the NordicDB population controls (n = 684). The discovery and the Dutch replication population were studied by meta-analysis under a fixed-effects model to estimate their combined effect. Confidence intervals for odds ratios of meta-analysis were calculated with a meta package in R 46,47 . Finally, SNPs that were reported in the literature to be associated with RSV bronchiolitis or asthma, or variants within 20 kb of such SNPs, were screened in our GWAS data.  Table 5. Characteristics of bronchiolitis cases in replication populations. * Only RSV-positive bronchiolitis cases originally selected 10 . ** Cases were randomly acquired from a population of > 95% of the cases < 12 months of age 10 .