A genome-wide association study of quantitative computed tomographic emphysema in Korean populations

Emphysema is an important feature of chronic obstructive pulmonary disease (COPD). Genetic factors likely affect emphysema pathogenesis, but this question has predominantly been studied in those of European ancestry. In this study, we sought to determine genetic components of emphysema severity and characterize the potential function of the associated loci in Korean population. We performed a genome-wide association study (GWAS) on quantitative emphysema in subjects with or without COPD from two Korean COPD cohorts. We investigated the functional consequences of the loci using epigenetic annotation and gene expression data. We also compared our GWAS results with an epigenome-wide association study and previous differential gene expression analysis. In total, 548 subjects (476 [86.9%] male) including 514 COPD patients were evaluated. We identified one genome-wide significant SNP (P < 5.0 × 10–8), rs117084279, near PIBF1. We identified an additional 57 SNPs (P < 5.0 × 10–6) associated with emphysema in all subjects, and 106 SNPs (P < 5.0 × 10–6) in COPD patients. Of these candidate SNPs, 2 (rs12459249, rs11667314) near CYP2A6 were expression quantitative trait loci in lung tissue and a SNP (rs11214944) near NNMT was an expression quantitative trait locus in whole blood. Of note, rs11214944 was in linkage disequilibrium with variants in enhancer histone marks in lung tissue. Several genes near additional SNPs were identified in our previous EWAS study with nominal level of significance. We identified a novel SNP associated with quantitative emphysema on CT. Including the novel SNP, several candidate SNPs in our study may provide clues to the genetic etiology of emphysema in Asian populations. Further research and validation of the loci will help determine the genetic factors for the development of emphysema.

www.nature.com/scientificreports/ Smoking is a major risk factor for emphysema, but emphysema is identified in non-smokers and the severity of emphysema among smokers varies greatly [4][5][6] . Emphysema has a genetic component with significant heritability and a previous study estimated its heritability to be approximately 25% 7,8 . Previous reports have described genetic determinants of emphysema and airway phenotypes in smokers with or without COPD 9 , emphysema in the general population 10 , distinct local histogram emphysema pattern 11 , and emphysema distribution 12 .
However, most emphysema genome-wide association studies (GWAS) were performed either exclusively or predominantly in European ancestry individuals. The Korean Obstructive Lung Disease (KOLD) and COPD in dusty areas (CODA) cohorts were constructed in South Korea collecting CT imaging and blood samples that enabled assessments of genetic associations with CT features. We investigated genetic determinants of emphysema severity in Korean cohorts and sought to relate these to gene expression and DNA methylation.

Methods
Study sample. Blood samples of 1056 subjects from the KOLD and CODA cohorts were genotyped. KOLD is a prospective cohort which recruited COPD patients from 16 university hospitals in South Korea. CODA is a prospective cohort conducted on subjects with airflow limitation and healthy volunteers living in dusty areas near cement plants in the Kangwon and Chungbuk provinces of South Korea. Details of the cohorts were described in previous papers 13,14 . Written informed consent was given by each participant. This study received ethical approval from the Kangwon National University Hospital IRB (KNUH 2012-06-007) and the Asan Medical Center IRB (Approval No. 2005-0345). This study was conformed to the tenets of the Declaration of Helsinki.
Computed tomographic measurements. In the KOLD cohort, CT measurements were obtained using 16-channel multidetector row CT scanner (SOMATOM Sensation; Siemens Medical Systems, Erlangen, Germany). In the CODA cohort, CT measurements were obtained using dual source CT scanner (SOMATOM Definition, Siemens Healthcare, Forchheim, Germany). In both studies, all subjects were scanned at full inspiration in a supine position. Emphysema was calculated as percent of lung area below or equal to the − 950 HU threshold and log-transformed.
Genotyping and quality control. We genotyped all subjects on Axiom KoreanChip 1.0 platform 15 . We removed low quality SNPs with low variant call rate, excessive heterozygosity and singletons, gender discrepancy, Hardy-Weinberg Equilibrium p < 0.001 and minor allele frequency < 0.01 using Affymetrix Power Tools and PLINK. After quality control, 586,966 SNPs with minor allele frequency of 1% or more were remained. (Fig. 1) Genotype imputation was performed at the Michigan Imputation Server using the HRC r1.1 reference panel. After imputation, SNPs with low imputation quality (R 2 < 0.8) and minor allele frequency < 0.01 were  Genome-wide association analysis. We performed linear regression on natural log transformed emphysema index adjusted for age, sex, smoking status (never smoker, ex-smoker and current smoker), pack-years of smoking, and study center using PLINK (version 1.19) 16 . We also performed analyses only in COPD patients. We defined genome-wide significance as P < 5 × 10 -8 and defined 'candidate' markers of interest at P < 5 × 10-6. Local association plots were generated around 600 kb in either direction of lead SNPs using LocusZoom in Asian genome (hg10/1000 Genome NOV 2014 ASN) 17 . Recombination rates were obtained using HapMap Phase II data 18 .
Epigenetic marks and DNase I hypersensitivity regions in top SNPs. Potential functional information of SNPs which exhibit high linkage disequilibrium (LD) (r 2 > 0.8, 1000 G Phase1, Asian population as reference) with the top associated SNPs was obtained using the HaploReg database (version 4.1) and the singletissue expression quantitative trait loci (eQTL) data in the genotype-tissue expression (GTEx) consortium (lung and whole blood) 19,20 . The P-value threshold for significant eQTL was set at 5 × 10 -4 in GTEx database.
Epigenome-wide association study (EWAS) related to emphysema. We performed EWAS of emphysema in blood DNA from 100 CODA subjects using the Infinium HumanMethylation450 platform. EWAS methods have been previously described in a study of emphysema index 21 .
Comparison with previously published results. We performed look-ups of top SNPs in previously reported quantitative emphysema GWAS results which combined four study populations (COPDGene, ECLIPSE, National Emphysema Treatment Trial/Normative Aging Study, and GenKOLS) 9 . We also investigated whether the nearest genes of top SNPs were reported in previous differential expressed gene (DEG) analysis related to emphysema 22 .

Results
Baseline characteristics of study population. Baseline characteristics of study subjects are shown in Table 1. The mean age of all subjects was 71 years and 83.9% of all subjects were ever-smokers. Among a total of 548 subjects, 514 COPD patients who were defined as post-bronchodilator FEV 1 /FVC less than 0.7 and 34 subjects with normal spirometry had quantitative CT emphysema data (Fig. 1).

GWAS in all subjects and COPD patients only. The Qq plot did not show systematic inflation in
GWAS test statistics (lambda value 0.99 and 0.99 in all subjects and COPD patients, respectively (Supplementary Fig. S1). rs117084279 (MAF, 0.021), near the PIBF1 gene was genome-wide significant in all subjects ( Table 2, Fig. 2). Fifty seven SNPs in 19 loci identified in all subjects, and 106 SNPs in 16 loci in COPD patients, reached a pre-defined suggestive significance level (P < 5.0 × 10 -6 ) forming the candidate SNPs (Table 2). Among the candidate SNPs, 24 SNPs overlapped in both groups. The top 20 SNPs of each group are shown in Tables 2 and 3. eQTL results. We sought to determine whether candidate SNPs are located in regions that have effect on regulation of gene expression in lung and whole blood using the GTEx database. Three candidate SNPs in all www.nature.com/scientificreports/ www.nature.com/scientificreports/ subjects and 8 candidate SNPs in COPD patients were identified as eQTL in lung or whole blood in GTEx (Tables 4 and 5).
Comparing enhancer regions with GWAS results. To further explore the functional role of candidate SNPs, we examined our candidate SNPs and SNPs in LD with our candidate SNPs using the HaploReg database.

Look-ups in DEG and EWAS data.
Comparing with DEGs identified in a preceding emphysema RNAseq analysis, MACROD2 upregulation in emphysema overlapped with a region identified in the GWAS in all subjects. Similarly, DOCK1, LARGE and ERAL1 also overlapped loci noted genes that is associated with SNPs identified in COPD patients (Table 6). In addition, several annotated genes of our candidate SNPs were also identified in our EWAS study related to emphysema with nominal level of significance (Supplementary Table S2). We plotted results of GWAS and functional studies together in Fig. 3.

Look-ups in COPDGene data.
In COPDGene emphysema results, we did not find any replicated SNP with our results (Supplementary Table S3).

Discussion
In this study, we identified one genome-wide significant SNP in a novel candidate gene (PIBF1) and candidate SNPs (P < 5.0 × 10 -6 ) for quantitative emphysema on CT in all subjects and COPD patients in Korean cohorts. We further found candidate SNPs (at a more liberal threshold of significance) were often eQTL in lung tissue, in linkage disequilibrium with variants in promotor or enhancer histone markers, or DNase I hypersensitivity sites in lung tissue or whole blood. Moreover, several SNPs were located near genes reported in preceding emphysema RNA-seq analysis or preceding EWAS studies. In all subjects, we identified 2 candidate SNPs which were also identified as eQTL in lung tissue, near CYP2A6. CYP2A6 has been associated with COPD and emphysema, and also smoking habits 23,24 . Genetic Table 6. Differentially expressed genes in previous COPD transcriptome analysis. www.nature.com/scientificreports/ polymorphisms of CYP2A6 result in altered activity of the CYP2A6 protein, affecting nicotine metabolism and smoking behavior 9,25,26 . This effect was also evaluated in the Asian population 27,28 . However, a relationship between this region and COPD and emphysema has not yet been described. Smoking is an important risk factor for the development of emphysema; therefore, it is possible that this variant contributes to the development of emphysema through smoking. Further research is needed to elucidate the causal relationship. One of our candidate SNPs (rs11214944) near NNMT was identified as an eQTL and in LD with SNPs lying in enhancer histone marks in lung tissue. NNMT has also been identified as a gene differentially expressed according to severity of COPD 29,30 . Moreover, NNMT has been identified as differentially expressed in moderate emphysema compared to mild emphysema by more than sixfold 31 . Intriguingly, in a previous study, NNMT was one of differentially expressed genes in IL-6 signaling related to airway inflammation and remodeling 32 . Although the study was focused on airway inflammation, not emphysema, considering that IL-6 and its signaling play a main function in emphysema pathogenesis, we could expect association of NNMT and emphysema through IL-6 33 .
Although the MAF was very low, we identified one SNP, rs117084279, near PIBF1, that was genome-wide significant. Interestingly, PIBF1 has also been identified as one component of centriolar satellite proteins and has an essential role in primary cilia formation and ciliary protein recruitment 34,35 . By whole genome siRNA-based functional genomics screen, mutation in PIBF1 has been known to induce hereditary ciliopathy disease 36 . It has been well known that cigarette smoking causes structural and functional abnormality of cilia in bronchial epithelial cell [37][38][39][40] . Also, cigarette smoke is suggested to be responsible for genes associated with altered ciliary growth 41 . The relationship between ciliary function in airway epithelial cell and emphysema has not been well elucidated but our novel SNP could provide a potential link between them.
Likewise, KLF12and KCNJ3 are genes near our candidate SNPs, which were also found to be associated with our EWAS results. Of them, KCNJ3 was identified to associated with lung function and airway obstruction in previous studies, though there is no data on emphysema 24,42 . There is insufficient data to clarify that KLF12 is related to emphysema in lung. However, previous studies suggested that KLF12 gene, also known as the AP-2rep gene, functions as a transcriptional repressor of the AP-2α gene through a set of overlapping cis-regulatory promoter elements and a reciprocal regulation of both genes 43,44 . AP-2α is known to be involved in ras oncogene-mediated transformation and myc-mediated programmed apoptotic cell death 45,46 . Of interest, one study indicated that AP-2α protein was increased in lung of cigarette smoke exposure induced COPD rat model and this was also associated with increased cell apoptosis 47 . There are several mechanisms that likely contribute to the pathogenesis of emphysema [48][49][50][51][52] . One of them is apoptosis. Both animal COPD and human lung model suggest that apoptosis might be involved in the development of emphysema [53][54][55] . However, it has been unclear whether there is a direct relationship between AP-2α and apoptosis of cells in lung, results of our study yield novel insight of development and progression of emphysema and further experimental study is warranted.
In addition, we identified SNPs at a suggestive level of significance near MACROD2 which is DEG identified in preceding emphysema RNA-seq analysis. MACROD2 was associated with COPD and lung function in previous studies, but there is lack of data on the association with emphysema directly [56][57][58] . Although the reference study www.nature.com/scientificreports/ identified DEG according to presence of emphysema instead of quantitative value, integration with our GWAS result helps to identify meaningful genes among numerous genes 22 .
Although we did not find replicated SNPs in lookup results with COPDGene results, differential expression of genes and DNA methylation through high linkage disequilibrium with the genes could be suggested as a potential functional mechanism.
Inability to detect replicated SNPs in previous study is one of our limitations. This might be owing to small sample size, another limitation of our study, which can increase the false positive rates and decrease the statistical power. Even though a genome-wide significant SNP being identified, in view of both the relatively low MAF of the SNP and the small sample size of our study, further replication studies with larger population sizes are needed. Ethnic differences also could contribute to the inability of replication, and further Asian studies on emphysema are needed. Third, our study was not able to perform a meta-analysis with GWAS on emphysema or COPD in Asian population due to the lack of data. Comparing the results of meta-analysis with GWASs in different ethnicity would facilitate to elucidate the ethnic specificity. Fourth, the functional and biological impacts of the SNPs on emphysema are not identified in our study. Functional and integrated analysis may lead to a better understanding of pathophysiology of emphysema in Asian population. Also, we could not find out causal effects on emphysema of SNPs identified in our study. In further studies or meta-analysis including our study is needed to explore causal effects of the SNPs using Mendelian Randomization 59-63 . Finally, we had an interest in exploring whether the genetic cause could be a determining factor in emphysema regardless of smoking. Therefore, we focused on the results of the population that includes more never-smokers. However, owing to a small number of non-COPD subjects included in the total population, characteristics between the total and COPD populations may not be significantly different.

Conclusions
In a genome-wide association study of emphysema in Korean COPD, we identified a new genome-wide significant association and several associations at suggestive significance. Ours is the first GWAS related to quantitative emphysema in Korean population. Further analysis of including replication in other independent cohorts and functional studies would yield insights into the development of emphysema. In particular, this work may be a starting point to investigate the aspects of the pathobiology of emphysema that are shared or unique across differing ancestries.