Introduction

A pulmonary function test provides an accurate and objective assessment of lung impairment and can obtain adequate information for the diagnosis of chronic obstructive pulmonary disease (COPD).1 COPD, a disease characterized by obstructive ventilatory impairment, is one of the most common respiratory diseases and one of the leading causes of mortality worldwide.2, 3, 4 Forced expiratory volume in the first second (FEV1), forced vital capacity (FVC) and the ratio of FEV1 to FVC (FEV1/FVC) are the major spirometric parameters.1 An obstructive ventilatory defect presents a disruption in the balance between maximal airflow of the lung and its capacity. It means airway narrowing during exhalation and is defined by a reduced FEV1/FVC.5

COPD is a complex disease that is influenced by multifactorial environmental traits, such as old age, infection, presence of asthma and cigarette smoking.6, 7 Although smoking is the main risk factor for COPD, not all smokers develop this critical disease. Thus, it has been suggested that genetic factors may explain why some individuals are more susceptible than others. Several family studies have affirmed genetic influences on respiratory impairment, but little is known about which particular gene is involved.8, 9, 10

Recently, multiple loci associated with pulmonary function and/or risk of COPD were identified using genome-wide association (GWA) analyses.11, 12, 13, 14, 15, 16, 17, 18, 19 GWA studies (GWASs) for meta-analyses among four populations of the Cohorts for Heart and Aging Research in Genomic Epidemiology consortium reported that FEV1/FVC is associated with chromosome 4. In particular, the chromosome 4q31 region near HHIP and the 4q22.1 region in FAM13A were related to FEV1/FVC.11, 14, 15, 16, 17, 19 Some case–control studies replicated this finding and furthermore revealed that variants in FAM13A are associated with a higher risk of COPD, especially in smokers.16, 20, 21 However, the biological mechanism explaining how the interaction of single-nucleotide polymorphisms (SNPs) and environmental factors (for example, smoking status) increases the prevalence of reduced pulmonary function and/or COPD is still unclear.

In this study, we performed GWAS on the pulmonary function of two Korean population-based cohorts as part of the Korean Genome Epidemiology Study (KoGES). We investigated which genetic factors might contribute to pulmonary function as measured by FEV1/FVC and examined how they related to lung impairment in accordance with smoking behaviors.

Materials and methods

Study design and population cohorts

This study analyzed GWAS data from two independent cohorts from Ansan and Ansung that are included in the KoGES. Both cohorts enrolled Korean men and women aged 39–70 years in 2001–2002 to conduct a prospective investigation. Detailed information on participant recruitment is available elsewhere.22, 23, 24 A total of 5020 participants from Ansan and 5018 participants from Ansung took part in the baseline study from 2001 to 2003. Cohort members underwent a comprehensive health examination and a questionnaire-based interview, and biospecimens for assays were collected by health professionals at each study site. A questionnaire included demographic characteristics, lifestyle choices and medical history. Pulmonary function tests were performed by a skilled technician using a portable spirometer (Vmax-2130, Sensor Medics, Yorba Linda, CA, USA) according to standardized protocols of the American Thoracic Society.25 All participants performed pre-bronchodilator spirometry test until completing at least three repeated measurements and an acceptable measure was determined when the differences between the largest and the next largest FVC and FEV1 values were within 0.15 l. Calibration and quality control of spirometric examinations were also performed regularly based on American Thoracic Society guidelines.25

The Korea Association Resource Project, a multidisciplinary research consortium, began in 2007 to conduct a large-scale GWAS of Ansan and Ansung cohorts in the KoGES.26 From the 10 038 participants in both cohorts, we obtained genetic information of 10 004 individuals on microarrays, but we excluded data of 2011 individuals because they exhibited either a missing genotype call rate of <96% (n=401), heterozygosity <70% (n=11), gender consistencies (n=41), average pair-wise identity-by-state values higher than 0.8 as estimated values from first-degree relatives of Korean sib-pair samples (n=608) or medical history consisting of any kind of cancer (n=101). In addition, individuals who did not complete a pulmonary function test or anthropometric measurement (n=214), did not report smoking status (n=301) and reported a diagnosis of asthma (n=334) were omitted. The final GWAS consisted of 4319 individuals from Ansan and 3674 individuals from Ansung (see Supplementary Table S1).

Ethics statement

Informed written consent was obtained from all participants. The study protocol was approved by the ethics committee of the Korean Center for Disease Control and institutional review boards of the Korea University Ansan Hospital and the Ajou University School of Medicine.

Genotyping and quality control

For each subject, genomic DNA was separated from venous blood, and 500 ng of the sample was genotyped on the Affymetrix Genome-wide Human SNP array 5.0 (Affymetrix, Santa Clara, CA, USA). Bayesian robust linear modeling using the Mahalanobis distance genotyping algorithm was used for genotype calling of 500 568 SNPs.27 Detailed information on quality control method is available in a previous report.26 In Ansan, 312 381 SNP markers were used for the GWAS after excluding those with missing genotype call rates 5% (n=45 343), deviations from the Hardy–Weinberg equilibrium test with statistical significance set at 0.0001 (n=35 410) or a minor allele frequency <5% (n=147 570). With the same criteria, final 313 984 SNP markers were included from Ansung.

GWA and haplotype analyses

To explore loci associated with pre-bronchodilator pulmonary function test values as a quantitative trait, we used the PLINK program, version 1.07 (Free Software Foundation, Boston, MA, USA) and performed multivariate linear regression models, including age, gender, site and height as covariates. Additive models were applied to the analysis and false discovery rate adjustments were used for multiple testing corrections.

We developed the linkage disequilibrium (LD) blocks using the Haploview program, version 4.2 (Broad Institute of Harvard and MIT, Cambridge, MA, USA). Lewontin’s |D′| and R2 between all pairs of biallelic SNP loci were used to examine LD using the default algorithm from Gabriel et al.28 We constructed the LD blocks for HapMap populations (available from the ver.3 HapMap project) using Haploview to cross-compare their LD patterns with those in our population. Haplotypes for the polymorphic SNPs were inferred with a minor allele frequency >0.10 and deviations from the Hardy–Weinberg equilibrium test with statistical significance set at 0.05. This procedure was also performed using the expectation-maximization algorithm that was implemented through the PLINK program.

Association of haplotypes with the value was analyzed using multivariate linear regression models including the same covariates, and similar analyses were conducted after stratification by smoking status (pack-years of smoking (PY)=0, 015, 1530 and 30P-values for given SNPs with the theoretical distribution.

Results

Population characteristics

The average ages of the Ansan and Ansung cohorts were 48.99±7.82 and 55.60±8.74 years, respectively. Whereas 42.23% (n=1824) of the Ansan cohort participants were ‘ever-smokers’, 37.75% of the Ansung cohort participants were included in that category. Among all ‘ever smokers’, however, the average number of packs smoked per year was 21.05±16.05 in the Ansan cohort and 26.53±18.22 in the Ansung cohort. The pre-bronchodilator FEV1, FVC and FEV1/FVC values were significantly different between the two cohorts (Table 1).

Table 1 Demographic characteristics of the Ansan and Ansung cohorts

Genome-wide association results with pulmonary function

To identify loci associated with FEV1/FVC, we conducted a multivariate linear regression analysis. The genomic inflation factor (λ) was estimated at 1.03 for FEV1/FVC in the combined set, suggesting no evidence of systematic bias. A Manhattan plot of GWAS from the combined set shows that SNPs associated with FEV1/FVC to the GWA significance level are located on chromosome 4q22.1 in the FAM13A gene (Figure 1). A quantile–quantile plot in Supplementary Figure S1 shows P-values for FEV1/FVC in −log10 scale across the diagonal line, but dots representing lower P-values at the right end of the plot are located above the line. For FEV1/FVC, significant GWA was seen for four SNPs in FAM13A, specifically on chromosome 4q22.1 after adjusting for P<0.05 using a false discovery rate control; P-values ranged from 1.76 × 10−7 to 5.47 × 10−7 in the combined set (8.38 × 10−6 to 2.61 × 10−5 in Ansan and 8.99 × 10−4 to 2.84 × 10−3 in Ansung). The most strongly associated SNP in this region, rs2609264 (P=1.76 × 10−7 in the combined set), is located in an intron of the FAM13A gene (Table 2 and Figure 2a).

Figure 1
figure 1

Manhattan plot displaying results from genome-wide association of the pooled set. The pink dots shown in the FAM13A gene on 4q22.1 are associated with FEV1/FVC at a genome-wide significant level for the pooled set. The horizontal line indicates the genome-wide significance threshold P=1.00 × 10−5. FEV1, forced expiratory volume in the first second; FVC, forced vital capacity. A full color version of this figure is available at the Journal of Human Genetics journal online.

Table 2 Single-nucleotide polymorphisms (SNPs) in FAM13A most strongly associated with FEV1/FVC with P-value adjusted for false discovery rate <0.05 (in a combined set)
Figure 2
figure 2

Regional association signals between single-nucleotide polymorphisms (SNPs) and FEV1/FVC and linkage disequilibrium (LD) within FAM13A. (a) Regional association plot between SNPs and FEV1/FVC within FAM13A. The association was drawn from multiple linear regression analysis adjusted for age, gender and pack-years of smoking on the basis of the additive model. (b) The LD among SNPs in FAM13A (from 89 868 514 to 90 193 745 bp) with the use of imputed genome-wide association (GWA) data. Within the block including the top four SNPs, the LD signal is very strong; the squared correlation coefficient between two loci (R2) was examined between 0.83 and 0.97. FEV1, forced expiratory volume in the first second; FVC, forced vital capacity. A full color version of this figure is available at the Journal of Human Genetics journal online.

For percent predicted FEV1 and percent predicted FVC, the GWA was seen for top 30 SNPs in multiple regions with P-values 6.60 × 10−5 and 2.80 × 10−5, respectively (Supplementary Tables S3 and S4). The SNP with the smallest P-value for % predicted FEV1, rs825388 (P=3.85 × 10−6 in a combined set), is located on chromosome 5q14.3. Four SNPs on chromosome 5q14.3 near the GPR98 and two SNPs on chromosome 6p22.1 were significantly associated with % predicted FEV1. In addition, the most strongly association SNP for % predicted FVC, rs3131847 (P=1.65 × 10−6 in a combined set), is located on chromosome 6p22.1 near the ZFP57. Five SNPs on chromosome 6p22.1 near the ZFP57 and two SNPs on chromosome 4 in FAT1 were also significantly associated with % predicted FVC.

SNPs and haplotypes related to FEV1/FVC in the FAM13A locus

Focusing on FAM13A, we examined SNPs with LD in the association with FEV1/FVC in separate cohorts and their combined set. Using the additive model, four SNPs, rs2609264, rs1458551, rs2609261 and rs2609260, were found to have a strong LD. The squared correlation coefficient between two loci (R2) ranged from 0.83 to 0.97 in all parts. In addition, several intron SNPs in FAM13A, such as rs2869966, rs2869967 and rs2045517, were observed to be in high LD (R2=0.84) with our top SNP (rs2609264; Figure 2b). Previous large GWA studies demonstrated that three FAM13A SNPs are significantly, and in particular, associated with FEV1/FVC.11, 12

For the four SNPs in FAM13A, TCAG (43.60%) and CTGA (52.06%) were demonstrated to be the main haplotypes. In haplotype analysis, the CTGA haplotype had a highly significant association with lower FEV1/FVC than the TCAG haplotype (Table 3). The effect size for CTGA was −0.57% (s.e. =0.11, P=2.10 × 10−7) in the combined set. Compared with the TCAG haplotype, the CTGA haplotype was associated with lower FEV1/FVC by 0.75% (s.e.=0.17, P=9.83 × 10−6) in men and by 0.43% (s.e.=0.14, P=1.91 × 10−3) in women.

Table 3 Association of haplotypes in FAM13A gene with FEV1/FVC on multiple linear regression analysis

Associations of FAM13A haplotypes with FEV1/FVC by smoking behavior

Table 4 shows that TCAG and CTGA haplotypes are linked to significant differences in FEV1/FVC across all groups except for the mild smoking group (015). Among nonsmokers (PY=0), the CTGA haplotype was associated with reduced FEV1/FVC than the TCAG haplotype (P=9.02 × 10−5); the cross-sectional estimate was −0.51% (s.e.=0.13). Consistently, we found that the CTGA haplotype had a risk effect in heavy smokers (30<PY) (β=−1.40%, s.e.=0.42, P=9.37 × 10−4).

Table 4 Association of main haplotypes in FAM13A gene with FEV1/FVC by pack-years of smoking on multiple linear regression analysis in combined set

In addition, we found a statistically significant interaction between the CTGA haplotype and heavy smoking. In combined set, the CTGA haplotype in heavy smokers had significantly lower FEV1/FVC by 4.92% (s.e.=0.29, P=3.10 × 10−64) than would be expected if the two factors were independent: the effect size for CTGA haplotype compared with the TCAG haplotype on FEV1/FVC was −0.49% (s.e.=0.14, P=4.34 × 10−4) in nonsmokers; the effect size for heavy smoking status compared with nonsmoking on FEV1/FVC was −3.62% (s.e.=0.31, P=1.20 × 10−31) in TCAG haplotype group (P for interaction=0.028) (Figure 3 and Supplementary Table S2).

Figure 3
figure 3

Joint effects of smoking behavior and haplotypes of FAM13A on reduced pulmonary function. Values on bars indicate reduced FEV1/FVC. Joint effects between smoking behavior and haplotype of FAM13A showed reduced FEV1/FVC; smokers with CTGA haplotype in heavy smokers had lower FEV1/FVC compared with TCAG haplotype in nonsmokers, and this effect is dose dependent on pack-years of smoking (PY) (after stratification by smoking status). In general linear model on contrast statement, P-value indicates significance for difference in haplotype distribution by smoking behavior. A full color version of this figure is available at the Journal of Human Genetics journal online.

Discussion

In this GWAS consisting of 7993 Korean participants, we identified GWA of FEV1/FVC with SNPs in FAM13A of 4q22.1: rs2609264, rs1458551, rs2609261 and rs2609260. For those four SNPs, the CTGA and the TCAG haplotypes were confirmed as main haplotypes. We found that the CTGA haplotype (‘risk haplotype’) was associated with lower FEV1/FVC than the TCAG haplotype (‘reference haplotype’). In addition, an interaction between the CTGA haplotype and heavy smoking was statistically significant.

Cigarettes include many oxidants, pro-oxidants and free radicals, all of which contribute to disturbed elasticity and extracellular matrix (ECM) of lung parenchyma.29 Existing literature also suggests that ECM-degrading enzymes, reactive oxygen species and airway inflammation correlate with smoke-related COPD.30, 31 In addition, Framingham Heart Study reported that cigarette smoking was strongly associated with a more rapid decline in pulmonary function, 0.2–0.3% decline in FEV1/FVC per year for each pack per day of smoking. This report also revealed that evidence for heritability is present in the rate of pulmonary impairment in family members; the heritability for decline in lung function ranged from 0.11 to 0.29 in nonsmokers, but from 0.14 to 0.39 in smokers.8 In summary, this study has shown evidence of a genetic contribution to pulmonary function decline, as well as a cigarette smoking behavior. In our study, the CTGA haplotype was associated with lower FEV1/FVC by 0.51% in nonsmokers and by 1.40% in heavy smokers, as compared with the TCAG haplotype. Our findings identified genetic associations about reduced pulmonary function in both nonsmokers and heavy smokers.

As stated above, current smoking status is the major environmental risk factor for the development of COPD, but some of the ex-smokers show continuing decrease in pulmonary function after they stop smoking.32 That is, the degree of structural change of airway is diverse among people, and genetic differences could be suggested as one of the causes. One study has reported that cigarette smoking not only affects the lung components, but its effects may also be modulated by genes of exposed individuals.33 We confirmed the joint effect of FAM13A gene and smoking status. The CTGA haplotype in heavy smokers had significantly lower FEV1/FVC than would be expected if the CTGA haplotype and heavy smoking were independent. In other words, our finding suggests that the risk of reduced pulmonary function is particularly high in FAM13A CTGA haplotype carriers who have heavy smoking status. Although several studies have examined the joint effect of smoking and some candidate genes for COPD, such as interleukin-6, interferon-γ, and α1-antitrypsin,34, 35, 36 none of the tests of interaction between smoking and FAM13A gene have been investigated in published GWAS of pulmonary function or COPD. We found significant gene-by-smoking interaction between FAM13A haplotypes and FEV1/FVC in heavy smokers.

Candidate genes that have been implicated in the pathogenesis of COPD may be involved in proteolysis/antiproteolysis (for example, SERPINA1, MMPs and SERPINA3), oxidants/antioxidants (for example, GST and SOD) and inflammation (for example, TNF-α and TGFβ).37, 38, 39, 40, 41, 42 Although pathophysiological mechanisms behind the association between the candidate genes and pulmonary function are still unclear, we identified FAM13A as one of the candidate genes for the related outcomes. A previous report suggested that FAM13A (or FAM13A1) may be associated with the ECM cluster that includes SPP1, MEPE, IBSP, DMP1 and DSPP. The ECM cluster genes reside within the same open chromatin domain, and expression of FAM13A and other ECM cluster genes may be coregulated.43 We observed that rs1585616 in the MEPE gene (an ECM gene) is weakly associated with FEV1/FVC, suggesting that potential biological mechanisms may bridge FAM13A and reduced lung function (data not shown). In addition, multiregional genes have been reported in large-scale GWASs (for example, DDX1, RARB, HTR4, ADAM19, GPR126, CDC123, LRP1 and THSD4).11, 15, 18, 44, 45 In our study, several SNPs of those candidate genes showed association with FEV1/FVC (Supplementary Table S5).

Herein, we describe the large-scale GWAS in an Asian population to identify susceptible genes related to decreased pulmonary function and compared their effects with environmental effects. In addition, this study is the first to report significant interaction between FAM13A gene and smoking behavior with a pulmonary function trait in Asians. Our study contains several limitations, however. First, some of the recent population-based GWASs reported that pulmonary function and/or COPD were associated with rs2045517, rs2869966, rs2869967, rs6830970 and rs7671167 in FAM13A, but we could not replicate the same results because these SNPs were not contained in our genetic data. Second, the biological function and pathophysiological mechanisms of FAM13A in humans is yet to be clarified. Therefore, fine-mapping within the FAM13A gene is required to confirm about association between statistically significant noncoding SNPs and expression of the FAM13A gene. Furthermore, biological pathway and network analysis using these GWAS results is needed to define a functional role for FAM13A in reduced pulmonary function or COPD.

In conclusion, our study reidentified that the FAM13A locus is related to pulmonary function in Asians. We also suggest that the FAM13A variants may modify the adverse effects of cigarette smoking in lower pulmonary function, especially in heavy smokers. To clarify furthermore the interaction of FAM13A gene or other genes with environmental exposures on reduced lung function, functional studies may be necessary in the future.