Introduction

Panic disorder (PD) is an anxiety disorder characterized by recurrent and unexpected panic attacks, subsequent worry and phobic avoidance. Lifetime prevalence of PD is 1–3%, and twice as many women as men suffer from the disorder.1 The disorder frequently takes a chronic course, with many remissions and relapses. Genetic epidemiological studies have shown that genetic factors, as well as environmental factors, have important roles in the pathogeneses of PD. Family studies2, 3 indicate that first-degree relatives of probands with PD have an approximately fivefold increased risk of PD, and twin studies4, 5 estimate about 40% of the genetic liability towards PD. A large number of genetic studies of PD including linkage and association analyses have been conducted.6, 7, 8 Suggestive evidence for linkage in several regions and association of candidate genes such as monoamine- and neuropeptide-related genes has been reported. However, these studies have failed to find conclusive susceptibility genes for PD.

We previously conducted the first genome-wide association study (GWAS) of PD (200 Japanese cases and 200 controls) and found several suggestively associated genes with PD.9 However, a follow-up study in a larger Japanese sample (558 cases and 566 controls) failed to show any significant association of these genes with PD.10 Erhardt et al.11 conducted a three-stage GWAS of PD in German samples (216 cases and 222 controls in the GWAS stage, and 909 cases and 915 controls in total). They found two single-nucleotide polymorphisms (SNPs) (rs7309727 and rs11060369), located in TMEM132D on 12q24.3, to be associated with PD. Gregersen et al.12 conducted a genome-wide scan using microsatellite markers in the isolated population of the Faroe Islands (13 distantly related PD cases and 43 controls). They found an association of ACCN1 with PD in an extended Faroese sample (31 cases and 162 controls), whereas they could not replicate the association in an outbred Danish sample (243 cases and 645 controls). Kawamura et al.13 recently performed a genome-wide copy number variation association study of PD in 2055 Japanese subjects (535 cases and 1520 controls) and reported that common duplications in 16p11.2 were associated with PD. This region includes several genes such as IGH, HERC2P4, TP53TG3, SLC6A8 and SLC6A10P and small RNAs. However, none of them overlapped with genetic markers associated in earlier candidate gene studies. Due to the lack of power in previous association studies, genes that are truly associated with PD might not be detected.

To overcome the limitations of previous association studies targeting candidate genes with small sample sizes, we conducted a new GWAS of PD in a larger sample and combined our first and second GWAS (GWAS I and II: 718 cases and 1717 controls in the meta-analysis stage). For a follow-up study, 12 suggestively significant SNPs were tested in an independent sample set (329 cases and 861 controls). Considering evidence for a substantial polygenic contribution to psychiatric disorders, we also undertook a polygenic score analysis to test whether large sets of common variants of small effects accounted for risk of PD.

Materials and methods

Subjects

The two samples included in the meta-analysis consisted of unrelated Japanese subjects living in Tokyo, Nagoya and Niigata, all of which are located in the mainland of Japan. Sample I consisted of 200 PD cases and 200 controls described in our previous GWAS.9 In the present study, a new GWAS (II) of PD was conducted using sample II (579 PD cases and 1568 controls), in which part of the sample (545 cases and 414 controls) overlapped with those reported in our previous paper.10 The diagnosis was confirmed according to the Diagnostic and Statistical Manual of Mental Disorders Fourth Edition (DSM-IV) criteria,14 using the Mini International Neuropsychiatric Interview (MINI)15 and reviewing clinical records. Control subjects were interviewed by one of the authors and screened for history of major psychiatric disorders by using either a short questionnaire or the structured interview (MINI).

Replication sample comprised 329 cases and 863 controls, all recruited in Japan through multiple institutions. Cases were diagnosed according to the DSM-IV criteria by at least two experienced psychiatrists on the basis of all available sources of information, including unstructured interviews, clinical observations and medical records. Controls were screened for history of psychiatric disorders including PD based upon self report. After complete description of the study to the subjects, written informed consent was obtained. This study was approved by the ethical committees of relevant institutions (University of Tokyo, Chiba University, Niigata University, Mie University, Oita University Faculty of Medicine and RIKEN). Details of sample recruitment are provided in Supplementary Materials and Methods.

Genotyping and quality control (QC)

Samples I and II were genotyped on the GeneChip Human Mapping 500K Array Set and the Genome-Wide Human SNP Array 6.0 (Affymetrix, Santa Clara, CA, USA), respectively. For replication, genotyping was performed by the Taqman assay with the ABI PRISM 7900HT Sequence Detection System (Applied Biosystems, Foster City, CA, USA). Genotyping for GWAS and replication was conducted at Human SNP Typing Center and Department of Psychiatry at the University of Tokyo, respectively.

For the GWAS stage, stringent QC procedures were applied to individual subjects and SNP data (for example, sample-wise call rate 0.95, SNP call rate 0.95, Hardy–Weinberg equilibrium P-value 0.001 and minor allele frequency 0.05). The subjects QC left 177 cases and 178 controls for sample I, and 541 cases and 1539 controls for sample II (Table 1). The final GWAS consisted of 282 173 SNPs for sample I and 633 001 SNPs for sample II on autosomal chromosomes. In addition, 6043 SNPs for sample I and 23 131 SNPs for sample II on X chromosome in females were also analyzed.

Table 1 Demographic characteristics of samples totaling 1047 cases and 2578 controls

For replication, we investigated independent autosomal SNPs (P<5 × 10−5 in the meta-analysis) genotyped on both of the two genotyping platforms. The QC (sample-wise and SNP call rate 0.95 and Hardy–Weinberg equilibrium P-value 0.001) left 12 SNPs for 329 cases and 861 controls for the final replication data set. Details of QC procedure are provided in Supplementary Materials and Methods.

Demographic variables of each sample set after QC filtering are summarized in Table 1.

SNP-based association analysis

GWAS in each study was conducted independently using logistic regression under an additive model using PLINK ver. 1.07.16 As it is known that there is a population structure in the Japanese population,17 multidimensional scaling (MDS) analysis was performed using PLINK, and 10 MDS components were added as covariates in each GWAS to control for population stratification. SNPs with P-value <5 × 10−8 were considered to be genome-wide significant in individual GWAS and meta-analysis. A P<0.05 was taken as significant evidence of replication.

A power calculation was performed using the method as described by Ohashi et al.18 The meta-analysis sample (718 cases and 1717 controls) was tested to detect α=5 × 10−8, assuming disease prevalence of 0.03. For a power of 80%, the minimum detectable relative risks were 1.75, 1.52 and 1.47 for risk allele frequency of 0.1, 0.25 and 0.4, respectively.

The quantile–quantile plot was used to evaluate overall significance of the GWA results and the potential impact of population stratification. The inflation factor λ was calculated on the basis of the median χ2. Haploview 4.219 was used to create Manhattan plots of P-values from the GWAS and meta-analysis. SNAP 2.020 was used to plot association results in each gene region in the meta-analysis stage of the GWAS data.

Meta-analysis

A meta-analysis of the GWAS I and II was conducted using PLINK ver. 1.07. The combined sample comprised 718 cases and 1717 controls. To detect strong PD-associated SNPs that were not genotyped on the platforms, imputation was performed with MACH ver. 1.0 software,21 using the HapMap CHB+JPT phase II data as the reference sample. Association analyses in each GWAS were conducted by logistic regression with 10 MDS components as covariates using ProbABEL.22 The location of SNPs reported is taken from the NCBI build 36 (UCSC hg18). Poorly imputed SNPs (r2<0.30) and SNPs with low minor allele frequency (<0.01) were excluded, resulting in a final meta-analysis data set of 1.9M SNPs. For each SNP, we combined the odds ratios (ORs) for a given reference allele weighted by the confidence intervals using a fixed-effects or random-effects model. To assess the homogeneity of ORs, the Cochran’s Q statistic was used.23 If the Cochran’s Q statistic showed P-value <0.01, a random-effects model (DerSimonian and Laird method) was employed.24 If the Cochran’s Q statistic showed P-value 0.01, a fixed-effects model (Mantel–Haenszel method) was employed.25 We also combined GWAS and replication data using the same approaches (fixed-effects or random-effects model).

Gene ontology (GO) term enrichment analysis

For enrichment analyses, we used the public domain tool provided by the Database for Annotation, Visualization and Integrated Discovery (DAVID) bioinformatics platform.26 We examined GO because it is used widely in functional annotation and enrichment analysis. As DAVID accepts only a gene list, markers were converted to representative genes using the ProxyGeneLD software27 (see details in Supplementary Materials and Methods). We used 2.0 and 1.9M imputed markers in individual GWAS and the meta-analysis as an input file, respectively. After conversion, top 1% of genes (172 genes in each GWAS and 166 in the meta-analysis) were tested by enrichment analysis using DAVID with the total genes (17 180 in GWAS I, 17 173 in GWAS II and 16 645 in the meta-analysis) as the base set. Considering the redundant nature of annotations, the groups of similar annotations were grouped together using ‘Functional Annotation Clustering’ (kappa value >0.5). We selected the best significantly enriched terms of individual groups. The enrichment P-value was calculated from the number of genes in the list that hit a given biology class compared to pure random chance. To control the family-wide false-positive rate in the result list, the multiple test correction of enrichment P-values must be performed on the functional annotation categories being tested at the same time. To correct this multiple testing, false discovery rate was calculated, which is the expected proportion of false discoveries among the discoveries (%).28 The false discovery rate less than 25% was taken as significance. We attempted to replicate the significant GO terms found in one GWAS using the other GWAS, and also analyzed enriched GO terms based on the meta-analysis of the two GWAS data sets.

Association analysis of prior candidate genes

To find supportive evidence for associations of previously implicated candidate genes, we additionally tested for the association with a set of 356 prior candidate genes summarized by Maron et al.8 We used ProxyGeneLD software to convert SNPs from the meta-analysis of two GWAS data to specific genes as described above. After conversion, 38 341 SNPs were assigned to 327 candidate genes.

Polygenic score analysis

Polygenic scores were calculated and tested for their effect following the method described by International Schizophrenia Consortium.29 Five sets of LD-pruned SNPs (r2 threshold at 0.25 and window size 200 SNPs) were selected from the discovery data. This selection of SNPs was based on their nominal P-value (significance threshold PT<0.1, 0.2, 0.3, 0.4 and 0.5). Polygenic score was calculated by weighting scores of risk allele count by the logOR observed in the discovery data. We calculated individual scores for each set of SNPs using PLINK ver. 1.07. Nagelkerke’s pseudo R2 was calculated by logistic regression analysis using R software (http://www.r-project.org), with the number of non-missing SNPs and 10 MDS components as covariates in the target sample.

Data sharing

Portions of our GWAS results for PD are publicly available in the Genome-Wide Association Database (GWAS DB; http://gwas.lifesciencedb.jp/).30

Results

GWAS results

In this study, we conducted a new GWAS with a larger sample size (sample II) than that in our previous reports (sample I). The quantile–quantile and Manhattan plots of the GWAS I and II are illustrated in Supplementary Figures S1 and S2, respectively. The genomic inflation factor (λ) of the GWAS I and II was 1.06 and 1.05, respectively, suggesting no substantial effects of population structure. The Manhattan plots of the GWAS I and II, respectively, showed several genomic regions as potential risk loci (Supplementary Figures S1 and S2, respectively), although none was genome-wide significant. The results of genotyped SNPs showing P-value <10−4 in the GWAS I and II are provided in Supplementary Tables S1 and S2, respectively.

Meta-analysis of the GWAS data

The quantile–quantile and Manhattan plots in the meta-analysis of the two GWAS are illustrated in Figure 1. The genomic inflation factor was 1.04. Among 1.9M imputed SNPs in the meta-analysis, no genome-wide significance was observed. Four SNPs showed significance at the level of P <10−5 (Supplementary Table S3). Three of the top four SNPs (rs10144552, rs944805 and rs4129976) were genotyped on both of the platforms. The strongest association was observed at rs10144552 located 36 kb upstream of BDKRB2 (P=4.4 × 10−6) on 14q32.2 (Figure 2). SNPs showing P-value <5 × 10−5 in the meta-analysis were listed in Supplementary Table S3. The tests of heterogeneity of effects in each sample were non-significant (P0.01), showing that estimates were similar between the two samples and justifying the fixed effects of the meta-analysis (data not shown).

Figure 1
figure 1

Results of the meta-analysis of the two genome-wide association studies (GWAS I and II). (a) Quantile–quantile plots of the association results. Observed association results (−log10 P) are plotted against the expected distribution under the null hypothesis of no association. (b) Manhattan plots present the P-values across the genome. The association results (−log10 P) are plotted in chromosomal order. Single-nucleotide polymorphisms in X chromosome were analyzed only in females.

Figure 2
figure 2

Plots of association results (−log10 P) in BDKRB2 region in the meta-analysis of the genome-wide association studies (GWAS). Chromosome position is plotted according to its physical position with the reference to the NCBI build 36. Recombination rate as estimated from the JPT and CHB HapMap data is plotted in light blue. Large red diamond (genotyped on both the platforms): single-nucleotide polymorphism (SNP) with strongest evidence for association (rs10144552). Small diamond and square represent imputed SNPs and SNPs genotyped on both the GWAS platforms, respectively. Strengths of linkage disequilibrium (LD) (r2) with SNP rs10144552 in the plots are shown (darker red indicates stronger LD).

Follow-up study

We investigated the associations of 12 SNPs selected from the top meta-analysis findings in an independent Japanese sample (329 cases and 861 controls). Among 12 SNPs, 9 SNPs showed the same direction of effects between the meta-analysis and replication samples, the number of which could be expected by chance (P=0.64). None of these SNPs was significant (Table 2). When these SNPs were analyzed by sex, no significant association was identified (Supplementary Tables S4 and S5).

Table 2 Top findings based on the meta-analysis of two GWAS data sets and their follow-up study

In the combined analysis across the two GWAS and replication samples, no genome-wide significant association was found (Table 2). The ORs ranged between 1.15 and 1.49. Rs10144552 in BDKRB2 showed the most significant P-value (P=1.3 × 10−5, OR=1.31). The second strongest association was found at rs2911968 in MCPH1 on 8p23.1 (P=2.6 × 10−5, OR=1.30). The third significant SNP was found at rs4129976 located in CNTN4 on 3p26.3 (P=5.0 × 10−5, OR=1.29).

GO term enrichment analysis

To further explore the GWAS data, we took a GO-based approach, which provides complementary information to single-marker analysis. We found that five terms in GWAS I and four terms in GWAS II were associated with PD at a nominal P-value<0.05 (Supplementary Tables S6 and S7). Among these significantly enriched terms, three in GWAS I (phosphoinositide binding, axon guidance and transcription factor activity) and one in GWAS II (negative regulation of secretion) had false discovery rate <25%. However, these significantly enriched terms in one GWAS were not replicated using the other GWAS. When the analysis was conducted based on the meta-analysis of the two GWAS data sets, none of the GO-enriched terms had false discovery rate <25% (Supplementary Table 8).

Candidate gene association analysis

We tested the 327 prior candidate genes using a gene list from the results of meta-analysis of the GWAS data. Although none of these genes achieved significance after experiment-wise correction for multiple testing (P<1.53 × 10−4=0.05/327), gene-wise evaluation produced 20 genes yielding significance at the level of P<0.05 (Supplementary Table S9). Among them, the strongest evidence for association was found at rs12501691 located in NPY5R on 4q31.3–q32 (gene-wise corrected P=6.4 × 10−4). In addition to the prior candidate gene list, we examined the associations of TMEM132D11 and ACCN112 with PD in our results. Neither of SNPs rs7309727 nor rs11060369 in TMEM132D was significant (nominal P=0.38 and 0.55, respectively). The most significantly associated SNP (rs1397504) in TMEM132D in our results did not reach the gene-wide significance (gene-wise corrected P=0.37). On the other hand, although the most strongly associated SNP within ACCN1 showed non-significance (rs880305, gene-wise corrected P=0.13), rs280039 located 350 kb upstream of ACCN1 showed strong association with PD (nominal P=4.0 × 10−5). The regional association plots of TMEM132D and ACCN1 are presented in Supplementary Figures S3 and S4. Furthermore, when we investigated the region of 16p11.2 where common duplications were associated with PD,13 no significant SNP was found.

Polygenic score analysis

Figure 3 and Supplementary Table S10 present the P-values and Nagelkerke’s R2 statistics for the logistic regression analysis based on the two Japanese samples. Significant associations were found between case–control status in the target sample (sample I) and polygenic scores based on the sets of SNPs with PT<0.3 and <0.4 in the discovery sample (sample II) (P=0.031 and 0.036, R2=3.2% and 3.4%, respectively Figure 3). The polygenic scores based on the sets of SNPs with PT<0.3, 0.4 and 0.5 in the discovery sample (sample I) also showed significant associations with PD status in the target sample (sample II) (P=0.040, 0.014 and 0.020, respectively). However, pseudo R2 was lower than that based on a larger discovery sample (sample II) (maximum R2 <0.5%).

Figure 3
figure 3

Pseudo R2 explained by the PD polygenic scores in the genome-wide association study samples. X axis represents sets of SNPs with P-values threshold. Y axis represents Nagelkerke’s R2. P-values in each set of single-nucleotide polymorphisms represent the significance of correlation between the PD status in the target sample and polygenic scores based on the discovery sample. (a) Sample I (target) predicted by sample II (discovery). (b) Sample II (target) predicted by sample I (discovery).

Discussion

To date, most of candidate gene association studies and GWAS have been underpowered to detect the small effect sizes of susceptibility loci for PD. In this study, we performed a new GWAS and a meta-analysis of our two GWAS data sets followed by a replication study in Japanese (1047 PD cases and 2578 controls in total). To our knowledge, the sample size of the present study was the largest in the GWAS of PD. However, the meta-analysis of the GWAS data produced no genome-wide significance and none of the top SNPs selected from the meta-analysis was associated with PD in the replication sample. This may have been attributable to chance findings in the meta-analysis and inadequate replication sample size. Recent large-scale GWAS of schizophrenia31 and bipolar disorder32 suggest that the effect sizes of common risk alleles are small (OR<1.2). The power of our meta-analysis sample (718 cases and 1717 controls) was 0.3% to detect α=5 × 10−8 conferred by an allele with a frequency of 0.25 and an OR of 1.2. Therefore, it is very unlikely to detect any locus at the genome-wide significance with this power. Moreover, the power of replication sample (329 cases and 861 controls) was 45% to detect α=0.05 conferred by an allele with a frequency of 0.25 and an OR of 1.2. Besides the possibility that the meta-analysis findings were false positives, the lack of replication may be due to the limited power of the replication sample.

Despite the limited power of the samples, the regions with statistically significant association might have some relevance with PD. Among these loci, the most strongly associated SNP was rs10144552 at upstream of BDKRB2 on 14q32 (all combined P=1.3 × 10−5). The direction of the effects was the same across the three (two GWAS and replication) samples. Gratacòs et al.33 reported that a SNP rs945032, located in the promoter region of BDKRB2, was associated with PD, substance abuse and bipolar disorder. Our imputation results did not include this SNP. As rs945032 is in the middle of a recombination hotspot, even nearest imputed SNPs in the present study presented low LD with this SNP (r2<0.01). BDKRB2 encodes bradykinin receptor that activates second messenger regulating blood pressure, pain perception and neuronal differentiation.33 Several studies have suggested that mortality due to cardiovascular diseases is high in PD patients.34 These reports highlight BDKRB2 as a promising candidate for PD that is worthy of additional follow-up.

In the analysis of the prior PD candidate genes, the strongest evidence for association was found at rs12501691 in NPY5R on 4q31.3–q32 (gene-wise corrected P=6.4 × 10−4). NPY5R was reported to be associated with PD, especially in females.35 In the present study, sex-specific analyses showed that rs12501691 in NPY5R were associated with PD in only females (female, P=2.7 × 10−3; male, P=0.085). Neuropeptide Y (NPY) is a particularly plausible candidate for modulating effects of environmental stress exposure on susceptibility to anxiety disorders as seen in animal models.36 Therefore, changes in NPY expression due to genetic variations in the NPY-related genes affect stress response and emotion,37 and may contribute to anxiety disorders such as PD. We also investigated PD candidate genes TMEM132D11 and ACCN112 reported in previous GWAS. Erhardt et al.11 reported that two SNPs rs7309727 and rs11060369 in TMEM132D were strongly associated with PD in the German population. Risk allele of rs11060369 was associated with higher TMEM132D mRNA expression levels in the frontal cortex. In our study, however, neither of the SNPs showed significant association with PD. Gregersen et al.12 conducted a genome-wide scan using the isolated population of the Faroe Islands, and suggested ACCN1 as a candidate gene for PD. Our meta-analysis of the GWAS provided supportive evidence for association of rs280039 at upstream of ACCN1 with PD in the Japanese population (P=4.0 × 10−5), whereas no association of SNPs within the gene was found at the gene-wide levels of significance. The inconsistency in the results between Caucasian and Japanese samples may be due to differences in clinical and genetic heterogeneity.

The lack of replication in GWAS- and GO-based enrichment analyses across samples is not surprising if the effect of each locus is too small to surpass the threshold specified for significance. Therefore, we performed the polygenic score analysis using our GWAS data. The power of the discovery sample (sample II, 541 cases and 1539 controls; sample I, 177 cases and 178 controls), respectively, was 65% and 39% to detect α=0.4 conferred by an allele with a frequency of 0.25 and an OR of 1.1. Despite the low power in the discovery sample, the sets of SNPs with PT<0.3 and 0.4 based on the discovery sample (also PT<0.5 in sample I) were significantly associated with PD status in the target sample. The largest variance was explained by the set of SNPs at the threshold of PT<0.4 in the discovery sample (sample II, P=0.036 and R2=3.4%; sample I, P=0.014 and R2=0.4%). These results are compatible with previous studies in which around 1–2% of the variance for anxiety and depression could be explained by the polygenic score approach.38 Our findings suggest that ‘true’ risk variants of PD may be included in the sets of the more liberal thresholds.

The present study has limitations. First, although controls were screened to exclude history of major psychiatric disorders, some subjects in the control group were still at risk for developing PD. This may have decreased sensitivity. However, given the relatively low prevalence of PD (<2–3%), it was less likely that inclusion of controls with potential PD had major effect on statistical power. Second, as our study was designed to identify common variants conferring risk for PD, we did not assess the impact of rare variants. Recently, multiple rare variants have been reported in genes shown to harbor common variants associated with common diseases.39, 40 Therefore, resequencing genomic regions suggested by GWAS might be powerful approach to identify rare coding (that is, missense or nonsense) variants associated with PD. Third, we did not investigate gene–gene and gene–environment interactions. It has been recognized that anxiety disorders including PD are the results of multiple, complex interactions between genes and environmental influences.41 The candidate genes suggested in the present study are needed for further research on interactions with environmental factors such as childhood experience.

In conclusion, the present study suggest that BDKRB2 and several other genes are worthy of follow-up as candidates for PD. Our findings also suggest that a large number of common alleles of small effects collectively account for risk of PD. GWAS with multiple large samples and their meta/mega-analyses are warranted to confirm our findings.