Genetic factors underlying trait neuroticism, reflecting a tendency towards negative affective states, may overlap genetic susceptibility for anxiety disorders and help explain the extensive comorbidity amongst internalizing disorders. Genome-wide linkage (GWL) data from several studies of neuroticism and anxiety disorders have been published, providing an opportunity to test such hypotheses and identify genomic regions that harbor genes common to these phenotypes. In all, 11 independent GWL studies of either neuroticism (n=8) or anxiety disorders (n=3) were collected, which comprised of 5341 families with 15 529 individuals. The rank-based genome scan meta-analysis (GSMA) approach was used to analyze each trait separately and combined, and global correlations between results were examined. False discovery rate (FDR) analysis was performed to test for enrichment of significant effects. Using 10 cM intervals, bins nominally significant for both GSMA statistics, PSR and POR, were found on chromosomes 9, 11, 12, and 14 for neuroticism and on chromosomes 1, 5, 15, and 16 for anxiety disorders. Genome-wide, the results for the two phenotypes were significantly correlated, and a combined analysis identified additional nominally significant bins. Although none reached genome-wide significance, an excess of significant PSRP-values were observed, with 12 bins falling under a FDR threshold of 0.50. As demonstrated by our identification of multiple, consistent signals across the genome, meta-analytically combining existing GWL data is a valuable approach to narrowing down regions relevant for anxiety-related phenotypes. This may prove useful for prioritizing emerging genome-wide association data for anxiety disorders.
Anxiety disorders (ANX), such as generalized anxiety disorder, panic disorder (PD), and phobias, are common, disabling conditions with significant lifetime prevalence.1 They tend to persist throughout life and show substantial comorbidity with each other and other internalizing disorders such as major depressive disorder (MDD). Twin and family studies implicate genetic factors in their etiology, with moderate levels of familial aggregation and heritability.2
Compared with disorders such as schizophrenia and bipolar disorder, relatively few molecular genetic studies of ANX exist. A recent review showed there are fewer than ten published linkage scans among all ANX (excluding obsessive compulsive disorders (OCD)), most focusing on PD.3 Similarly, most existing genetic association studies of ANX are limited to PD, and few specific genes have been reliably identified.4
Neuroticism (NEU), a personality trait reflecting a tendency towards states of negative affect, has long been associated with anxiety and depressive disorders and their comorbidity.5, 6, 7 NEU has a heritability of ∼40%8 and twin studies support a common genetic factor underlying NEU and susceptibility to anxiety and depressive symptoms, and disorders.9, 10, 11 As such, identify genes influencing variation in NEU may be an entry point for understanding the molecular genetic basis of these psychiatric conditions.
Seven genome-wide linkage (GWL) studies of NEU are in the literature representing eight independent samples. To date, no studies have attempted to combine the large amount of linkage information available for NEU. In addition, there is some overlap between regions identified by GWL studies for ANX and NEU.12
The aims of this study are to synthesize information across linkage studies of NEU and ANX using meta-analytic methods to (1) confirm and refine the linked regions observed in those studies, (2) identify novel linked regions that may not have achieved significance in any individual study and, (3) compare the findings across these two phenotypes to identify regions containing common susceptibility loci.
Materials and methods
Data sources, samples, and measures
To identify all potential primary linkage studies of NEU or non-OCD ANX, we searched MEDLINE as well as the reference sections of identified studies and reviews. Studies included were required to be genome-wide and performed in Caucasian subjects. We attempted to contact the investigators of each study, requesting that they provide their linkage analysis results in the form of marker ID and corresponding P-values or linkage statistic.
For NEU, we identified eight independent samples from seven published studies (Table 1). One publication analyzed data from two different samples: the Australian Twin Registry (ATR) and the Netherlands Twin Registry (NTR).13 All studies used adult subjects except for the Brisbane Adolescent Twin Study (BATS).14 Sampling designs differed considerably across the studies including unselected epidemiological twin samples (ATR, NTR, BATS), extreme discordant and concordant sibling pairs (UKSP15 and GENESiS16), and sibling pairs selected for other phenotypes (IASPSAD, NZ-ND). One of these, the Depression Network (DeNt) sample, reported linkage scans for MDD using an initial collection17 and severe MDD for an expanded collection of families.18 For the current analysis, the DeNt researchers provided unpublished linkage results for NEU. We excluded data using another anxiety-related trait assessed in the NTR19 from the current analysis because that sample substantially overlapped that of the NTR NEU scan.
As can be seen from Table 1, all of these studies used some version of Eysenck’s NEU, either from the full Eysenck Personality Questionnaire (EPQ) or the shortened version (SF). The Amsterdamse Biografische Vragenlijst is the Dutch equivalent of the EPQ. Most transformed the total NEU score before analysis to minimize its deviation from normality. The original linkage analysis published by Neale et al20 from the New Zealand nicotine dependence (NZ-ND) sample used raw scores. For consistency with the other studies, we re-analyzed NZ-ND using angular transformed scores.
We identified four independent linkage studies that primarily analyzed non-OCD ANX families ascertained via PD probands. For the current study, the largest most recent PD data from the Columbia group was used from a nonparametric, multipoint analysis of their ‘Intermediate’ panic phenotype.21 The Yale group has published four analyses of various ANX phenotypes. We used the results from Kaabi et al,22 which analyzed linkage to a multivariate ‘fuzzy clustering’ phenotype that combined information from PD and social, specific, and agoraphobias. Finally, we included data from the primary PD linkage analysis published by the Iowa group.23 Characteristics of these three samples are also listed in Table 1. We were unable to obtain data from one other primary ANX linkage scan.24
Genome scan meta-analysis
Meta-analyses of primary studies can (1) identify consistency of evidence across studies and (2) increase power to detect novel linked regions not reaching significance in any individual scan. We used the genome scan meta-analysis (GSMA) approach25 to combine GWL results across studies within phenotype. GSMA is a nonparametric rank-ordering method that can combine results from GWL methods across studies, which used different markers and applied different statistical tests. In simulation studies, GSMA detected linkage with power comparable to or greater than that obtained by performing a combined linkage analysis of all data.26 Further details on GSMA and interpretation of results are provided in the Supplementary Material.
Linkage statistic transformation and maximum P-value adjustment
Several preliminary manipulations were applied to the primary data before meta-analysis. All linkage results were converted to P-values if only linkage statistics (LOD or NPL score) were provided. Nonparametric linkage software packages commonly report the minimum LOD as zero with corresponding P-values of 1 or 0.5. Neither P-value approximation is correct and can lead to bias in meta-analyses. To correct this bias, maximum NPL-based P-values were adjusted to 0.72, which is a more appropriate estimation.27
Fine scale (1 Mb) physical integration
The genetic markers from each study were independently mapped to the human genome reference sequence, allowing all the results to be combined onto a common physical map. The physical positions reported here are based on NCBI Build 36.1.28, 29 For each study, the significance of each 1 megabase (Mb) physical bin across the genome was calculated. Most bins did not contain a genotyped marker or reported linkage result. The significance of intervening bins without genotyped markers was estimated using the nearest flanking bins with genetic markers and assuming a linear slope.
Common genetic map
The Rutgers Combined Linkage-Physical Map30 was used to calculate the genetic position of each 1 Mb physical bin. Earlier applications of GSMA have used 30 cM bins. Because most of our data sets provide data for intermarker distances of 10 cM or less, we calculated maximum log P-values primarily for 10 cM bins. We repeated this for a range of bin widths, allowing us to examine the effects of bin width and placement on our results. Data manipulations were performed using R.31
False discovery rate analysis
To assess enrichment of significant bins not satisfying genome-wide significance, a false discovery rate (FDR) approach was used.32 In addition to estimating the number of bins below various FDR thresholds, we calculated a ‘q-value’ for each GSMA bin using a conservative assumption (P0=1). Further details on FDR are provided in the Supplementary Material.
GSMA was performed using a range of binning criteria including 20, 15, 10, and 5 cM bins, resulting in 191, 246, 368, and 722 bins, respectively. Smaller bins are potentially more useful in weighted FDR analyses of genome-wide association data and candidate gene testing in linked regions. Broad consistency was observed for the results across binning strategies. Results for the 10 cM analysis are presented here; complete results using all bin definitions are available in Supplementary Tables S1–S4.
Because of the variety of sample sizes and study designs, sensitivity analysis was performed to compare the sample size-weighted versus unweighted approaches. The results of the unweighted sensitivity analysis showed that no single study was overly influential. The range of Spearman rank correlations (rsr) between the full and drop-one-out GSMA was narrow across the studies dropped, with mean correlations of 0.93 and 0.84 for the NEU and ANX analyses, respectively. In contrast, the weighted GSMA sensitivity analysis for NEU showed the ATR study disproportionately contributed to the results with a rsr of 0.76 while the minimum rsr for the remaining studies was 0.94. This was not unexpected, as ATR is the largest study with 6522 subjects. For ANX, the weighted results were sensitive to dropping the Columbia PD study (rsr=0.55), whereas dropping the other two studies was not very influential (rsr=0.93 and 0.92), similarly attributable to the Columbia sample being largest with 120 pedigrees and 517 affecteds. Further details are contained in Supplementary Tables S5 and S6. Because of increased sensitivity of the weighted results, only the unweighted results are presented here. Weighted results are available in Supplementary Tables S1–S4.
Unweighted GSMA of NEU linkage scans resulted in 25 of 368 bins (10 cM) reaching nominal significance (P<0.05) using the PSR test. Some of these bins were adjacent to each other and represented 11 separate regions on chromosomes (chr) 1, 6, 10, 11, 12, and 14. Details are displayed in Table 2 and Supplementary Figure S1. Six of 25 bins were also significant (<0.05) using the second GSMA statistic, POR, which is used to further evaluate significance of a bin. Although simulations have shown that truly linked bins do not necessarily have significant POR results, bins significant using both statistics are more reliable.
Although each bin was the same genetic size, the physical sizes of the significant bins were variable and ranged from 2.4 Mb on chr 17 to 54.9 Mb for consecutive adjacent significant bins on chr 11. The two top results for NEU by PSR alone were in nearby bins on chr 1 at 50–60 and 70–80 cM, but neither had a significant POR. Bins significant using both PSR and POR were observed on chr 9, 11, 12, and 14 including three contiguous bins on chr 12 (90, 100, 110 cM).
Unweighted analysis of ANX resulted in 24 bins reaching nominal significance (PSR<0.05). As with the NEU results, multiple significant bins were adjacent to each other, which collapsed into nine separate regions. Six bins were also significant using the second GSMA statistic, POR. However, the most significant (<0.01) bins by PSR were not significant for POR. Adjacent significant bins were observed on chrs 1, 2, 5, 11, 15, 16, and 22. Notably, there were five contiguous significant bins on chr 15, which covered 34.9 Mb. Detailed results are shown in Table 3 and Supplementary Figure S2.
Correlation across phenotypes
We observed that two bins had nominally significant PSR for both NEU and ANX (chr 1, bins 60 and 70, 31.3–53.3 Mb). As twin studies have shown that common genetic factors influence both NEU and ANX, we sought to test if GSMA results were correlated across the genome. Spearman correlations were calculated using all bins (unfiltered) and again using bins with PSR<0.5 (filtered). The filtered test was performed, as most of the GSMA results are not significant and could mask a true correlation between scans. Significant correlations between NEU and ANX were observed using multiple binning and P-value thresholds, with the filtered set of bins showing higher correlation (r∼0.2). At a P-value threshold of 0.5 for bin sizes of 5, 10, 15, and 20 cM, we found Spearman rank correlations of 0.165, 0.256, 0.183, and 0.182, respectively. To determine empirical significance, we randomly permuted the per bin size results and recalculated correlations 10 000 times using the same approach. These distributions were used to calculate empirical P-values. At the P-value threshold of 0.5 for bin sizes of 5, 10, 15, and 20 cM, the empirical P-values are 0.0008, 0.0002, 0.023, and 0.042, respectively.
Because of the significant correlation in results between NEU and ANX, a combined GSMA was performed post hoc. The unweighted 10 cM bin combined analysis results showed 19 bins nominally significant for both GSMA statistics, and an additional 8 bins where both the PSR and an adjacent POR were significant. Although none of the PSR in these 27 bins across 14 regions reached genome-wide significance (0.05/n bins), simulation studies have shown the co-occurrence of significant PSR and POR statistics in the same or adjacent bin is unlikely in null data. Additionally, an excess of both significant PSR and PORP-values was observed. Using a FDR threshold of 0.50 for the PSR, the combined analysis yielded 12 bins, whereas separate analyses did not have any bins meeting this threshold. These bins were on chrs 2, 6, 9, 10, 11, and 15, and each showed nominal significance in one of the separate analyses. Clustering of more than two adjacent bins was observed on chr 1, 11, and 15, which comprised regions of 30, 60, and 40 cM, respectively. The regions on chr 1 and 11 also contain more than one bin with PSR below a FDR of 0.50. Detailed results and figures are in Table 4, Figure 1, and Supplementary Material.
Two linkage meta-analyses of anxiety-related phenotypes, NEU and ANX, were performed using rank-based GSMA followed by a combined analysis after observing significant correlations between the results. Four bin sizes were used in order to check for artefacts related to either bin size or location, and the findings were generally consistent across these. We present the 10 cM bin results, which are appropriate for the level of resolution from our primary data sets. Both sample-weighted and unweighted analyses were undertaken. Although there was considerable heterogeneity in sample size and design, the unweighted results were not sensitive to dropping out any particular study, so the discussion focuses on these results.
We combined GWL results for Eysenk’s NEU assessed in eight independent samples containing 14 811 subjects in 5179 families. Among the 368 separate 10 cM bins, 25 bins representing 11 separate regions on chr 1, 2, 6, 7, 9, 10, 11, 12, and 14 attained nominal significance (PSR<0.05). Further support using the criteria POR<0.05 was found for bins on chr 9 (150 cM), 11 (90 cM), 12 (90–110 cM), and 14 (110 cM); the last three are flanked by other bins with significant PSR. The chr 9 region does not show up as being particularly significant in any of the primary studies and represents a linkage signal that would not be detected without combining data across studies. To our knowledge, this region has not been reported as linked to other anxiety or related phenotypes. The signal on chr 11 primarily derives from the highly-powered UKSP sample with additional evidence from IASPSAD and NZND and is within a reported linkage region for MDD.33 The UKSP sample also provides much of the signal in the chr 12 region, with additional support from four other studies. The region has been strongly linked (LOD=6.0) to MDD34 and includes TMEM16D, which was associated with NEU in a genome-wide association study (GWAS) of personality in a Sardinian sample.35 Observing overlap between regions linked to NEU and MDD is not surprising, as NEU is genetically correlated with MDD,36, 37 and genetic variation in NEU may account for a portion of the genetic overlap between ANX and MDD.10, 11 Finally, the chr 14 signal derives primarily from the NTR, which previously showed linkage to this area using the Spielberger State-Trait Anxiety Inventory (STAI).19 The region also overlaps a suggestive linkage from the Kaabi et al22 PD study included herein as well as a locus identified in a targeted linkage scan performed in a single, large pedigree segregating PD and risk for early-onset ANX.38 Surprisingly, there was no evidence in the 8p22, a region with replicated linkage to harm avoidance, another anxiety-related trait.39
As a comparison to the NEU analysis, we sought to maximize linkage information across clinically-defined ANX via meta-analysis of three independent studies (two for PD and one for a multivariate ANX phenotype). The combined sample contained 718 subjects meeting criteria for ANX in 162 families. The analysis identified 24 bins (10 cM) reaching nominal significance representing nine separate regions on chrs 1 (two regions), 2, 5, 9, 11, 15, 16, and 22, respectively. These results support earlier linkage on chrs 2, 9, and 15 reported by the Columbia group using their largest sample.21 Regions with further support (POR<0.05) include bins on chrs 1 (170 cM), 5 (200 cM), 15 (10 cM), 16 (40–50 cM), and 22 (10 cM). Although most significant bins had support from multiple samples, they were not always the most prominent regions previously reported for any individual study. The bins on chr 15 and 16 overlap regions where Camp et al40 reported nominal linkage (LOD scores 1.51 and 2.49, respectively) using a combined MDD-ANX phenotype.
Shared linkage between NEU and ANX
Whereas most nominally significant bins for NEU and ANX did not overlap, there was modest correlation (∼0.2) of linkage signals across the genome, supporting the evidence from twin studies that NEU and ANX share genetic factors. Additionally, we observed two adjacent bins on chr 1 (60–80 cM, 33–54 Mb) that were nominally significant across both phenotypes. For this region, the study-specific minimum linkage P-values ranged from 0.0018 to 0.72. Therefore, this result is supported by multiple studies of both phenotypes, suggesting that this region harbors genes broadly underlying anxiety susceptibility.
Although genetic epidemiology supports a combined analysis, this was not carried out a priori to minimize heterogeneity. The correlation of two independent GSMA results supported conducting a combined analysis, which showed more bins significant for both GSMA statistics and a greater enrichment for significant results than either phenotype alone. Although there are probably loci specific to NEU and ANX, the combined analysis yielded stronger results. Twelve bins were found below a FDR of 0.50 which means, on average, six of these bins will represent true positives. These bins were clustered into seven distinct regions of the genome representing approximately 111.6 Mb of sequence.
Relevance to GWAS
Although GWAS and sequencing studies are the current norm for searching for loci contributing to complex traits, linkage results are arguably important given the emerging architecture of complex traits. First, the shift to case–control from family-based samples may arguably be somewhat premature, as rare or low frequency variants may make an important contribution to many complex disorders. Further, it is apparent that many loci of small effect are contributing to complex disorders. Although achieving genome-wide significance is often the focus, most studies show a profound enrichment of significant tests as evidenced by Q-Q plots. However, separating false from true positives below the genome-wide significance threshold remains a challenge. Even in traits such as BMI with GWAS meta-analyses of samples with tens to hundreds of thousands of individuals, only dozens of SNPs are robustly associated41 and then in aggregate only explain a limited amount of the variance.42 It is reasonable to hypothesize that some regions of the genome will be enriched for these variants of small effect. Linkage has the advantage of identifying regions enriched for common variants of tiny effect and rare variants of larger effect. Therefore, identifying areas of robust linkage is important for approaches such as weighted FDR approaches applied to GWAS,43 or directed re-sequencing and rare variant analysis.
There are a several limitations to the current study. First, 10 cM bins are large, covering many megabases and genes. This is common to linkage studies and the GSMA method that uses their data. The reliability of results obtained at finer resolutions is uncertain, given the inherent error in localizing regions of maximal linkage.44 Nonetheless, smaller bins may be helpful for directed follow-up, so we provide our 5 cM results as Supplementary Table S4. A notable limitation for the ANX meta-analysis is the small number of studies, their sample sizes, and phenotypic heterogeneity. We note that P-values from individual ANX studies were generally less significant than from individual NEU studies. The GSMA method is blind to this, as it ranks bins so it will always find the ‘best’ among those available. Therefore, results from the ANX meta-analysis are likely not as robust as those from NEU. Also, several of the original ANX linkage analyses tried multiple genetic transmission models, with differing findings; we included the non-parametric results only. Finally, it is unclear whether the weighted or unweighted GSMA approach is optimal. We chose to focus on the unweighted results owing to the outcome of the sensitivity analysis. However, this gives results from less powerful studies (due to smaller sample size or study design) equivalent weight to results from more powerful ones. Intermediate weighting schemes may well provide a better balance between the two approaches used here but little empirical evidence is available.
In conclusion, we performed meta-analyses of linkage data from all available studies of NEU and three of the four independent published non-OCD ANX studies. Several regions were identified or confirmed for each phenotype and were supported by multiple samples. We also found modest support for common genetic factors shared by NEU and ANX across the genome, including a specific region on chr 1 (60 cM) that may harbor genetic variation broadly contributing to anxiety susceptibility. These results may be useful to refine or supplement on-going genetic association studies of anxiety-related phenotypes.
This work was supported by NIH grant R21MH79192 to Dr Hettema. Preliminary results from this study were presented at the XVIth World Congress on Psychiatric Genetics, October 11-15, 2008, Osaka, Japan.
About this article
Supplementary Information accompanies the paper on European Journal of Human Genetics website (http://www.nature.com/ejhg)
Psychiatry and Clinical Neurosciences (2015)