Introduction

Human cognitive performance is both highly variable and under strong genetic control. Virtually all cognitive tests that have been studied show appreciable heritabilities.1, 2, 3 For example, all of the tests in our study have heritabilities that have been estimated to be at least 15%, some ranging as high as 50% (Table 1). Although there have been a number of candidate gene studies testing the effects of specific polymorphisms on cognitive tasks, few definitive associations have emerged that meet contemporary standards of evidence.4 Surprisingly, there have been few genome-wide studies, with the exception of some genome-wide linkage scans that have not resulted in a clear indication of which genomic regions are most important (reviewed in 5). One genome-wide association study did report significant associations between delayed recall and variants in the KIBRA and CAMTA16, 7 genes, but these associations have not been replicated.8 Two other genome-wide association studies on cognition did not show conclusive associations.9, 10

Table 1 Description of cognitive tests in our battery

Although it seems clear that genetic variation strongly influences cognitive performance in human beings, identification of important genetic differences remains elusive. With the aim of discovering these variants, we constructed a test battery using objective, standardized tests that are of the highest relevance in studies involving psychiatric and neurological populations and performed genome-wide association in a large sample of healthy individuals. The diverse aspects of cognition measured in this study are not only interesting in their own right, but also have documented relevance to psychiatric traits and are considered important endophenotypes in psychiatric conditions and illnesses (Table 1).

Materials and methods

Subjects and cognitive battery

We considered two tests of attention and executive function, the Digit Symbol Substitution subtest of the Wechsler Adult Intelligence Scale-III and the Stroop Color-Word test, administered to a total of 1688 individuals. These two tests were chosen as our main phenotypes because they have relatively high heritabilities, assess executive function (Table 1), have shown sensitivity in patient samples, and have favorable psychometric properties. Subjects were recruited into this study between 2006 and 2009 by advertisements posted around the Duke and North Carolina State Universities and local retirement homes. This was primarily a cohort of young university students, with 81% under the age of 30 years, 58% undergraduate students, 18% graduate students, 58% of European ethnicity, and 54% female (Supplementary Table 1). In all, 84% of students were at Duke University and 13% at North Carolina State University, both in the Raleigh-Durham region of North Carolina, and 3% were from other universities. A subset of 832 of these individuals also took a 30 min expanded battery of tests, details of which are given in Table 1. All participants were compensated for their time with a small monetary reward.

Single-nucleotide polymorphism genotyping

Each subject donated 20 ml of blood or 5 ml saliva for DNA extraction. DNA was extracted using the QIAGEN (Venlo, The Netherlands) Autopure LS. The DNA for 1458 subjects was genotyped using Illumina (San Diego, CA, USA) genotyping chips and single-nucleotide polymorphisms (SNPs) from the HumanH610 were used for analysis (Supplementary methods).

Test score distributions

The distribution of scores on each test of the expanded battery was assessed for normality using a Shapiro–Wilks test and those showing a substantial deviation from normality (P<0.001 in those of European ethnicity) were transformed using a Box-Cox transformation in STATA:11 this included Trails A and B. Although Digit Span Forward and Backward did not pass this Shapiro–Wilks cutoff because of the limited number of scores possible (integers from 1 to 9 for Forward and 1 to 8 for Backward), visually they followed a normal distribution and thus were left untransformed. The phenotype distributions were shifted toward better performance for university students, compared with those not currently in school; however, even within this subset, test scores still covered a wide range and did not substantially deviate from a normal distribution (Supplementary Figure 1). For test score distributions and correlations between tests, see Supplementary Tables 3 and 4.

Exclusions

Each subject was asked to complete a questionnaire before the test, which included questions with regard to age, native language, education, ethnicity, medications, and psychiatric disorders. The 832 subjects who took the expanded battery also had their level of depression measured by the Beck Depression Inventory (BDI)-II,12 and 680 of these subjects filled out a more extensive questionnaire that asked about lifestyle, family background, substance abuse, familiarity with the testing battery, and strategies used during testing (these last two were given directly after the test). Subjects recruited from retirement homes were also administered the MOCA13 to test for possible dementia. On the basis of these data, 157 subjects were excluded because of factors likely to influence their performance (Supplementary methods).

Covariates

Standard covariates of ethnicity (EIGENSTRAT axes), age, sex, education (baseline of current undergraduate students with dummy variables for those without college education, those with a bachelor's degree, and those with a graduate degree or who were currently in graduate school), BDI score, handedness, who tested the subject, whether they took the full battery or just Digit Symbol and Stroop, which university they attended, and testing location were added one at a time to a multivariate regression model of all subjects in STATA.11 Except for sex and ethnicity, which were always included, these standard covariates were only removed from the model if doing so increased the adjusted r2. The adjusted r2, unlike r2, which will always increase as you add more covariates, takes into account both sample size and the number of covariates in the model and is calculated as 1−(1−r2)*((n−1)/(nk−1)), where k is the number of covariates in the model. With the aim to account for as much environmental variation as possible, we also considered novel covariates collected with our full questionnaire such as whether the subject had seen that particular test before, and added them to the regression model if they contributed with a P-value below 0.005 (Supplementary methods).

Association analysis

EIGENSTRAT analysis14 was performed on our subjects to determine ethnicity, and outliers were pruned to remove as many of the initial 185 significant axes as possible while retaining a large sample size. This pruning resulted in 10 significant EIGENSTRAT axes and the inclusion of 813 self-identified Europeans, 167 East Asians, 74 South Asians, and 32 of other ethnicities. Our primary association analyses were carried out in these individuals: 1086 for our two main tests and 514 for the expanded battery. All 10 significant EIGENSTRAT axes were used as covariates in all analyses.

Each SNP was analyzed in plink,15 using an additive linear model with the selected covariates for each phenotype. As target phenotypes, we considered key variables from each of the tests (Table 1). In addition to these 11 tests, we considered the first principal component (PC1) for the tests in the expanded battery, which explained 39% of the total variation in test scores. When an individual was missing a single test score because of examiner error (three missing Delayed Story Recall, one missing Symbol Search, one missing Trails B), PC1 was calculated using imputed scores for the missing tests. Scores were imputed using the missing value analysis function in SPSS, using expectation maximization algorithms.16 The minor allele frequency cutoff for analysis was set to 5/2n, thus 0.002 for the two main tests and 0.005 for the expanded battery. Twelve phenotypes analyzed against approximately 560 000 SNPs each (the number varies slightly depending on the phenotype, see Supplementary Table 3) require a Bonferroni-corrected P-value of 7.4 × 10−9. The results for SNP association analysis were visualized using WGAViewer,17 and QQ plots for each phenotype are available in Supplementary Figure 2.

Select polymorphisms from earlier studies of cognition were analyzed if present on the Illumina HumanHap610 or tagged with an r2 above 0.8 on the chip (Table 2).

Table 2 Association with candidate polymorphisms for human cognitive function

We also tested for the effect of common copy number variants (CNVs) by using a set of SNPs known to tag CNVs.18 Of 285 SNPs identified as tagging a CNV,19 187 were on the HumanHap610 and were used in our genome-wide analysis. These SNPs were examined for association with each phenotype in a separate regression analysis (Supplementary results), and the correction for multiple testing required a P-value below 2 × 10−5.

To assess association in a more homogeneous group, we additionally performed analysis on only those individuals <30 years of age who were students of European ethnicity. EIGENSTRAT axes were re-built for just these samples and analysis was performed for 561 subjects (191 for extended battery) using all four EIGENSTRAT significant axes and the covariates listed in Supplementary Table 5, which were kept the same as in the initial analysis unless they no longer contributed to the model with P<0.2. With an average of 550 000 SNPs (Supplementary Table 6) tested for each of the 12 phenotypes, a P-value of 7.6 × 10−9 is required to declare genome-wide significance after Bonferroni's correction. Follow-up analyses for rs1983761 were also performed in 133 subjects of European ethnicity above the age of 29 years with four EIGENSTRAT axes, and in 47 subjects of European ethnicity below the age of 30 years who were not students with the same four axes (Supplementary Table 7). For each analysis, the covariates were kept the same as in the original analysis unless they no longer contributed to the model with P<0.2. This SNP was not analyzed in other ethnicities, as it was only seen once in African Americans, twice in South Asian, and never in East Asian in our study.

Power calculations for association analysis were performed using PowerCalc software at (http://www.genome.duke.edu/labs/goldstein/software/)20 (Supplementary methods).

Results

Genome-wide SNP association study

The two main phenotypes examined in this study, Digit Symbol and Stroop Color-Word, were analyzed in 1086 subjects using genotype data from the Illumina HumanHap610 and an additive linear regression model including covariates to minimize the effects of environmental influences (Supplementary Tables 2 and 3). Phenotypes for an additional nine cognitive tests and PC1 were available for a total of 514 subjects and were analyzed in the same manner. No polymorphism achieved a significant association after accounting for the full number of tests carried out in this study, nor did any polymorphism achieve the now typical threshold of 5 × 10−8 for genome-wide significance for any single test.4 The strongest association was between Digit Span Backward and rs1876040, with a P-value of 6.3 × 10−8. This SNP is 33 kb away from the nearest gene, AC092594.1. The top 100 hits for each phenotype can be found in the Supplementary results, and the results for power calculations are in Supplementary Table 3.

Association testing in SNPs of special interest

CNVs that are known to be tagged by SNPs on the HumanHap610 were also analyzed through their proxy SNPs18, 19 (Supplementary results). No CNV-representing SNPs were found to be significantly associated with any of the tests in our battery, despite 80% power to detect variants explaining at least 2% (for Digit symbol and Stroop Color-Word) to 5% (for the remaining tests) of the variation in test score. The best association was between COWA and rs7604792, which tags a 5.8 kb CNV (chromosome 2:123192648-123198471, P=8 × 10−5). This CNV is not in the vicinity of any genes.

Select polymorphisms earlier found to be associated with cognitive tasks in genome-wide association or candidate gene studies were also analyzed for association with each of the 12 phenotypes (Table 2). These polymorphisms were either present on the Illumina HumanHap610 or tagged on the chip. No candidate polymorphisms were significantly associated after using a Bonferroni's correction for multiple testing.

Genome-wide SNP association study in the homogeneous group

We also performed genome-wide association analysis of these 12 phenotypes by restricting our data set to students <30 years of age who were of European ethnicity (Supplementary Tables 5 and 6). Again, no SNP was found to be significantly associated after correcting for all tests in this set of 561 subjects (191 for extended battery) (top 100 associations are in Supplementary results). However, one SNP, rs1983761, with a P-value of 9.9 × 10−9 for association with Trails B, did pass the commonly used threshold of 5 × 10−8. We additionally tested for association of this SNP with Trails B in 133 subjects of European ethnicity above the age of 29 years and in 47 subjects of European ethnicity below of the age of 30 years who were not students, and found that it was not associated with P<0.1 in either of these groups, and that the trend of effect in both of these cases was the reverse of that in the original finding (Supplementary Table 7).

Discussion

Genome-wide association studies have largely failed to find common genetic variants that explain large portions of the variation in complex traits, 21 especially psychiatric diseases.22, 23, 24 It has been proposed that this failure stems from the problem that such conditions are heterogeneous in nature, and that endophenotype measurements will be much more clean and amenable to genetic analysis.25 Here, we have studied a number of tests that assess underlying endophenotypes of a number of important psychiatric conditions. We evaluated the performance of healthy volunteers on these tests and found that, consistent with what is observed for the end clinical conditions themselves, no single common SNP makes a major contribution to variation in the population. This is consistent with the findings from model organisms that the distribution of effect size for causal common variants is the same regardless of the type of phenotype under study.26 Although our moderate sample size does limit our ability to detect variants of small effect, for Digit Symbol and Stroop Color-Word, both tests of executive function, we can conclude that it is unlikely that any variant explains >4% of the variation, whereas for the other tests in our battery, no variant explains >8%. Our effective power is even >4–8%, as environmental covariates built into the regression model explained 15–49% of the variation in each phenotype. For example, Symbol Search, with covariates explaining 41% of the total variation, has an effective power of 80% to detect variants explaining at least 5% of the remaining variation in the trait (Supplementary Table 3). Finding that no single common variant has a large effect on these phenotypes is consistent with a separate study we performed on phenotypes related to very specific aspects of memory27 and with the emerging body of evidence that many key human phenotypes under long-term selection are not heavily dependent on common variation in individually detectable polymorphisms.

Although none of the SNPs or common CNVs analyzed in this study met our threshold for declaring genome-wide significance, it is worth pointing out that many thousands of the SNPs analyzed had P-values much lower than those reported in typical candidate gene studies of comparable size, and many of the genes that are represented could be argued to be strong candidate genes on the bases of influencing transmitter systems relevant to given traits or related criteria. This indicates that candidate gene studies may have been too liberal in their cutoffs for declaring an association to be significant. Furthermore, we observed a single variant, rs1983761, which associated with Trails B with a P-value of 9.9 × 10−9 in students of European ethnicity who were <30 years of age. Although this association did pass the commonly used threshold of 5 × 10−8, it did not pass our Bonferroni-corrected threshold of 9.7 × 10−9. This variant explained 14% of the variation in the trait in the initial association, yet when we followed it up in a group of 133 subjects of European ethnicity above the age of 29 years and in a group of 47 subjects of European ethnicity below the age of 30 years who were not students, we saw no association despite 99 and 86% power, respectively, to detect a variant with an effect this large. Even if the winner's curse meant that this variant had a smaller effect size than 14%, we had 80% power to detect a variant explaining 4.5 or 12% of the variation in this trait in older Europeans and younger Europeans not in school, respectively. Unless one would believe that this variant only has an effect in young students, it must be regarded as a false positive and only further emphasizes the importance of using strict P-value cutoffs in association analyses.

Although the sample size in this study precluded the discovery of variants effecting <4 to 8% of the variation in these traits, the results are consistent with the hypothesis that common diseases may be more influenced by deleterious variants held in low frequency by natural selection than by common variants of large effect as postulated by the common disease–common variant hypothesis. It is of course also possible that a great number of common variants of small effect influence such traits, as was recently reported in a study on schizophrenia,28 or that many common variants interact with each other to influence risk. However, as there are already multiple examples of rare variants affecting risk for psychiatric diseases,29, 30, 31 we believe that characterization of such variants is the most promising strategy that is likely to progress the study of the genetics of psychiatric and other complex disease conditions. Furthermore, our results are consistent with the idea that identifying rare causes of disease will be more fruitful than efforts directed at genetically homogeneous endophenotypes.

Conflict on interest

The authors declare no conflict on interest.