Introduction

The developmental period between childhood and early adulthood is a time of substantial cognitive change, with significant gains in working memory, complex reasoning and social abilities [1, 2]. Successful cognitive development during this period has important consequences for future mental and physical wellbeing, as well as occupational and financial success [3]. This period is also thought to be a critical window of risk for many psychiatric illnesses, such as psychotic disorders [4, 5]. While both biological and environmental factors clearly influence cognitive development, most current work in neurodevelopment focuses on very early stages (e.g., prenatal/perinatal; first 12 months) [6, 7]. Moreover, a rapidly growing literature has documented changes in gene expression over the course of brain development [8,9,10], but the relationship between temporal variation in gene expression and cognitive development is unclear. Delineating the genetic influences on cognitive development between childhood and adulthood should provide important insights into the biological mechanisms governing both typical and atypical maturation.

Cognitive abilities are substantially influenced by genes, with approximately half of the variance in general cognition attributed to genetic factors [11]. Specific abilities, including attention [12,13,14,15], working memory [15,16,17,18,19], and declarative memory [20,21,22], are also heritable. Furthermore, the heritability of cognition is moderated by age, with the heritability of IQ increasing from around 40% in early childhood to over 80% in adulthood [23]. More recently, the heritability of general cognition has been shown to increase between childhood/adolescence and early adulthood [11, 24]. Between early adulthood and old age, on the other hand, heritability of processing speed and memory decreases [25]. Thus, genetic influences on cognition may vary, not as a linear function of age, but depending on the developmental period and the specific cognitive abilities under investigation [26]. While the moderation of heritability by age is well replicated [3], previous studies have been unable to fully disentangle this effect because age has been categorized into broad developmental periods, rather than investigated as a continuous factor. Moreover, heritability at different ages has typically been estimated in different samples.

Early efforts to explain the increasing heritability of cognition across development mostly focused on twin and adoption designs [27,28,29]. More recently, studies have examined the genetic influence of common variants in unrelated individuals, known as SNP-based heritability [30, 31]. For example, Trzaskowski and colleagues [28] examined changes in heritability of general intelligence, or g, between ages 7 and 12 in 2875 unrelated children, as well as 6702 twin pairs. SNP-based heritabilities were 0.26 at age 7 and 0.45 at age 12, similar to the twin-based heritabilities of 0.36 and 0.49 at these ages. While these studies provide initial insight into the genetic mechanisms underlying cognitive development, they have a number of limitations. First, most have focused on childhood, with only one study investigating the period between childhood and adolescence [32] and none spanning from childhood to adulthood. Second, using different cognitive tests at different ages has introduced variability, making it difficult to establish whether cognitive changes stem from genetic or methodological factors. Finally, most studies have focused on general cognition, and have not applied a consistent genetic approach to specific functions in a single sample [15].

Since increase in age is not the direct consequence of gene action, increase in age, or maturation, can be considered an effect of the environment and modeled as a Gene × Environment interaction. Thus, a Gene × Age (G × A) interaction on cognitive development can be tested using a cross-sectional design that models differences in cognitive performance as a function of both relatedness (empirically defined) and similarity in age between individuals [25, 33]. As well as providing an estimate of genetic influence on cognitive development, G × A interaction analysis suggests whether this effect is due to fluctuations in action of the same genetic factors, or variation in the genetic factors influencing the trait at different ages. Glahn and colleagues [25] used this approach to identify neurocognitive processes with significant G × A interactions, identifying potential phenotypes for gene discovery in age-related cognitive decline. Similarly, Kent and colleagues [34] identified more than 600 lymphocyte-based RNA transcripts with significant G × A interactions, defining candidate genes for biological aging. Despite the high heritability of cognitive functions, identifying specific genes that influence cognition has proved challenging. Considering G × A interactions in the search for cognition genes could help, particularly when the goal is to determine if the gene influences cognitive development.

In this study, we modeled change in cognitive functions between childhood and early adulthood in the Philadelphia Neurodevelopmental Cohort [35], a large population-based sample of individuals aged 8–21 years old. The aims of the study were to use an empirical relatedness matrix to (1) establish the heritability of general and specific cognitive functions, and (2) determine if G × A interactions influence these functions between childhood and early adulthood.

Methods

Participants

The Philadelphia Neurodevelopmental Cohort (PNC) is a population-based sample from the greater Philadelphia area, comprising 9421 individuals aged 8–21 years who received medical care within the Children’s Hospital of Philadelphia network. Study procedures have been described in detail elsewhere [35]. Briefly, participants presented for a range of medical needs, including general checkups, and chronic condition management [28]. Participants provided written assent/consent for genomic studies upon providing blood samples during the clinical visit. Inclusion criteria were: (1) ability to provide signed informed consent (parental consent was required for participants under age 18), (2) English language proficiency, and (3) physical and cognitive ability to participate in computerized cognitive testing. Data deposited in dbGaP [36] were used in the present analyses. The present analyses were limited to participants who identified as either white non-Hispanic (European American) or black non-Hispanic (African American). A total of 6634 subjects with available cognitive and genetic data were included in the analyses, of whom 4694 (70.8%) were European American (EA) and 1940 (29.2%) were African American (AA). Given known differences in minor allele frequencies between individuals of African and European ancestry, all genetic analyses were conducted separately for EAs and AAs. Age ranged from 8 to 21, with a mean of 13.9 (SD = 3.65)(Figure S1), 49.1% of subjects were male (n = 3254).

Neurocognitive assessment

All PNC participants completed the 1-hour Computerized Neurocognitive Battery (CNB) [37, 38]. The CNB consists of 14 tests designed to capture functioning in five domains of cognitive ability: (1) executive function (abstraction and mental flexibility, attention, working memory), (2) episodic memory (verbal, facial, spatial), (3) complex cognition (verbal reasoning, nonverbal reasoning, spatial processing), (4) social cognition (emotion identification, emotion differentiation, age differentiation), and (5) sensorimotor speed (motor, sensorimotor). The CNB has been described elsewhere [37, 38] and a summary of the measures is included in Table S1. The battery also included the reading component of the Wide Range Achievement Test (WRAT), a measure of general cognitive ability.

In addition to the measures directly indexed by the CNB, we derived a general composite score (g) as the first component of principal component analyses (PCA) using all tests except the WRAT. We also derived a general composite score for speed (gs) as the first component of PCA using reaction times for all cognitive measures. To minimize the impact of missing data on these composite scores, the Multivariate Imputation by Chained Equation (MICE) method [39,40,41] was used to impute missing values using the mice package in R [42]. The imputation model was based on age, sex, and ethnicity (AA or EA). Test scores were imputed for subjects with less than 50% missing neurocognitive data and five datasets were imputed (see Figure S2 for patterns of missingness and Figure S3 for plots of observed and imputed data). All subsequent analyses were conducted on the imputed neurocognitive data. Correlations among all test scores can be seen in Figure S4.

Genotyping

Samples were genotyped on one of four Illumina arrays: HumanHap550, HumanHap610, OmniExpress, or Human1M. Genotyped data were imputed in a separate phase of the study at the Broad Institute [43]. Unobserved genotypes from each chip set were imputed using the IMPUTE2 package and the reference haplotypes in Phase I of the 1000 genomes data (June 2011 release) that included ~37138905 variants from 1094 individuals from Africa, Asia, Europe, and the Americas. The imputed genotype data were used in subsequent analyses. All analyses were conducted separately for EA and AA populations.

Estimation of the empirical relatedness matrix

Empirical relatedness quantifies the proportion of alleles that are identical by descent between individuals and was calculated for all pairs of individuals using the genotype data. A set of 50 k common autosomal SNPs in approximate linkage equilibrium was selected from all available SNP variants after LD pruning (r² > 0.1) using PLINK [44]. Relatedness was estimated from the selected SNPs using the IBDLD software package [45] (up to 50 SNPs within a 2 cM span), and a whitening transformation was applied to the resulting empirical relatedness matrix. The matrix was inspected to ensure correct properties (trace equal to number of genotyped subjects, symmetry, positive semi-definiteness, range of diagonal, and off-diagonal elements). The distribution of estimated relatedness values can be seen in Figure S5.

Statistical and quantitative genetic analyses

The statistical programming language R [46] was used for descriptive statistics and graphics. All genetic analyses were conducted using the SOLAR software package [47]. Briefly, SOLAR implements linear mixed-effects models, which decompose the overall variance of a quantitative trait. Traditionally, these analyses have been performed on family data using matrices calculated from pedigree information, but can also be applied to cohorts of related and unrelated individuals using relatedness estimated from genotype information [48]. Under a simple polygenic model, the phenotypic variance (σ2p) is assumed to be composed of an additive genetic component (σ2g) and an environmental component (σ2e). Maximum-likelihood estimates (MLEs) of σ2g and σ2e (along with regression coefficients for any variables included as fixed-effect covariates in the model) are found using an iterative procedure. Narrow-sense heritability (h2) is the proportion of the phenotypic variance accounted for by additive genetic variance (h2 = σ2g/σ2p).

As detailed previously [25, 34, 49], this polygenic model can be extended to examine Gene × Environment (G × E) interactions. One potential consequence of a G × E interaction is that the overall additive genetic variance is greater under certain environmental conditions than others. To test for this effect with a quantitatively measured environment, the polygenic model is modified to include a linear function on the logarithm of σ2g. This linear function contains a free parameter, γ, reflecting the change in σ2g per unit of the environmental variable, age in this case. A non-zero value of γ implies a heritable response to the environment, and therefore, a G × E interaction. A second potential consequence of a G × E interaction is that the trait exhibits imperfect pleiotropy with itself at different ages i.e., the relative contributions of genetic factors to σ2g change with age [34]. In this case, the genetic correlation (ρg) between the trait measured at one age and the same trait measured at another age is less than 1, suggesting changes in the genetic factors contributing to σ2g. This phenomenon can be examined in cross-sectional studies where individuals are only tested under a single environmental condition or at a single time point, provided the degree of relatedness between individuals is known [33]. To uncover this effect, the ρg for a given pair of individuals is modeled as a function of the difference in their ages and another free parameter, λ, reflecting the rate of decay in ρg as the difference in ages increases. The genetic correlation (ρg) equals 1 if either the difference in ages between individuals is 0 or λ is 0. Thus, a non-zero value of λ implies decreasing ρg and imperfect pleiotropy across ages, and therefore, a G × E interaction. See Genotype × Age Interaction Model in supplement for more information.

Polygenic models with modifications to test for both consequences of a potential G × E interaction, i.e., γ and λ, were fitted to all neurocognitive traits. Age in years was fitted as the continuous environmental variable. All models included age, age2, sex, and their interactions as fixed-effect covariates. Statistical significance for each of the parameters of interest was determined by comparing the likelihood of the full polygenic model to the likelihood of a null model i.e., where the parameter of interest was constrained to 0. To control for multiple testing, the false discovery rate (FDR) was set at 5% [50]. A rank-based inverse normal transformation was applied to scores on each test to ensure normal distributions. Separate polygenic models were fitted to data from EAs and AAs.

Results

Cognitive scores increase between childhood and early adulthood

Figure 1 shows neurocognitive test scores plotted by age. As previously reported [1], increasing age was significantly associated with increasing test scores across all neurocognitive measures. Verbal reasoning, age differentiation, sensorimotor speed, WRAT, and g showed particularly substantial age-related changes, with increases of 1.69, 1.56, 2.08, 2.08, and 2.08 SD between ages 8 and 21, respectively (Fig. 1).

Fig. 1
figure 1

Neurocognitive scores by age for all participants

Cognitive abilities are heritable

Heritability estimates for all neurocognitive measures are presented in Fig. 2 and Table 1. For EAs, all heritability estimates, except for abstraction and age differentiation, were significant after adjustment for multiple testing (Table 1). For AAs, all heritability estimates, except for abstraction, spatial memory, verbal memory, spatial reasoning, age differentiation, and sensorimotor speed were significant after adjustment for multiple testing (Table 1). The AA sample is smaller than the EA sample, which may account for fewer statistically significant results. Neurocognitive measures with non-significant heritability estimates were excluded from subsequent G × A analyses. Figure S6 shows heritability estimates for the imputed and unimputed neurocognitive data.

Fig. 2
figure 2

Heritability estimates for all neurocognitive measures

Table 1 Heritability estimates and G × A interactions

G × A interaction I: genetic variance increases with age

In EAs, general cognitive ability, or g, exhibited a significant increase in genetic variance with increasing age (γ = 0.047, p = 0.024) after adjustment for multiple testing. This effect is denoted by γ and suggests that specific genetic factors influence change in performance on these measures, but also that the magnitude of effect of these genetic factors varies as a function of age (Table 1). Figure 3 shows genetic variance, environmental variance, and heritability between ages 8 and 21 for g.

Fig. 3
figure 3

Predicted changes in genetic variance, environmental variance, and heritability with age on general cognitive ability, or g, in the European American group

In AAs, increase in genetic variance with increasing age on g showed a trend towards statistical significance after adjustment for multiple testing (γ = 0.081, p = 0.065) (Table 1). Figure S7 shows genetic variance, environmental variance, and heritability between ages 8 and 21 for g in the AA sample. The smaller AA sample may account for fewer statistically significant results.

G × A interaction II: genetic factors influencing cognitive abilities at different ages overlap

In both EAs and AAs, none of the neurocognitive measures exhibited statistically significant changes in genetic correlation with increasing age, suggesting that the genetic factors influencing changes in neurocognition do not change between childhood and adulthood (Table 1).

Discussion

Using a large population-based developmental cohort of individuals aged 8–21 years old, we established that both general and specific neurocognitive measures are heritable. Heritability estimates for measures of general cognition, executive function, memory, complex reasoning, social cognition, and sensorimotor speed were moderate to large, consistent with previous findings that neurocognition across the first two decades of life is under considerable genetic influence [11, 24, 32]. Using G × A interaction analyses, we found that specific genetic factors influenced changes in general cognitive ability, or g, between childhood and adulthood, but that the scale, or strength, of action of these genetic factors varied with age, particularly in the EA sample. Finally, we did not find evidence for decay in genetic correlation on any neurocognitive measure throughout childhood and early adulthood, suggesting that the same genetic factors influence changes in neurocognition during this developmental period.

Our findings advance knowledge regarding the genetic architecture of cognitive development between childhood and adulthood in several ways. First, while the importance of genetic factors in determining individual differences in general cognitive ability is well established [3], specific neurocognitive functions have received less attention. We found that measures of general cognition, executive function, memory, complex reasoning, social cognition, and sensorimotor speed were significantly heritable. The largest heritability estimates were for general cognitive measures, 67 and 72% for the WRAT and g respectively, but were also substantial for memory (36–56%) and complex reasoning (35–46%) measures. This pattern of results is consistent with those of genome-wide complex trait analysis (GCTA) in the same sample [15]. Specifically, the largest heritability estimates in both our study and that of Robinson and colleagues are in the complex reasoning domain and the WTAR and the smallest are in the executive and social domains. The main reason for the somewhat higher heritability estimates in our study is that our analyses included all individuals, whereas individuals with relatedness > 0.05 were excluded prior to GCTA [15]. Thus, the higher heritability estimates using related individuals are likely due to rare variants, CNVs, and structural variants that are not well captured by current common SNPs. Our findings are also in line with reports that heritability increases with increasing task complexity [51,52,53]. Differences in heritability estimates between cognitive measures may be due to differences in genetic architecture, such that different genetic factors and/or the same genetic factors, but to differing degrees, underlie variation in different neurocognitive measures [54]. Differential effects of environmental factors, such as education [55] and socioeconomic factors [56, 57] may also play a role. Measurement error is unlikely to underlie the observed differences in heritability estimates since reliability of the CNB measures is high ( > 0.7 for most measures) [15, 38, 58].

Second, G × A analyses revealed a significant age-related increase in genetic variance on general cognitive ability, or g, in line with previous studies [11, 28]. Specifically, we found an increase in genetic variance from 0.32 at age 8 to 0.58 at age 21, which closely resembles that reported by Trzaskowski and colleagues of an increase from 0.26 to 0.45 between ages 7 and 12 [28]. Similar increases have also been reported in the twin literature. For example, Haworth et al. reported an increase from 0.41 to 0.66 between ages 9 and 17 [11], and Bergen et al. reported an increase from ~0.45–0.80 between ages 5 and 35 [24]. Previous studies have generally discussed increasing genetic variance in terms of three phenomena: (1) innovation, i.e., novel genetic factors coming into play over time, (2) decay, i.e., existing genetic factors becoming decreasingly important over time, and (3) amplification i.e. existing genetic factors becoming increasingly important over time [27, 29]. Our findings provide evidence for genetic influences on g becoming amplified over childhood and early adulthood. This amplification of genetic factors may be due to gene-environment correlations, whereby individuals increasingly select experiences based on their underlying genetic predispositions, thus accentuating genetic differences [59, 60]. Moreover, genes may become increasingly expressed over the course of cognitive development due to maturational processes [61], and/or environmental factors may moderate gene expression as individuals increasingly select and evoke these factors [11, 62]. Our findings also highlight the importance of g, or general cognitive ability, and lend support to the notion of generalist genes i.e., that a set of generalist genes influence multiple cognitive domains [63]. The generalist gene hypothesis has important practical implications since g can readily be calculated in any study that includes five or more cognitive tests, allowing combining of data from multiple studies and thus increasing power to detect genetic variants underlying cognition [26].

Third, our G × A analyses suggest overlap between the genetic factors influencing changes in neurocognition in childhood and adulthood, since none of the neurocognitive measures exhibited significant changes in genetic correlation with age. This finding is directly in line with that of a previous meta-analytic study, which found that genetic innovation, whereby novel genetic factors come into play over time, ceased after age 8 [29]. Our findings are also in line with an expansive literature documenting stable genetic effects on cognition throughout the lifespan [3]. For example, in a longitudinal twin study, the genetic correlations of full-scale, verbal and performance IQ between childhood and adolescence were estimated at 0.96, 0.78, and 0.90, respectively [64]. Similarly, a study examining changes in heritability of g between ages 7 and 12 in 6702 twin pairs, as well as 2875 unrelated children [28], reported genetic correlations between ages of 0.73 and 0.75 for the SNP-based and twin approaches, respectively. Our results extend these findings by showing overlap between the genetic factors influencing neurocognition beyond adolescence to early adulthood.

Our findings have several implications and generate testable hypotheses for future work. First, our findings suggest the importance of considering G × A interactions when conducting gene discovery studies since genetic variance increases with age. Thus, gene discovery efforts during this developmental period may be most fruitful when examining young adults or by using statistical models that account for changes in genetic variance with age. Second, this study suggests the feasibility of utilizing different populations of individuals together in genetic studies. Although our AA sample was small, we found significant heritability estimates. Conducting genetic studies across populations is critical in assessing the accuracy and broader relevance of a finding [65, 66]. Finally, our findings provide evidence for specific, but also dynamic, genetic influences on cognitive development between childhood and adulthood, in line with a growing literature on changes in gene expression over the course of brain development [8, 53]. Future studies integrating neurocognitive and neurobiological measures will aid understanding of the complex interplay between genetic influences. For example, cortical thickness and white-matter tract integrity may act as intermediary pathways between genetic factors and neurocognition over development [27].

This study has some limitations. First, our data were cross-sectional and longitudinal studies with repeated assessments of the same individuals over time using identical cognitive tests are needed to fully establish developmental cognitive trajectories. Second, while we included individuals of both European and African ancestry, the AA sample was small. Future studies should include larger samples of individuals of African ancestry, as well as other underrepresented populations [65, 66]. Third, while this is the first time, to our knowledge, that G × A analyses have been used to examine genetic influences on changes in both general and specific cognitive measures between childhood and adulthood, different tests have different psychometric properties and strong assertions about specific cognitive functions require replication in independent samples.

In conclusion, we found that neurocognitive measures across childhood and early adulthood are under substantial genetic influence. Moreover, specific genetic factors influence changes in general cognitive ability, or g, between childhood and adulthood, but the magnitude of effect of these genetic factors varies as a function of age. Finally, the genetic factors influencing neurocognition throughout this developmental period overlap at different ages. Establishing the nature of G × A interactions on changes in neurocognition across childhood and early adulthood is a necessary first step in identifying genes that influence cognitive development.