Although common sense suggests that environmental influences increasingly account for individual differences in behavior as experiences accumulate during the course of life, this hypothesis has not previously been tested, in part because of the large sample sizes needed for an adequately powered analysis. Here we show for general cognitive ability that, to the contrary, genetic influence increases with age. The heritability of general cognitive ability increases significantly and linearly from 41% in childhood (9 years) to 55% in adolescence (12 years) and to 66% in young adulthood (17 years) in a sample of 11 000 pairs of twins from four countries, a larger sample than all previous studies combined. In addition to its far-reaching implications for neuroscience and molecular genetics, this finding suggests new ways of thinking about the interface between nature and nurture during the school years. Why, despite life's ‘slings and arrows of outrageous fortune’, do genetically driven differences increasingly account for differences in general cognitive ability? We suggest that the answer lies with genotype–environment correlation: as children grow up, they increasingly select, modify and even create their own experiences in part based on their genetic propensities.
The phenomenon of general cognitive ability was discovered more than a century ago and was called g to distinguish it from the many connotations of the word intelligence.1 Individual differences in diverse cognitive abilities such as verbal, spatial, memory and processing speed correlate about 0.30 on average, and a general factor (an unrotated first principal component) accounts for about 40% of the total variance, as indicated in a meta-analysis of more than 300 studies.2, 3 Despite its contentious history,4 g is one of the most reliable, valid and stable behavioral traits,3 and it predicts important social outcomes such as educational and occupational levels far better than any other trait.4, 5, 6, 7
The substantial heritability of g has been documented in dozens of family, twin and adoption studies, with estimates varying from 40 to 80%.8, 9, 10 Twin studies, which compare identical (monozygotic, MZ) and fraternal (dizygotic, DZ) twins, provide the majority of these estimates. In 34 twin studies with a total of 4672 pairs of MZ twins, the average MZ correlation is 0.86,8 indicating that identical twins are nearly as similar as the same person tested twice (test–retest correlation for g is about 0.903). In contrast, the average DZ correlation is 0.60 in 41 studies, with a total of 5546 pairs of DZ twins. Heritability, the genetic effect-size indicator, can be estimated by doubling the difference between the MZ and DZ correlations because MZ twins are twice as similar genetically as DZ twins.11 This heritability estimate of 52% is similar to that in the results from family and adoption studies. Moreover, meta-analyses of all of the studies yield heritability estimates of about 50%, indicating that about half of the total variance in g can be accounted for by genetic differences between individuals.12, 13, 14
However, this simple conclusion masks possible developmental differences. The dozens of twin studies of g vary widely in the age of their samples, and several reviews have noted a tendency for heritability to increase with age, especially during the transition from infancy to early childhood.9, 15, 16 The possibility that the heritability of g increases with age is interesting because it goes against the reasonable assumption that experiences accumulate during the course of life. However, the hypothesis has not been rigorously tested in part because of the large sample size of a wide age range of twins assessed on g measures needed for an adequately powered test to detect significant differences in heritability.
To test the hypothesis that the heritability of g increases from childhood to young adulthood, we created a consortium of six twin studies from four countries that yielded a total of 11 000 pairs of twins with g data, larger than the previous world's literature combined. This large sample size makes it possible to conduct the first adequately powered test of age differences in the genetic and environmental etiology of g.
Materials and methods
Samples and measures
Data on general cognitive ability were available from six twin studies from four different countries. These samples are all part of the Genetics of High Cognitive Abilities consortium. Three samples came from the United States—from Ohio, Colorado and Minnesota—and one each from Australia, The Netherlands and the United Kingdom. Individuals ranged from 6 to 71 years of age and the samples are organized here in the order of the average age of the sample. (For our age analyses, we limited the sample to individuals below 34 years of age because just 127 pairs were older than 34 years, a sample too small to provide adequate power in our twin analyses.) Although these studies included different measures of cognitive ability, diverse cognitive tests can be used to create a g score that correlates highly with g scores derived from other tests,17 which Spearman18 referred to as the indifference of the indicator. Thus, we created g scores standardized within each study.
The Western Reserve Reading Project,19 a longitudinal twin study, provides data for 121 MZ pairs and 171 same-sex DZ pairs. Recruiting was conducted through school nominations, Ohio State Birth Records and media advertisements. Schools were asked to send a packet of information to parents in their school system with twins who have been enrolled for kindergarten but have not finished first grade. Cooperation was secured from 273 schools throughout the state of Ohio. Media advertisements in the Greater Cleveland Metropolitan Area have also been used for the effective recruitment of additional twins. A social worker with long-standing ties to the community was also hired to assist in the recruitment of under-represented groups through face-to-face meetings with churches, community centers and other service organizations. General cognitive ability was assessed using a short form of the Stanford–Binet Intelligence Scale,20 including Vocabulary, Pattern Analysis, Memory for Sentences, Memory for Digits and Quantitative subtests. These subtests were summed and standardized for age and sex to form a general cognitive ability (g) score. Zygosity was assessed using DNA analysis by a buccal swab procedure. The average age of the sample was 6.07 years (range=4.33–7.92).
The Twins Early Development Study is a sample of twins born in the United Kingdom between 1994 and 1996.21 The Twins Early Development Study sample has been shown to be reasonably representative of the general population in terms of parental education, ethnicity and employment status.22 Zygosity was assessed through a parent questionnaire of physical similarity, which has been shown to be over 95% accurate when compared with DNA testing.23 For cases where zygosity was unclear from this questionnaire, DNA testing was conducted. At 12 years of age, the twins participated in web-based testing.24 The twins were tested on two verbal tests, Wechsler Intelligence Scale for Children (WISC)-III-PI Multiple Choice Information (general knowledge) and Vocabulary Multiple Choice subtests,25 and two non-verbal reasoning tests, the WISC-III-UK Picture Completion25 and Raven's Standard and Advanced Progressive Matrices.26, 27 We created a g score with equal weights for the four tests by summing their standardized scores. Further information about g as measured in the Twins Early Development Study can be found elsewhere.24, 28 The Twins Early Development Study provides data for 1518 MZ pairs and 2500 DZ pairs (1293 same-sex and 1207 opposite-sex pairs). The average age of the sample was 11.57 years (range=10.08–13.74).
The Minnesota Center for Twin and Family Research (MCTFR)29 provides data for 1177 MZ pairs and 679 same-sex DZ pairs. Twins were ascertained from Minnesota state birth records spanning the years 1972 through 1994 and recruited to participate in a broad-ranging longitudinal study of psychological development. At their intake into the study, twins were either 11 or 17 years of age. Twins with known mental retardation or a developmental disability that would have precluded their completing the intensive in-person MCTFR assessments as well as twins living more than a day's drive from the laboratories in Minneapolis were excluded from participation. Otherwise, the MCTFR sample is broadly representative of twin pairs born in Minnesota for the birth years sampled, with little evidence of participation bias in terms of parental education, socioeconomic status or mental health.30
The intelligence quotients (IQs) used in this study were determined from the twins’ intake assessment, at which time they completed an abbreviated version of the Wechsler Adult Intelligence Scale-Revised (WAIS-R) if they were from the older cohort or the WISC-Revised (WISC-R) if they were from the younger cohort. In both cases, the abbreviated Wechsler assessment consisted of two verbal subtests (Information and Vocabulary) and two performance subtests (Block Design and Picture Arrangement), selected because performance on these four subtests correlates with a value greater than 0.90 with IQ determined by all Wechsler subtests. Performance on the four subtests was prorated and norms for the Wechsler tests used to compute IQs.
Zygosity was initially assessed using the consensus of four indicators: a standard zygosity questionnaire completed by the twins’ parents before the intake assessment; a diagnosis of zygosity based on trained project staff perception of physical similarity at the time of intake assessment; and an algorithm based on ponderal index, cephalic index and fingerprint ridge count. If there was any discrepancy among these three methods, zygosity was determined by evaluating 12 blood group antigens from blood samples. In an analysis of 50 twin pairs where the questionnaire, project staff assessment and physical similarity algorithm all agreed, the resulting zygosity determination was always confirmed in the serological analyses. The average age of the sample was 13 years (range=11–17).
The data are provided by the Institute for Behavior Genetics from 390 twin pairs participating in the Longitudinal Twin Study (LTS), 696 pairs from the Colorado Twin Study (CTS) and 1779 pairs from the Colorado Learning Disabilities Research Center. The LTS and CTS are maintained in a single database, with no overlap in subjects. The Colorado Learning Disabilities Research Center subjects were independently ascertained and could include overlapping subjects. For the purposes of this analysis, a search was made for all doubly ascertained families, and all known duplicates have been removed from the original LTS and CTS samples; all data for these analyses are for unique individuals, with one test per individual. The study samples are 90% White, with approximately equal representation of male (49%) and female (51%) individuals.
The LTS sample was collected from 1984 onwards, with repeated testing from about 1 year of age through, currently, their early twenties. Ascertainment was through a search of birth records made available by the Colorado Department of Health. A total of 483 pairs have participated at some time in the study, with 412 currently active. IQ testing at approximately 16 years of age used the WAIS-III.31 The data from this test were used if available. If not, the next latest test was used: WISC-III32 at 12 years of age or WISC-R at 7 years of age.33 Thus, age of testing ranged from 6 to 19 years, with a mean age of 15.4 years. Zygosity was determined initially using a modified version of the Nichols and Bilbro34 questionnaire. Subsequently, these assignments were confirmed using 11 highly polymorphic short tandem repeat markers (the Institute for Behavior Genetics zygosity panel) in 92% of the sample for whom DNA has been collected. Further details of the ascertainment and history of the study are provided in Rhea et al.35
The CTS sample was recruited as adolescents through a combination of historical birth records and the use of school records. In all, 170 of 176 school districts participated at some level. IQ testing used the Vocabulary and Block Design subtests of the age-appropriate WISC-III or WAIS-III. Age of testing ranged from 12 to 25 years, with a mean age of 17.1 years. In almost all cases, zygosity is determined by genotyping the Institute for Behavior Genetics zygosity panel. Further details of the ascertainment and history of the study are provided in Rhea et al.35 To estimate full-scale IQ scores from the two subtests administered, a regression equation of full-scale IQ on the subtests was computed in the LTS sample and applied to the CTS sample.
The Colorado Learning Disabilities Research Center sample participated in either the Colorado Reading Project36, 37 or the Colorado Learning Disabilities Research Center.38 Twin pairs were ascertained through 27 cooperating school districts in the state of Colorado. Twin pairs included those in which at least one member had a school history of reading problems and twin pairs in which neither member had a school history of reading problems. Although this means that the sample is not strictly unselected, the IQ distribution shows no signs of departure from normality, with mean=105.6, s.d.=13.2, skewness=0.00 and kurtosis=0.11. IQ tests used either the WISC-R or the WAIS-R. The twins were reared in primarily English-speaking, middle-class homes, and were between 8 and 20 years of age at the time of testing, with a mean age of 11 years. The average age of the combined Colorado sample was 13.12 years (range=6–25).
The Twin Cognition Study39 provides data for 338 MZ pairs and 513 DZ pairs (265 same-sex and 248 opposite-sex pairs), recruited through primary and secondary schools in the greater Brisbane area.40 Zygosity for DZ same-sex twin pairs was established by typing nine independent DNA microsatellite markers (AmpF1STR Profiler Plus Amplification kit; Applied Biosystems, Foster City, CA, USA; polymorphism information content>0.7) and cross-checked with blood group results (ABO, MNS and Rh; blood typing provided by Australian Red Cross Blood Service, Brisbane, USA) and phenotypic data (hair, skin and eye color). The overall probability of correct zygosity assignment was greater than 99.9%.41 Parental report indicated no significant head injury, neurological or psychiatric conditions, history of substance abuse/dependence or taking of medications with significant central nervous system effects. An informed written consent was obtained from the twins and their parent or guardian, and ethical approval was obtained from the Human Research Ethics Committee, Queensland Institute of Medical Research. Twins were tested as close as possible to their sixteenth birthday on three verbal (Information, Arithmetic and Vocabulary) and two performance (Spatial and Object assembly) subtests from the Multidimensional Aptitude Battery,42 in addition to other measures of cognitive ability. The Multidimensional Aptitude Battery is a computerized test, based on the WAIS-R,43 that generates scores for full-scale IQ based on Canadian normative data. For a full description of the test battery as measured in the Twin Cognition Study, see Luciano et al.44 The average age of the sample was 16.00 years (range=15–22).
The Netherlands Twin Register45 provides data for 434 MZ pairs and 517 DZ pairs (337 same-sex and 180 opposite-sex pairs). IQ data were available in twins who had taken part in studies on cognition at 6, 12 and 18 years of age46 or as adults. At the age of 6 years, twins were tested as part of studies on the development of cognition executive function and neuropsychological development.47 IQ data at 12 years of age were collected in twins who took part in developmental studies of cognition and brain development.48 At the age of 18 years, the twins took part in studies of brain development and cognition.49 The adult twins had also taken part in a study of brain function and IQ.50 The large majority of same-sex twins’ zygosity was based on typing of DNA or blood group polymorphisms. For the other pairs, zygosity was based on a series of physical similarity questions, answered by the mother of twins repeatedly over time.51 IQ testing was carried out with standard, age-appropriate IQ tests (see Boomsma et al.46 and Posthuma et al.50). The average age of the sample was 17.99 years (range=5.67–71.03).
The twin method
The twin method uses MZ (identical) and DZ (fraternal) twin intraclass correlations to dissect phenotypic variance into genetic and environmental sources.11 MZ twins are 100% genetically similar, whereas DZ twins are on average only 50% similar for segregating genes. Environmental variance can be dissected into shared environmental effects (that is, environmental effects that make members of the same family more similar) and non-shared environmental effects (that is, environmental effects that do not make members of the same family similar). These genetic and environmental effects are commonly represented as A, C and E. ‘A’ is the additive genetic effect size, also known as narrow heritability. Heritability can be estimated by doubling the difference between MZ and DZ twin correlations. Shared environment (C, for effects common to family members) refers to variance that makes MZ and DZ twins similar beyond twin similarity explained by additive genetic effects. C can be estimated by subtracting the estimate of heritability from the MZ correlation. In addition, non-shared environmental influences (E) can be estimated from the total variance not shared by MZ twins; non-shared environmental influences are the only influences deemed to make MZ twins different. E also includes measurement error. Twin intraclass correlations were calculated that index the proportion of total variance due to between-pair variance.52 Rough estimates of genetic (A) and environmental influences (C and E) can be calculated from these twin correlations.
A more comprehensive and precise way of estimating the ACE parameters is maximum-likelihood model-fitting analysis,53 which provides estimates of genetic and environmental effect sizes that make assumptions explicit, tests the fit of the entire model to the data, tests the relative fit of alternative models and provides confidence intervals for the parameter estimates. Discussion of the use of maximum-likelihood model-fitting analyses can be found elsewhere.11, 53, 54, 55 Mx software for structural equation modeling was used to perform standard model-fitting analyses with raw data.55
All measures were standardized to a mean of zero and an s.d. of 1 separately for each sample. Analysis of variance was used to assess differences in means by sex and zygosity. All measures were residualized for age and sex effects using a regression procedure. Standardized residuals were used because the age and sex of twins are perfectly correlated across pairs, and variation within age at the time of testing and variation within sex could contribute to the correlation between twins, and thus be misrepresented as environmental influences shared by the twins.56 Four of the samples (Australia, US Colorado, United Kingdom and the Netherlands) included both same-sex pairs as well as opposite-sex DZ twin pairs. We therefore performed preliminary analyses based on sex-limitation models to investigate possible sex differences in etiology. These analyses indicated no significant qualitative differences and therefore we report results here from analyses including opposite-sex as well as same-sex twins. There was a significant quantitative sex difference only in the UK sample, but the difference was small, and the UK sample had the greatest power to detect significant differences. To create the largest possible sample to power the analyses, we combined data from male and female individuals.
We conducted two sets of analyses on the data from the Genetics of High Cognitive Abilities consortium. First, we analyzed g in each sample, and used a standard heterogeneity model to test for differences between the six samples (referred to as analysis by site). Next, we combined the twins from all of the studies and split them into three age categories, representing childhood, adolescence and young adulthood. The same heterogeneity twin model was used to assess whether there are significant changes in the etiology of g from childhood to adolescence to young adulthood (referred to as analysis by age). Twin intraclass correlations were calculated and standard univariate twin analyses using raw data were conducted in Mx.55 The heterogeneity model provides estimates for each grouping separately and then assesses the significance of heterogeneity across the groups by equating the ACE estimates and measuring the worsening of fit of the reduced model. For the analyses by site, results from this model indicate whether there are significant differences in the etiology of g in different twin studies. For the analyses by age, results from this model indicate whether there are significant increases in heritability of g from childhood to young adulthood.
For the analyses by age we used three age groups that were chosen on the basis of the age distributions of the samples involved and to yield samples of adequate size to provide reasonable power for the analyses. The age groups, referred to as childhood, adolescence and young adulthood, include data from the six different twin studies. The childhood group has a mean age of 9 years (range=4–10 years); the adolescence group has a mean age of 12 years (range=11–13); and the young adulthood group has a mean age of 17 years (range=14–34). For these age analyses, it was not possible to include a fourth group of individuals above the age of 34 years because the Genetics of High Cognitive Abilities consortium includes just 127 pairs older than 34 years, a sample too small to provide adequate power in the twin analyses.
Assortative mating (that is, phenotypic similarity between mates) is likely to occur for g.57, 58 Assortative mating has the effect of inflating the fraternal (DZ) twin correlation. This results in lowered estimates of heritability (A) and increased estimates of the shared environment (C). We were unable to assess the effect of assortative mating on our ACE estimates. However, if there is an effect of assortative mating, it is unlikely to affect the developmental changes in ACE estimates.
Analyses by site
The means and s.d. for g in the six sites yielded negligible sex and zygosity differences (Supplementary Table 1). Intraclass twin correlations (Table 1) show that for all six sites, MZ correlations are significantly greater than DZ correlations, suggesting genetic influence.
Model-fitting analyses on each sample indicate that genetic, shared and non-shared environmental influences were significant in all samples, apart from the Australian sample where an AE (‘No C’) model provided the best fit (Supplementary Table 2). A standard heterogeneity model was then applied to the data to assess significance of heterogeneity across the six samples. There was significant overall heterogeneity between the six samples (difference in χ2=212.275, difference in d.f.=15, P=6.73 × 10−37, Akaike's information criterion=−182.275). The A estimates are correlated with the average age of the samples with the exception of the Netherlands sample, which included twins of a wide age range. Estimates from the equated model were A=0.55 (0.51–0.59); C=0.21 (0.17–0.25); E=0.24 (0.23–0.25).
Analyses by age
The means and s.d. for g in the three age groups and in the combined sample yielded negligible sex and zygosity differences although the differences were statistically significant because of the large sample sizes (Supplementary Table 3). Intraclass twin correlations (Table 2) show that for all three age groups, MZ correlations are significantly greater than DZ correlations, suggesting genetic influence. This rough estimate of heritability (doubling the difference in MZ and DZ twin correlations) suggests a linear increase in the heritability of g from 42% in childhood to 54% in adolescence to 68% in young adulthood.
Model-fitting analysis was used to obtain maximum-likelihood estimates of parameters to test the fit of the model and the relative fit of alternative models using raw data. There were significant genetic and environmental influences for all the three age groups (Table 3). These model-fitting estimates are illustrated in Figure 1 for the three age groups. Similar to the heritability estimates from the twin correlations, the model-fitting heritability estimates (A) indicate a linear increase from 41% in childhood to 55% in adolescence to 66% in young adulthood.
The heterogeneity model was applied to the data to assess significance of differences across the three age groups. Fit statistics from this heterogeneity model are shown in Table 4. There was significant overall heterogeneity between age groups (‘Equate all ACE’ in Table 4) and there was significant heterogeneity specifically for heritability estimates between age groups (‘Equate all A’). Finally, we tested the hypothesis of a linear and significant increase in heritability by testing for differences in A estimates between just two age groups at a time. Heritability was significantly lower in childhood than in adolescence (‘Equate A groups 1 and 2’) and heritability in adolescence was significantly lower than in young adulthood (‘Equate A groups 2 and 3’).
Although our focus is on testing the hypothesis that the heritability of g increases linearly from childhood to young adulthood, an interesting developmental trend also emerged from the environmental results. At all three ages, the best-fitting model was one that included shared environment (C) as well as non-shared environment (E) (Table 3). However, shared environment shows a decrease from childhood (33%) to adolescence (18%) but remained at that modest level in young adulthood (16%) (Tables 3 and 4). Non-shared environment does not change from childhood (26%) to adolescence (27%) but shows a modest but significant decline in young adulthood (19%).
We have for the first time shown that genetic influence on general cognitive ability increases significantly and linearly from childhood to adolescence to young adulthood. In this section, we consider the implications of this finding.
Finding such a dramatic increase in genetic influence on g during the school years—which by early adulthood accounts for two-thirds of the total variance of g—has far-reaching implications for fields as diverse as molecular genetics, neuroscience and education. For molecular genetics, the developmental increase in heritability implies that it should be easier to identify genes responsible for genetic influence on g by studying adults rather than children. The first genome-wide association study of g has been reported but it yielded inconclusive results, perhaps because the age of its sample was only 7 years.59 More generally, multivariate genetic research indicates that g accounts for nearly all of the genetic variance on diverse cognitive abilities,60 which, taken together with evidence for its substantial heritability and its societal importance, suggests that g is a good target for molecular genetic research.
In relation to neuroscience, the increasing genetic impact of g during development needs to be understood in relation to the development of brain processes that mediate genetic effects on g. These brain pathways between genes and g are not necessarily due to a single general physical (for example, dendritic density), physiological (for example, synaptic plasticity) or psychological (for example, executive function) structure or function. To the contrary, it has been suggested that each gene associated with g affects many such processes (pleiotropy) and many genes affect each process (polygenicity).61 Although tracing the development of such diffuse brain pathways between genes and g is daunting, g genes could boost a systems approach to the brain by opening tiny windows through which we can view diverse brain networks that are integrated functionally in their effect on our ability to reason, to solve problems and to learn.
The educational implications of this finding come from the answer to the question posed by this research: Why, despite life's ‘slings and arrows of outrageous fortune’, do genetically driven differences increasingly account for differences in general cognitive ability during the school years? It is possible that heritability increases as more genes come into play as the brain undergoes its major transitions from infancy to childhood and again during adolescence.62 However, longitudinal genetic research indicates that genes largely contribute to continuity rather than change in g during the school years.14, 22, 63, 64, 65 We suggest that the developmental increase in the heritability of g lies with genotype–environment correlation: as children grow up, they increasingly select, modify and even create their own experiences in part on the basis of their genetic propensities.66, 67, 68, 69 This leads to an active view of experiences relevant to cognitive development, including educational experiences, in which children make their own environments that not only reflect but also accentuate their genetic differences.
The GHCA consortium is supported by a grant from the John Templeton Foundation (no. 13575). The opinions expressed in this report are those of the authors and do not necessarily reflect the views of the John Templeton Foundation. We thank Andrew McMillan for his support with data management. Support obtained for the GHCA consortium members’ twin studies are as follows. Western Reserve Reading Project (Ohio): US National Institute of Child Health and Human Development (HD038075 and HD046167). Twins Early Development Study (United Kingdom): UK Medical Research Council (G0500079) and the US National Institute of Child Health and Human Development (HD044454 and HD046167). Minnesota Twin Family Study (USA): USPHS grants AA009367, R01 DA005147 and R01 DA013240. Colorado Twin Studies (USA)—LTS: HD19802, HD010333, HD18426, MH043899 and the MacArthur Foundation; CTS: VA1296.07.1629B and DA011015; CLDRC: HD11681 and HD027802. Twin Cognition Study (Australia): Australian Research Council (A7960034, A79906588, A79801419, DP0212016 and DP0343921) and The Human Frontier Science Program (RG0154.1998-B). Netherlands Twin Register: Dutch Organization for Scientific Research (NWO 051.02.060, NWO 480-04-004, NWO 575-25-012 and NWO/SPI 56-464-14192) and Human Frontiers of Science Program (RG0154/1998-B). D Posthuma is supported by NWO/MaGW VIDI-016-065-318.
About this article
Supplementary Information accompanies the paper on the Molecular Psychiatry website (http://www.nature.com/mp)