We are on the brink of exciting discoveries into the molecular genetic underpinnings of autism spectrum disorder. Overwhelming evidence of genetic involvement coupled with increased societal attention to the disorder has drawn in more researchers and more research funding. Autism is a strongly genetic yet strikingly complex disorder, in which evidence from different cases supports chromosomal disorders, rare single gene mutations, and multiplicative effects of common gene variants. With more and more interesting yet sometimes divergent findings emerging every year, it is tempting to view these initial molecular studies as so much noise, but the data have also started to coalesce in certain areas. In particular, recent studies in families with autism spectrum disorder have identified uncommon occurrences of a novel genetic syndrome caused by disruptions of the NLGN4 gene on chromosome Xp22. Previous work had identified another uncommon syndrome that is caused by maternal duplications of the chromosome 15q11–13 region. We highlight other converging findings, point toward those areas most likely to yield results, and emphasize the contributions of multiple approaches to identifying the genes of interest.
Autistic disorder (OMIM 209850) is a developmental disorder characterized by three areas of abnormality: impairment in social interaction, impairment in communication, and restricted and repetitive patterns of interest or behavior.1 Autistic disorder is one of a few diagnoses within the category of pervasive developmental disorders (PDD), which also includes Rett syndrome (RTT), PDD not otherwise specified (NOS), and Asperger syndrome, which is characterized by relatively spared communication despite abnormalities in the other two areas. With the exception of RTT, the PDDs are not distinct or categorical disorders but instead represent positions on a spectrum of social and communication impairment and behavioral restriction and repetition.2 Recent research has examined autistic traits in a population of twins and found that social impairment actually follows a unimodal distribution without a clear demarcation to separate cases of the disorder.3 For this reason, future discussion of autism may more appropriately refer to autism spectrum disorder (ASD).
ASD has a large genetic component with complex inheritance. The prevalence is 0.1–0.2%4 for narrow diagnosis of autistic disorder and 0.6% for ASD. The prevalence of autistic disorder is approximately four times higher in males than in females, with an even higher ratio in milder forms of ASD. Some concern has been raised about a possible increase in prevalence, but changes in diagnostic methodology and ascertainment strategy complicate comparisons across time.5 Twin studies show a 60–91% concordance rate in monozygotic twins, depending on whether a narrow or broad phenotype is considered, in contrast to no observations of concordance in dizygotic twins under narrow phenotypic definition and 10% concordance under broader phenotypic definition.6 Sibling recurrence rate has been estimated to be 4.5%.7 This pattern of sharply increasing risk for first-degree relatives and monozygotic twins relative to the population prevalence does not fit a simple dominant or recessive model, but indicates the involvement of multiple genes interacting with one another to lead to disease susceptibility.
Direct approaches to identifying ASD susceptibility genes include three overlapping methodologies: chromosomal methods, including karyotyping and fluorescent in situ hybridization (FISH); linkage studies, including genome screens in affected sibling pairs; and gene association studies, including candidate gene studies. Indirect approaches may also be helpful. Less complex genetic diseases with some shared symptoms, such as RTT or Fragile X syndrome, may result from disruption of gene or protein systems that may also be disrupted in other cases of ASD. Likewise, animal models may provide clues to genes and protein systems important in relevant behaviors, but it is unlikely that animal behaviors will fully parallel the human disorder. Postmortem and neuroimaging studies may inform genetic approaches, and, conversely, genetic studies will ideally be followed up with postmortem and neuroimaging studies, particularly using neuroimaging methods capable of studying neurochemical function in vivo.
Within these approaches to ASD genetics, converging clinical and molecular methodology will be needed to produce insight into the underlying neurobiology. Clinical researchers are playing a critical role in sequentially parsing ASD into subgroups and symptom clusters using approaches such as factor and cluster analysis. Heterogeneity may be particularly important in considering divergent findings between studies using singleplex families and those using multiplex families, since the latter are uncommon and may reflect a different grouping of gene variants despite a convergent clinical picture.8 Finally, since gene variants in complex genetic disorders are likely to be both subtle and diverse,9 it is particularly important to fully characterize gene variation and to use haplotype analysis to clarify and narrow regions of interest.
To emphasize synergy and convergence in molecular genetic methodologies, the review begins with a description of ongoing research in two example regions of the genome, chromosomes 7q and 15q11–13. Key concepts and findings from each of the various individual methodologies are then reviewed in turn, including molecular findings in relatively less complex and chromosomal disorders, important concepts and findings in complex genetics, and finally, parallel methodologies from animal, epigenetic, gene expression, and proteomic work that may yield important insights for ASD.
Converging methodologies: examples
Following the first few genome screens in ASD, most excitement focused on a region of chromosome 7q that approached genome-wide significance and was identified in multiple studies.10, 11, 12, 13, 14, 15 The area implicated is broad, with some suggestion of two distinct peaks, 7q21–22 and 7q32–36. Limited evidence supports a role for paternal imprinting in this region.13 Research in this region demonstrates the intersections of linkage with chromosomal and association methods. Furthermore, simple genetic disorders and animal models have informed the selection of regional candidate genes. Finally, for one candidate gene, RELN, follow-up studies of protein and mRNA are being pursued in ASD brains.
Some of the most promising leads in the region have been a few chromosomal anomalies, including both inversions and translocations, which were reported near 7q31.13, 16, 17, 18 Mapping of one of these translocation breakpoints identified the disrupted gene as RAY1/ST7 (FAM4A1). RAY1/ST7 is a putative tumor-suppressor gene that shows a very complex genomic organization with multiple splice-variant isoforms.19 Thus far, mutation screening of RAY1/ST7 in individuals with ASD has revealed no clear evidence of involvement in a larger group of patients, but further study is warranted, including mapping of the other breakpoints in this region.
Investigators have also tested association at a number of candidate genes in this region, as has happened in most chromosomal regions with any evidence for linkage. While they may be positioned near the region of interest, most of these genes have also been selected on the basis of findings in simple genetic disorders or animal models. While little evidence currently supports any candidate gene in this region, it is instructive to consider how three of these regional candidate genes were chosen and studied. A number of other genes have also been studied without evidence of involvement,18, 20, 21, 22, 23 and investigators must be careful to statistically correct for multiple candidate genes considered in any given ASD sample.
The most compelling candidate gene, FOXP2 on 7q31, is disrupted in one family with an autosomal dominant form of specific language impairment (SLI).24 Although initial studies did not implicate the coding region of FOXP2 in ASD,25, 26 further study is warranted after association to potential gene regulatory regions was reported in a larger sample of patients with SLI.27
Another candidate gene, WNT2 on 7q31, encodes for a member of the same signaling pathway as DVL1. Mice lacking DVL1 show abnormal sensory gating and reduced social interaction.28 Mutations leading to amino-acid changes were detected in WNT2 in two families with autism, with one affected parent transmitting the mutation to two affected children.29 A modest association was also reported for a 3′ polymorphism in a sample of ASD sibling pairs,29 but this has not yet been replicated.30, 31
A third candidate gene, RELN on 7q22, was originally identified in the mutant reeler mouse line32 and subsequently found in human autosomal recessive lissencephaly, a disorder of failed neuronal migration.33 An initial study in autistic disorder reported preferential transmission of ‘long’ alleles of a 5′ repeat polymorphism.34 A replication study found that overall transmission distortion was not significant but that ‘long’ alleles were preferentially transmitted.35 Four additional studies in large samples found no evidence for association with RELN.31, 36, 37, 38 Abnormalities in expression of the corresponding protein, reelin, are being investigated in postmortem brain studies of autistic disorder.32
Data on multiple fronts lead to the conclusion that chromosome 7q21–36 contains one or more gene variants that cause ASD susceptibility. Study of this region has largely followed the traditional progression of broad linkage findings being refined and then followed up by association and mutation screening, but a notable role has also been played by translocations and information from animal models and simple genetics. We can expect that continued efforts on these multiple fronts will lead to the responsible variant(s) within the next few years. Subsequent studies of gene expression and protein function in humans and animal models will lead us to a greater understanding of ASD pathophysiology and hopefully therapeutics.
Chromosome 15q11–13, shown in Figure 1,39 offers a parallel example of efforts to find susceptibility genes, with most information coming from chromosomal and epigenetic findings with additional efforts on the linkage and association fronts. Maternal interstitial duplication or supernumerary inverted duplication of 15q11–13 is the most frequent chromosomal anomaly in ASD, accounting for 1–3% of patients.40, 41 Initial studies to characterize the phenotype have found variation among affected people including mental retardation, motor coordination problems, seizure disorder, and impairments in attention, communication, and social function (some but not all with ASD or attention deficit hyperactivity disorder (ADHD)).42, 43
This duplication is the converse of deletions of the same region in Angelman syndrome (AS). Four mechanisms underlie AS: mutation in UBE3A,44 a gene important in protein degradation; deletion of the maternal chromosome 15q11–13 region; complete lack of the maternal 15th chromosome (paternal uniparental disomy), or imprinting mutations (in which the switch in methylation pattern of a mother's chromosome inherited from her father does not take place upon transmission to her child). AS is characterized by moderate to profound mental retardation, ataxia, hypotonia, characteristic facies, epilepsy, absence of speech, and predominant smiling and laughter. A population-based study showed a high rate of ASD in AS.45 AS is thought to result largely from lack of expression of the maternally expressed UBE3A gene in the brain; however, patients with deletions have more severe phenotypes, suggesting that absence of other genes, perhaps including a GABAA receptor gene cluster, may contribute to the more severe phenotype in deletion cases.46
Two questions are central to understanding the importance of maternally inherited 15q11–13 duplications in ASD. First, how do these duplications lead to an increased risk for ASD? Second, are gene variants in this region implicated in the larger group of ASD patients who do not have duplications? Efforts to answer the first question have centered on imprinted genes that may show a shift in expression when duplicated. As might be expected, expression of maternal UBE3A transcripts is increased in patients with duplications.47 Epigenetic studies in this region recently identified ATP10C as another imprinted gene that may contribute to the duplication phenotype.48, 49 Additional imprinted transcripts may also be important in the duplications, including antisense transcripts that could regulate gene expression.50
Numerous studies have sought to connect the duplication phenotype to the larger group of patients with ASD. Paralleling AS, at least two possibilities exist for an additional role for chromosome 15q11–13 variation in ASD. First, imprinting may be disrupted, requiring investigators to directly study methylation patterns in a sample of patients with ASD, likely including postmortem brain studies. Second, a gene that is expressed abnormally in the duplication syndrome may also show altered expression or function in ASD. Thus far, this second possibility has eluded the simplest explanations, with no ASD coding region mutations detected in the two maternally expressed genes, UBE3A or ATP10C.51, 52 Association studies in the maternal expression domain have been largely negative, with a couple of positive studies. These studies have been limited in power, particularly power to detect the parent-of-origin effects expected within an imprinted domain. One sibling pair study found association at D15S122, a microsatellite marker located in the 5′ end of UBE3A.53 A follow-up study found association at multiple SNPs within ATP10C54 with greater evidence within trios than within sibling pairs.
The complexity of chromosome 15q11–13 gene regulation extends the area of interest outside of the maternal expression domain. An initial study identified significant association at GABRB3 155CA-2, a polymorphism within the amino butyric acid (GABA) receptor A β3 subunit (GABRB3) gene.55 This finding has been replicated in one study,56 but not in others. A microsatellite marker, GABRB3, which is 3′ to the GABRB3 gene, has also been implicated in one association study.57 Linkage results in this region have also been quite inconsistent. The most promising finding has been modest evidence for linkage at D15S511, 5′ of GABRB3, which increased in a subgroup of sibling pairs that had relative sparing of one area of cognitive function relative to other areas.39 Limited evidence for linkage has also been detected in this region without subject subgrouping.58, 59
In conclusion, the chromosome 15q11–13 region is the source of a duplication syndrome that cannot be clinically differentiated within ASD, and which overlaps with a symptom of a potential subgroup of patients with ASD, in which patients have more mood instability, more anxiety related to change, more pronounced unevenness in development, and a higher risk for seizures in adolescence. Research into the pathophysiology of the syndrome proceeds in parallel to study of this region in the larger ASD population. Initial association and linkage efforts that lack sufficient power suggest that we are likely to find additional important variants. While research proceeds into the complex epigenetic phenomena regulating gene expression, current evidence supports intense scrutiny of this region using a broader, denser net of polymorphisms including haplotype analysis and an increased sample size to generate adequate power. As a parallel approach, the duplication syndrome could be used to develop a transgenic mouse model that would generate significant insight into pathophysiology.
Relatively less complex genetic and chromosomal disorders
A few relatively less complex genetic or chromosomal disorders include prominent neuropsychiatric symptoms relevant to ASD. Since the etiology of these disorders is partially known, investigators can hope to gain insight into the more complex genetics of ASD by studying the molecular and brain system abnormalities both in humans with these diseases and in corresponding lower mammallian disease models. A number of other simple genetic diseases primarily affect other systems but also commonly share some features of autism spectrum disorder. For example, Smith–Lemli–Opitz syndrome (SLOS) results from a defect in cholesterol synthesis and involves structural deficits throughout the body, but many SLOS patients also have ASD symptoms that may relate to abnormal serotonergic development.60 More single gene or chromosomal disorders will likely be identified from within the heterogeneous syndrome of ASD.
Maternal chromosome 15q11–13 duplications and triplications are the most common relatively less complex genetic syndrome identified to date in samples of patients with autism spectrum disorders. This syndrome is discussed above as part of example II.
Mutations of the NLGN4 gene define a new, relatively less complex genetic syndrome identified within families with ASD.61 The NLGN4 gene is located on chromosome Xp22.33, in an X-specific region near the junction of the pseudoautosomal region. It lies within a region where deletions have been identified in several patients with autism.62 The NLGN4 gene product has not been studied extensively, but it shares about 70% homology with the rest of the neuroligin family that complexes with neurexin to facilitate synaptic formation and function.63, 64 It has a counterpart NLGN4Y on the Y chromosome that is expressed in a similar pattern within the male brain.61
The first truncation mutation identified in NLGN4 was found in two brothers with ASD as well as their unaffected mother,61 but no other family members had the mutation. A different NLGN4 truncation mutation was detected in a large family with a pattern of X-linked mild mental retardation and a minority of affected members with ASD.65 Thus far, every male identified as having a mutation disrupting NLGN4 has manifested a phenotype with variable expression ranging from mental retardation to ASD without mental retardation. Complete penetrance for autism and/or mental retardation identifies this as a relatively less complex genetic syndrome related to autism; however other genetic factors may contribute to the variability in phenotypic expression.
Coding region mutations in NLGN4 appear to be an uncommon cause of autism and X-linked mental retardation, since hundreds of other subjects have been screened without identifying additional mutations.61, 65 Association studies and larger-scale resequencing efforts will be crucial to evaluate whether gene regulatory region or other variants in NLGN4 might be involved in a larger group of patients with ASD. Even if only a few families are affected by perturbations of this gene, further study of the neuroligin protein system and its impact on the developing nervous system may reveal important information about pathophysiology leading to ASD and/or mental retardation.
Rett syndrome (RTT), one of the DSM-IV PDDs, is a rare X-linked disorder that is almost always sporadic. Girls with classic RTT lose speech and purposeful hand movements at 6–18 months and subsequently progress to develop microcephaly, seizures, ataxia, stereotypic hand movements, abnormal breathing patterns, and autistic behavior. Mutations in MECP2, the gene encoding methyl-CpG binding protein 2 (MeCP2), were recently identified as the cause of RTT.66 The MeCP2 protein normally binds methylated CpG dinucleotides, forming part of a complex that represses gene expression. Therefore, RTT must somehow be a result of inappropriate gene overexpression at some time and location during development. Heterozygous female mecp2 knockout mice develop symptoms after reaching maturity, which raises the possibility that the genetic defect could affect brain stability rather than development67, 68 and is consistent with the finding that MeCP2 protein is expressed after neurons reach a certain level of maturity.69
Fragile X Syndrome (FRAXA) is the prototypical genetic syndrome caused by an expanding trinucleotide repeat sequence. The typical clinical picture in FRAXA includes mental retardation, macro-orchidism, large ears, prominent jaw, and high-pitched speech. Greatly increased rates of ASD and ADHD are also observed in FRAXA. In contrast, the rate of FRAXA in the population of patients with autism is very low. Within neurons, the FMR protein (FMRP) interacts with mRNA and ribosomes, suggesting a role in regulating protein synthesis.70 FMRP is heavily synthesized in dendritic spines in response to synaptic activity, and abnormal dendritic spine size and shape have been noted in FRAXA patients and fmr1 knockout mice.71 These abnormalities may correspond to an abnormal postsynaptic response that weakens synaptic connections.72
Turner syndrome affects females who possess only one copy of the X chromosome (45,X). The typical clinical features include short stature, webbing of the posterior neck, increased carrying angle of the arms, sternal deformities, and streak ovaries, as well as variable social and cognitive impairments and an elevated risk of ASD and ADHD. Analysis of parental origin of the X chromosome revealed that girls who received an X chromosome from their mother (45, Xm) (and therefore did not receive one from their father) had significantly worse social cognition, behavioral inhibition and verbal and visuospatial memory than those who received their X chromosome from their father (45, Xp).73 The parent-of-origin effect suggests an imprinted locus (a gene on the X chromosome expressed only when inherited from the father) that could lead to differences between male and female psychosocial development, since males do not receive an X chromosome from their father. Such an effect could be partially responsible for male predominance of ASD.
Concepts and contributions in complex genetics
Locus heterogeneity means that defects in different loci, or genes, may cause the same phenotype. The concept arose in relatively simple genetics, where, for example, defects in one of a few different genes can cause hereditary nonpolyposis colorectal cancer (HNPCC).74 This idea becomes much more complicated in complex genetic disorders, where different (likely overlapping) clusters of genetic variants may cause susceptibility to disease. Various methods can be devised to reduce locus heterogeneity within a genetic sample, including subgrouping or quantitative trait analysis as described below. Population isolates may also contain fewer different susceptibility variants and have been helpful in other complex genetic diseases.75
Allelic heterogeneity means that different variants in the same gene may lead to different patterns of genetic disease. This concept also arose in simple genetic disorders, where different mutations in the same gene can lead to different phenotype patterns. For example, many different mutations in a single gene (CFTR) have been described in cystic fibrosis, some of which are associated with particularly mild forms of the disorder or may lead to more pancreatic than lung findings.76 Similar patterns are emerging at the MECP2 gene in RTT and X-linked mental retardation.77 As rare and common variants in relevant genes (eg the serotonin transporter (SLC6A4)), are better understood, allelic heterogeneity may be better dilineated in complex genetic disorders. In addition to being important in contributing to divergent phenotypes, multiple alleles within a single gene may make different quantitative and qualitative contributions to polygenic susceptibility to a single phenotype.
Clinical heterogeneity, potential endophenotypes and quantitative trait analysis
ASD itself also exhibits significant clinical heterogeneity. Recent advances in diagnosis, such as the autism diagnostic inventory (ADI-R)78 and autism diagnostic observation schedule (ADOS)79 may reduce uncertainty in diagnosis, but considerable variability remains. Symptoms and signs, rather than etiologies, comprise psychiatric syndromes such as autism, as well as other medical syndromes, including diabetes and inflammatory bowel disease. Each gene may make a somewhat different contribution to the disorder, with gene X important in social cognition and gene Y important in language acquisition. When clustering of risk alleles reaches a certain threshold, an individual is at increased risk of developing the disorder. A subthreshold number of risk alleles may result in the broader autism phenotype identified in family members of patients with autism.80, 81
Endophenotypes are measurable components related to a syndrome, including anatomical, biochemical, neuropsychological or other data.82 A quantifiable trait offers a number of potential advantages over the diagnostic category itself. The trait might be measurable with more reliability and validity than is possible for the disorder. Inheritance of a trait may correspond to fewer genes. A trait may capture the true nature of a disorder as a spectrum and could even expand the number of subjects by tracing it through family members not affected with the full disorder. Finally, statistical methods may be more powerful for quantitative than for dichotomous variables. One example of the endophenotype approach is in schizophrenia, where abnormalities in the P50 auditory evoked potential response have been associated with polymorphisms in CHRNA7.83
Initial analyses in ASD have primarily considered the endophenotype of general language impairment. A number of other potential quantitative traits would be of particular interest in autism, including most prominently head circumference and whole blood serotonin, but also several others: level of intellectual functioning, degree of social or communication impairment, presence of seizure disorder, dysphmorphology, savant abilities, and restrictive and repetitive behaviors.
An increased rate of macrocephaly in autistic disorder was mentioned in the initial description.84 More recent observers note that most children with autism are born with normal head circumference and show an increased rate of growth during early childhood;85 however only about 20% of children with autistic disorder meet criteria for macrocephaly.86 The increased rate of growth in head circumference appears to be most dramatic in the first year of life and corresponds to increased growth of the cerebral cortex as measured by MRI.87 A number of questions follow logically from these observations, including whether there is a corresponding behavioral subgroup. Since head circumference is monitored in most countries during pediatrician visits during the first year of life, this trait could quickly be incorporated into genetic analyses in ASD.
Since the first description of hyperserotonemia in autistic disorder,88 numerous studies have identified elevated whole blood or platelet serotonin (5-HT) in about 25% of patients.89 Initial functional imaging studies in autistic disorder suggest that brain serotonin synthesis capacity may be increased.90, 91 More research to characterize the serotonin system, particularly in the periphery, could yield important information about the abnormalities found in ASD. An association mapping study of whole blood serotonin independent of psychiatric disorder revealed suggestive evidence of linkage on chromosome 16p that converges with linkage in ASD, as well as evidence for association at an integrin gene, ITGB3, on 17q.92
Several chromosomal anomalies have increased rates in autism, including maternally inherited duplications of chromosome 15q11–13, AS, Prader–Willi syndrome, Down syndrome, Turner syndrome and deletions of 2q37. None of these chromosomal disorders is found in more than 4% of ASD samples. Phenotypes for some of these anomalies have been characterized outside of ASD diagnoses, and these are detailed in the section on genetic and chromosomal syndromes.
Translocation breakpoint identification studies have identified a number of genes disrupted in single patients with autism, some in regions on 2q and 7q (see above) implicated by linkage studies, but abnormalities in the coding regions of these genes have not been found in a larger population of patients.16, 93, 94, 95, 96, 97 Quite a number of other deletions or rearrangements have also been reported scattered throughout the genome.98 While some of these chromosomal abnormalities may not be causal, mapping these breakpoints has the advantage of clearly identifying a disrupted gene or genes. Translocations or deletions are likely to provide the key for gene identification in some of the regions already implicated by other methods, but chromosomal anomalies remain understudied in ASD.
Genome-wide approaches using linkage statistics identify large chromosomal regions of interest within families with a given phenotype. Investigators generally favor non-parametric approaches that make no assumptions about disease transmission and typically use sibling pairs to identify increased sharing of alleles among affected family members at particular points in the genome.99 Initial genome-wide linkage studies in autistic disorder have been underpowered to detect genes of small effect, and, therefore, findings have been scattered and inconsistent across samples. Susceptibility genes are likely lurking under some of the small linkage peaks that have been reported in one study or another, but until we have the power to reliably detect linkage, it will be difficult to confirm these initial findings. Furthermore, without sufficient power (possibly even in large samples due to heterogeneity), we are certain to miss some regions of the genome that contain susceptibility genes. Analysis within one of the larger samples suggests the involvement of at least 15 and likely many more genes.100 While there is no clear relationship between heritability and gene effect size in complex genetics,101 samples of at least 1000 sibling pairs will be necessary to reliably detect linkage to regions with modest effect sizes in the presence of locus heterogeneity.102 Furthermore, the logical next steps for ASD linkage studies, including analyses of parent-of-origin effects12 or clinical subgroups, will require even larger sample sizes. Pooling the current linkage data and/or samples would offer a running start toward this goal.
A few regions of interest are emerging across the reported linkage studies. Significant linkage findings have been reported on chromosomes 2q14 and 3q.103 Several other regions of interest with suggestive linkage evidence have emerged in more than one study, including regions on chromosomes 7q (see above), 13q, 16p, and 17q.11, 14, 58, 59, 100, 103, 104, 105
Now that first-pass genome screen data are available in multiple samples, investigators are trying to identify subgroups or quantitative traits that may refine linkage findings in particular regions. Again, these analyses are somewhat limited by low power to detect genes of small effect, but they may be able to clarify existing linkage peaks. The primary subgrouping considered to date has been phrase speech delay (PSD) past 36 months, which occurs in about 50% of autistic disorder sibling pairs. Linkage to chromosome 2q increased in two samples when restricted to sibling pairs with PSD.104, 106. One PSD-restricted sample also showed increased linkage to chromosome 7q,107 which was also implicated in a quantitative trait analysis of age at first word and repetitive behavior within ASD sibling pairs.108
Association and candidate gene studies
Association methods are considerably more powerful than linkage methods at a given locus, allowing genes of weaker effect to be detected, but the polymorphisms tested must be much closer to the susceptibility variant. For this reason, tests of genetic association have generally been used to study candidate genes in disease. Family-based association testing has become the gold standard in avoiding population stratification bias. More recently, methods that properly control for potential population differences (Genome Control and STRAT) have been developed for case–control studies.109, 110 As genotyping throughput increases and costs decrease, genome-wide association studies of large samples will soon offer the potential to detect genes of small effect without requiring hypotheses about pathophysiology or chromosomal location. Coupling this approach with layered analysis of multiple samples may allow investigators to efficiently avoid both types I and II error. Population isolates with extensive linkage disequilibrium may also be very helpful for genome-wide association mapping.111
Family structure and ascertainment differs between association and linkage studies. Most association studies are conducted in samples of autism trios, with an affected proband and two parents. Linkage studies are primarily conducted in families with multiple affected members that are uncommon in ASD since the sibling recurrence risk is only about 4.5%. These multiplex families may be biased toward families with less complex genetics than seen in the overall disorder despite a clinically indistinguishable phenotype.8 Some hypothesize that such enriched linkage samples are more likely to contain multiple rare polymorphisms within a single gene of interest, complicating identification of genetic variation important in the overall population.9 In contrast, families ascertained in association studies likely reflect more closely the typical disorder. One would predict that many genes would be common to both groups of families, but some may be detected in only one or the other. Except where specifically noted, association studies discussed in detail below were conducted in samples of trios with probands with ASD.
Numerous candidate gene studies have been conducted in ASD on the basis of relatively limited knowledge of pharmacology, developmental neuropathological abnormalities, or chromosomal anomalies. Since we have very little knowledge of pathophysiology in ASD, a limited number of genes represent primary candidates for the disorder and deserve more scrutiny. Although admittedly likely to miss several of the susceptibility genes, our laboratory defined the serotonin transporter (SLC6A4) and 5-HT2A serotonin receptor genes (HTR2A) as the two primary candidate genes. GABRB3 was one of the eight secondary candidates. After studying primary candidate genes, some justification could be concocted for studying almost any gene that is expressed in the brain, which most likely will include at least half of the genes if every neuronal and glial cell throughout development is considered. Further, investigators and reviewers struggle to identify the appropriate level of statistical correction for association tests, with the correct approach lying somewhere between a Bonferroni correction for each gene tested in a given sample to correction for every polymorphism in the human genome.112
Most association studies have considered segregation of disease with alleles at individual single-nucleotide polymorphisms (SNPs); however, unless a single known functional variant accounts for disease susceptibility, studying multiple polymorphisms in a given gene or region may offer significant advantages. Without using multiple polymorphisms, any association study is likely to be vulnerable to type II error by failing to adequately cover the gene of interest.113 Furthermore, haplotype analysis may capture information from flanking markers to generate more informative transmission data. In the quite likely situation of multiple susceptibility variants within a single gene or gene region, haplotype analysis provides greater power to detect susceptibility variants that segregate with multiple distinct haplotypes that may include both alleles at a single SNP.114 For a more detailed discussion of haplotypes and association studies, see Wall and Pritchard.115
The serotonin transporter gene (SLC6A4) is a primary candidate gene in ASD based on increased platelet serotonin uptake in hyperserotonemic first-degree relatives of probands with ASD and responsiveness of OCD-related symptoms of autism to potent serotonin transporter inhibitors.116 A variable number tandem repeat (VNTR) polymorphism in the SLC6A4 promoter, 5-HTTLPR, has been shown to affect transcription and has been associated with neuroticism or anxiety,117 as well as liability to depression in the face of adverse life events.118 Another VNTR in SLC6A4 intron 2 may also affect transcription.119, 120
In autistic disorder, most but not all studies have had nominally significant evidence of transmission disequilibrium for SLC6A4 polymorphisms. The initial study found overtransmission of the short 5-HTTLPR allele, but stronger overtransmission of a haplotype marked by the short allele of the 5-HTTLPR and the 12-repeat allele of the intron 2 polymorphism.116 Subsequent studies have focused primarily on the 5-HTTLPR and have been inconsistent. Some have replicated preferential transmission of the short allele of this polymorphism or haplotypes including the short allele,121, 122, 123 while others have found overtransmission of the long allele.124, 125, 126 Some family-based studies have found no significant association at this polymorphism.127, 128, 129 One study found a nominally positive haplotype association in the trio portion of the sample in contrast to significant findings when sibling pairs were included.130 Some inquiries have expanded to include other polymorphisms within or near SLC6A4, finding association with a number of 5′ single-nucleotide polymorphisms.121, 122
Linkage studies also find evidence in this region. One study found a single-point LOD score of 3.6 at the intron 2 VNTR marker within SLC6A4.14 The largest sample analyzed to date showed the strongest signal (MLS 2.83) overlapping the serotonin transporter gene.105 Another study reported suggestive linkage in a region of 17q overlapping with SLC6A4 and found more evidence for linkage in a subgroup of families with elevated scores on a rigid-compulsive factor derived from the ADI-R.123
Investigators have also sought to understand the relationships between these polymorphisms and whole blood serotonin levels with varying levels of success. Coutinho et al129 reported significant association between elevated whole blood serotonin and haplotypes of the 5-HTTLPR and intron 2 polymorphisms. Three studies in ASD found similar nonsignificant trends for association between elevated whole blood serotonin and homozygosity for the long form of the 5-HTTLPR polymorphism.131, 132, 133
We are left with an incomplete understanding of the role of the serotonin transporter gene in autism. Enough evidence has accumulated to suggest that this gene is very likely involved in ASD. Studies of trios appear more likely to detect association within the 5′ region of the gene, but sibling pair studies have identified linkage in this region as well. The linkage samples may contain some families with rare mutations that may not correspond to the association pattern observed in the overall population.9 For example, a rare missense mutation has been reported in the serotonin transporter gene in families with individuals affected with obsessive–compulsive disorder, social anxiety disorder, and Asperger disorder.134 Heterogeneity could also be at work on a larger level, with different variants assuming an important role in different populations. On the other hand, increased significance with haplotype analyses suggests that the responsible variant or variants have not yet been identified but may be in partial linkage disequilibrium with polymorphisms currently being studied. It is interesting that the only nonsynonymous variant found in a population-based screening of SLC6A4135 was associated with a phenotype including Asperger's disorder and obsessive–compulsive disorder.134 Moreover, the gene regulatory region remains incompletely characterized and may contain important polymorphisms, including variation within the 5-HTTLPR itself.136 This region merits more extensive study using haplotype methodology in a larger sample of ASD trios and with further exploration of the relationship between multiple haplotypes and several phenotypes related to autism.
Other candidate genes have been associated with ASD in single studies that await replication. The mitochondrial aspartate/glutamate carrier (AGC1) gene SLC25A12, is located on chromosome 2q31 under the well-replicated linkage peak. Ramoz et al137 recently reported a strong family-based association between autistic disorder and two SNPs within SLC25A12 that showed increased significance when considered as a haplotype. The NLGN3 gene, located on chromosome Xq13.1, is a candidate gene in ASD as a member of the neuroligin family. Based on the findings at NLGN4, investigators sequenced NLGN3, revealing an amino-acid change within a highly conserved region in two brothers with ASD.61 No other family members were available for sequencing, so it is difficult to assess cosegregation between the variant and ASD. Mice lacking the glutamate receptor subunit GluR6 (GRIK2) gene show abnormalities in long-term potentiation within the hippocampus,138 and abnormal response at a sister receptor has been identified in mouse models of FRAXA.72 Jamain et al139 reported an association between autistic disorder and maternal transmission of alleles at three polymorphisms within GRIK2, which lies under a suggestive linkage peak on chromosome 6p21. The arginine–vasopressin receptor 1A gene (AVPR1A) is important in regulation of basic social functions in animal models, as discussed below. A preliminary study of AVPR1A in ASD found a nominal association.140 HOXA1 belongs to a gene family that determines pattern formation within the central nervous system. While one family-based association has been reported at HOXA1,141 attempts to replicate this finding in larger samples have shown no supporting evidence.142, 143, 144, 145
Parallel molecular approaches and concepts
Existing animal models
The behavioral phenotype of ASD is not easily converted to lower mammals. Many transgenic mice demonstrate abnormalities in social behavior, but it is difficult to equate these simple behaviors to the pervasive abnormalities in ASD. The most appropriate models to date have been knockout mouse models of RTT and FRAXA, both discussed in the corresponding subsections above. Without the knowledge of human disorders caused by disruptions of MECP2 or FMR1, however, anyone studying these mice would find it difficult to identify the relevant behavioral deficits and to connect them to ASD. Until we have constructed mouse models that parallel known molecular defects leading to ASD in humans, such as maternally inherited chromosome 15q11–13 duplications, we will not be able to correlate accurately the behavioral profile of interest.
A few other transgenic mouse lines do have particularly interesting social phenotypes. Mice lacking the oxytocin gene (Oxt) show an apparent lack of social memory.146 Mice that have had their arginine–vasopressin receptor 1A gene (avpr1a) promoter region replaced by prairie vole avpr1a promoter show increased affiliative behavior.147 Each animal model is likely to reveal one particular protein or neural system underlying the complex capacity to coordinate social behavior.
Epigenetics and alternative modes of inheritance
The definition and application of epigenetics is somewhat controversial. The term is used here to refer to those nuclear mechanisms other than the primary DNA sequence that determine an individual's phenotype and may also be transmissible to offspring; however, some extend the definition to include extranuclear mechanisms (as studied in proteomics) or, broader still, any processes by which genotype leads to phenotype.148 Epigenetic and DNA-based genetic approaches are so intertwined as to be nearly inseparable, and they may eventually fold back together, as we come to understand genes in a broader sense than simply primary DNA sequence. An individual's primary DNA sequence will eventually be supplanted by the primary DNA sequence and modifications (eg methylation) and histone composition and modification (eg phosphorylation and methylation of specific residues). Epigenetics is crucial in allowing genes to be specifically expressed or suppressed as cells differentiate into multiple different tissues during development, but it can also influence transmission of traits or disorders from parent to child.
Two concepts are worth mentioning in epigenetics. First, exogenous factors can modify epigenetic control of gene expression. For example, supplementation with folic acid in patients on hemodialysis reduces homocysteine levels, favors DNA methylation, and changes the pattern of imprinted gene expression.149 Second, induced changes in epigenetic mechanisms can cause lifelong alterations in phenotype that can be transmitted to offspring. For example, supplementation with folic acid and other vitamins early in pregnancy makes female agouti mice more likely to bear offspring with yellow coats who are in turn more likely to bear a third generation with yellow coats.150, 151
Epigenetic mechanisms are important in a few relatively simple genetic disorders relevant to ASD. The RTT gene, MECP2, encodes a protein that in its normal state promotes chromatin condensation at particular points in the genome. Hypermethylation of the trinucleotide repeat region in FMR1 causes FRAXA syndrome. Parental imprinting depends on DNA methylation and histone modification and underlies the maternal chromosome 15q11–13 duplication syndrome. Imprinting also appears to determine susceptibility to behavioral abnormalities in Turner syndrome. In more complex genetic disorders such as ASD, we can expect to find more subtle variation in genes that may encode proteins or RNA molecules that act as regulators of gene expression and thereby perturb multiple systems. Furthermore, Yu et al98 hypothesized that chromatin structure or function may be abnormal in families of patients with ASD, thereby leading to a wide variety of chromosomal abnormalities that reflect rather than cause the primary defect. It is possible that intervention shifting epigenetic effects in one direction or another at a particular gene or set of genes may eventually provide an alternative treatment route for some patients with ASD.
Protein and mRNA approaches
Molecular methodology is now being applied in postmortem studies of autism. These studies have tremendous potential, but initial findings may implicate a different distribution of neuronal types, numbers, and size, rather than implicating particular genes or protein systems in the pathophysiology of disease. Crosstalk between molecular pathology studies and ongoing molecular genetic investigation is crucial. For example, the genetic association at GABRB3 sparked interest in the GABA system in postmortem studies, including evidence that GABAA receptor binding is reduced.152, 153 As protein and mRNA studies using microarrays and other novel approaches continue, postmortem work will generate new candidate genes for study and clarify the role of reported gene associations.
Investigators have followed three approaches in examining areas of the brain with structural abnormalities in autism. The first approach is to consider important proteins within neurotransmitter systems, including the acetylcholine system,154, 155 the glutamate system,156 and the GABA system,152, 153 The second approach is to consider proteins that control neuronal death, due to the finding of macrocephaly in some patients.32, 157, 158 The last approach is to consider peptides thought to be important in neuronal migration and synaptic development.32, 154, 159 Initial findings are promising, but additional work including immunohistochemistry may be necessary to accurately compare protein binding in relation to neuronal location and morphology.
Conclusions and future directions
Initial genetic studies in ASD have been promising, but the detection of genetic variants responsible for disease has thus far been elusive. This is not unexpected in a complex genetic disease. Quantitative traits or subgroups of patients may be identified with a relatively less complex genetic etiology. Subgrouping was possible in RTT on the basis of clinical presentation and then on the basis of gene mutations in MECP2. Maternal duplications of chromosome 15q11–13 allow genetic subgrouping that is beginning to extend to the clinical realm. Epigenetic studies in this region are necessary to clarify its involvement in the general population of patients with ASD. The truncation and missense mutations in NLGN4 in a few patients with ASD, as well as in patients with X-linked mental retardation, identify another subgroup with relatively less complex inheritance. Association and linkage studies in the region are needed to clarify possible involvement of NLGN4 in a larger group of patients. Regardless of how many patients are affected, disruption of this gene in some patients provides investigators with an opportunity to study a relevant protein using gene knockout and other techniques, potentially increasing our understanding of the underlying developmental neurobiological pathophysiology.
Most patients are likely to have variants of multiple genes acting in concert to cause susceptibility to ASD. This may even apply to relatively less complex cases such as 15q11–q13 duplications, FRAXA, and NLGN4 mutations, in which variable phenotypic expression is driven by other variants contributing to protection from or risk for ASD. Different groups of variants, both within and across genes, are likely to contribute to susceptibility in different individuals. Several initial linkage studies have yielded interesting results in various chromosomal regions, and some of these regions have now been implicated in more than one study. A number of family-based candidate gene studies have identified significant evidence for association, but replication is quite inconsistent across sites. To clarify these findings, we will need to focus on full characterization of variation within gene or chromosomal regions while employing haplotype analysis. No gene variant has been identified yet as contributing to autism susceptibility in the majority of patients with ASD. Although identification and confirmation of variants contributing to ASD susceptibility is delayed by genetic complexity, it is likely that multiple targets will be identified for amelioriation, treatment, and possible prevention of autistic disorder in the future.