Big data approaches to decomposing heterogeneity across the autism spectrum


Autism is a diagnostic label based on behavior. While the diagnostic criteria attempt to maximize clinical consensus, it also masks a wide degree of heterogeneity between and within individuals at multiple levels of analysis. Understanding this multi-level heterogeneity is of high clinical and translational importance. Here we present organizing principles to frame research examining multi-level heterogeneity in autism. Theoretical concepts such as ‘spectrum’ or ‘autisms’ reflect non-mutually exclusive explanations regarding continuous/dimensional or categorical/qualitative variation between and within individuals. However, common practices of small sample size studies and case–control models are suboptimal for tackling heterogeneity. Big data are an important ingredient for furthering our understanding of heterogeneity in autism. In addition to being ‘feature-rich’, big data should be both ‘broad’ (i.e., large sample size) and ‘deep’ (i.e., multiple levels of data collected on the same individuals). These characteristics increase the likelihood that the study results are more generalizable and facilitate evaluation of the utility of different models of heterogeneity. A model’s utility can be measured by its ability to explain clinically or mechanistically important phenomena, and also by explaining how variability manifests across different levels of analysis. The directionality for explaining variability across levels can be bottom-up or top-down, and should include the importance of development for characterizing changes within individuals. While progress can be made with ‘supervised’ models built upon a priori or theoretically predicted distinctions or dimensions of importance, it will become increasingly important to complement such work with unsupervised data-driven discoveries that leverage unknown and multivariate distinctions within big data. A better understanding of how to model heterogeneity between autistic people will facilitate progress towards precision medicine for symptoms that cause suffering, and person-centered support.

Autism occurs in approximately 1–2% of the population [1] and autistic individuals’ mental health difficulties are a major public health issue. In economic terms, the lifetime individual cost of autism is estimated at $2.4 (£1.5) million in the United States and United Kingdom and annual population costs are around $268 billion in the United States [2, 3]. While interest in and science investigating autism has been growing rapidly, progress towards translating scientific knowledge into high-impact clinical practice has been small and slow in pace. We are still far from delivering more effective intervention for unwanted symptoms, more precise and earlier diagnosis, better understanding and prediction of prognosis and development, and personalization of support and intervention. All of these points are within the scope of stratified psychiatry [4] and precision medicine [5]. To arrive at this point, our contention is that we will first need to grapple with an important issue holding back progress—heterogeneity within the autistic population.

The field is currently addressing this issue. Some have argued that we are at a crossroad and must acknowledge that the concept of autism as a single entity lacks validity at a biological level [6, 7] and that autism must be taken apart [8]. This idea relates to what others have discussed regarding autism as an umbrella label referring to many different kinds of ‘autisms’ [9] and how the scientific community should abandon attempts to continue characterizing all of autism under a single theory [10]. Research has begun along these new directions but is highly fractionated because heterogeneity is discussed across multiple levels of analysis, from genetics [11], neural systems [12,13,14], cognition [15], behavior and development [16,17,18], and clinical topics (e.g., response to treatment or outcome [19, 20]). Approaches differ in how heterogeneity should be decomposed, from utilizing theoretical a priori known stratifiers [21,22,23] or dimensions to data-driven approaches [12, 24,25,26]. Models for understanding heterogeneity also differ, with some conceptualizing distinctions as categorical/qualitative, continuous/dimensional, and/or where distinctions or similarities may cut across diagnostic boundaries [26,27,28]. Work can also differ with regards to aims that are specific to understanding heterogeneity within one level of analysis [29, 30], while others attempt to explain heterogeneity across levels [31,32,33,34,35,36].

The purpose of this paper is not to provide an in-depth review of the literature on these areas. Rather, we see a need to provide organizing principles for framing these diverse areas of research, so that future synthesis and theoretical development about heterogeneity can be facilitated. Specifically, we first discuss how commonly used terminology such as ‘spectrum’ or the ‘autisms’ can be used to imply different types of models for understanding heterogeneity in autism. Next, we discuss how heterogeneity arises within the context of the historical change in diagnostic criteria. Third, we provide arguments behind why understanding heterogeneity is critical for furthering progress towards precision medicine. Fourth, we discuss some of the problems with the dominant paradigm in the field—the case–control paradigm. In discussing these issues, we point towards problems with small sample studies and the need for bigger data. This leads into a discussion regarding characteristics of big data that are important for studying heterogeneity in autism. We follow this with organizing principles behind how one attempts to understand multi-level heterogeneity. We then discuss the role of transdiagnostic viewpoints which go beyond understanding heterogeneity just within autism. Finally, we conclude with discussions about realistic challenges, mitigating strategies, and clinical implications of big data approaches.

Terminology behind ‘heterogeneity’ and impact on building and evaluating models

The concept of heterogeneity in autism dates back to the original conceptions of an ‘autistic spectrum’ by Wing [37]. Since then, we now apply the concept of heterogeneity beyond just clinical, behavioral, and/or cognitive levels. A hallmark of heterogeneity in autism is its multi-level presentation (Fig. 1c), applicable from genotype through phenotype [9, 10], throughout development [16, 38], and manifesting as important clinical differentiation (e.g., outcome [20], response to treatment [19], etc.). Thus, the concept of heterogeneity not only applies to how individuals differ at one level of analysis, but also when and at which levels those differences arise, and potentially how heterogeneity across levels is coordinated. While the idea of heterogeneity itself has a longstanding history, better explanations are needed behind why heterogeneity manifests across different levels and how they are connected across levels and within or between individuals. Bringing such concepts back to developmental psychopathology, terms such as equifinality and multifinality [39] may be helpful. For example, a diversity of different developmental starting points or causal mechanisms in the genome may reach similar endpoints (equifinality) at levels more proximate to clinical outcomes or behavior [40]. However, very similar mechanisms at one level could also result in a diversity of endpoints (multifinality) [41]. Currently, the mapping of multi-level heterogeneity in autism is unclear, but it is imperative that we understand these mappings which are likely to be indicative of useful explanations towards precision medicine goals.

Fig. 1

Approaches to decomposing heterogeneity in autism. a A population of interest is shown, and autism cases are colored in green, pink, and blue. The different colors are meant to represent different autism subtypes. In b we show the impact of ignoring heterogeneity on effect size. With a typical case–control model, we ignore these possible subtype distinctions and compare autism to controls on some dependent variable. In this example scenario there is no clear case–control difference but the autism group shows higher variability (indicated by the larger error bars). An approach towards decomposing heterogeneity might be to construct a stratified model whereby we model the subtype labels instead of one autism label, and then re-examine differences on the hypothetical dependent variable of interest. In this example, the autism subtypes show contradictory effects. These effects are masked in the case–control model as the averaging cancels out the interesting different effects across the subgroups. c Heterogeneity is shown in autism as multi-level phenomena. This panel also visualizes the difference between broad versus deep big data characteristics and labels the top-down versus bottom-up approaches to understanding heterogeneity in this multi-level context. Finally, this panel also shows how development is another important dimension of heterogeneity to consider at each level of analysis (i.e., ‘chronogeneity’). In this example, chronogeneity is represented by different trajectories for different types of autism individuals

There are many ways to talk about how autistic individuals are similar to or different from each other [42]. On the one hand, we can understand phrases like the ‘spectrum’ as referring to heterogeneity as graded continuous change between individuals. ‘Spectrum’ can also apply to both the clinically diagnosed autism population and the whole population, including those with the ‘broader autism phenotype’ [43,44,45,46]. The idea of a spectrum can be applied as a model for understanding heterogeneity between autistic individuals—a model we would refer to as a ‘dimensional model’. Dimensional models can also cut across traditional diagnostic boundaries, with the most prominent example of this being the National Institute of Mental Health (NIMH) Research Domain Criteria (RDoC) model [47]. However, we also use heterogeneity as a way of conceptualizing categorical or qualitative differences between autistic individuals. The term ‘spectrum’ could also imply a qualitative rather than a quantitative difference between individuals. However, terms that pluralize autism as ‘autisms’ may be more applicable here, as the idea of multiple kinds of autisms lends itself to categorical ways of thinking about individuals as ‘subgroups’ or ‘subtypes’. A subtype model for explaining heterogeneity in autism can also be called a ‘stratified model’.

Since we have different ways of talking about heterogeneity, the question will naturally arise as to which way of conceptualizing heterogeneity is best. Are categorical ‘subtype’ models better than continuous ‘dimensional’ models, or vice versa? This could be an ill-posed question, since these concepts and models need not be mutually exclusive. First, theoretically we could imagine an important blending between the two types of models for understanding heterogeneity and this can be tested statistically (e.g., factor mixture models [48]). For instance, one could first subtype the autistic population, and then further characterize between-individual variability through continuous models within each subtype. Second, the answer to such a question may differ depending on the aim of the model. For example, a subtype model might be better at predicting treatment responses, whereas a dimensional model might be better at predicting basic biological mechanisms, or vice versa. As we build a literature on understanding heterogeneity in autism, it will be important to be clear about how different models conceptualize heterogeneity, as well as understanding that different models may be important for different types of aims. The aphorism by George Box that ‘all models are wrong, but some are useful’ is applicable here [49]. Models are simplified explanations that typically account only for a portion of variability in a phenomenon. Even if models are quite different in their explanation and predictive power, they can still be useful for a variety of different aims. Therefore, a pragmatic approach for evaluating heterogeneity models will be important for moving forward, since it is unlikely that we will converge on single explanations (models) that can explain the wide array of multi-level heterogeneity in autism.

Heterogeneity, evolution of the diagnostic concept

The evolution of the nosology and diagnostic concept of autism changes the definition of autism—who counts as being on ‘on the spectrum’ and who gets a clinical diagnosis [50]. This evolution also contributes to the discussion about heterogeneity in autism. When ‘autism’ was first defined as ‘autistic disturbances of affective contact’, the core features were considered to be ‘extreme self-isolation’ and ‘obsessive insistence on the preservation of sameness’ [51, 52]. At the cognitive level, language impairments or peculiarities were seen as secondary to ‘basic disturbances in human relatedness’ [52]. Moreover, both Kanner [51] and Asperger [53] recognized good cognitive potential in their child patients and therefore autism was not necessarily tied to intellectual disability. However, at the next stage of nosological evolution, language and cognitive impairments began to be considered ‘core’ [54] and this conceptualization directly impacted the first operationalization of autism in the Diagnostic and Statistical Manual of Mental Disorders (DSM)-III [55], in which language deficits were core to diagnosis. Individuals identified as having autism in the 1970s and 1980s were therefore mostly those with marked difficulties in verbal communication, and many were considered to have intellectual disability. In the 1980s, Wing [56] and colleagues not only introduced the work of Hans Asperger into the English speaking world, but also conducted epidemiological studies that demonstrated the heterogeneity in social, language, motor, and cognitive abilities in the autistic and developmentally delayed population [57, 58]. Wing’s ideas of the ‘triad of social, communication and imagination impairments and repetitive behavior’, the lack of clear division between Kanner’s autism and less extreme forms, and the shift of core social impairment from ‘extreme autistic aloneness’ to ‘deficits in the use and understanding of unwritten rules of social behavior’ clearly broadened what autism encompassed. All these ideas were subsequently adopted into versions of diagnostic systems including DSM-III-R, DSM-IV and ICD-10 (International Statistical Classification of Diseases and Related Health Problems-10th Revision). Phenotypic heterogeneity therefore increased, allowing an autistic individual to be verbal or minimally verbal, ‘active but odd’, ‘passive’, ‘aloof’ or ‘loners’ [59], and with various combinations of repetitive and stereotyped behaviors. The DSM-5’s exclusion of language impairments from, and inclusion of atypical sensory responses into core symptoms, reflects how the concept of autism nowadays is much broader than how it had initially been conceptualized. The most recent revision of ICD criteria (ICD-11) further emphasizes specific diagnostic subgroups that qualify whether an individual with autism has impairments with functional language and/or intellectual development. With the changing and broadening diagnostic concept comes increased heterogeneity, inevitably at the behavioral phenotypic level, and possibly also at other levels of analysis.

This history behind the evolving diagnostic concept is an important yet often not fully acknowledged caveat for interpreting research on autism. Research spanning several decades may have been isolating phenomena in altogether different types of individuals than does more recent research. Since the spectrum of diagnosed individuals is wider today than in the past, interpretations behind lack of replication or inconsistencies across studies should take this into account, rather than assuming the population under study has not changed over time. As the diagnostic concept continues to change we must be mindful of this issue when interpreting how current research matches up to work that may be several decades old.

Shifting from the ‘one-size-fits-all’ paradigm towards understanding heterogeneity

Perhaps the most prominent justification behind why understanding heterogeneity is important is because individuals with autism widely differ in response to treatment. While most treatment approaches are early intensive behavioral intervention and naturalistic developmental behavioral intervention, the existing literature suggests that they have variable levels of effectiveness and in some cases may not significantly affect core autism features such as social-communication difficulties [60,61,62,63,64]. Currently, there are also no medical treatments that significantly affect the core characteristics of autism [65, 66]. Rather than advocating a ‘one-size-fits-all’ approach to treatment, most recent best practice recommendations specifically highlight the critical need for future research to identify factors that explain heterogeneity in response to treatment, in order to better individualize treatment and intervention approaches and to better target changes in core or functionally impairing symptomatology [60, 64]. A separate ethical issue raised by the neurodiversity movement is the idea that autism itself should not be a target for treatment, since it may be part of the individual’s genetic make-up and identity. Rather, treatment should target specific co-occurring symptoms and difficulties in adaptive functioning that cause suffering and disability. Such co-occurring symptoms and maladaptation (in many cases the contributing reasons are not solely within the autistic person but also arising from the environmental contexts) comprise a critical, yet under-developed, angle to stratification of the autism spectrum which will guide ethical and personalized intervention.

Heterogeneity also limits basic scientific progress towards understanding autism. To understand why, it is important to first make salient the problems with the dominant paradigm, which is ill-equipped to reveal heterogeneity—the case–control paradigm. The case–control paradigm exemplifies the ‘one-size-fits-all’ approach, since all cases are treated identically due to the same diagnostic label. Studies that attempt to identify ‘biomarkers’ via case–control designs have implicitly conceptualized the notion that if a strong biomarker did exist, it would completely differentiate cases from all controls. We have yet to isolate any biomarkers for autism that can reliably and consistently reach this high bar [7, 67]. One reason why case–control research has fallen short on identifying high-impact biomarkers could be that we are looking at the wrong features. However, an alternative explanation is that high-impact biomarkers are likely exclusive to specific subsets of autistic individuals. That is, a high-impact biomarker may be informative for one subtype of autism, but not others (Fig. 1b). In order to identify such stratification or dimensional biomarkers [68], one will have to change the approach from the case–control model to a stratified and/or dimensional model. This is not to say that case–control studies are not useful. Isolation of consistent and reliable case–control differences are useful for identifying on average differences, but typically with substantial degree of overlap in the distributions. However, if we are searching for biomarkers that could help us move towards precision medicine, we will need to pivot our approach away from case–control studies as the dominant paradigm and towards stratified and/or dimensional models that could yield much higher impact larger effects.

As an illustrative example, we take our own recent work on mentalizing ability in adults with autism. From a case–control perspective, autistic adults perform on average lower on the ‘Reading the Mind in the Eyes’ Test (RMET) compared to matched typically developing controls [69]. However, taking a stratified approach, we find that the autistic adult population can be reliably split into subtypes who are completely unimpaired on the RMET versus those who are highly impaired [25] (Fig. 2). Thus, in this example, while replicable on average case–control effects appear, a stratified approach that takes into account heterogeneity can isolate higher impact and more precise considerations about mentalizing as measured by the RMET in the adult autistic population.

Fig. 2

Case–control vs stratified model example with adult autism and mentalizing ability. This figure reports data from Lombardo et al. [25] on two independent datasets of adults with autism and performance on an advanced mentalizing test, the Reading the Mind in the Eyes Test (RMET). a (Discovery), b (Replication) Case–control differentiation and the standardized effect size for each dataset are shown. cf RMET scores and standardized effect sizes from the same two datasets after unsupervised data-driven stratification into five distinct autism subgroups and four distinct TD subgroups. Autism subgroups 1–2 are highly impaired on the RMET, while autism subgroups 3–5 are completely overlapping in RMET scores with the TD population

Imprecise effect size estimates and lack of power in small sample size studies

Compounding the problem of utilizing ‘one-size-fits-all’ models like the case–control paradigm is the issue of small sample size studies. Over the last several decades, it has been common practice to conduct and publish small sample size studies. Small sample studies can be problematic from the viewpoint that statistical power is low for all but the largest effects. Small sample size also means that estimated sample statistics vary considerably relative to their population parameters due to more pronounced sampling variability. In Fig. 3, we show simulations that illustrate the issues of low power and imprecise estimates of effect size so that they are clear and salient to readers. A common case–control study with n = 20 per group results in an effect size that varies considerably relative to the true population effect. This variability in estimated effect size at small samples is consistent irrespective of what the true population effect is. Only with very large sample sizes (e.g., n > 1000) can we see that the sample effect size hones in with some precision on the true population effect size. The histograms shaded in red in Fig. 3 also show the limited statistical power one has at smaller effect sizes and small sample size.

Fig. 3

Simulation of sample effect size estimates at different sample sizes and across a range of true population effects for a hypothetical case–control study. In this simulation we set the population effect size to a range of different values, from very small (e.g., d = 0.1) to very large (e.g., d > 1.0) (panels ae show simulation results when effect size ranges from d = 0.1 to d = 0.9 in steps of 0.2). We then simulated data from two populations (cases and controls), each with n = 10,000,000, that had a case–control difference at these population effect sizes. Next, we simulated 10,000 experiments where we randomly sampled from these populations different sample sizes (n = 20, n = 50, n = 100, n = 200, n = 1000, n = 2000) and computed the sample effect size estimate (standardized effect size, Cohen’s d) for the case–control difference. These histograms (gray) show how variable the sample effect size estimates are (black lines show 95% confidence intervals) relative to the true population effect size (green line). Visually, it is quite apparent how small sample sizes (e.g., n = 20) have wildly varying sample effect size estimates and that this variability is consistent irrespective of what the true population effect size is. Overlaid on each gray histogram are red histograms that show the distribution of sample effect size estimates where the hypothesis test (e.g., independent samples t-test) passes statistical significance at p < 0.05. The rightward shift in this red distribution relative to the true population effect size (green line) illustrates the phenomenon of effect size inflation. The problem is much more pronounced at small sample sizes and when true population effects are smaller. We then computed what is the average effect size inflation for this red distribution and plotted this average effect size inflation as a percentage increase relative to the true population effect in (f). Each line in panel f refers to simulations with different sample sizes. This plot directly quantifies the degree of effect size inflation across a range of true population effects and across a range of sample sizes. The code for implementing and reproducing these simulations is available at

Effect size inflation in small sample studies

Our simulations also make salient another common characteristic of small sample size studies—the possibility for vast effect size inflation when statistically significant effects are identified [70]. Inflated effects occur because effect sizes that are deemed statistically significant in small studies benefit from noise in the direction of the effect. Such inflated effects present an over-optimistic view on the identified effects and are prone to the winner’s curse [70]. Inflated effects look attractive and may be easier to publish due to their apparent indication of large effects. However, in subsequent replication attempts, investigators likely will fail to identify effects as large as the original small sample study because the effect size in the original study was inflated by some degree [71]. We can see effect size inflation and its interaction with true population effect size in Fig. 3. At very small true population effect sizes, sample effect size estimates that are deemed statistically significant (the red histograms in Fig. 3a–e) are wildly inflated, and this problem is most pronounced for small sample size studies. For example, tiny population effect sizes of 0.1 standard deviations of difference show on average greater than 300 to 350% effect size inflation when a study observes a statistically significant effect at p < 0.05 with an n = 50 or n = 20, respectively (Fig. 3f). If the true population effect size is much larger (e.g., d > 0.5), inflation in effect size is attenuated, and at relatively large sample sizes (n > 100 per group), there is very little effect size inflation on average for such effects. Of course, these simulations here are simplistic examples of studies with only one statistical comparison. The reality is that studies typically make multiple comparisons and sometimes on a massive scale (e.g., neuroimaging, genetics). In these situations, inflated effect sizes become an even bigger problem [72].

Why is such a characteristic important in discussions of case–control paradigms versus paradigms that acknowledge heterogeneity? The pervasiveness of small sample sizes and effect size inflation in case–control studies tend to give over-optimistic views on the utility of case–control studies. Over the course of time, replication attempts typically decrease the enthusiasm for many such effects, because the reality is likely that most case–control effect sizes are much smaller than published small sample size studies would suggest. By portraying initial novel case–control studies as showing large effects, we may be less inclined to ask the question of whether heterogeneity is involved. Furthermore, small case–control effects may be due to complicated heterogeneity in the autism population that hides potentially large effects restricted to specific subtypes. By focusing on heterogeneity, we are likely to better identify true population effects of much larger magnitude. Assuming that such research identifies true large effects in relatively large samples, the issue of effect size inflation may be much less of an issue (as the simulations here demonstrate). However, any model where statistical power is low can show inflated effect sizes. Therefore, models that try to explain heterogeneity can be prone to effect size inflation as well, hence the need for very large samples and high statistical power in stratified or dimensional models.

Sampling bias across strata nested in the autism population

Small sample size case–control studies that do not acknowledge heterogeneity in the autism population are also particularly problematic because increased sampling variability has substantial biasing impact in enriching specific strata of the population over others. Ideally, to get a generalizable sample of the population in a case–control paradigm, one hopes that if there are unknown strata nested in the population, the sample prevalence of each strata reflects the true prevalence of that strata in the population. If such a criterion is not achieved, it means that samples can be biased by the enrichment of certain strata of the population over others. If enrichment of different strata of the population are present across multiple studies, they may paint a confusing and potentially contradictory picture of the phenomenon. A primary example of this is the systematic over-enrichment of males over females in most case–control studies, particularly intervention and biological studies [73,74,75], which may lead to male-biased inferences about autism [76]. Another simulation shown in Fig. 4 illustrates that small samples are much more prone to this bias due to enrichment of specific strata over others. In this simulation, there are five subtypes in the autism population, and each has different effects relative to the control population. Therefore, enrichment of different subtypes can have dramatic effects on the results of the study. Our simulation had equal population prevalence for each subtype (i.e., 20% of the autism population), which meant that from study to study, the specific strata that may be enriched is random. Obviously, in the likely scenario where population prevalence rates are asymmetrical across subtypes, the enrichment of specific strata could favor those subtypes with higher population prevalence.

Fig. 4

Simulation showing sampling variability and bias of enrichment of specific strata in small sample size studies. In this simulation we generated a control population (n = 1,000,000) with a mean of 0 and a standard deviation of 1 on a hypothetical dependent variable (DV). We then generated an autism population (n = 1,000,000) with 5 different autism subtypes each with a prevalence of 20% (e.g., n = 200,000 for each subtype). These subtypes vary from the control population in effect size in units of 0.5 standard deviations, ranging from −1 to 1. This was done to simulate heterogeneity in the autism population that is reflective of very different types of effects. For example, the autism subtype 5 shows a pronounced increased response on the DV, whereas autism subtype 1 shows a pronounced decreased response on the DV. Across 10,000 simulated experiments, we then randomly sampled from the autism population sample sizes of n = 20, n = 200, and n = 2000, and computed the sample prevalence of each autism subtype. The ideal result without any bias would be sample prevalence rates of around 20% for each subtype. This 20% sample prevalence is approached at n = 2000, and to some extent at n = 200. However, small sample sizes such as n = 20 shows large variability in sample prevalence rates of the subtypes and this can markedly bias the results of a case–control comparison. The code for implementing and reproducing these simulations is available at

Such biases due to sampling variability across subtypes have considerable importance for replicability. To illustrate, we give a simple example indicative of many cases in the current literature. For example, Study 1 may unknowingly possess a sample enriched with specific autism subtypes that show a decreased response on some dependent variable. Study 2 unknowingly has a different autism sample enriched with subtypes that show a contradictory increased response on the same dependent variable. Both studies are published and the authors of each may get into a heated debate, each claiming that the other is wrong. Yet a third study comes out with perhaps a more unbiased (and possibly larger) sample, and given that the overall population effect could be near zero for a case–control comparison (as in the simulations in Fig. 4), this third study finds no difference and claims that both studies 1 and 2 are false positives. While the third study may be the clearest indication of what occurs as an overall case–control effect, this study too may be missing the point completely—the population under investigation is not homogeneous and is stratified. Therefore, each study could have merit, if better contextualized and with some attempt to grapple with issues of heterogeneity. Thus, it is clear from these examples that practices of running case–control studies, utilizing small sample sizes, and not fully confronting the issue of heterogeneity in autism, may compound problems and lead to a conflicted literature and delay scientific progress. Given these considerations, our recommendation is to move away from small-sample case–control models and towards stratified and/or dimensional models that take into account important heterogeneity in autism.

Essential big data characteristics for studying heterogeneity

While the idea of heterogeneity in autism has been around for some time, it is understandable why as a field autism research has made only limited progress. Conducting research on heterogeneity can be difficult for reasons of lack of datasets that are large enough to sufficiently answer such questions. As the previous discussions on issues with small sample sizes suggest, we would argue that one key ingredient to studying heterogeneity in autism successfully is ‘big data’. When we use the phrase ‘big data’, we are not necessarily referring to the ‘feature’ dimension of the data—that is, massively multivariate ‘feature rich’ data (e.g., neuroimaging or genomics data). Obviously, feature-rich aspects of big data are indeed important in their own right and for the purposes of understanding heterogeneity. Rather, the dimensions we would emphasize about big data are the participant dimension (i.e., large sample size) and the depth of the measured features embedded in the participant dimension. Put another way, we need big data that have characteristics of being both ‘broad’ and ‘deep’ [77] (Fig. 1c).

Broad data refer directly to the participant or sample dimension of the dataset (as opposed to the feature dimension) and is characteristic of massive sample size. Such a broad spread over individuals should ideally provide good coverage over the population of interest and allows for sufficient sampling of each strata of interest. Broad data are an essential ingredient for decomposing heterogeneity in autism since we can run into many problems with data that are not sufficiently large or do not allow for such broad coverage over the population. Sufficiently broad data can also open up opportunities for replicating findings, since experimental designs can be planned ahead of time to set aside a sufficiently large validation set to replicate findings from an initial broad discovery set. As data sharing and open data initiatives become more available, we should see more investigations on heterogeneity that meet this big data requirement. There are some current resources that are immediately available to meet such needs (e.g., the ABIDE datasets [78], the National Database for Autism Research (NDAR) [79], the Simons Simplex Collection [80], SPARK [81], the Healthy Brain Network [82], and see refs. [83, 84]) and we would expect much more in the coming years. As we get better at detecting what are the relevant dimensions and/or subtypes explaining important heterogeneity in autism, we may be better able to design high-powered targeted studies where the requirements for massive sample size may be reduced substantially. However, for most topics, we are not yet at this stage, and thus broad data (i.e., massive sample size) are necessary.

Developing models to explain aspects of heterogeneity at one level is only the first step. Once we have built good models that explain heterogeneity at one level, we will need to ask the next translational question: ‘What else are these models good for?’ Put differently, stratified or dimensional models can be good at predicting phenomena at one level of analysis, but because autism is heterogeneous at multiple levels, could such models help us make sense of heterogeneity outside the domain that the model was originally built upon? Answering this question can have considerable relevance for precision medicine goals. For instance, a geneticist may have identified a unique biological subtype of autism based around a certain genetic mechanism. Such a genetic stratifier would already be useful for pinpointing a specific discrete cause for some proportion of the autism population. However, working towards precision medicine, we would next want to know whether such a genetic subtype is different from other autistic individuals on clinically relevant aspects such as prognosis, response to treatment, symptomatology, cognition, etc. Thus, when we ask this type of question, we need big data that are not only broad, but also ‘deep’ [77]. Deep data are data collected on the same individuals that penetrate through multiple levels of analysis (Fig. 1c). Deep data allow for stratifications or dimensional models to be built at one level, but the important tests of such stratifications can be done at other levels. An example of this can be seen in recent work on the Simons Simplex Collection. Here the authors made stratifications on the phenotype and then asked the question of whether such stratifications increased power for detecting genome-wide association study-type effects at the genetic level [31]. Thus, to best answer questions by utilizing stratified or dimensional models, we will require big data that are both broad and deep, as the combination of both types of data can allow for discovery of explanations of autism heterogeneity and can immediately point towards the utility of such models for explaining the multi-level complexity inherent in autism. New multi-site studies such as EU-AIMS Longitudinal European Autism Project (LEAP) are targeted to directly address both issues of broad and deep data [85,86,87] and we need other efforts along these lines.

Approaches to decomposing heterogeneity in autism: top-down, bottom-up, and chronogeneity

Since the approach to decomposing heterogeneity in autism towards precision medicine goals is one of identifying clinically and mechanistically useful models, it is helpful to make salient some different approaches towards these goals. A common circumstance might be where a researcher makes a stratification at a level higher up in the hierarchy presented in Fig. 1c. The translational next step may be to work down towards understanding how a stratified and/or dimensional model at this higher level of analysis can explain some phenomenon at a lower level. We refer to this as a top-down approach. For example, a clinically important stratification can be made in the early development of autism regarding language outcome at 4–5 years of age. Some children keep up with age-appropriate norms in the areas of expressive and receptive language development, whereas others fall far behind in their language abilities across these domains. The empirical question after making such stratification could be whether such autism language–outcome subtypes differentiate at the level of neural systems organization, particularly neural systems that are developing specialization of function for speech and language processes [22]. More recent work has also shown that variation at the level of gene expression in blood leukocytes is associated with large-scale speech-related functional neural responses. These gene expression-neuroimaging associations are different across autism language outcome subtypes [23]. In this example, it is clear that the stratifications were made at a level of analysis above the level that was later interrogated for mechanistic understanding. Thus, while early language outcome is itself a clinically important stratifier, this top-down work also indicates that the stratifier may also be mechanistically useful for pointing towards different underlying biology. Other examples of a top-down approach may be based on cognitive characteristics [88], sex/gender [76], and co-occurring medical and psychiatric conditions (e.g., epilepsy [89], attention-deficit/hyperactivity disorder (ADHD) [27], etc). This type of top-down approach may ultimately motivate future work that could potentially identify unique discoveries about biology behind a subset of the autism population that was previously unknown.

In contrast to top-down approaches, an approach that works from the bottom-up could be highly complementary. As the phrase implies, a bottom-up approach starts with identifying and building useful models from a lower level in the hierarchy, and then asks questions about how such low-level models can explain phenomena higher up in the hierarchy. For example, in the ‘genetics first’ approach, an investigator may be interested in identifying how different high-impact genetic causes of autism may be similar or different at a phenotypic or cognitive level of analysis [90,91,92,93]. In another example, an investigator may compare autism subtypes at the level of neural systems or structural brain features (e.g., with or without early brain enlargement), and then ask the question of whether such a stratification provides a meaningful indicator of differentiation at a clinical level [14]. Both top-down and bottom-up approaches can be useful, depending on the particular research question, and each can highlight different aspects of important heterogeneity in autism. In order to link up such multi-level complexity into explanations behind heterogeneity in autism, it will be imperative to have work from both approaches.

A final approach to decomposing heterogeneity deals with the lifespan developmental dimension across any level of analysis, or ‘chronogeneity’ [38]. Several large longitudinal studies consistently indicate that there are several autism subtypes with different developmental trajectories [16,17,18, 38, 94]. Regression, a developmental feature seen in autistic individuals, is another key stratifier that is surprisingly under-studied but with plausible unique biological bases [95, 96]. Within the developmental dimension, heterogeneity can be assessed as both inter- and intra-individual variability, but can also cover individualized deviance from group trajectories over time- [38] or age-specific norms [97, 98]. Chronogeneity thus offers a unique vantage point on multi-level heterogeneity not covered by understanding heterogeneity at static time points.

Approaches to decomposing heterogeneity in autism: supervised versus unsupervised

In addition to conceptualizing stratified and/or dimensional models by top-down, bottom-up, or developmental approaches, it is also important to clarify how we build on the process of understanding heterogeneity. Ultimately, the scientific process of better understanding heterogeneity in autism is a learning problem. Taking ideas from statistical or machine learning, we can broadly divide learning processes into supervised and unsupervised learning [99]. Supervised learning deals with a priori knowledge about a topic (i.e., known labels), and then seeks to derive a model to best predict that known information. With regard to the process of understanding heterogeneity in autism, the analogy of supervised learning can apply to all instances where the experimenter uses their own knowledge and justifications to dictate where the stratifications are made (e.g., top-down, bottom-up, or developmental). In other words, knowledge from a supervised source (e.g., an investigator, a theory) informs the stratification or dimension to be modeled. This type of approach has the advantage of being theory driven and/or builds on expert knowledge of the investigator (e.g., clinical intuition or experience), who may already have highlighted a distinction that is meaningful and justified in a variety of ways.

The disadvantage of solely relying on a ‘supervised’ approach is that the investigator and/or a theory may be missing other important distinctions about how to model heterogeneity for the question of interest. In this case, the learning process can be helped by some type of ‘unsupervised’ statistical learning process that uncovers distinctions that may not be readily apparent from a priori knowledge. Because big data are a key ingredient for building models to explain heterogeneity, we can utilize the feature-rich aspects of big data to embark on data-driven discovery of potentially complex multivariate patterns that distinguish different types of individuals. We refer to this data-driven approach as an ‘unsupervised’ approach since computationally, the learning occurs without any expert a priori knowledge and justifications and solely relies on statistical distinctions embedded in the data itself. With this approach we likely rely on advanced computational techniques from machine learning that are tailored to best identify complex multivariate distinctions. For example, we utilized clustering methods taken from systems biology and applied them to item-level patterning of behavioral responses on the RMET. This unsupervised approach yielded discovery of five different autism subtypes that could be replicably identified in an independent replication set (Fig. 2) [25]. In other work, Ellegood et al. [100] applied clustering to neuroanatomical phenotypes across a range of different mouse models for autism. This work illustrated that heterogeneous starting points (e.g., different genetic mutations highly associated with autism) can converge and diverge at the level of neuroanatomical phenotypes [100]. Using structural magnetic resonance imaging measures of cortical morphometry, Hong et al. [12] used clustering to identify three autism subtypes with different anatomical profiles. These anatomically defined subtypes were then found to be useful for increasing the performance of supervised learning models to predict symptom severity on measures such as the autism diagnostic observation schedule (ADOS) [12].

It should be noted that both supervised and unsupervised approaches have their advantages and disadvantages, and can be complementary. An example of this complementarity can be seen in a hybrid supervised–unsupervised approach from Feczko et al. [15]. In this study, the authors utilized a supervised ensemble learning model called a Functional Random Forest (FRF) model to classify autism versus typically developing children based on cognitive features from a neuropsychological test battery. In addition to classifying autism versus typically developing children, the FRF model produces a proximity matrix that indicates similarity between individuals. The authors then utilized this proximity matrix to identify subgroups in an unsupervised manner utilizing a community detection algorithm, typically used in network science to discover ‘modules’. This hybrid approach to cognitive subtyping proved useful for identifying different patterns of resting state functional connectivity across the subtypes. Thus, through the scientific process of building knowledge about important stratified or dimensional models, both unsupervised and supervised approaches can inform each other, and in some cases may be utilized together in a hybrid fashion.

Decomposing heterogeneity in relation to transdiagnostic constructs

Although so far we treat autism as an entity and focus on heterogeneity within it, this diagnostic construct is human-made, cumulative, and evolving [101, 102]. Phenotypically, autism frequently co-occurs with other neurodevelopmental (e.g., ADHD, tic disorders) and psychiatric (e.g., anxiety, depression, obsessive-compulsive disorder, psychotic disorders) conditions [1, 103] and heightened autistic traits often cut across other categorical diagnoses as well [104]. Underlying this may be multi-level processes cutting across sets of frequently co-occurring diagnoses [105], which potentially can be delineated by transdiagnostic approaches such as using the RDoC framework [47]. In this respect, we should acknowledge that heterogeneity in autism is part of the broader heterogeneity existing across neurodevelopmental and (physical and mental) health conditions. In the same vein, the reasons, principles, and approaches described above to tackle heterogeneity in autism can be similarly applied when autism is studied within a transdiagnostic framework cutting across multiple diagnoses. In the background of high co-occurrence, a transdiagnostic framework is necessary for deepening our understanding of the heterogeneity within and beyond autism.

Challenges for big data approaches in autism science and clinical practice

With all these advantages of big data in mind, we acknowledge it is easier said than done in practice. There are key practical challenges to be overcome. First, conducting studies with very large sample sizes is challenging, and perhaps only the most well-funded laboratories and/or consortiums can regularly conduct such work. In a situation where we are investigating phenomena with stratified models, this problem is magnified since one now needs large sample sizes within each strata being investigated. The practical issues are further compounded when there is need to replicate—a need which is absolutely necessary to build confidence in identified effects. Second, broad and deep data are both desirable, but there are inevitable tradeoffs when considering feasibility. Current initiatives to collate existing data from smaller-scale studies (e.g., ABIDE [78] and NDAR [79]) have stimulated the field of autism research to move towards broad data, and there are increasing consortium efforts taking a prospective, coordinated data acquisition protocol to synchronize the acquisition of broad and deep data (e.g., EU-AIMS LEAP [85,86,87], the Healthy Brain Network [82], the POND Network [106]). Continuous exchange across research teams to establish shared methodologies and measurements are critical, yet for the field to move forward, it is important to sustain flexibility and openness in incorporating new findings, methodologies, and measures, especially considering that the samples to be enrolled in future research must be more representative of the autistic population at large—truly diverse and inclusive (e.g., in terms of age and life stage, ethnic background, genetic make-up, social-economic status, cultural context, sex, and gender, etc.)—in order to represent the full spectrum of individuals around the globe. Fundamental to these large-scale and long-term efforts is advocacy for funding support that encourages coordinated study designs and data merging efforts to achieve broad and deep data.

In the meantime, we believe there is still room for ‘smaller science’ in the era of big data. Contributions towards delineating heterogeneity could still be made by studies with moderately sized but adequately-powered samples. By ‘moderately sized samples’, we do not define this phrase in absolute terms (e.g., some rule-of-thumb sample size that can be applied irrespective of the context). Rather, what counts as moderately sized samples will need to be defined for each research context. However, at the very least we do intend ‘moderately sized’ to mean sample sizes that are sufficiently statistically powered (e.g., >80%) for reasonably sized effects of interest (e.g., medium or large effects). Such sample sizes are different from what we consider as ‘large sample sizes’, which would be situations where the sample size offers more than enough statistical power (e.g., 90–100%) even for very small effect sizes and likely hones in on point estimates of the population effect size with high precision. These moderately sized studies can make substantial progress in autism research via several ways. First, such studies could focus on examining well-defined subgroups in the autism spectrum, derived either from hypothesis-driven strata (e.g., individuals with specific behavioral profile, specific neurobiological status, specific developmental characteristics, specific etiological factor, etc.) or strata discovered via prior big (broad) data studies. In this scenario of moderately sized studies, case–control models could be meaningful with evidence of independent replication. However, such studies will likely yield more information if they also are stratified and/or use dimensional models to capture aspects of important heterogeneity within autism. Such studies could help isolate effects specific to subsets of autism where the effects are larger than smaller effects typically found in case–control studies. Studies like these could help canalize research in specific directions towards better understanding such reasonably sized large or medium effects. Second, moderately sized studies could hone their focus on well-defined mechanisms in a hypothesis-testing/driven manner or conducting clinical trials that target on specific mechanisms (instead of treating autism overall as a single category driven by an ubiquitous cause). In these scenarios, moderately sized studies are not broad, but they could dive into deep data as a way to reveal more mechanistic insight and connect multiple levels of analysis. In sum, practical limitations likely require the field to alternate between investigations that are large-N and broad or more moderately sized studies that feature deep characteristics of the data. This strategy may facilitate future work until opportunities arise that can truly allow for big data that is both broad and deep.

Although big data approaches can move our research closer towards precision medicine goals, it is an even bigger challenge to translate the work into real-world individualized care and support. As in other fields of health care, person-level information that parses heterogeneity and achieves individual-level accuracy as a biomarker or predictor (e.g., BRCA gene mutations, the utility of which comes from big data science in oncology) is only part of the whole decision-making process in health care. Optimal care and support for autistic individuals has to be embedded in a person-centered, lifespan perspective that incorporates shared decision making and collaborative action planning [107]. Big data bring clarity to our understanding of individual differences on the autism spectrum and beyond the spectrum, yet in daily clinical practice, care and support can be improved only when such clarity is integrated with a perspective that respects the individuality of the autistic person and their personal contexts.


Understanding how heterogeneity manifests in autism is among the biggest challenges in our field. As we continue to develop models for explaining this heterogeneity, the organizing concepts laid out here could be useful in synthesizing very diverse areas of research. Heterogeneity must be interpreted relative to the zeitgeist, particularly as it pertains to how diagnostic concepts evolve. Models for explaining heterogeneity manifest in many ways, depending on whether the researcher conceptualizes the differences between individuals as quantitative and dimensional, or qualitative and categorical. There is room for both models that fuse together both dimensional and categorical distinctions. In general, we need to move beyond one-size-fits-all models such as case–control models, and we need to be stringent with respect to methodology, since practices such as small sample size research cannot live up to the challenges that heterogeneity creates. Small samples cannot adequately cover heterogeneity in the autism population in a highly generalizable fashion, and hence there is a need for ‘big data’ when studying heterogeneity. Big data should be both broad and deep, to not only sample adequately across different strata from the population but also to examine how strata defined at one level may be relevant for explaining variability at other levels. Heterogeneity can be parsed from multiple approaches that capitalize on information from levels of analysis either most proximate or most distal from the clinical phenotype and which work their way down or up through the hierarchy, or via an examination of change across development. Also important for conceptually organizing work on this topic is whether we utilize a priori knowledge to build heterogeneity models or whether we allow computational methods to inform us about data-driven distinctions that may be hidden and not readily apparent to most researchers. Models to understand heterogeneity can move beyond just those with clinical diagnoses of autism and, in the future, transdiagnostic approaches utilizing similar organizing concepts may provide complimentary information. Overall, the push to understand heterogeneity is critical as we attempt to move towards precision medicine, which will need to be embedded in a person-centered, lifespan-informed, shared decision-making and collaborative planning of care to provide holistic support for each unique autistic individual.


  1. 1.

    Lai MC, Lombardo MV, Baron-Cohen S. Autism. Lancet. 2014;383:896–910.

    Google Scholar 

  2. 2.

    Buescher AV, Cidav Z, Knapp M, Mandell DS. Costs of autism spectrum disorders in the United Kingdom and the United States. JAMA Pediatr. 2014;168:721–8.

    PubMed  Google Scholar 

  3. 3.

    Leigh JP, Du J. Brief Report: Forecasting the economic burden of autism in 2015 and 2025 in the United States. J Autism Dev Disord. 2015;45:4135–9.

    Google Scholar 

  4. 4.

    Kapur S, Phillips AG, Insel TR. Why has it taken so long for biological psychiatry to develop clinical tests and what to do about it? Mol Psychiatry. 2012;17:1174–9.

    CAS  PubMed  Google Scholar 

  5. 5.

    Collins FS, Varmus H. A new initiative on precision medicine. N Engl J Med. 2015;372:793–5.

    CAS  PubMed  PubMed Central  Google Scholar 

  6. 6.

    London E. The role of the neurobiologist in redefining the diagnosis of autism. Brain Pathol. 2007;17:408–11.

    PubMed  Google Scholar 

  7. 7.

    Waterhouse L, London E, Gillberg C. ASD validity. Rev J Autism Dev Disord. 2016;3:302–29.

    Google Scholar 

  8. 8.

    Waterhouse L, Gillberg C. Why autism must be taken apart. J Autism Dev Disord. 2014;44:1788–92.

    PubMed  Google Scholar 

  9. 9.

    Geschwind DH, Levitt P. Autism spectrum disorders: developmental disconnection syndromes. Curr Opin Neurobiol. 2007;17:103–11.

    CAS  Google Scholar 

  10. 10.

    Happe F, Ronald A, Plomin R. Time to give up on a single explanation for autism. Nat Neurosci. 2006;9:1218–20.

    CAS  PubMed  Google Scholar 

  11. 11.

    Geschwind DH, State MW. Gene hunting in autism spectrum disorder: on the path to precision medicine. Lancet Neurol. 2015;14:1109–20.

    PubMed  PubMed Central  Google Scholar 

  12. 12.

    Hong SJ, Valk SL, Di Martino A, Milham MP, Bernhardt BC. Multidimensional neuroanatomical subtyping of autism spectrum disorder. Cereb Cortex 2018;28:3578–88.

    Google Scholar 

  13. 13.

    Hahamy A, Behrmann M, Malach R. The idiosyncratic brain: distortion of spontaneous connectivity patterns in autism spectrum disorder. Nat Neurosci. 2015;18:302–9.

    CAS  PubMed  Google Scholar 

  14. 14.

    Amaral DG, Li D, Libero L, Solomon M, Van de Water J, Mastergeorge A, et al. In pursuit of neurophenotypes: the consequences of having autism and a big brain. Autism Res. 2017;10:711–22.

    PubMed  PubMed Central  Google Scholar 

  15. 15.

    Feczko E, Balba NM, Miranda-Dominguez O, Cordova M, Karalunas SL, Irwin L, et al. Subtyping cognitive profiles in autism spectrum disorder using a functional random forest algorithm. Neuroimage. 2018;172:674–88.

    CAS  PubMed  Google Scholar 

  16. 16.

    Lord C, Bishop S, Anderson D. Developmental trajectories as autism phenotypes. Am J Med Genet C Semin Med Genet. 2015;169:198–208.

    PubMed  PubMed Central  Google Scholar 

  17. 17.

    Pickles A, Anderson DK, Lord C. Heterogeneity and plasticity in the development of language: a 17-year follow-up of children referred early for possible autism. J Child Psychol Psychiatry. 2014;55:1354–62.

    PubMed  Google Scholar 

  18. 18.

    Fountain C, Winter AS, Bearman PS. Six developmental trajectories characterize children with autism. Pediatrics. 2012;129:e1112–20.

    PubMed  PubMed Central  Google Scholar 

  19. 19.

    Bacon EC, Dufek S, Schreibman L, Stahmer AC, Pierce K, Courchesne E. Measuring outcome in an early intervention program for toddlers with autism spectrum disorder: use of a curriculum-based assessment. Autism Res Treat. 2014;2014:964704.

    PubMed  PubMed Central  Google Scholar 

  20. 20.

    Fein D, Barton M, Eigsti IM, Kelley E, Naigles L, Schultz RT, et al. Optimal outcome in individuals with a history of autism. J Child Psychol Psychiatry. 2013;54:195–205.

    PubMed  PubMed Central  Google Scholar 

  21. 21.

    Lai MC, Lombardo MV, Suckling J, Ruigrok AN, Chakrabarti B, Ecker C, et al. Biological sex affects the neurobiology of autism. Brain. 2013;136(Pt 9):2799–815.

    PubMed  PubMed Central  Google Scholar 

  22. 22.

    Lombardo MV, Pierce K, Eyler LT, Carter Barnes C, Ahrens-Barbeau C, Solso S, et al. Different functional neural substrates for good and poor language outcome in autism. Neuron. 2015;86:567–77.

    CAS  PubMed  PubMed Central  Google Scholar 

  23. 23.

    Lombardo MV, Pramparo T, Gazestani V, Warrier V, Bethlehem RA, Carter Barnes et al. Large-scale associations between the leukocyte transcriptome and BOLD responses to speech differ in autism early language outcome subtypes. Nat Neurosci. 2018;21:1680–88.

    CAS  PubMed  PubMed Central  Google Scholar 

  24. 24.

    Kim SH, Macari S, Koller J, Chawarska K. Examining the phenotypic heterogeneity of early Autism Spectrum Disorder: subtypes and short-term outcomes. J Child Psychol Psychiatry. 2016;57:93–102.

    PubMed  Google Scholar 

  25. 25.

    Lombardo MV, Lai MC, Auyeung B, Holt RJ, Allison C, Smith P et al. Unsupervised data-driven stratification of mentalizing heterogeneity in autism. Scientific Reports 2016.

  26. 26.

    Stefanik L, Erdman L, Ameis SH, Foussias G, Mulsant BH, Behdinan T et al. Brain-behavior participant similarity networks among youth and emerging adults with schizophrenia spectrum, autism spectrum, or bipolar disorder and matched controls. Neuropsychopharmacology 2017.

  27. 27.

    Aoki Y, Yoncheva YN, Chen B, Nath T, Sharp D, Lazar M, et al. Association of white matter structure with autism spectrum disorder and attention-deficit/hyperactivity disorder. JAMA Psychiatry. 2017;74:1120–8.

    PubMed  PubMed Central  Google Scholar 

  28. 28.

    Elton A, Di Martino A, Hazlett HC, Gao W. Neural connectivity evidence for a categorical-dimensional hybrid model of autism spectrum disorder. Biol Psychiatry. 2016;80:120–8.

    PubMed  Google Scholar 

  29. 29.

    Cholemkery H, Medda J, Lempp T, Freitag CM. Classifying autism spectrum disorders by ADI-R: Subtypes or severity gradient? J Autism Dev Disord. 2016;46:2327–39.

    PubMed  Google Scholar 

  30. 30.

    Hu VW, Steinberg ME. Novel clustering of items from the Autism Diagnostic Interview-Revised to define phenotypes within autism spectrum disorders. Autism Res. 2009;2:67–77.

    PubMed  PubMed Central  Google Scholar 

  31. 31.

    Chaste P, Klei L, Sanders SJ, Hus V, Murtha MT, Lowe JK, et al. A genome-wide association study of autism using the Simons Simplex Collection: does reducing phenotypic heterogeneity in autism increase genetic homogeneity? Biol Psychiatry. 2015;77:775–84.

    PubMed  Google Scholar 

  32. 32.

    Hu VW, Sarachana T, Kim KS, Nguyen A, Kulkarni S, Steinberg ME, et al. Gene expression profiling differentiates autism case-controls and phenotypic variants of autism spectrum disorders: evidence for circadian rhythm dysfunction in severe autism. Autism Res. 2009;2:78–97.

    PubMed  PubMed Central  Google Scholar 

  33. 33.

    Talebizadeh Z, Arking DE, Hu VW. A novel stratification method in linkage studies to address inter- and intra-family heterogeneity in autism. PLoS ONE. 2013;8:e67569.

    CAS  PubMed  PubMed Central  Google Scholar 

  34. 34.

    Hu VW, Addington A, Hyman A. Novel autism subtype-dependent genetic variants are revealed by quantitative trait and subphenotype association analyses of published GWAS data. PLoS ONE 2011;6:e19067.

    CAS  PubMed  PubMed Central  Google Scholar 

  35. 35.

    Pierce K, Marinero S, Hazin R, McKenna B, Barnes CC, Malige A. Eye tracking reveals abnormal visual preference for geometric images as an early biomarker of an autism spectrum disorder subtype associated with increased symptom severity. Biol Psychiatry. 2016;79:657–66.

    Google Scholar 

  36. 36.

    Yang D, Pelphrey KA, Sukhodolsky DG, Crowley MJ, Dayan E, Dvornek NC, et al. Brain responses to biological motion predict treatment outcome in young children with autism. Transl Psychiatry. 2016;6:e948.

    CAS  PubMed  PubMed Central  Google Scholar 

  37. 37.

    Wing L. The autistic spectrum: a guide for parents and professionals. London: Constable & Robinson Ltd; 1975.

    Google Scholar 

  38. 38.

    Georgiades S, Bishop SL, Frazier T. Editorial Perspective: Longitudinal research in autism - introducing the concept of ‘chronogeneity’. J Child Psychol Psychiatry. 2017;58:634–6.

    PubMed  Google Scholar 

  39. 39.

    Cicchetti D, Rogosch FA. Equifinality and multifinality in developmental psychopathology. Dev Psychopathol. 1996;8:597–600.

    Google Scholar 

  40. 40.

    Sanders SJ. First glimpses of the neurobiology of autism spectrum disorder. Curr Opin Genet Dev. 2015;33:80–92.

    CAS  PubMed  Google Scholar 

  41. 41.

    Vorstman JAS, Parr JR, Moreno-De-Luca D, Anney RJL, Nurnberger JI Jr, Hallmayer JF. Autism genetics: opportunities and challenges for clinical translation. Nat Rev Genet. 2017;18:362–76.

    CAS  PubMed  Google Scholar 

  42. 42.

    Lai MC, Lombardo MV, Chakrabarti B, Baron-Cohen S. Subgrouping the autism “spectrum”: reflections on DSM-5. PLoS Biol. 2013;11:e1001544.

    CAS  PubMed  PubMed Central  Google Scholar 

  43. 43.

    Constantino JN. The quantitative nature of autistic social impairment. Pediatr Res. 2011;69(5 Pt 2):55R–62R.

    PubMed  PubMed Central  Google Scholar 

  44. 44.

    Baron-Cohen S, Wheelwright S, Skinner R, Martin J, Clubley E. The autism-spectrum quotient (AQ): evidence from Asperger syndrome/high-functioning autism, males and females, scientists and mathematicians. J Autism Dev Disord. 2001;31:5–17.

    CAS  PubMed  Google Scholar 

  45. 45.

    Robinson EB, St Pourcain B, Anttila V, Kosmicki JA, Bulik-Sullivan B, Grove, et al. Genetic risk for autism spectrum disorders and neuropsychiatric variation in the general population. Nat Genet. 2016;48:552–5.

    CAS  PubMed  PubMed Central  Google Scholar 

  46. 46.

    Sucksmith E, Roth I, Hoekstra RA. Autistic traits below the clinical threshold: re-examining the broader autism phenotype in the 21st century. Neuropsychol Rev. 2011;21:360–89.

    CAS  PubMed  Google Scholar 

  47. 47.

    Insel T, Cuthbert B, Garvey M, Heinssen R, Pine DS, Quinn K, et al. Research domain criteria (RDoC): toward a new classification framework for research on mental disorders. Am J Psychiatry. 2010;167:748–51.

    PubMed  Google Scholar 

  48. 48.

    Lubke GH, Muthen B. Investigating population heterogeneity with factor mixture models. Psychol Methods. 2005;10:21–39.

    PubMed  Google Scholar 

  49. 49.

    Box GEP. Science and statistics. J Am Stat Assoc. 1976;71:791–9.

    Google Scholar 

  50. 50.

    Verhoeff B. Autism in flux: a history of the concept from Leo Kanner to DSM-5. Hist Psychiatry. 2013;24:442–58.

    PubMed  Google Scholar 

  51. 51.

    Kanner L. Autistic disturbance of affective contact. Nerv Child. 1943;2:217–50.

    Google Scholar 

  52. 52.

    Eisenberg L, Kanner L. Childhood schizophrenia; symposium, 1955. VI. Early infantile autism, 1943-55. Am J Orthopsychiatry. 1956;26:556–66.

    CAS  PubMed  Google Scholar 

  53. 53.

    Asperger H. Die autistischen Psychopathen im Kindesalter. Arch Psychiatr Nervenkr. 1944;117:76–136.

    Google Scholar 

  54. 54.

    Rutter M, Bartak L. Causes of infantile autism: some considerations from recent research. J Autism Child Schizophr. 1971;1:20–32.

    CAS  PubMed  Google Scholar 

  55. 55.

    Rutter M. Diagnosis and definition of childhood autism. J Autism Child Schizophr. 1978;8:139–61.

    CAS  PubMed  Google Scholar 

  56. 56.

    Wing L. Asperger’s syndrome: a clinical account. Psychol Med. 1981;11:115–29.

    CAS  PubMed  Google Scholar 

  57. 57.

    Wing L, Gould J. Severe impairments of social interaction and associated abnormalities in children: epidemiology and classification. J Autism Dev Disord. 1979;9:11–29.

    CAS  PubMed  Google Scholar 

  58. 58.

    Wing L. Language, social, and cognitive impairments in autism and severe mental retardation. J Autism Dev Disord. 1981;11:31–44.

    CAS  PubMed  Google Scholar 

  59. 59.

    Wing L. The autistic spectrum. Lancet. 1997;350:1761–6.

    CAS  PubMed  Google Scholar 

  60. 60.

    Zwaigenbaum L, Bauman ML, Choueiri R, Kasari C, Carter A, Granpeesheh D, et al. Early intervention for children with autism spectrum disorder under 3 years of age: recommendations for practice and research. Pediatrics. 2015;136 Suppl 1:S60–81.

    Google Scholar 

  61. 61.

    Wong C, Odom SL, Hume KA, Cox AW, Fettig A, Kucharczyk S, et al. Evidence-based practices for children, youth, and young adults with autism spectrum disorder: a comprehensive review. J Autism Dev Disord. 2015;45:1951–66.

    Google Scholar 

  62. 62.

    French L, Kennedy EM. Annual Research Review: Early intervention for infants and young children with, or at-risk of, autism spectrum disorder: a systematic review. J Child Psychol Psychiatry. 2018;59:444–56.

    PubMed  Google Scholar 

  63. 63.

    Reichow B, Barton EE, Boyd BA, Hume K. Early intensive behavioral intervention (EIBI) for young children with autism spectrum disorders (ASD). Cochrane Database Syst Rev. 2012;10:CD009260.

    Google Scholar 

  64. 64.

    Schreibman L, Dawson G, Stahmer AC, Landa R, Rogers SJ, McGee GG, et al. Naturalistic developmental behavioral interventions: empirically validated treatments for autism spectrum disorder. J Autism Dev Disord. 2015;45:2411–28.

    PubMed  PubMed Central  Google Scholar 

  65. 65.

    McPheeters ML, Warren Z, Sathe N, Bruzek JL, Krishnaswami S, Jerome RN, et al. A systematic review of medical treatments for children with autism spectrum disorders. Pediatrics. 2011;127:e1312–21.

    Google Scholar 

  66. 66.

    Howes OD, Rogdaki M, Findon JL, Wichers RH, Charman T, King BH, et al. Autism spectrum disorder: consensus guidelines on assessment, treatment and research from the British Association for Psychopharmacology. J Psychopharmacol. 2018;32:3–29.

    PubMed  Google Scholar 

  67. 67.

    Muller RA, Amaral DG. Editorial: Time to give up on Autism Spectrum Disorder? Autism Res. 2017;10:10–14.

    PubMed  Google Scholar 

  68. 68.

    Loth E, Murphy DG, Spooren W. Defining precision medicine approaches to autism spectrum disorders: concepts and challenges. Front Psychiatry. 2016;7:188.

    PubMed  PubMed Central  Google Scholar 

  69. 69.

    Baron-Cohen S, Bowen DC, Holt RJ, Allison C, Auyeung B, Lombardo MV, et al. The “Reading the Mind in the Eyes” test: complete absence of typical sex difference in ~400 men and women with autism. PLoS ONE 2015;10:e0136521.

    PubMed  PubMed Central  Google Scholar 

  70. 70.

    Button KS, Ioannidis JP, Mokrysz C, Nosek BA, Flint J, Robinson ES, et al. Power failure: why small sample size undermines the reliability of neuroscience. Nat Rev Neurosci. 2013;14:365–76.

    CAS  Google Scholar 

  71. 71.

    Camerer C, Dreber A, Holzmeister F, Ho TH, Huber J, Johannesson M, et al. Evaluating the replicability of social science experiments in Nature and Science between 2010 and 2015. Nat Human Behav. 2018;2:637–44.

    Google Scholar 

  72. 72.

    Reddan MC, Lindquist MA, Wager TD. Effect size estimation in neuroimaging. JAMA Psychiatry. 2017;74:207–8.

    PubMed  Google Scholar 

  73. 73.

    Watkins EE, Zimmermann ZJ, Poling A. The gender of participants in published research involving people with autism spectrum disorders. Res Autism Spectr Disord. 2014;8:143–6.

    Google Scholar 

  74. 74.

    Werling DM. The role of sex-differential biology in risk for autism spectrum disorder. Biol Sex Differ. 2016;7:58.

    PubMed  PubMed Central  Google Scholar 

  75. 75.

    Lai MC, Lerch JP, Floris DL, Ruigrok AN, Pohl A, Lombardo MV, et al. Imaging sex/gender and autism in the brain: Etiological implications. J Neurosci Res. 2017;95:380–97.

    CAS  PubMed  Google Scholar 

  76. 76.

    Lai MC, Lombardo MV, Auyeung B, Chakrabarti B, Baron-Cohen S. Sex/gender differences and autism: setting the scene for future research. J Am Acad Child Adolesc Psychiatry. 2015;54:11–24.

    PubMed  PubMed Central  Google Scholar 

  77. 77.

    Zhao Y, Castellanos FX. Annual Research Review: Discovery science strategies in studies of the pathophysiology of child and adolescent psychiatric disorders--promises and limitations. J Child Psychol Psychiatry. 2016;57:421–39.

    PubMed  PubMed Central  Google Scholar 

  78. 78.

    Di Martino A, O’Connor D, Chen B, Alaerts K, Anderson JS, Assaf M, et al. Enhancing studies of the connectome in autism using the autism brain imaging data exchange II. Sci Data. 2017;4:170010.

    PubMed  PubMed Central  Google Scholar 

  79. 79.

    Hall D, Huerta MF, McAuliffe MJ, Farber GK. Sharing heterogeneous data: the national database for autism research. Neuroinformatics. 2012;10:331–9.

    PubMed  PubMed Central  Google Scholar 

  80. 80.

    Fischbach GD, Lord C. The Simons Simplex Collection: a resource for identification of autism genetic risk factors. Neuron. 2010;68:192–5.

    CAS  Google Scholar 

  81. 81.

    Consortium S. SPARK: A US Cohort of 50,000 Families to Accelerate Autism Research. Neuron. 2018;97:488–93.

    Google Scholar 

  82. 82.

    Alexander LM, Escalera J, Ai L, Andreotti C, Febre K, Mangone A, et al. An open resource for transdiagnostic research in pediatric mental health and learning disorders. Sci Data. 2017;4:170181.

    PubMed  PubMed Central  Google Scholar 

  83. 83.

    Al-Jawahiri R, Milne E. Resources available for autism research in the big data era: a systematic review. PeerJ. 2017;5:e2880.

    PubMed  PubMed Central  Google Scholar 

  84. 84.

    Reilly J, Gallagher L, Chen JL, Leader G, Shen S. Bio-collections in autism research. Mol Autism. 2017;8:34.

    PubMed  PubMed Central  Google Scholar 

  85. 85.

    Charman T, Loth E, Tillmann J, Crawley D, Wooldridge C, Goyard D, et al. The EU-AIMS Longitudinal European Autism Project (LEAP): clinical characterisation. Mol Autism. 2017;8:27.

    PubMed  PubMed Central  Google Scholar 

  86. 86.

    Loth E, Charman T, Mason L, Tillmann J, Jones EJH, Wooldridge C, et al. The EU-AIMS Longitudinal European Autism Project (LEAP): design and methodologies to identify and validate stratification biomarkers for autism spectrum disorders. Mol Autism. 2017;8:24.

    PubMed  PubMed Central  Google Scholar 

  87. 87.

    Loth E, Spooren W, Ham LM, Isaac MB, Auriche-Benichou C, Banaschewski T, et al. Identification and validation of biomarkers for autism spectrum disorders. Nat Rev Drug Discov. 2016;15:70–3.

    CAS  PubMed  Google Scholar 

  88. 88.

    Brunsdon VE, Happe F. Exploring the ‘fractionation’ of autism at the cognitive level. Autism. 2014;18:17–30.

    PubMed  Google Scholar 

  89. 89.

    Amiet C, Gourfinkel-An I, Laurent C, Bodeau N, Genin B, Leguern E, et al. Does epilepsy in multiplex autism pedigrees define a different subgroup in terms of clinical characteristics and genetic risk? Mol Autism. 2013;4:47.

    PubMed  PubMed Central  Google Scholar 

  90. 90.

    Bernier R, Golzio C, Xiong B, Stessman HA, Coe BP, Penn O, et al. Disruptive CHD8 mutations define a subtype of autism early in development. Cell. 2014;158:263–76.

    CAS  PubMed  PubMed Central  Google Scholar 

  91. 91.

    Bernier R, Hudac CM, Chen Q, Zeng C, Wallace AS, Gerdts J, et al. Developmental trajectories for young children with 16p11.2 copy number variation. Am J Med Genet B Neuropsychiatr Genet. 2017;174:367–80.

    CAS  PubMed  Google Scholar 

  92. 92.

    Earl RK, Turner TN, Mefford HC, Hudac CM, Gerdts J, Eichler EE, et al. Clinical phenotype of ASD-associated DYRK1A haploinsufficiency. Mol Autism. 2017;8:54.

    PubMed  PubMed Central  Google Scholar 

  93. 93.

    Hudac CM, Stessman HAF, DesChamps TD, Kresse A, Faja S, Neuhaus E, et al. Exploring the heterogeneity of neural social indices for genetically distinct etiologies of autism. J Neurodev Disord. 2017;9:24.

    PubMed  PubMed Central  Google Scholar 

  94. 94.

    Szatmari P, Georgiades S, Duku E, Bennett TA, Bryson S, Fombonne E, et al. Developmental trajectories of symptom severity and adaptive functioning in an inception cohort of preschool children with autism spectrum disorder. JAMA Psychiatry. 2015;72:276–83.

    PubMed  Google Scholar 

  95. 95.

    Parr JR. Does developmental regression in autism spectrum disorder have biological origins? Dev Med Child Neurol. 2017;59:889.

    PubMed  Google Scholar 

  96. 96.

    Gupta AR, Westphal A, Yang DYJ, Sullivan CAW, Eilbott J, Zaidi S, et al. Neurogenetic analysis of childhood disintegrative disorder. Mol Autism. 2017;8:19.

    PubMed  PubMed Central  Google Scholar 

  97. 97.

    Bethlehem RAI, Seidlitz J, Romero-Garcia R, Dumas G, Lombardo MV. Normative age modeling of cortical thickness in autistic males. bioRxiv. 2018.

  98. 98.

    Loth E, Ahmad J, Mason L, Crawley DV, Hayward HL, San Jose Caceres A, et al. Identifying cross-domain cognitive subtypes among children, adolescents, and adults with autism spectrum disorders. San Francisco: International Society for Autism Research; 2017.

    Google Scholar 

  99. 99.

    Hastie TJ, Tibshirani RJ, Friedman JH. The elements of statistical learning: data mining, inference, and prediction. New York: Springer; 2011.

    Google Scholar 

  100. 100.

    Ellegood J, Anagnostou E, Babineau BA, Crawley JN, Lin L, Genestine M, et al. Clustering autism: using neuroanatomical differences in 26 mouse models to gain insight into the heterogeneity. Mol Psychiatry. 2015;20:118–25.

    CAS  PubMed  Google Scholar 

  101. 101.

    Kendler KS. An historical framework for psychiatric nosology. Psychol Med. 2009;39:1935–41.

    PubMed  PubMed Central  Google Scholar 

  102. 102.

    Kendler KS, Zachar P, Craver C. What kinds of things are psychiatric disorders? Psychol Med. 2011;41:1143–50.

    CAS  PubMed  Google Scholar 

  103. 103.

    Gillberg C. The ESSENCE in child psychiatry: Early Symptomatic Syndromes Eliciting Neurodevelopmental Clinical Examinations. Res Dev Disabil. 2010;31:1543–51.

    PubMed  Google Scholar 

  104. 104.

    Levy Y, Ebstein RP. Research review: crossing syndrome boundaries in the search for brain endophenotypes. J Child Psychol Psychiatry. 2009;50:657–68.

    PubMed  Google Scholar 

  105. 105.

    Sonuga-Barke EJ, Cortese S, Fairchild G, Stringaris A. Annual Research Review: Transdiagnostic neuroscience of child and adolescent mental disorders--differentiating decision making in attention-deficit/hyperactivity disorder, conduct disorder, depression, and anxiety. J Child Psychol Psychiatry. 2016;57:321–49.

    PubMed  Google Scholar 

  106. 106.

    Ameis SH, Lerch JP, Taylor MJ, Lee W, Viviano JD, Pipitone J, et al. A diffusion tensor imaging study in children With ADHD, autism spectrum disorder, OCD, and matched controls: distinct and non-distinct white matter disruption and dimensional brain-behavior relationships. Am J Psychiatry. 2016;173:1213–22.

    Google Scholar 

  107. 107.

    Lai MC, Anagnostou E, Wiznitzer M, Allison C, Baron-Cohen S. Evidence-based support for autistic people across the lifespan: maximizing potential, minimizing barriers, and optimizing the person-environment fit. OSF Preprints 2018.

Download references


MVL was supported by a European Research Council (ERC) Starting Grant (ERC-2017-STG 755816). M-CL was supported by the O’Brien Scholars Program within the Child and Youth Mental Health Collaborative at the Centre for Addiction and Mental Health (CAMH) and The Hospital for Sick Children, Toronto, the Academic Scholar Award from the Department of Psychiatry, University of Toronto, the Slaight Family Child and Youth Mental Health Innovation Fund, CAMH Foundation, and the Ontario Brain Institute via the Province of Ontario Neurodevelopmental Disorders (POND) Network. SB-C was supported by the Autism Research Trust and the Medical Research Council during the period of this work. The research was supported by the National Institute for Health Research (NIHR) Collaboration for Leadership in Applied Health Research and Care East of England at Cambridgeshire and Peterborough NHS Foundation Trust. The views expressed are those of the authors and not necessarily those of the NHS, the NIHR, or the Department of Health. The research was also supported by funding from the Innovative Medicines Initiative 2 Joint Undertaking (JU) under grant agreement No. 777394. The JU receives support from the European Union’s Horizon 2020 research and innovation program and EFPIA and AUTISM SPEAKS, Autistica, SFARI. The views expressed are those of the authors and not necessarily those of the NHS, the NIHR, or the Department of Health, or JU partners.

Author information



Corresponding author

Correspondence to Michael V. Lombardo.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Lombardo, M.V., Lai, M. & Baron-Cohen, S. Big data approaches to decomposing heterogeneity across the autism spectrum. Mol Psychiatry 24, 1435–1450 (2019).

Download citation

Further reading