Early detection and intervention in autism spectrum disorder (ASD) are topics of paramount importance because of the enormous potential to capitalize on the brain’s enhanced plasticity during early development as a mechanism to positively impact outcomes [1]. While it is becoming increasingly clear that the biology of autism starts in early prenatal development [2, 3] and that early behavioral signs begin to manifest before 18 months [4,5,6], the mean age of diagnosis is still lagging far behind at 3–4 years of age [7, 8]. In contrast to this reality, we have recently shown that diagnostic stability at much earlier ages is indeed high [9], and thus the ability to detect and start treatment earlier is feasible [10]. Some have suggested that detection and intervention before 24 months are key in order to capitalize on early neuroplasticity to facilitate optimal outcomes [10,11,12]. The impact of starting intervention earlier would likely be more total positive gains for the child (indexed by the absolute level of improvement). However, a less obvious, but perhaps equally important effect of earlier intervention could be a decrease in the variability of treatment responses at a group level. If this were the case, the reduction in treatment response variability might allow for more precise predictions about treatment outcomes.

Understanding the ingredients that moderate and predict early intervention treatment response is of the utmost importance, especially given the current state of the field, where there is notably large heterogeneity in how children may respond to treatment [13]. While the field has noted that some early interventions have an impact at a group level [14, 15], what is less clear is how to predict an individual’s specific response to the treatment and how to make that prediction before treatment begins. Understanding individual-level predictors of treatment response, particularly pre-treatment individual characteristics, is a key objective for precision medicine [16, 17] applied to autism. Ideally, we would like to know what child-specific characteristics are present before an intervention starts, in order to help us optimally predict how that specific intervention may affect the child. There are indications that some pre-treatment characteristics such as level of play, language, social cognitive abilities, IQ, ASD symptom severity, and adaptive behavior may be important for moderating treatment response [11, 18,19,20,21,22,23]. In contrast to the many clinical studies that have been carried out on these phenotypic characteristics, biological predictors of treatment responses remain largely unknown, leaving open the possibility that individual intrinsic biological characteristics of a child may also moderate their response to treatment. If we better understood such treatment-relevant and individualized biological characteristics, this might yield unique insights into how and why some treatments work better for some children, but not others.

In this work, we examine the effect of relatively early (<24 months) versus later (≥24 months) treatment start and how this may affect total gains and variability in treatment response. We also investigate whether pre-treatment standardized clinical behavioral measures and blood leukocyte gene expression patterns moderate how quickly a child will respond to early intervention. We operationalize treatment response here as the rate at which children respond over time and will refer to this concept from here on as “treatment slopes.” Blood leukocyte gene expression offers up a powerful in-vivo alternative to clinical behavioral measures, as it helps to map out biological mechanisms of brain relevance but in a peripheral non-neural tissue. While the brain is largely an inaccessible tissue to assay mechanisms like gene expression in living patients, blood leukocyte gene expression has revealed a number of interesting brain-relevant characteristics that can be related to different phenotypes in living patients. Leukocyte expression patterns can be used in a classifier to predict diagnostic status [24], correlate with total brain size [25], and are related to large-scale functional neural system response to speech [26], the patterning of thickness and surface area in the cerebral cortex [27], and social symptom severity [28]. Differentially expressed genes in blood leukocytes are part of extended gene networks that are linked to highly penetrant ASD-related mutations [28]. Another revelation is that blood leukocyte genes associated with autism tend to be within a class of broadly expressed genes that are highly expressed in the brain and many other tissues [26]. Broadly expressed genes are one class of important ASD-associated genes that primarily have peak levels of expression during prenatal development [3, 29]. Given the sensitivity of blood leukocyte gene expression activity as a tool for assessing the living biology behind ASD toddlers [2], we reasoned that there may be pre-treatment gene expression patterns in ASD toddlers that may be predictive of treatment slopes.

In this study, we used the least absolute shrinkage and selection operator (LASSO) regression [30] to model how clinical behavioral measures or gene expression patterns may be predictive of treatment slopes. LASSO is an important modeling strategy here for its use of L1 regularization, which acts to penalize largely uninformative features and results in a sparse solution that allows the user to isolate the specific subset of features that are highly predictive. To better understand treatment-relevant genes, we ran further analyses to test if these genes highly interact at the protein level, whether they overlap with known ASD-related genomic and epigenomic mechanisms, and how they are expressed spatially throughout the brain.

Materials and methods


This study was approved by the Institutional Review Board at the University of California, San Diego. Participants and families in this study were recruited as part of a larger multidisciplinary research project examining early neurobiological features and development of ASD at the University of California, San Diego. Toddlers with a high likelihood for an ASD diagnosis were identified from one of two sources: general community referral (e.g., website or outside agency) and a population-based screening method called Get SET Early [6, 10]. Using this population-based screening approach, toddlers with a high likelihood for an ASD diagnosis as young as 12 months were identified in pediatric offices with a broadband screening instrument—the Communication and Symbolic Behavior Scales-Developmental Profile Infant Toddler Checklist [31]. Toddlers were evaluated and tracked every 6 months until their third birthday when a final diagnosis was given. Licensed clinicians with expertise evaluating and diagnosing ASD in toddlers made final diagnoses based on clinical judgment and by incorporating criteria for ASD on the Autism Diagnostic Observation Schedule (ADOS) [9, 32]. Toddlers who were determined to be high likelihood for ASD were offered intervention through our UCSD treatment program. Seventy-two families were referred for intervention, and 49 families chose to receive treatment in our program. Of the 49 children who received treatment from our program, 41 children (33 males, 8 females, mean age at the start of treatment  = 22.77 months, SD age = 4.08, range = 13–27 months) also had a blood sample taken before the start of treatment and were therefore included in analyses for this work. Additional participant pre-treatment clinical information can be found in Table 1. Data from this study have been previously reported in Bacon et al. [33], although this prior paper only focused on treatment and clinical behavioral data and did not examine gene expression.

Table 1 Pre-treatment clinical characteristics.

Early intervention program

In order to reduce confounds that could be associated with differences associated with treatment type and administration, all toddlers received the same in-home treatment program using the Strategies for Teaching Based on Autism Research (STAR) curriculum [34]. The STAR program is a comprehensive behavioral intervention program with a curriculum designed specifically for children with ASD and includes instructional strategies of Discrete Trial Training [35,36,37], Pivotal Response Training [38, 39], and teaching in Functional Routines [40, 41]. In an effort to improve the developmental appropriateness of the curriculum for these very young children, the STAR curriculum was augmented with developmental approaches applied through Project ImPACT. Project ImPACT is a manualized curriculum developed by Ingersoll and Dvortcsak [42] used to target social-communication goals in young children with ASD. Project ImPACT focuses on the relationship between adult responsivity and children’s social-communicative development. In the Project ImPACT curriculum, an early childhood interventionist (ECI) combines naturalistic behavioral strategies and developmental strategies. For example, the interventionist would respond to all communicative attempts by the child as if they were purposeful and recast expanded communication to facilitate communicative growth.

Treatment delivery

Each child received 6–12 hours per week (mean = 9.01, SD = 1.53) of direct one-on-one intervention with a trained ECI at home until 36 months of age. ECIs were bachelor’s degree or undergraduate-level research assistants with previous experience with young children with ASD. Each ECI received extensive didactic and hands-on training in behavioral principles and the STAR and Project ImPACT programs discussed above. Fidelity of implementation was reached for each intervention strategy as determined by using all components of the intervention correctly at least 80% of the time across two different children and monitored for maintenance. Programs were developed and supervised by master’s degree-level clinicians (i.e., in-home coordinators) experienced in ASD, with oversight from two doctorate-level clinical psychologists with extensive experience in early behavioral intervention for this population. In addition, parent coaching was provided throughout the course of participation.

Treatment outcome measure

The Adapted Student Learning Profile (aSLP) is a curriculum-based assessment for determining student learning goals and was adapted from the STAR curriculum to include additional goals from the Project ImPACT curriculum (see [34, 42]). The aSLP provides an extensive list of skills targeted in the STAR and Project ImPACT curricula and allows for the assessor to indicate the child’s performance level on each skill across six domains: receptive language, expressive language, spontaneous language, functional routines, pre-academic concepts, and play and social interaction concepts. Data were analyzed using total aSLP scores across all domains rather than separate domain aSLP scores. The aSLP is administered by presenting each item up to five times to the child and observing the child’s response. This is conducted in a structured format, and no teaching occurs during the assessment. The assessor then rates the child’s response, indicating if the child did not demonstrate the skill or showed a partial demonstration of the skill or mastery of the skill. The entire aSLP takes ~30–45 minutes to complete. Each child’s in-home coordinator completed an aSLP at intake and every 3 months thereafter to determine performance and progress. A child’s performance was estimated by the subject-specific slope estimated in a linear mixed-effect model for modeling on the longitudinal aSLP scores (see section on “Developmental trajectory analyses”).

Pre-treatment clinical behavioral measures

Pre-treatment clinical behavioral measures were collected to characterize the sample and utilized for analyzing how predictive such pre-treatment clinical measures (measured at treatment start) were of subsequent treatment slopes. The clinical measures analyzed were the Mullen Scales of Early Learning (MSEL), the Vineland Adaptive Behavior Scales (2nd edition; VABS), and the ADOS. The MSEL assesses the developmental functioning of children between birth and 68 months [43]. An examiner measures child functioning level through a series of play-like tasks over five domains: gross motor, fine motor, receptive language, expressive language, and visual reception skills. For each scale, the assessment derives a T-score with a mean of 50 and standard deviation of 10, a percentile score, and an age equivalent score indicating at what developmental age the child is performing. An early learning composite (ELC) score is calculated from the total of scores on all scales (with the exception of the gross motor scale) with a mean of 100 and standard deviation of 15. The VABS provides a measure of adaptive skills used to cope with challenges of daily living [44]. A caregiver completes a questionnaire regarding the individual’s current level of functioning across five domains: communication, daily living skills, socialization, motor skills, and maladaptive behavior. All scales use standard scores with a mean of 100 and a standard deviation of 15, a percentile score, and an age equivalent score indicating at what developmental age the individual is performing. Scores on all scales are combined to obtain an overall adaptive behavior composite (ABC) with a mean of 100 and a standard deviation of 15.

Developmental trajectory analyses

To estimate aSLP trajectories for each toddler, we used a linear mixed-effect model to estimate longitudinal subject-specific intercepts and slopes as random effects. The subject-specific slopes (from here on called “treatment slopes”) estimated from this model were extracted and used as the primary treatment outcome measure to be predicted by pre-treatment gene expression or clinical measures. These analyses were computed using the lme function from the nlme library in R.

To better understand the effects of age at treatment start, we used 24 months as the cutoff point for distinguishing very early versus later treatment start. This very early (<24 months) versus later distinction at 24 months (≥24 months) was made given that it is considered that the first 24 months of life are the critical early window for when early intervention could have the most impact [11, 12]. Early (<24 months) versus later (≥24 months) start groups were not different with regard to treatment intensity (i.e., average number of hours per week in treatment; F(1,39) = 0.004, p = 0.94; <24 months mean = 8.91, SD = 1.65; ≥24 months mean = 9.14, SD = 1.43) or with regard to general pretreatment developmental ability (i.e., pretreatment Mullen Early Learning Composite; Welch’s t(36.21) = −1.19, p = 0.23; <24 months mean = 76.50, SD = 14.29; ≥24 months mean = 70.74, SD = 16.25). Linear mixed-effect models were used to examine differences in the aSLP as a function of very early (<24 months) versus later (≥24 months) start group. The linear mixed-effect model included treatment start group (Early, <24 months of age at the start of treatment; Later, ≥24 months of age at the start of treatment), age, the interaction between age and treatment start group, number of days in treatment, treatment intensity (average number of hours per week), and pre-treatment Mullen Early Learning Composite as fixed effects and subject-specific age slopes and intercepts as random effects. We also investigated how variability in treatment slopes may differ between very early versus later start groups by computing the standard deviation of treatment slopes within each group and then quantifying the difference in standard deviation, computed as the difference score between later versus very early start groups. To test the standard deviation difference between groups against the null hypothesis of no difference in standard deviation difference score, we computed standard deviation difference scores over 10,000 random permutations of the very early or later start group labels, to derive a null distribution of standard deviation difference scores. A p value was then computed as the percentage of times under the null distribution that a standard deviation difference score was greater than or equal to the actual standard deviation difference score.

Blood sample collection, RNA extraction, quality control, and sample preparation

Four to six milliliters of blood was collected into EDTA-coated tubes from toddlers on visits when they had no fever, cold, flu, infections or other illnesses, or use of medications for illnesses 72 hours prior to blood draw. Temperature was also taken at the time of blood draw. Blood samples were passed over a LeukoLOCK filter (Ambion, Austin, TX, USA) to capture and stabilize leukocytes and immediately placed in a −20 °C freezer. Total RNA was extracted following standard procedures and manufacturer’s instructions (Ambion, Austin, TX, USA). LeukoLOCK disks (Ambion, Cat #1933) were freed from RNA-later and Tri-reagent (Ambion, Cat #9738) was used to flush out the captured lymphocyte and lyse the cells. RNA was subsequently precipitated with ethanol and purified through washing and cartridge-based steps. The quality of messenger RNA samples was quantified by the RNA integrity number (RIN), with values of 7.0 or greater considered acceptable [45], and all processed RNA samples passed RIN quality control. Quantification of RNA was performed using Nanodrop (Thermo Scientific, Wilmington, DE, USA). Samples were prepped in 96-well plates at the concentration of 25 ng/µl.

Gene expression and data processing

RNA was assayed at Scripps Genomic Medicine (La Jolla, CA, USA) for labeling, hybridization, and scanning using the Illumina BeadChips pipeline (Illumina, San Diego, CA, USA) per the manufacturer’s instruction. All arrays were scanned with the Illumina BeadArray Reader and read into Illumina GenomeStudio software (version 1.1.1). Raw data were exported from Illumina GenomeStudio, and data preprocessing was performed using the lumi package [46] for R (http://www.R-project.org) and Bioconductor (https://www.bioconductor.org) [47]. Raw and normalized data are part of larger sets deposited in the Gene Expression Omnibus database (GSE42133; GSE111175).

Patient gene expression dataset

A larger primary dataset of blood leukocyte gene expression was available from 383 samples from 314 toddlers within the UC San Diego cohort, with the age range of 1–4 years old. The samples were assayed using the Illumina microarray platform in three batches. The datasets were combined by matching the Illumina Probe ID and probe nucleotide sequences. The final set included a total of 20,194 gene probes. Quality control analysis was performed to identify and remove 23 outlier samples from the dataset. Samples were marked as outliers if they showed low signal intensity (average signal two standard deviations lower than the overall mean), deviant pairwise correlations, deviant cumulative distributions, deviant multidimensional scaling plots, or poor hierarchical clustering, as described elsewhere [25]. This resulted in a final high-quality dataset that included 360 samples from 299 toddlers. High reproducibility was observed across technical replicates (mean Spearman’s correlation of 0.97 and median of 0.98). Thus, we randomly removed one of each of the two technical replicates from the primary dataset. From the subjects in the larger primary dataset, a total of n = 41 also had treatment data; n = 36 from the Illumina HT12 platform along with n = 5 from the Illumina WG6 platform were used in this study. The 20,194 probes were quantile normalized and then variance filtered to leave the top 50% of highly varying probes (i.e., 10,097 probes). Treatment slopes were slightly different as a function of batch (F(2,34) = 3.44, p = 0.047), but were not associated with age at blood sampling (F(1,34) = 0.0009, p = 0.97), sex (F(1,34) = 2.03, p = 0.16), RIN (F(1,34) = 0.22, p = 0.64), or treatment intensity (average number of hours per week in treatment) (F(1,34) = 0.005, p = 0.93). Removal of variance associated with batch, sex, and RIN was achieved by using a linear model to estimate these effects in the training set of each cross-validation (CV) fold. This model computed on the training set was then applied to the test set for removing variance in such covariates.

Predictive modeling of treatment slopes

To predict individual differences in treatment slopes, we used a LASSO regression model [30], which was used as predictors of either multivariate pre-treatment gene expression or clinical measures. LASSO uses L1 regularization (controlled by the lambda (λ) parameter) to shrink beta coefficients of uninformative features and thus reduce or effectively remove the influence of such features on the model. This feature is important for our purposes as we seek to compute a model that predicts treatment slopes but also informs us as to which features (e.g., genes or clinical measures) are most important for the model. For all LASSO modeling to assess the model’s predictive utility, we used leave-one-out CV to partition the data into training and test sets. Within the training set, a 10-fold CV loop is used to estimate the optimal lambda parameter for the model. Cross-validated mean-squared error (MSE) and R2 were computed to evaluate the predictive value of the model. We also used permutation tests (1000 permutations) to randomly shuffle treatment slopes and construct a null distribution of MSE values under the null hypothesis. This null MSE distribution was used to compute a p value, defined as the proportion of times under the null distribution where an MSE value was as low or lower than the observed MSE value with unpermuted treatment slopes.

Protein–protein interaction analysis

The resulting gene list from the LASSO model predicting treatment slopes was then tested for evidence of protein–protein interactions (PPI). This analysis was achieved using the STRING database (https://string-db.org), with all parameters set to the STRING defaults (using all interaction sources and confidence interaction scores of 0.4 or higher). STRING also outputs enrichment results for Gene Ontology, Reactome, KEGG, and UniProt databases.

Autism-associated gene set enrichment analyses

To better link the set of treatment-relevant genes prioritized by the LASSO model, we tested this gene set for enrichment with other lists of genes known from the literature to be associated with autism. For autism-associated genetic mutations, we used genes from SFARI Gene (https://gene.sfari.org) [48] in categories S, 1, 2, and 3 (October 2020 release). For genes with evidence of dysregulated expression in postmortem cortical tissue, we used differentially expressed gene lists from Gandal et al. [49]. At the epigenetic level, we also analyzed genes with evidence for differential histone acetylation in autism in postmortem prefrontal and temporal cortex tissue [50].

Spatial gene expression analyses

To get a better idea of the brain regions that are likely to be maximally affected by treatment-relevant genes prioritized by the LASSO model, we examined how these genes were spatially expressed across the brain using the Allen Institute Human Brain Atlas [51]. Whole-brain gene expression maps for treatment-relevant genes were downloaded in the Montreal Neurological Institute space from https://neurosynth.org/genes/. These gene maps were then input into a whole-brain one-sample t test computed in SPM12 (https://www.fil.ion.ucl.ac.uk/spm/software/spm12/). Thresholding for multiple comparisons was achieved with voxel-wise false discovery rate (FDR) correction set to q < 0.05.


Differences between early versus late treatment start groups

A total of n = 41 toddlers were considered in all further analyses given that they had both gene expression and treatment data. In a first analysis, we examined whether individuals with a very early start to treatment (i.e., <24 months) would result in better outcomes than those who started treatment later (i.e., ≥24 months). As noted above, this early versus late distinction at 24 months was made given that it is considered that the first 24 months of life are the critical early window for when early intervention could have the most impact [11, 12]. For this analysis, we used a linear mixed-effect model that modeled treatment start group (Early, <24 months of age at the start of treatment; Later, ≥24 months of age at the start of treatment), age, the interaction between age and treatment start group, number of days in treatment, treatment intensity (average number of hours per week), and pre-treatment Mullen Early Learning Composite as fixed effects and subject-specific age slopes and intercepts as random effects. Main effects were observed for age (F = 134.09, p = 2.22e − 16) and treatment start group (F = 20.92, p = 5.47e − 5), but there was no interaction between age and treatment start group (F = 1.14, p = 0.28). As shown in Fig. 1A, the treatment start group effect is driven by the early group (<24 months) showing larger total treatment gains than those who start treatment relatively later (≥24 months) and that these effects cannot be explained by factors such as the duration of time in treatment, treatment intensity, or general pre-treatment developmental ability. However, the lack of an age-by-group interaction in predicting treatment slopes indicates that there are no differences in the steepness of the trajectories between early versus late start groups.

Fig. 1: Treatment slopes and relationship with age at treatment start.
figure 1

Panels A and B show trajectories of skill acquisition on the Adapted Student Learning Profile (aSLP) treatment outcome measure. Panel A shows these trajectories for treatment start groups defined by age at treatment start as either early (<24 months, pink) or relatively later (≥24 months, turquoise). Panel B shows the trajectories but with each individual’s data now colored by treatment slopes (colored from blue to red) estimated from a linear mixed-effect model. Higher slopes indicate steeper trajectories and thus a faster rate of skill growth over time, whereas relatively lower slopes indicate less steep trajectories that can be interpreted as relatively slower rates of skill growth over time. Panel C shows treatment slopes for each individual as a function of age at treatment start (color indicates treatment slopes, as shown in panel B). Variability in treatment slopes becomes markedly larger when the age of treatment start occurs after 24 months of age. Panel D shows a null distribution of the difference in standard deviations over 10,000 permutations of random labelings of later (≥24 months) versus early (<24 months) groups. The actual difference in standard deviation between later versus very early start groups is shown by the vertical red line.

While the steepness of treatment slopes does not heavily differ on average between early versus late start groups, it is noteworthy that where the two groups do differ is on the variability in treatment slopes. Figure 1B, C shows a clear distinction between the late start group showing markedly more variable treatment slopes than the early start group. A permutation test further verified that the actual difference in standard deviations between late versus early start groups is highly significant relative to what this standard deviation difference would be under random group labeling (p = 0.004) (Fig. 1D). This result indicates that while treatment slopes remain relatively consistent in their variability before 24 months, after 24 months the treatment slopes become much more variable.

Prediction of treatment slopes with pre-treatment clinical measures

We next examined if pre-treatment clinical behavioral measures could be predictive of treatment slopes. A LASSO model that included all pre-treatment ADOS, Mullen, and VABS subscales was able to significantly predict treatment slopes (mean MSE = 19.87, p = 9.99e − 4, R2 = 0.21) (Fig. 2A). Describing the correlations between treatment slopes and individual pre-treatment clinical measures, we find that all Vineland and Mullen subscales are significantly positively correlated, whereas total ADOS score and ADOS RRB were negatively correlated with treatment slopes (Fig. 2B). These results are largely consistent with the idea from past work that pre-treatment clinical measures can be predictive of later treatment outcomes [11, 18,19,20,21,22,23]. However, as a new perspective on this effect, with longitudinal trajectories measured over more than just two time-points (e.g., pre-treatment and post-treatment), we find that pre-treatment clinical measures can predict how steep an individual’s treatment slope trajectory will be over the course of the treatment.

Fig. 2: Predicting treatment slopes with pre-treatment clinical measures.
figure 2

Panel A shows actual treatment slopes (y-axis) versus predicted treatment slopes from a LASSO model (x-axis) when using pre-treatment clinical measures as features. The color from blue to red indicates actual treatment slope values. Panel B shows the correlation (Pearson’s r) between treatment slopes and each of the pre-treatment clinical measures. The coloring of the bars indicates the −log 10(p value) and bars that pass the vertical dotted line are measures that pass FDR q < 0.05. ADOS Autism Diagnostic Observation Schedule, SC social communication, RRB restricted repetitive behaviors, MSEL Mullen Scales of Early Learning, VR visual reception, FM fine motor, RL receptive language, EL expressive language, VABS Vineland Adaptive Behavior Scales, Comm communication, DL daily living skills, Soc socialization, ABC adaptive behavior composite.

Prediction of treatment slopes with pre-treatment blood leukocyte gene expression data

We next asked if pre-treatment biological characteristics such as multivariate pre-treatment gene expression in blood leukocytes could also predict treatment slopes. Using a similar LASSO regression approach, we find that pre-treatment gene expression can also significantly predict treatment slopes (MSE = 21.67, p = 0.001, R2 = 0.13) (Fig. 3A), albeit to a lesser extent than pre-treatment clinical behavioral variables (e.g., 13% variance predicted with gene expression versus 21% variance predicted with clinical behavioral measures).

Fig. 3: Predicting treatment slopes with pre-treatment blood leukocyte gene expression.
figure 3

Panel A shows actual treatment slopes (y-axis) versus predicted treatment slopes from a LASSO model (x-axis) using pre-treatment blood leukocyte gene expression as features. The color from blue to red indicates actual treatment slope values. Panel B shows the −log 10 p value for the enrichment test (enrichment odds ratio (OR) colored in red) between treatment-relevant genes and ASD-associated gene lists. SFARI ASD refers to genes listed on SFARI Gene (https://gene.sfari.org), where mutations are known to be associated with ASD. DE Upreg or Downreg lists are genes that are (DE in postmortem cortical tissue [49]. ASD DA lists are genes whose histone proteins are DA in postmortem cortical tissue [50]. Bars passing the dotted line indicate gene lists that pass FDR q < 0.05. Panel C shows a graph of the protein–protein interaction (PPI) network of treatment-relevant genes from the LASSO model. Red nodes are genes enriched in UniProt for “acetylation.” Green circles indicate genes whose histone proteins are DA in autism postmortem cortical tissue. Blue circles indicate genes that have high-confidence or syndromic ASD genes in SFARI Gene. DE differentially expressed, DA differentially acetylated, PFC prefrontal cortex, TC temporal cortex.

Next, we investigated which genes were most important in helping the LASSO model make such treatment slope predictions. Because LASSO uses L1 regularization to shrink coefficients of features that are less informative to 0, this allowed us to identify the subset of key genes that contribute to the model’s predictive accuracy. Here we find that LASSO prioritizes 295 genes that help predict treatment slopes. Rather than being a random array of genes, these treatment-relevant genes show evidence of interactions at the protein level, as evinced with a PPI analysis (observed edges = 353, expected edges 306, p = 0.004) (Fig. 3C). Further annotation of this treatment-relevant gene set was done with gene set enrichment analysis. This analysis discovered enriched biological processes such as regulation of protein localization and vesicle-mediated transport. Cellular compartments such as cytosol, intracellular organelle lumen, and cytoplasm were also enriched. With UniProt, we also discovered acetylation as a keyword enrichment (Fig. 3C) (Table 2). Thus, treatment-relevant genes discovered by LASSO likely interact at the protein level and may be involved in processes such as protein localization, vesicle-mediated transport, and acetylation.

Table 2 Gene enrichment analysis.

We next asked if this list of treatment-relevant genes might be associated with genetic mutations associated with autism or with genes that show dysregulated expression or histone acetylation in postmortem cortical tissue. Using gene lists from SFARI Gene [48] as well as a list of differentially expressed genes from Gandal et al. [49], we find no evidence of enrichment in either of these lists. However, we did find the presence of four genes that are either high-confidence and/or syndromic ASD genes in SFARI Gene—KMT2C, CORO1A, FBXO11, and PPP2R5D. KMT2C is noted as a rare de novo loss-of-function variant associated with autism [52,53,54,55,56,57]. CORO1A is a rare de novo loss-of-function variant associated with autism [56] and is located within the well-known ASD-associated CNV region of 16p11.2 [58]. FBXO11 is another rare de novo loss-of-function and missense variant in autism [54, 57] and appears in the autism-associated CNV region of 2p16.3 [59, 60]. PPP2R5D is a known syndromic cause of ASD and rare de novo loss-of-function variant associated with autism [61, 62]. Each of these genes is a member of the PPI network shown in Fig. 3C. Related to the UniProt enrichment in acetylation, we also found significant enrichment with genes that are differentially acetylated in autism postmortem cortical tissue (Fig. 3B and Table 3). Specifically, treatment-relevant genes were enriched for upregulated histone acetylated genes in the prefrontal cortex tissue, but downregulated histone acetylated genes in the temporal cortex. This difference in spatial regions and directionality of the histone acetylation effect could suggest that these treatment-relevant genes may asymmetrically impact differing brain regions. Thus, while these treatment-relevant genes map onto a few genes with known evidence for high-confidence mutations or dysregulated gene expression, they are more strongly linked to genes that show evidence of differential histone acetylation in ASD cortical tissue. This potentially indicates that treatment-relevant biology may be linked to epigenetic changes such as histone acetylation in cortical tissue. Given that early intervention intends to change behavior through reshaping the underlying biology, these links to histone acetylation could potentially provide key novel evidence as to how treatment effects may be moderated by individual molecular characteristics intrinsic to each individual.

Table 3 Enrichments with ASD-relevant gene lists.

Finally, we examined how treatment-relevant genes may be preferentially expressed in specific regions of the human brain. Leveraging spatial gene expression information from the Allen Institute Human Brain Gene Expression Atlas, we looked for which regions showed high levels of expression of these treatment-relevant genes. To do this, we downloaded spatial gene expression maps for all 295 treatment-relevant genes from https://neurosynth.org/genes/. With a one-sample t test in SPM12, we ran a whole-brain analysis to identify brain areas where expression levels were significantly different from 0, correcting for multiple comparisons at voxel-wise FDR q < 0.05 (Fig. 4). Here we find that subcortical areas are highly prominent—particularly the thalamus, striatum, and claustrum. Amongst cortical areas, the most prominent regions are the anterior, middle, and posterior cingulate cortex (ACC, MCC, PC), dorsal and ventral medial prefrontal cortex (dMPFC, vMPFC), dorsolateral prefrontal cortex, ventral premotor cortex (vPMC), somatomotor cortex (SMC), temporoparietal junction (TPJ), planum temporale (PT), inferior parietal lobule (IPL), intraparietal sulcus (IPS), posterior superior temporal sulcus (pSTS), anterior temporal lobe (ATL), middle temporal gyrus (MTG), lateral occipital cortex, and insular cortex (Ins).

Fig. 4: Regional gene expression in the brain for treatment-relevant genes.
figure 4

This figure shows whole-brain analysis results (thresholded at q < 0.05 FDR correction for multiple comparisons) indicating which brain regions show high levels of expression for the treatment-relevant genes. Spatial gene expression was profiled here with the Allen Institute Human Brain Atlas. DLPFC dorsolateral prefrontal cortex, vMPFC ventromedial prefrontal cortex, dMPFC dorsomedial prefrontal cortex, ACC anterior cingulate cortex, MCC middle cingulate cortex, PCC posterior cingulate cortex, vPMC ventral premotor cortex, PT planum temporale, TPJ temporoparietal junction, SMC somatomotor cortex, IPL inferior parietal lobule, pSTS posterior superior temporal sulcus, ATL anterior temporal lobe, MTG middle temporal gyrus, LOC lateral occipital cortex.


In this work, we examined whether pre-treatment clinical behavioral and blood leukocyte gene expression patterns could predict the rate of skill growth in response to early intervention in young toddlers with autism. Congruent with prior studies, pre-treatment clinical behavioral characteristics such as language and communication and nonverbal cognitive ability are indeed helpful for predicting later treatment response [11, 18,19,20,21,22,23], predicting ~21% of the variance in treatment slopes. A novel finding from this work is that pre-treatment gene expression patterns from blood leukocytes are also informative for predicting treatment slopes, predicting ~13% of the variance in treatment slopes. The effect of behavioral variables predicting more variance may not be surprising since such variables are conceptually and theoretically closer to what is being measured as the treatment outcome (e.g., behavioral change on the aSLP). However, the effect that pre-treatment blood leukocyte gene expression can predict treatment slopes at some level is a revelation, given that prior to this work it was unknown whether pre-treatment biological factors such as blood leukocyte gene expression could predict treatment slopes at all.

Examining the gene expression signal that is predictive of treatment slopes more closely, our LASSO modeling approach prioritizes a subset of 295 genes that highly interact at the protein level and which are enriched for biological processes such as acetylation. Expanding on the idea of acetylation as a treatment-relevant biological process, we also discovered that these treatment-relevant genes are enriched for genes that are differentially histone acetylated in postmortem cortical tissue of ASD patients [50]. Because our signature was revealed in blood and not in brain tissue and that the UniProt enrichment of acetylation is not necessarily brain-specific, the evidence that these genes also have a differential impact on histone acetylation in autism cortical tissue is an important cross-tissue correspondence. Given that the central dogma behind the early intervention is to capitalize on an individual’s heightened propensity for neurobiological plasticity and change in early development, these findings suggest that one key to predicting an individual’s propensity for such change may be hidden within individualized and intrinsic biology related to histone acetylation. In other words, predicting early intervention treatment response may hinge critically on how susceptible an individual’s intrinsic biology is to experience- or context-dependent control over the regulation of gene expression. This idea bodes well with general ideas regarding histone acetylation as one of the primary molecular influences over activity-dependent gene expression, which would then subsequently alter experience-dependent learning and memory processes [63] that are critical ingredients of early intervention.

Because our signature was revealed in blood and not in the brain, we additionally tested how these genes are expressed spatially throughout human brain cortical tissue. If the signature was brain-irrelevant, we would not see high levels of expression within specific brain regions. Contrary to this null hypothesis, we discovered that treatment-relevant genes highly express throughout a range of subcortical and cortical areas. Subcortical areas such as the thalamus, striatum, and claustrum all have extensive connections to the various cortical areas implicated [64,65,66,67]. The cortical areas fall within well-known large-scale circuits like the default mode, salience, and somatomotor network. The default mode network is noted for its overlap with regions considered integral for social brain circuitry (e.g., dMPFC, vMPFC, PCC, TPJ, ATL, pSTS) and social-communicative functions linked to the domains affected in ASD [68,69,70,71,72,73,74,75,76]. Other regions relevant to the mirror system are also apparent (e.g., vPMC, Ins, MCC, IPL, IPS, SMC) [77,78,79]. Language-relevant regions are also notable, such as (e.g., PT, MTG, vPMC, ATL) [80, 81]. While speculative, this evidence could be suggestive of the possible impact of treatment-relevant genes on circuitry that plays important roles in cognitive and behavioral domains targeted by early intervention and which are key domains of importance in the early development of autism. Overall, this result corresponds well with the earlier discussed brain-relevant enrichment in differential histone acetylation in autism cortical tissue, for indicating that these genes, although identified in blood, have an important brain-relevant impact.

In addition, we also found that starting treatment before versus after 24 months is a meaningful distinction [11, 12]. Toddlers who started treatment before 24 months showed larger overall gains than those starting treatment after 24 months, even when controlling for the amount of time in treatment, treatment intensity, and general pre-treatment developmental ability. This result is compatible with the main ideas behind why early intervention is crucial before 24 months [1, 10, 12]. Also compatible with the idea that treatment start before versus after 24 months is important, we also discovered that treatment slopes are much more variable once past 24 months of age. The enhanced variability of treatment slopes after 24 months is important, as it underscores how heterogeneity can be magnified with a later treatment start. One implication of this result is that prediction of treatment outcome is a much more difficult task when the child begins treatment after 24 months of age. This is another consideration for why early detection and intervention is key—treatment outcomes tend to be more consistent if treatment begins before 24 months.

There are some caveats and limitations that are necessary to address to interpret the present findings. First, this study is correlational in nature and does not contain a contrast group to compare the STAR program to. As such, interpretation of the effects here as related specifically to treatment-related learning cannot be disentangled from possible maturational effects. The effects here should be interpreted as a prediction of developmental change in the context of an early intervention, but should not be interpreted as predicting change that is specifically driven by elements of the intervention itself per se. Second, the results reported here are associated with a specific evidence-based early intervention program that contains a mixture of elements from various programs (e.g., applied behavioral analysis, pivotal response training) and administered by highly trained providers with systematic probes of fidelity of implementation. The use of the same standardized treatment approach for all participants is a strength of the current study. However, given the variety of different types of early intervention programs available today (e.g., Early Start Denver Model, parent-mediated communication-focused treatment), caution must be taken in generalizing these findings to other intervention programs. A question for future work would be to examine whether these findings extend to other widely used early intervention programs. Third, the aSLP was not administered by blind assessors. In addition, as is true for most curriculum-based assessments, the aSLP is not a standardized psychometric instrument. However, treatment slopes as indexed by the aSLP were highly correlated with other psychometrically well-validated instruments such as the Mullen and Vineland that were administered by blind assessors. This indicates that aSLP has some construct validity for measuring developmental abilities despite these limitations. Fourth, the outcome measure operationalized as the rate of improvement over time is not a commonly used metric to evaluate early intervention response. In most designs, there are time-locked pre- and post-testing measures to evaluate treatment response. However, the rate of response to treatment from multiple longitudinal measurements may be a more sensitive measure of treatment response than a change score sampled at just two points in time. Fifth, while the sample size of this study is moderate to above average for what is typical in most treatment and gene expression studies [82, 83], future work replicating these findings with larger samples is needed. Notably, given recommendations such as the use of a higher alpha threshold for statistical significance (e.g., α = 0.005) [84] for the discovery of novel effects, all primary effects of interest reported here would still survive this more conservative alpha threshold. Another caveat here regarding sample size in gene expression studies is that studies of brain tissue are typically much smaller and deal with RNA quality that is much lower than what is typical in studies using blood samples. In addition, brain tissue studies typically have much larger age ranges spanning toddlerhood to adulthood. Thus, there is much larger age-related heterogeneity that is apparent in brain tissue compared to blood. Combining these caveats regarding higher sample size, higher RNA quality, and less age-related heterogeneity suggests that the current study is typically ahead of the norms within the context of gene expression studies in ASD patients. Finally, future work could examine whether different approaches to merge multiple data modalities such as pre-treatment gene expression and clinical measures might help to better predict treatment slopes. In the current work, we did not investigate this possibility as it is beyond the scope of the current investigation and requires much more sophisticated approaches tailored specifically for multiple modality data fusion, especially in situations where different modalities are high dimensional and/or differ substantially in dimensionality [85, 86].

In conclusion, this work shows the importance of early treatment start ideally before 24 months and also shows for the first time that blood gene expression characteristics can predict how fast toddlers with ASD respond to early treatment. While clinical behavioral variables outperformed gene expression measures, the signal within gene expression is important because it potentially indicates that a key biological ingredient for determining an individual’s treatment outcome is susceptibility to epigenetic change via mechanisms such as acetylation. Understanding how this treatment-relevant biology affects neuroplasticity and experience-dependent learning is a key next step towards how such molecular mechanisms are linked to heterogeneous outcomes in ASD.