Alcohol use effects on adolescent brain development revealed by simultaneously removing confounding factors, identifying morphometric patterns, and classifying individuals

Park, Sang Hyun; Zhang, Yong; Kwon, Dongjin; Zhao, Qingyu; Zahr, Natalie M.; Pfefferbaum, Adolf; Sullivan, Edith V.; Pohl, Kilian M.

doi:10.1038/s41598-018-26627-7

Download PDF

Article
Open access
Published: 29 May 2018

Alcohol use effects on adolescent brain development revealed by simultaneously removing confounding factors, identifying morphometric patterns, and classifying individuals

Sang Hyun Park¹^na1,
Yong Zhang²^na1,
Dongjin Kwon ORCID: orcid.org/0000-0003-4942-9214^3,4,
Qingyu Zhao³,
Natalie M. Zahr^3,4,
Adolf Pfefferbaum^3,4,
Edith V. Sullivan³ &
…
Kilian M. Pohl⁴

Scientific Reports volume 8, Article number: 8297 (2018) Cite this article

4056 Accesses
10 Citations
2 Altmetric
Metrics details

Subjects

Abstract

Group analysis of brain magnetic resonance imaging (MRI) metrics frequently employs generalized additive models (GAM) to remove contributions of confounding factors before identifying cohort specific characteristics. For example, the National Consortium on Alcohol and NeuroDevelopment in Adolescence (NCANDA) used such an approach to identify effects of alcohol misuse on the developing brain. Here, we hypothesized that considering confounding factors before group analysis removes information relevant for distinguishing adolescents with drinking history from those without. To test this hypothesis, we introduce a machine-learning model that identifies cohort-specific, neuromorphometric patterns by simultaneously training a GAM and generic classifier on macrostructural MRI and microstructural diffusion tensor imaging (DTI) metrics and compare it to more traditional group analysis and machine-learning approaches. Using a baseline NCANDA MR dataset (N = 705), the proposed machine learning approach identified a pattern of eight brain regions unique to adolescents who misuse alcohol. Classifying high-drinking adolescents was more accurate with that pattern than using regions identified with alternative approaches. The findings of the joint model approach thus were (1) impartial to confounding factors; (2) relevant to drinking behaviors; and (3) in concurrence with the alcohol literature.

Adolescent alcohol use is linked to disruptions in age-appropriate cortical thinning: an unsupervised machine learning approach

Article 08 October 2022

Volumetric trajectories of hippocampal subfields and amygdala nuclei influenced by adolescent alcohol use and lifetime trauma

Article Open access 02 March 2021

Investigating grey matter volumetric trajectories through the lifespan at the individual level

Article Open access 15 July 2024

Introduction

After birth, the human brain undergoes profound change that continues throughout adolescence and into young adulthood¹. A consensus of cross-sectional and longitudinal magnetic resonance imaging (MRI) studies suggests that cortical gray matter volume declines and the cortical mantle thins^2,3, but white matter volume, microstructural organization, and myelination of fiber tracts increase^4,5, during healthy adolescent development. In this developmentally critical second decade of life, young people commonly engage in risky behaviors, including consumption of alcohol. A recent U.S. survey estimates that 66% of 18-year-olds have drunk alcohol and about 25% report getting drunk⁶. A rising incidence of binge drinking may put developing youth at particularly high risk for deviations from the normal trajectory of brain development⁷. Longitudinal studies of heavy relative to minimal drinking during adolescence report acceleration of gray matter volume shrinkage, attenuation of white matter growth⁸, and decreased fiber integrity⁹. Similar but subtler developmental changes have been detected in youth who drink regularly, if not heavily¹⁰. Despite such reports of quantifiable effects of drinking on normal neurodevelopmental trajectories, weak effects may be difficult to extricate using traditional, hypothesis-driven methods¹¹ and may be enhanced by the use of machine-learning approaches to determine group separating characteristics¹².

In neuroimaging studies, identifying group differences using classification approaches can be straight forward if the groups are of equal sample size and matched with respect to demographic factors such as age, sex, and ethnicity^13,14,15. However, a challenge of many neuroimaging studies is statistical power, particularly given the number of potentially confounding factors¹⁶. For example, the National Consortium on Alcohol and Neurodevelopment in Adolescence (NCANDA)¹⁷, a landmark longitudinal study supported by the National Institute on Alcohol Abuse and Alcoholism and the National Institutes of Health Big Data to Knowledge initiative, has been collecting MRI and neuropsychological data in adolescents and young adults to (1) expand knowledge about normal brain maturation; (2) document changes following initiation of moderate-to-heavy alcohol consumption; and (3) identify imaging markers that predict early-onset alcohol use disorder (AUD). The number of youth with a notable history of alcohol consumption at baseline was small¹⁷. To power this investigation adequately, however, the study also recruited youth with minimal alcohol exposure at baseline that had a high risk for transitioning to the AUD phenotype during the course of the 10-year study.

One popular approach for analyzing unbalanced data sets is to include only subsets of the collected sample matched with respect to basic demographic variables. For example (in support of the first aim of the NCANDA study), age-matched samples selected from another large cohort study, the ‘Pediatric Imaging, Neurocognition, and Genetics’ data set confirmed the longitudinal brain developmental patterns identified in the minimal drinking adolescents of the NCANDA cohort³. Specific to the NCANDA cohort and its second aim, the study also reported smaller and thinner frontal and temporal cortices for the group initiating moderate-to-heavy alcohol consumption relative to the minimally-drinking group. Matching cohorts, however, is not always successful in revealing significant group differences. For example, analyses of diffusion tensor image (DTI) data from demographically-matched subsets of the NCANDA study did not reveal effects of moderate-to-high drinking on DTI metrics (i.e., regional fractional anisotropy, mean diffusivity, axial diffusivity, and radial diffusivity)⁴. This was surprising given evidence that excessive alcohol consumption in adults disrupts white matter microstructure of select fiber systems^18,19,20,21.

An alternative approach to analyzing unbalanced data is to include the entire sample, but to remove the effects of confounding variables before performing group analysis^{12,22,23,24,25,26,27}. Regression approaches, such as the ‘ordinary’ generalized additive model (GAM), remove the effects of confounding factors by first modeling the relationship between the dependent variable (e.g., volume of the corpus callosum) and confounding factors (e.g., age) on a subset of the sample (e.g., minimal alcohol-consuming healthy controls)³, then using that model to remove the effect of confounding factors from each dependent variable so that residuals of the raw metric are used in group analyses. However, GAM often suffers from sensitivity to noise, as demonstrated, for example, by the variance in age associated with peak white matter microstructural maturation⁴. Robust regression claims to address the sensitivity issue by separately modeling the effects of confound and noise in MRI metrics²⁸. While robust regression has often been used in large neuroimaging studies²⁹, the noise model requires a-priori specification, which can reduce the power of the analysis. For example, a cautious threshold for accounting for noise generally results in a robust GAM but the effects of confounding variables are then estimated on a notably reduced sample size. A small sample generally fails to capture comprehensive effects of confounding factors and the resulting GAM is thus likely inaccurate. We hypothesized that typical sequential use of the GAM to isolate the effects of confounding variables on MR metrics would minimize information relevant for distinguishing groups (e.g., adolescents with a drinking history relative to those without a significant drinking history).

To test the hypothesis, we apply a machine learning approach to the baseline NCANDA neuroimaging data set. Our proposed approach, referred to as Joi-GAM-Class (for joint GAM classification) simultaneously determines optimal parameters (1) of a GAM (for removing the influence of confounding factors) and (2) a logistic classifier (for cohort classification). The classifier identifies a subset of variables (i.e., residual scores of imaging metrics) that is most informative for differentiating minimal from regular drinking youth. We refer to this subset of brain measurements as pattern. To identify a pattern, the classifier’s search for informative brain metrics is constrained to subsets of a certain size, enforced by embedding ‘sparsity constraints’ into the classification model¹². To help with an initial understanding of the method used herein, Fig. 1 presents the output of three approaches analyzing a synthetic data set. Figure 1(a) plots an arbitrary image metric (y-axis) relative to age (x-axis). The green dots represent the imaging metrics of the minimal drinkers and the black ones of the regular drinkers. For both cohorts, the metric is clearly effected by age, a confounding factor also in the NCANDA data. The effects of age outweigh the effects of group when the classifier is applied directly to raw imaging metrics (i.e., not residuals) as the two cohorts are not separated accurately (Fig. 1(b)). Figure 1(c) shows a few samples that are mislabeled by classification based on residual scores of raw imaging metrics, i.e., after the confounding effects of age are removed via robust GAM. The GAM was parameterized based on the imaging metrics of the control group, i.e., the minimal drinkers. As is true with real data, however, the noise associated with raw imaging metrics made it highly unlikely to recover the ‘true’ age effect. Instead the data allow for a variety of plausible solutions shown schematically in the gray region outlined in Fig. 1(a). Within this set of possible solutions, the robust regression chose the solution that best fits a-priori assumptions. The assumptions were defined through specific settings of the underlying optimization algorithm, which were not specific to the classification task. By contrast, our joint optimization approach selected the GAM model so that the classifier perfectly separated the two cohorts (Fig. 1(d)).

To complete hypothesis testing, we cross-validate our joint algorithm approach (i.e., Joi-GAM-Class) against alternative implementations using the baseline NCANDA imaging data set. The data set consists of structural MRI and microstructural DTI metrics collected in 705 adolescents: 671 that are minimal (no-to-low) drinking and 34 that are regular drinkers. The GAM is defined with respect to age and socioeconomic status, because these two variables are not matched across the two cohorts (and are therefore confounding variables). To apply cross-validation, a popular method to measure accuracy of machine learning approaches, the total data set is divided into subsets (i.e., folds) in which the cohorts in each subset are matched with respect to demographic factors other than age and socioeconomic status. Each implementation uses one subset for training. The accuracy of patterns identified during training is then evaluated on the second subset to ensure that solutions are not specific to the first subset. This process is repeated with the second subset used for training and first for testing. The test accuracy of classifiers is summarized with accuracy scores, which include measures for testing the resistance of implementations to confounding factors. Furthermore, we compute p-values representing the statistical significance of accuracy scores and patterns identified by each implementation. Here, we are the first to report progress on the third aim of NCANDA (i.e., identify imaging markers that predict early-onset AUD) by presenting patterns of neuromaturation that are impartial to confounding effects (such as age) and correctly classify adolescents who drink regularly.

In a conference paper³⁰, we first discussed the idea of jointly parameterizing GAM and classification to analyze two independently collected structural MRI data sets of participants ranging in age from 60 to 72 years (N = 74). The first data set contained participants infected with the Human Immunodeficiency Virus (HIV) and effected by HIV-Associated Neurocognitive Disorder as well as demographically matched controls. The second data set, which was matched to the first one, contained individuals diagnosed with Mild Cognitive Impairment and a control cohort. In the conference submission, the GAM was used to remove the effect of acquisition differences between the two data sets and the classifier to identify differences between HIV-Associated Neurocognitive Disorder and Mild Cognitive Impairment. The experiment revealed that our joint approach is more accurate than sequential methods in identifying group differences based on data not ideally constructed for classification. Here, we confirm this finding on the NCANDA data set.

Results

Comparison of Sequential and Joint Approaches

Our experiments on the NCANDA data set revealed that our joint approach Joi-GAM-Class (based on MRI and DTI metrics) was indifferent to confounding factors (i.e., age and socioeconomic status) and more accurate than alternative implementations, listed here:

No-GAM-Class: performed sparsity constrained classification on raw image scores (i.e., omitting GAM); the benchmark for analysis without removing the effects of confounding factors.
Seq-GAM-Class: popular sequential approach first parameterized an ordinary GAM and then performed sparsity-constrained classification.
Seq-GAM_Rob-Class: sequentially executed robust regression and sparse classification; the alternative to Seq-GAM-Class that accounted for image noise.
Joi_STR-GAM-Class: the proposed joint model confined to the structural (STR) MRI metrics; the only other implementation indifferent to the confounding factors.
Joi_DTI-GAM-Class: the proposed joint model confined to microstructural DTI metrics; as with Joi_STR-GAM-Class, this method provided a benchmark for single-image modality analysis.
Joi_OPT-GAM-Class: a simplified version of our proposed joint model suitable for optimizing group separation, but not indifferent to the effects of confounding factors.

Note, Table S2 of the supplement lists these and all other acronyms used throughout the article.

We measured the accuracy of each implementation using two-fold cross-validation. After training each implementation on the training data to classify minimal and regular drinkers, we measured their accuracies on the testing data by reporting sensitivity, specificity, Area Under the receiver operating characteristic Curve (AUC), and ‘normalized-accuracy’. ‘Normalized-accuracy’ computed the accuracy of an implementation in correctly labeling samples while accounting for differences in sample size between the two cohorts. To ensure the indifference of an implementation to the effects of confounding variables, we also reported ‘matched-accuracy’. To compute ‘matched-accuracy’, we first defined a subset of the test data in which the cohorts were matched with respect to all known demographic scores including age, socioeconomic status, and cohort size and then re-computed the normalized-accuracy with respect to this subset. We set a threshold for labeling an accuracy score as significant at p ≤ 0.002 based on a two-tailed Fisher’s exact test³¹ (i.e., the probability of a classifier’s output to be generated by randomly assigning samples to cohorts) or the DeLong’s test³² i.e., (the probability of the output of one implementation to be generated by another implementation). This significant threshold was considered conservative as the number of implementations compared herein was small⁴. Unless otherwise stated, significant findings refer to the outcome of the Fisher’s exact test.

Desirable implementations were those with significant normalized-accuracy and significant matched-accuracy. For each implementation, indifference to the effects of the confounding variable ‘age’ was calculated using the two-tailed Fisher’s exact test to measure the ability of the relevant classifier to cleanly separate minimal (no-to-low) alcohol-exposed adolescents into an older (i.e., above the age of 15.4; N = 335) and younger cohort (i.e., below the age of 15.5; N = 336). The two cohorts were matched with respect to all demographic factors (i.e., socioeconomic status, supratentorial volume, sex, ethnicity, scanner) except age. Implementations with p > 0.01 passed the age-test as the effect of age was non-existent or magnitudes weaker than the effects of regular drinking. Thus, desirable implementations that also passed the age-test were considered informative with respect to distinguishing regular drinkers from minimal alcohol exposed adolescents. Critically, all implementations passed the socioeconomic status test, i.e., a replication of the age-test applied to this variable. We thus omit discussion of this test.

Table 1 summarizes results. The classifier without data harmonization (No-GAM-Class) was the only implementation, whose ‘normalized-accuracy’ score was significantly lower than chance. The sequential implementations (Seq-GAM-Class and Seq-GAM_Rob-Class) had significant ‘normalized-accuracy’ scores but non-significant ‘matched-accuracy’ scores. Compared to those implementations, the joint methods reported higher AUC, normalized-accuracy, and matched accuracy scores. Although specificity was higher than sensitivity for all implementations, the difference between these was substantially smaller for the joint approaches. The smallest difference was observed for Joi_STR-GAM-Class (sensitivity: 70.6%; specificity: 76.9%). Joi_STR-GAM-Class was also informative as it passed the age-test and had significant normalized-accuracy and matched-accuracy scores. Among the joint approaches, the accuracy score was diminished when only DTI metrics were used (i.e., Joi_DTI-GAM-Class): this implementation also failed the age-test and did not have a significant matched-accuracy score. Joi_Opt-GAM-Class failed the age-test and had the largest difference between normalized-accuracy and matched-accuracy scores (dropped by 12.6%), but it achieved the highest accuracy score (80.8%). Joi-GAM-Class passed the age-test, had a high accuracy score, and the smallest difference between normalized-accuracy (75.9%) and matched-accuracy (77.1%) scores. These accuracy scores were higher than those of the only other informative implementation (i.e., Joi_STR-GAM-Class). Joi-GAM-Class was also the only implementation that was significantly better than No-GAM-Class and Seq-GAM-Class. On a trend level (p < 0.0003), it was also better than Seq-GAM_Rob-Class.

Table 1 Sensitivity, specificity, Area Under the receiver operating characteristic Curve (AUC), normalized-accuracy, matched-accuracy, and age-test (testing for the effect of age) of each implementation.

Full size table

Pattern Analysis

As part of cross-validating an implementation, training consisted of parameter exploration, i.e., recording the identified pattern and corresponding accuracy for different parameter settings of the implementation. A pattern consists of a small number of MR-derived metrics that the implementation deemed informative for distinguishing the two cohorts. Figure 2 plots the normalized frequency of unique patterns identified by each implementation across all training runs. The following lists each implementation by the number of unique regions identified: Joi_STR-GAM-Class (53 patterns), No-GAM-Class (72 patterns), Seq-GAM-Class (72 patterns), Joi_OPT-GAM-Class (72 patterns), Joi_DTI-GAM-Class (225 patterns), Joi-GAM-Class (381 patterns), and Seq-GAM_Rob-Class (853 patterns). Interestingly, Joi-GAM-Class recorded four informative (and dominant) patterns each appearing in at least 50% of the training runs.

The four informative patterns of Joi-GAM-Class consist of the MR metrics listed in Table 2. The most frequently selected pattern (97.8%) consisted of the volumes of lateral ventricles and mid posterior corpus callosum. The second pattern (80.5%) included the first two MR metrics and two additional structural MRI metrics (i.e., volumes of centrum semiovale and central corpus callosum). The third pattern (54.7%) added DTI metrics fractional anisotropy of anterior corona radiata and posterior thalamic radiation) and the fourth (52.9%) included also axial diffusivity of fornix and volume of cingulate gyrus (Fig. 3). Thus, this implementation provided consistency in the identified patterns.

Table 2 Informative patterns of Joi-GAM-Class and corresponding selected regions.

Full size table

Alternative implementations also frequently selected the previously mentioned regions. The only MRI metrics not used by Joi-GAM-Class were the mean diffusivity of the corticospinal tract selected by Seq-GAM-Class, and the axial diffusivity of the medial lemniscus selected by Seq-GAM_Rob-Class.

Table 2 also lists the normalized- and matched-accuracy scores for the logistic classifier confined to the residual scores of the four patterns selected by Joi-GAM-Class. The fourth pattern, which included the MRI metrics of the other three patterns, had equivalent normalized-accuracy and matched-accuracy scores (79.4%). The classifier based solely on a single MRI metric achieved accuracy scores below 70% for most regions. The classifier based on the fractional anisotropy of the anterior corona radiata (normalized-accuracy: 80%) and the posterior thalamic radiation (normalized-accuracy: 75.4%) were exceptions, but their matched-accuracy scores were below 70%.

Regarding group differences (see Figs 4 and 5), the volume of the mid posterior corpus callosum was significantly smaller ($p=0.0002$) in regular drinkers relative to minimal alcohol-drinking adolescents. The axial diffusivity of the fornix ($p=0.0005$) and the fractional anisotropy of the anterior corona radiata ($p < 0.0001$) and posterior thalamic radiation ($p < 0.0001$) were significantly higher in the regular drinking adolescents relative to those with minimal alcohol-exposure.

Discussion

Joi_STR-GAM-Class and Joi-GAM-Class were the only successful approaches for identifying regular drinking on a subject level. This finding supports our central hypothesis that typical sequential use of the GAM to isolate the effects of confounding variables on MR metrics would minimize information relevant for distinguishing groups (e.g., adolescents with a drinking history relative to those without a significant drinking history). Joi-GAM-Class (i.e., the more accurate of these methods) selected patterns that included structural MRI volumes of the lateral ventricles, centrum semiovale, corpus callosum, and cingulate gyrus and microstructural DTI measures of the fornix, corona radiata, and thalamic radiations. The integrity of each of these regions has been reported to be affected by alcohol misuse in studies using more traditional, hypothesis-driven, morphometric group analysis. When this type of analysis was applied to those eight MRI metrics, only four of them showed significant group differences on the NCANDA data set. We thus conclude that the outcome of machine learning models, such as the one proposed here, requires analyzing MRI metrics as a whole to gain knowledge about the effect of alcohol on individuals.

Strong agreement existed among the regions included in the four informative patterns identified by Joi-GAM-Class. While inter-dependencies between the repeated training runs with varying parameter settings of Joi-GAM-Class could account for this finding, this explanation fails to explain the consistency between the informative patterns identified by Joi-GAM-Class and alternative implementations. A more likely explanation for this consistency is the significant impact of regular drinking on the regions identified by Joi-GAM-Class.

The brain regions identified by Joi-GAM-Class are relevant with respect to the Alcohol Use Disorder (AUD) literature. For example, the centrum semiovale, the most frequently appearing region across all patterns, was modestly smaller in the regular than in the minimal drinking group. This finding is consistent with in vivo neuroimaging²⁶ and postmortem studies³³ reporting smaller centrum semiovale volume in heavy alcohol drinking relative to healthy control adults. Smaller than normal white matter volumes could indicate a disruption in adolescent brain development given that white matter continues to grow throughout early adulthood^34,35,36.

A number of studies report that the corpus callosum is sensitive to alcohol use disorder^26,37,38. The corpus callosum integrates information and mediates complex behaviors³⁹ and is larger and thicker in adolescents with higher intelligence^40,41 and better problem solving abilities⁴². The cingulate cortex has been associated with selective attention⁴³, conflict monitoring and decision making in controls⁴⁴ and alcoholics^45,46. The lateral ventricles are generally enlarged in heavy alcohol consuming adults and serve as a sensitive marker of alcoholic-level drinking^13,47,48.

Joi-GAM-Class also identified regions with altered DTI metrics in the regular drinkers relative to the minimal drinking adolescents. Although low fractional anisotropy and high mean radial diffusivity are often reported in heavy drinking youth¹⁵, the current study reports that axial diffusivity of the fornix, fractional anisotropy of the anterior corona radiata and posterior thalamic radiation were high in the regular drinking group. These findings were also reported previously⁴, albeit at a statistically insignificant level. A recent longitudinal study of detoxified alcohol-dependent male adolescents found evidence of low white matter integrity in the body of the anterior corona radiata¹⁵. Microstructural compromise of the fornix, a major fiber bundle connecting limbic structures, has been reported in adult alcoholics⁴⁹.

We complete the review of our morphological findings by noting that the importance of single-region metrics (i.e., its frequency of appearance in the training runs as specified in Table 2) was unrelated to its significance in discriminating the two cohorts, i.e., only half the scores were significantly different between groups. The importance of a single-region metric was also unrelated to its accuracy in classification, i.e., all single-regional metrics reported low matched-accuracies. These observations were further supported by repeating two-fold cross-validation of the sequential procedure with the classifier (without sparsity constraints) being trained on the 29 regional measurements. These 29 MRI metrics were identified by applying a two-tailed t-test to residual scores of the training dataset and retaining those with $p\le 0.01$ (i.e., the significance threshold that led to the highest classification accuracy). The resulting normalized-accuracy of the classifier based on these 29 metrics at 67.3% was significantly lower than that of Joi-GAM-Class. Thus, the type of machine learning applied here analyzed all potentially informative metrics as a whole¹² to determine those regions impacted by regular alcohol use on the developing adolescent brain. In support of this statement, Joi-GAM-Class received lower accuracy scores than those listed in Table 2 for ‘Pattern 4’, the informative pattern of Joi-GAM-Class consisting of all eight regional scores. This pattern is the first known imaging marker with respect to the NCANDA cohort that predicts (i.e., classified with significant accuracy) individuals with regular drinking habits at baseline.

For readers interested in the technical aspects of our proposed machine learning approach, the remainder of the discussion focuses on differences in the implementations and their impact on accuracy scores. We first note, that No-GAM-Class, i.e., performing classification without the GAM model, failed the age-test and resulted in low accuracy scores, thereby supporting the need for properly modeling the effects of confounding factors. One way of modeling the effect is to perform the analysis on a subset of the data with the cohorts being carefully matched with respect to confounding factors. However, the sample size of a matched data set is often much smaller than the original dataset, thereby reducing statistical power. Alternatively, the effects of confounding factors can be removed via GAM.

When parameterizing a GAM independently from classification (i.e., sequential approaches), the residual effects of confounding factors can significantly effect the final classification, as observed here, since the sequential approaches (Seq-GAM-Class and Seq-GAM_Rob-Class) failed the age-test. That the joint implementations Joi_DTI-GAM-Class and Joi_OPT-GAM-Class also failed the age test is a caution to check the output of regression-based approaches for the effects of confounding factors. The series of stringent statistical tests performed post hoc in this study identified those outputs that were not selected because of contributions of confounding factors. Based on those tests, the only informative patterns were determined by the joint implementations Joi_STR-GAM-Class and Joi-GAM-Class.

When confining the joint analysis to just one modality, classification achieved higher accuracy when based on structural metrics (i.e., Joi_STR-GAM-Class) than when based on DTI metrics (i.e., Joi_DTI-GAM-Class). The higher accuracy scores of Joi-GAM-Class over the single-modality implementations (i.e., Joi_STR-GAM-Class and Joi_DTI-GAM-Class) further highlight the importance of analyzing multiple modalities together.

In conclusion, only the joint approaches Joi_STR-GAM-Class and Joi-GAM-Class passed the age-test, showed significant normalized- and matched-accuracy scores, and succeeded in identifying informative patterns on a data set not ideally constructed for classification. Thus, our experiments support the hypothesis of this study.

Methods

Participants

At baseline⁴, NCANDA recruited 831 adolescents, of whom 28 were excluded for the current analysis due to brain abnormalities or missing MRI data. Of the remaining 803 youth, 671 (333 male and 338 female adolescents, ages 12 to 21 years) met the criteria for minimal (no-to-low) alcohol consumption¹⁷ and comprised the control group. The remaining 132 adolescents reported initiating moderate-to-heavy alcohol consumption: female participants consumed four or more drinks (beer, wine, or hard liquor) and male participants consumed five or more drinks (beer, wine, or hard liquor) on at least one occasion in their lifetime. Of these, 34 subjects met criteria for regular drinking (i.e., they drank a minimum of two alcoholic drinks at least once per week). The total data set on 705 youth (671 minimal and 34 regular drinkers) used in this study included demographic information and MRI scans acquired across the five NCANDA collection sites¹⁷, two of which used Siemens 3 T Tim Trio scanners (Siemens) and three of which used General Electric 3 T Discovery MR750 scanners (GE). Each participant was described by age, sex, self-reported ethnicity, socio-economic status (based on the highest education achieved by either parent)⁵⁰, MRI scanner type (GE or Siemens), and supratentorial volume (determined from MR images) (see Table 3).

Table 3 Demographics of the NCANDA Samples (N = 705) and corresponding p-values between the regular drinkers and minimal alcohol exposed cohort. The two cohorts happen to be properly matched (p > 0.1) with respect to all demographic factors but age (in years) and socioeconomic status, whose p-values are marked in bold. The statistic of the supratentorial volume is listed in cm³.

Full size table

The two cohorts were matched (p > 0.1) on ethnicity (multinomial Chi-Square test⁵¹), and sex, and MRI scanner type (binomial Chi-Square test⁵²). Age, socio-economic status, and supratentorial volume (i.e., the only other confounding factors³) were compared using unpaired, two-tailed t-tests⁵³. The two cohorts matched with respect to supratentorial volume but not age and socio-economic status. Most of the regular drinkers were older (18 or older) and had higher socio-economic status than the control group.

Brain imaging metrics used for each individual included 32 MRI derived structural volume scores extracted from the T1- and T2-weighted MRIs³, and 112 DTI derived microstructural scores⁴. All scores were provided as data releases (Demographic Score Release: NCANDA DATA 00010 V5, Structural Score Release: NCANDA DATA 00011, DTI Score Release: NCANDA DATA 00012 V2) by the software platform Scalable Informatics for Biomedical Imaging Studies (SIBIS; sibis.sri.com)⁵⁴. The Section ‘Data Pre-processing’ of the supplement summarizes the pre-processing steps performed on these data as described by^3,4.

Implementations

All implementations used here were based on the sparse-logistic classification model¹². This method is trained to accurately classify samples by minimizing an energy function that encodes the underlying classification task as finding informative patterns (of MRI metrics) of a certain size. No-GAM-Class directly trained the classifier on the 144 raw imaging metrics of each subject. Training of the sequential approaches Seq-GAM-Class and Seq-GAM_Rob-Class consisted of first parameterizing a GAM for regressing out the effects of confounding factors (i.e., age and socio-economic status) before optimizing the classifier on the residual scores. The GAM used a linear model for capturing the relationship between the image metrics and socio-economic status and a quadratic model for capturing the relationship between the image metrics and age^3,4. Seq-GAM-Class used the least square estimation and Seq-GAM_Rob-Class used the robust regression (i.e., bisquare estimation, the default of ‘robustfit’ in Matlab2013b)²⁸ to determine the optimal setting of GAM on the minimal drinkers of the training data set. The joint approaches (Joi_STR-GAM-Class, Joi_DTI-GAM-Class, Joi_OPT-GAM-Class, and Joi-GAM-Class) removed the effects of confounding factors while concurrently optimizing classification accuracy by embedding the GAM model into the energy function of the classifier. While Joi_OPT-GAM-Class reported the result with respect to minimizing the energy function, Joi_STR-GAM-Class, Joi_DTI-GAM-Class, and Joi-GAM-Class went one step further and extended the energy function so that it accounted for accuracy of the GAM in removing the effects of the confounding factors. Joi-GAM-Class (as well as Joi_OPT-GAM-Class) considered all 144 imaging metrics, while Joi_STR-GAM-Class was confined to the 32 macro-structural MRI metrics and Joi_DTI-GAM-Class to the 112 micro-structural DTI metrics. The accuracy of each implementation was measured via 2-fold cross-validation described in further detail in the supplemental section on ‘Cross-Validation’.

For the technically inclined reader, the following subsections describe in detail the optimization algorithms used for training the sequential and joint implementations.

Training of the Sequential Approaches

The training of a sequential approach consisted of two steps: (1) determine the optimal setting of the GAM with respect to the ‘control group’ (i.e., minimal drinking cohort) and (2) identify the pattern, i.e., the subset of residual imaging scores most informative for group separation. The pattern was identified by computing the ‘weights’ of a sparse, logistic regression classifier¹² that resulted in the highest normalized-accuracy based on the training data.

To determine the optimal setting ${\alpha }_{i}\,:\,=({\alpha }_{i\mathrm{,0}},\ldots ,{\alpha }_{i\mathrm{,3}})$ of the GAM with respect to each image measurement type ‘i’, let ‘age_s’ be that age and ‘ses_s’ the socio-economic status of subject ‘s’. Then the GAM defined the relationship of the confounding factors to the corresponding image score i_s as

$${i}_{s} \sim {\alpha }_{i,0}+{\alpha }_{i,1}\cdot ag{e}_{s}+{\alpha }_{i,2}\cdot ag{e}_{s}^{2}+{\alpha }_{i,3}\cdot se{s}_{s}.$$

Assuming that the image scores were Gaussian distributed, then determining the optimal α_i was equivalent to maximizing a likelihood function parameterized by the mean of a Gaussian distribution. To define the likelihood function, we now introduce the mathematical notation summarized in Table S1 of the supplement. Specifically, the training data (i.e., one of the folds) consisted of two cohorts totaling N = 352 subjects with ${\mathbb{C}}$ representing the set of indices of the minimal drinkers. Each subject ‘s’ was described by the factor vector ${{\bf{d}}}_{s}\mathrm{=[1,}\,ag{e}_{s},ag{e}_{s}^{2},se{s}_{s}]$ (consisting of N_D = 3 subject specific demographic values) and up to ${N}_{F}=144$ image scores i_s. Training the GAM with respect to the data of the non-drinking cohort was then equivalent to fitting a matrix Φ so that the factor vector of each control subject was a predictor of the corresponding image scores i.e., ${{\bf{i}}}_{s} \sim {\rm{\Phi }}\cdot {{\bf{d}}}_{s}^{{\rm{{\rm T}}}}$. Assuming that ${{\bf{i}}}_{s} \sim {\mathscr{N}}({\rm{\Phi }}\cdot {{\bf{d}}}_{s}^{{\rm{{\rm T}}}},\,{\sqrt[{N}_{D}]{{\sigma }_{s}}}^{2})$ was normally distributed with ${\sigma }_{s}^{2}\in {\mathbb{R}}$ and referring to ${\Vert \cdot \Vert }_{2}$ as the l₂-norm, the optimal fitted matrix $\hat{{\rm{\Phi }}}$ was obtained by solving the following maximum likelihood problem

$$\begin{array}{lll}\hat{{\rm{\Phi }}} & := & \text{arg}\mathop{\max }\limits_{{\rm{\Phi }}}P(|D,\,{\rm{\Phi }})={\rm{\arg }}\mathop{{\rm{\max }}}\limits_{{\rm{\Phi }}}\prod _{s\in {\mathbb{C}}}P({{\bf{i}}}_{s}|{{\bf{d}}}_{s},{\rm{\Phi }},{\sigma }_{s})={\rm{\arg }}\mathop{{\rm{\max }}}\limits_{{\rm{\Phi }}}\prod _{s\in {\mathbb{C}}}\frac{1}{{\sigma }_{s}\sqrt[ND]{2\pi }}{e}^{-\frac{1}{2{\sigma }_{s}^{2}}{({{\bf{i}}}_{s}-{\rm{\Phi }}\cdot {{\bf{d}}}_{s}^{{\rm T}})}^{T}({{\bf{i}}}_{s}-{\rm{\Phi }}\cdot {{\bf{d}}}_{s}^{{\rm T}})}\\ & = & {\rm{\arg }}\,\mathop{\min }\limits_{{\rm{\Phi }}}G({\rm{\Phi }})\,{\rm{with}}\,G({\rm{\Phi }})\,:\,=\sum _{s\in {\mathbb{C}}}\frac{1}{{\sigma }_{s}^{2}}\parallel {\rm{\Phi }}\cdot {{\bf{d}}}_{s}^{{\rm{{\rm T}}}}-{{\bf{i}}}_{s}{\parallel }_{2}^{2},\end{array}$$

(1)

where D is the set of factor vectors and I is the corresponding set of image scores across all samples.

Interpreting $\mathrm{1/}{\sigma }_{s}^{2}$ as the ‘weight’ of each sample, the above minimization problem defined a robust regression of Φ that was solved via bi-square estimation. With respect to the ordinary GAM, ${\sigma }_{s}=\sigma $ was assumed to be uniform across all subjects so that computing $\hat{{\rm{\Phi }}}$ simplified to the least-square solution. Regardless of the specific computation of $\hat{{\rm{\Phi }}}$, the corresponding residual (desensitized) scores of all N subjects were determined via

$$R\,:=[{{\bf{r}}}_{1},{{\bf{r}}}_{2},\mathrm{..}.,{{\bf{r}}}_{N}]\,\,{\rm{with}}\,{{\bf{r}}}_{s}(\hat{{\rm{\Phi }}})\,:={{\bf{i}}}_{s}-\hat{{\rm{\Phi }}}\cdot {{\bf{d}}}_{s}^{{\rm{{\rm T}}}}.$$

(2)

Training of the sparse, logistic regression classifier consisted of minimizing a log probability with respect to the weights selecting the subset of informative residual image scores best separating both cohorts. In order to define the log probability, the association of each subjects ‘$s$’ to a cohort was encoded by label ${z}_{s}$. If the subject was a regular drinker then ${z}_{s}=1$, and ${z}_{s}=-\,1$ if it was a minimal exposed individual. $Z\,:\,=[{z}_{1},{z}_{2},\mathrm{..}.,{z}_{N}]$ was the vector of label assignments of all subjects in the training fold. The logistic function was defined as $\theta (a)\,:\,=\,\mathrm{log}\,\mathrm{(1}+\exp (\,-\,a))$, the weight vector ‘ω’ encoded the importance of each residual score of r_s in distinguishing the two cohorts, and $v\in {\mathbb{R}}$ was the ‘label offset’. Assuming that all samples were independently and identically distributed according to the binomial distribution

$$P({z}_{s}|{{\bf{r}}}_{s}(\hat{{\rm{\Phi }}}),\nu ,\omega )\,:\,=\frac{1}{1+{e}^{-{z}_{s}\cdot ({\omega }^{T}\cdot {{\bf{r}}}_{s}(\hat{{\rm{\Phi }}})+\nu )}}=\frac{1}{1+{e}^{-{z}_{s}\cdot ({\omega }^{T}\cdot ({{\bf{i}}}_{s}-\hat{{\rm{\Phi }}}\cdot {{\bf{d}}}_{s}^{{\rm T}})+\nu )}}=P({z}_{s}|{{\bf{i}}}_{s},{{\bf{d}}}_{s},\hat{U},\nu ,\omega ),$$

(3)

determining the optimal parameters was equivalent to minimizing the following logistic cost function

$$L(\hat{{\rm{\Phi }}},\nu ,\omega )\,:\,=-\,\mathrm{log}(\prod _{s=1}^{N}P({z}_{s}|{{\bf{r}}}_{s}(\hat{{\rm{\Phi }}}),\nu ,\omega ))=\sum _{s=1}^{N}\theta ({z}_{s}\cdot ({\omega }^{{\rm{{\rm T}}}}\cdot {{\bf{r}}}_{s}(\hat{{\rm{\Phi }}})+\nu ))$$

(4)

with respect to a sparse search space defined according to the l₀-‘norm’ ${\Vert \cdot \Vert }_{0}$ and a predefined number N_K < N_F of non-zero elements, i.e., the sparsity constraint

$${{\mathbb{S}}}_{{N}_{K}}\,:\,=\{\omega :{\Vert \omega \Vert }_{0}\le {N}_{K}\}.$$

(5)

In other words, parameterizing the sparse, logistic classifier summarizes to determining the optimal parameters $\hat{\nu }$ and $\hat{\omega }$ for the following minimization problem

$$(\hat{\nu },\hat{\omega })\,:\,={\rm{\arg }}\,\mathop{\min }\limits_{\nu \in {\mathbb{R}},\omega \in {{\mathbb{S}}}_{{N}_{K}}}L(\hat{{\rm{\Phi }}},\nu ,\omega \mathrm{)}.$$

(6)

We determined its solution via penalty decomposition¹². The image scores associated with non-zero entries in $\hat{\omega }$ then defined the group separating pattern.

Training of Joint Methods

Alternative to the sequential approach, the training of the joint approach consisted of simultaneously determining the optimal values for the variables $\hat{{\rm{\Phi }}}$ of the GAM and $(\hat{\nu },\hat{\omega })$ of the sparse, logistic regression. Specifically, the joint approaches were parameterized by maximizing the following joint probability

$$P(Z,I|{\rm{\Phi }},v,\omega ,D)\,:\,=\prod _{s\mathrm{=1}}^{N}P({z}_{s},{{\bf{i}}}_{s}|{{\bf{d}}}_{s},{\rm{\Phi }},v,\omega ,\sigma )=\prod _{s\mathrm{=1}}^{N}P({z}_{s}|{{\bf{i}}}_{s}{\boldsymbol{,}}{{\bf{d}}}_{s},{\rm{\Phi }},v,\omega )\cdot P({{\bf{i}}}_{s}|{{\bf{d}}}_{s},{\rm{\Phi }},\sigma \mathrm{)}.$$

(7)

$P({z}_{s}|{{\bf{i}}}_{s},{{\bf{d}}}_{s},{\rm{\Phi }},\nu ,\omega )$ was defined according to Eq. (3) and $P({{\bf{i}}}_{s}|{{\bf{d}}}_{s},{\rm{\Phi }},\sigma )$ according to the normal distribution of Eq. (1). Computing the log of that joint probability resulted in

$$\mathrm{log}\,P(Z,I|{\rm{\Phi }},\nu ,\omega ,D)=-\,\sum _{s=1}^{N}\theta ({z}_{s}\cdot ({\omega }^{T}\cdot ({{\bf{i}}}_{s}-{\rm{\Phi }}\cdot {{\bf{d}}}_{s}^{T})+\nu ))-\,\frac{1}{2{\sigma }^{2}}\sum _{s=1}^{N}||{{\bf{i}}}_{s}-{\rm{\Phi }}\cdot {{\bf{d}}}_{s}^{T}{||}_{2}^{2}.$$

(8)

The previous section parameterized the GAM with respect to the minimal drinkers (controls) so that any significant deviation in the image scores of the second cohort could be directly related to the existing clinical literature. To comply with that model, we confined the second sum of Eq. (8) to the controls and model the ‘input’ of the regular drinkers in parameterizing the GAM through the uninformative, uniform distribution represented by the constant $c\in {\mathbb{R}}$. Thus, the log of the joint probability is redefined as

$$\mathrm{log}\,P(Z,I|{\rm{\Phi }},\nu ,\omega ,D)=-\,\sum _{s=1}^{N}\theta ({z}_{s}\cdot ({\omega }^{T}\cdot ({{\bf{i}}}_{s}-{\rm{\Phi }}\cdot {{\bf{d}}}_{s}^{T})+\nu ))-\frac{1}{2{\sigma }^{2}}\sum _{s\in {\mathbb{C}}}||{{\bf{i}}}_{s}-{\rm{\Phi }}\cdot {{\bf{d}}}_{s}^{T}|{|}_{2}^{2}+c,$$

(9)

and its minimization as

$$\begin{array}{lll}(\hat{{\rm{\Phi }}},\hat{\nu },\hat{\omega }) & := & {\rm{\arg }}\mathop{\min }\limits_{{\rm{\Phi }},\nu ,\omega ,\in {{\mathbb{S}}}_{{N}_{K}}}-\,\mathrm{log}\,P(Z,I|{\rm{\Phi }},\nu ,\omega ,D)\\ & = & {\rm{\arg }}\mathop{{\rm{\min }}}\limits_{{\rm{\Phi }},\nu ,\omega ,\in {{\mathbb{S}}}_{{N}_{K}}}\mathrm{(1}-\gamma )\cdot L({\rm{\Phi }},\nu ,\omega )+\gamma \cdot G({\rm{\Phi }}),\end{array}$$

(10)

where $\gamma \,:\,=\frac{1}{2{\sigma }^{2}+1}$ weighted the importance between the logistic cost function and the GAM, $\sigma $ was fixed beforehand via parameter exploration, L(·) was defined according to Eq. (4), and G(·) according to Eq. (1). Note, if γ = 0 then the above equation simplifies to Joi_OPT-GAM-Class, the commonly used logistic regressor with the training of the GAM solely driven by group separation.

Given that finding the optimal solution of Eq. (10) was prone to a local minimum due to the non-convex energy function, the parameters Φ′ were initialized by the output of the GAM of Eq. (1). The local minimum for Equation (10) was then determined through an algorithm inspired by penalty decomposition¹². Specifically, we introduced $\hat{\psi }\in {\mathbb{S}}={{\mathbb{R}}}^{{N}_{F}}$, the non-sparse approximation of the weights $\hat{\omega }$. Eq. (10) was then equivalent to

$$(\hat{{\rm{\Phi }}},\hat{\nu },\hat{\psi },\hat{\omega })={\rm{\arg }}\mathop{{\rm{\min }}}\limits_{{\rm{\Phi }},\nu ,\psi \in {\mathbb{S}},\omega \in {{\mathbb{S}}}_{{N}_{K}}}(1-\gamma )\cdot L({\rm{\Phi }},\nu ,\psi )+\gamma \cdot G({\rm{\Phi }})\,{\rm{s}}{\rm{.t}}.\,\omega -\psi =0.$$

(11)

Introducing the weighting parameter $\rho > 0$, the solution of Eq. (11) was iteratively estimated by

$$({{\rm{\Phi }}}_{\rho },{\nu }_{\rho },{\psi }_{\rho },{\omega }_{\rho })\,:\,={\rm{\arg }}\mathop{{\rm{\min }}}\limits_{{\rm{\Phi }},\nu ,\psi ,\in {\mathbb{S}},\omega \in {{\mathbb{S}}}_{{N}_{K}}}\mathrm{(1}-\gamma )\cdot L({\rm{\Phi }},\nu ,\psi )+\gamma \cdot G({\rm{\Phi }})+\rho \cdot \parallel \omega -\psi {\parallel }_{2}^{2}.$$

(12)

As pointed out in Algorithm 1, the parameters of the logistic regression model were initialized as $\omega =0$, $\psi =0$, $\nu =0$¹² and then, together with Φ, updated via block coordinate descent. If the parameters converged, $\rho $ was increased and the procedures was repeated until the maximum of the absolute difference between the elements of the sparse weights ω_ρ and the non-sparse weights ψ_ρ was below a fixed threshold ${\varepsilon }_{P}$¹², i.e., let ${\Vert \cdot \Vert }_{{\rm{\max }}}$ denote the maximum element of a vector or matrix then

$$\parallel {\omega }_{\rho }-{\psi }_{\rho }{\parallel }_{{\rm{\max }}} < {\varepsilon }_{P}.$$

(13)

At each of these iterations, block coordinate descent improved the current estimate Φ′, ν′, ω′, and ψ′ of (Φ_ρ, ν_ρ, ω_ρ, ψ_ρ) by minimizing Eq. (12) fixing all variables but one and repeating this process until all variables converged. Keeping ν′, ω′, and ψ′ fixed, then the minimization problem of Eq. (12) simplified to

$${\rm{\Phi }}^{\prime} ={\rm{\arg }}\,\mathop{\min }\limits_{{\rm{\Phi }}}\mathrm{\ (1}-\gamma )\cdot L({\rm{\Phi }},\nu ^{\prime} ,\psi ^{\prime} )+\gamma \cdot G({\rm{\Phi }}\mathrm{)}.$$

(14)

Since the penalty function was smooth and convex, Eq. (14) was solved via gradient descent. Interpreting the above minimization problem as desensitizing the image scores from the influence of demographic factors, Φ parameterized the GAM specified by G(Φ) and was regularized by L (·, ν′, ψ′) to account for the noise in the image measurements i_s.

Next, block coordinate descent updated ν′ and ψ′ by keeping Φ′ and ω′ fixed so that Eq. (12) simplified to

$$(\nu ^{\prime} ,\psi ^{\prime} )={\rm{\arg }}\mathop{\min }\limits_{\nu ,\psi ,\in {\mathbb{S}}}\mathrm{(1}-\gamma )\cdot L({\rm{\Phi }}^{\prime} ,\nu ,\psi )+\rho \cdot \parallel \omega ^{\prime} -\psi {\parallel }_{2}^{2}.$$

(15)

Again, gradient descent was employed to determine the minimum of that equation as the penalty function was smooth and convex. Finally, ω′ was updated by solving Eq. (12) with fixed Φ′, ν′, and ψ′, i.e., using the closed form solution to determine

$$\omega ^{\prime} \,:\,={\rm{\arg }}\mathop{{\rm{\min }}}\limits_{\omega \in {{\mathbb{S}}}_{{N}_{K}}}\parallel \omega -\psi ^{\prime} {\parallel }_{2}^{2}.$$

(16)

Following the suggestion of Zhang et al.¹², block coordinate descent was repeated (i.e., Equations (14–16) until the relative changes of Φ′, ν′, ω′, and ψ′, between iterations were smaller than a fixed threshold ${\varepsilon }_{B}$, i.e.

$${\rm{\max }}\,\{{\rm{\Delta }}({\rm{\Phi }}),{\rm{\Delta }}(\nu ),{\rm{\Delta }}(\omega ),{\rm{\Delta }}(\psi )\} < {\varepsilon }_{B}.$$

(17)

with ${\rm{\Delta }}(a)=\frac{\parallel a^{\prime} -a^{\prime\prime} {\parallel }_{{\rm{\max }}}}{{\rm{\max }}\,\{\parallel \,a^{\prime} \,{\parallel }_{{\rm{\max }}}\mathrm{,1\}}}$. Once converged, Φ_ρ, ν_ρ, ω_ρ, and ψ_ρ were updated according to Φ′, ν′, ω′, and ψ′, ρ was increased, and another block coordinate descent was initiated until ω_ρ and ψ_ρ converged according to Eq. (13), which was the case in all of our experiments. Additional comments about the joint optimization are provided by the supplement.

Data availability

In compliance with NIH policy, the data release NCANDA DATA 00010 V5, NCANDA DATA 00011, and NCANDA DATA 00012 V2 that supports the finding of this study is released to the public according to the NCANDA Data Distribution agreement (see https://www.niaaa.nih.gov/research/major-initiatives/national-consortium-alcohol-and-neurodevelopment-adolescence for more detail).

Code availability

Our Matlab implementation of the proposed algorithm (GAM-Sparsity Constraint Logistic Regression V1) is available via https://www.nitrc.org/projects/gam_sparityreg.

Informed Consent

All procedures performed in this study were in accordance with the Declaration of Helsinki. All participants underwent informed consent processes at the visit with a research associate trained in human subject research protocols. Adult participants or the parents of minor participants provided written informed consent before participation in the study. Minor participants provided assent before participation. The Institutional Review Boards of each NCANDA site approved this study, and each site followed this procedure to obtain voluntary informed consent or assent, depending on the age of the participant.

References

Stiles, J. & Jernigan, T. L. The basics of brain development. Neuropsychology Review 20, 327–348 (2010).
Article PubMed PubMed Central Google Scholar
Shaw, P. et al. Neurodevelopmental trajectories of the human cerebral cortex. Journal of Neuroscience 28, 3586–3594 (2008).
Article PubMed CAS Google Scholar
Pfefferbaum, D. et al. Adolescent development of cortical and white matter structure in the NCANDA sample: Role of sex, ethnicity, puberty, and alcohol drinking. Cerebral Cortex 26, 4101–4121 (2016).
Article PubMed PubMed Central Google Scholar
Pohl, K. M. et al. Harmonizing DTI measurements across scanners to examine the development of white matter microstructure in 803 adolescents of the NCANDA study. NeuroImage 130, 194–213 (2016).
Article PubMed PubMed Central Google Scholar
Lebel, C. et al. Diffusion tensor imaging of white matter tract evolution over the lifespan. NeuroImage 60, 340–352 (2012).
Article PubMed CAS Google Scholar
Johnston, L. D., O’Malley, P. M., Miech, R. A., Bachman, J. G. & Schulenberg, J. E. 2016 Overview Key Findings on Adolescent Drug Use (Monierting the Future, 2017).
Cservenka, A. & Brumback, T. The Burden of Binge and Heavy Drinking on the Brain: Effects on Adolescent and Young Adult Neural Structure and Function. Frontiers in Psychology 8, 1111 (2017).
Article PubMed PubMed Central Google Scholar
Squeglia, L. M. et al. Brain development in heavy drinking adolescents. American Journal of Psychiatry 172, 531–542 (2015).
Article PubMed PubMed Central Google Scholar
Jacobus, J., Squeglia, L. M., Bava, S. & Tapert, S. F. White matter characterization of adolescent binge drinking with and without co-occurring marijuana use: a 3-year investigation. Psychiatry Research: Neuroimaging 214, 374–381 (2013).
Article PubMed PubMed Central Google Scholar
Pfefferbaum, A. et al. Altered Brain Developmental Trajectories in Adolescents After Initiating Drinking. The American Journal of Psychiatry (2017).
Squeglia, L. M. et al. Neural Predictors of Initiating Alcohol Use During Adolescence. The American Journal of Psychiatry 174, 172–185 (2017).
Article PubMed Google Scholar
Zhang, Y. et al. Extracting patterns of morphometry distinguishing HIV associated neurodegeneration from mild cognitive impairment via group cardinality constrained classification. Human Brain Mapping 37, 4523–4538 (2016).
Article ADS PubMed PubMed Central Google Scholar
Pfefferbaum, A., Rosenbloom, M., Deshmukh, A. & Sullivan, E. Sex differences in the effects of alcohol on brain structure. American Journal of Psychiatry 158, 188–197 (2001).
Article PubMed CAS Google Scholar
Ahmadi, A. et al. Influence of alcohol use on neural response to Go/No-Go task in college drinkers. Neuropsychopharmacology 38, 2197–2208 (2013).
Article PubMed PubMed Central CAS Google Scholar
Luciana, M., Collins, P. F., Muetzel, R. L. & Lim, K. O. Effects of alcohol use initiation on brain structure in typically developing adolescents. The American Journal of Drug and Alcohol Abuse 39, 345–355 (2013).
Article PubMed PubMed Central Google Scholar
O’Halloran, L., Nymberg, C., Jollans, L., Garavan, H. & Whelan, R. The potential of neuroimaging for identifying predictors of adolescent alcohol use initiation and misuse. Addiction 112, 719–726 (2016).
Article PubMed Google Scholar
Brown, S. A. et al. The National Consortium on Alcohol and NeuroDevelopment in Adolescence (NCANDA): A multi-site-study of adolescent development and substance use. Journal of Studies on Alcohol and Drugs 76, 895–908 (2015).
Article PubMed PubMed Central Google Scholar
Karas, M. et al. Brain connectivity-informed regularization methods for regression. bioRxiv (2017).
Kuceyeski, A., Meyerhoff, D. J., Durazzo, T. C. & Raj, A. Loss in connectivity among regions of the brain reward system in alcohol dependence. Human Brain Mapping 34, 3129–3142 (2013).
Article PubMed Google Scholar
Schulte, T. et al. How acute and chronic alcohol consumption affects brain networks: Insights from multimodal neuroimaging. Alcoholism: Clinical and Experimental Research 36, 2017–2027 (2012).
Article CAS Google Scholar
Zahr, N. M. Chapter 17 - Structural and microstructral imaging of the brain in alcohol use disorders. Handbook of Clinical Neurology 125, (275–290 (2014).
Google Scholar
van Erp, T. G. et al. Subcortical brain volume abnormalities in 2028 individuals with schizophrenia and 2540 healthy controls via the ENIGMA consortium. Molecular Psychiatry 21, 547–553 (2015).
Article PubMed PubMed Central Google Scholar
Worsley, K. J. et al. A general statistical analysis for fMRI data. NeuroImage 15, 1–15 (2002).
Article PubMed CAS Google Scholar
Wang, X.-F., Jiang, Z., Daly, J. J. & Yue, G. H. A generalized regression model for region of interest analysis of fMRI data. NeuroImage 59, 502–510 (2012).
Article PubMed Google Scholar
Friston, K. J. et al. Statistical parametric maps in functional imaging: A general linear approach. Human Brain Mapping 2, 189–210 (1995).
Article Google Scholar
Le Berre, A. P. et al. Sensitive biomarkers of alcoholism’s effect on brain macrostructure: similarities and differences between France and the United States. Frontier in Human Neuroscience 9, 354–1–354–13 (2015).
ADS Google Scholar
Kochunov, P. et al. Heritability of fractional anisotropy in human white matter: A comparison of Human Connectome Project and ENIGMA-DTI data. NeuroImage 111, 300–311 (2015).
Article PubMed PubMed Central Google Scholar
Holland, P. W. & Welsch, R. E. Robust regression using iteratively reweighted least-squares. Theory and Methods 6, 813–827 (1977).
Article MATH Google Scholar
Fritsch, V. et al. Robust regression for large-scale neuroimaging studies. Neuroimage 111, 431–441 (2015).
Article PubMed Google Scholar
Zhang, Y., Park, S. H. & Pohl, K. M. Joint data harmonization and group cardinality constrained classification. In Medical Image Computing and Computer Assisted Interventions, vol. 9900 of Lecture Notes in Computer Science, 282–290 (Springer-Verlag, 2016).
Fisher, R. A. On the interpretation of χ² from contingency tables, and the calculation of P. Journal of the Royal Statistical Society 85, 87–94 (1922).
DeLong, E. R., DeLong, D. M. & Clarke-Pearson, D. L. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 44, 837–845 (1988).
Article PubMed MATH CAS Google Scholar
Kril, J. J., Halliday, G. M., Svoboda, M. D. & Cartwright, H. The cerebral cortex is damaged in chronic alcoholics. Neuroscience 79, 983–998 (1997).
Article PubMed CAS Google Scholar
Pfefferbaum, A. et al. Variation in longitudinal trajectories of regional brain volumes of healthy men and women (ages 10 to 85 years) measured with atlas-based parcellation of MRI. NeuroImage 65, 176–193 (2013).
Article PubMed Google Scholar
Giedd, J. N. et al. Child psychiatry branch of the national institute of mental health longitudinal structural magnetic resonance imaging study of human brain development. Neuropsychopharmacology Reviews 40, 43–49 (2015).
Article PubMed Google Scholar
Raznahan, A. et al. Longitudinal four-dimensional mapping of subcortical anatomy in human development. Proceedings of the National Academy of Sciences 111, 1592–1597 (2014).
Article ADS CAS Google Scholar
Kapogiannis, D., Kisser, J., Davatzikos, C., abd Jeffrey Metter, L. F. & Resnick, S. Alcohol consumption and premotor corpus callosum in older adults. European Neuropsychopharmacology 22, 704–710 (2012).
Article PubMed PubMed Central CAS Google Scholar
Pfefferbaum, A. et al. Regional brain structural dysmorphology in human immunodeficiency virus infection: Effects of acquired immune deficiency syndrome, alcoholism, and age. Biological Psychology 72, 361–370 (2012).
Article Google Scholar
Hinkley, L. B. N. et al. The role of corpus callosum development in functional connectivity and cognitive processing. Plos One 7, e39804 (2012).
Article ADS PubMed PubMed Central CAS Google Scholar
Hutchinson, A. D. et al. Relationship between intelligence and the size and composition of the corpus callosum. Neuroscience 192, 455–464 (2009).
Google Scholar
Luders, E. et al. The link between callosal thickness and intelligence in healthy children and adolescents. Neuroimage 54, 1823–1830 (2011).
Article PubMed Google Scholar
van Eimeren, L., Niogi, S. N., McCandliss, B. D., Holloway, I. D. & Ansari, D. White matter microstructures underlying mathematical abilities in children. Neuroreport 19, 1117–1121 (2009).
Article Google Scholar
Posner, M. I. & Rothbart, M. K. Attention, self-regulation and consciousness. Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences 353, 1915–1927 (1998).
Article PubMed PubMed Central CAS Google Scholar
Botvinick, M. M. Conflict monitoring and decision making: reconciling two perspectives on anterior cingulate function. Cognitive, Affective, and Behavioral Neuroscience 7, 356–366 (2007).
Article Google Scholar
Schulte, T., Muller-Oehring, E. M., Sullivan, E. V. & Pfefferbaum, A. Synchrony of corticostriatal-midbrain activation enables normal inhibitory control and conflict processing in recovering alcoholic men. Biological Psychology 71, 269–278 (2012).
Article Google Scholar
Le Berre, A.-P. et al. Impaired decision-making and brain shrinkage in alcoholism. European Psychiatry 29, 125–133 (2014).
Article PubMed Google Scholar
Wobrock, T. et al. Effects of abstinence on brain morphology in alcoholism. European Archives of Psychiatry and Clinical Neurosciences 259, 143–150 (2009).
Article Google Scholar
Pitel, A.-L., Chanraud, S., Sullivan, E. V. & Pfefferbaum, A. Callosal microstructural abnormalities in Alzheimer’s disease and alcoholism: same phenotype, different mechanisms. Psychiatry Research 184, 49–56 (2010).
Article PubMed PubMed Central Google Scholar
Pfefferbaum, A., Rosenbloom, M., Rohlfing, T. & Sullivan, E. V. Degradation of association and projection white matter systems in alcoholism detected with quantitative fiber tracking. Biological Psychology 65, 680–690 (2009).
Article CAS Google Scholar
Akshoomoff, N. et al. The NIH toolbox cognition battery: results from a large normative developmental sample (PING). Neuropsychology 28, 1–10 (2014).
Article PubMed Google Scholar
McHugh, M. L. The Chi-square test of independence. Biochemia Medica 23, 143–149 (2013).
Article PubMed PubMed Central CAS Google Scholar
Yates, F. Contingency tables involving small numbers and the χ ² test. Journal of the Royal Statistical Society 1, 217–235 (1934).
MATH Google Scholar
Zimmerman, D. A note on interpretation of the paired-samples t test. Journal of Educational and Behavioral Statistics 22, 349–360 (1997).
Google Scholar
Nichols, B. N. & Pohl, K. M. Neuroinformatics software applications supporting electronic data capture, management, and sharing for the neuroimaging community. Neuropsychology Review 25, 356–368 (2015).
Article PubMed PubMed Central Google Scholar

Download references

Acknowledgements

This work was supported by the U.S. National Institute on Alcohol Abuse and Alcoholism (AA021697, AA012388, AA017168, AA017347, AA010723, MH113406), the National Institute of Health Office of Directors (AA021697-04S1). KMP was also supported by the Creative and Novel Ideas in HIV Research (CNIHR) Program through a supplement to the University of Alabama at Birmingham (UAB) Center For AIDS Research funding (NIH P30 AI027767). This funding was made possible by collaborative efforts of the Office of AIDS Research, the National Institute of Allergy and Infectious Diseases, and the International AIDS Society. EVS received support from the Moldow Women’s Hope and Healing Fund.

Author information

Sang Hyun Park and Yong Zhang contributed equally to this work.

Authors and Affiliations

Department of Robotics Engineering, Daegu Gyeongbuk Institute of Science and Technology, Daegu, South Korea
Sang Hyun Park
Colin Artificial Intelligence Lab, Richmond, BC, Canada
Yong Zhang
Department of Psychiatry & Behavioral Sciences, Stanford University, Stanford, CA, 94305, USA
Dongjin Kwon, Qingyu Zhao, Natalie M. Zahr, Adolf Pfefferbaum & Edith V. Sullivan
Center for Health Sciences, SRI International, Menlo Park, CA, 94025, USA
Dongjin Kwon, Natalie M. Zahr, Adolf Pfefferbaum & Kilian M. Pohl

Authors

Sang Hyun Park
View author publications
You can also search for this author in PubMed Google Scholar
Yong Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Dongjin Kwon
View author publications
You can also search for this author in PubMed Google Scholar
Qingyu Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Natalie M. Zahr
View author publications
You can also search for this author in PubMed Google Scholar
Adolf Pfefferbaum
View author publications
You can also search for this author in PubMed Google Scholar
Edith V. Sullivan
View author publications
You can also search for this author in PubMed Google Scholar
Kilian M. Pohl
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

S.P., Y.Z., N.M.Z. prepared the figures and helped in writing the manuscript. Q.Z., S.P., Y.Z. conducted the experiments. A.P., D.K., and E.V.S. collected and processed the data. K.M.P. managed the project and was writing the manuscript. All authors reviewed the manuscript.

Corresponding author

Correspondence to Kilian M. Pohl.

Ethics declarations

Competing Interests

The authors declare no competing interests.

Additional information

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Supplemental Material and Methods

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Park, S.H., Zhang, Y., Kwon, D. et al. Alcohol use effects on adolescent brain development revealed by simultaneously removing confounding factors, identifying morphometric patterns, and classifying individuals. Sci Rep 8, 8297 (2018). https://doi.org/10.1038/s41598-018-26627-7

Download citation

Received: 07 July 2017
Accepted: 15 May 2018
Published: 29 May 2018
DOI: https://doi.org/10.1038/s41598-018-26627-7

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.