DNA Methylation Signature on Phosphatidylethanol, not Self-Reported Alcohol Consumption, Predicts Hazardous Alcohol Consumption in Two Distinct Populations

The process of diagnosing hazardous alcohol drinking (HAD) is based on self-reported data and is thereby vulnerable to bias. There has been an interest in developing epigenetic biomarkers for HAD that might complement clinical assessment. Because alcohol consumption has been previously linked to DNA methylation (DNAm), here, we aimed to select DNAm signatures in blood to predict HAD from two demographically and clinically distinct populations (Ntotal=1,549). We first separately conducted an epigenome-wide association study (EWAS) for phosphatidylethanol (PEth), an objective measure of alcohol consumption, and for self-reported alcohol consumption in Cohort 1. We identified 102 PEth-associated CpGs, including 32 CpGs previously associated with alcohol consumption or alcohol use disorders. In contrast, no CpG reached epigenome-wide significance on self-reported alcohol consumption. Using a machine learning approach, two subsets of CpGs from EWAS on PEth and on self-reported alcohol consumption from Cohort 1 were separately tested for the prediction of HAD in Cohort 2. We found that a subset of 130 CpGs selected from the EWAS on PEth showed an excellent prediction of HAD with area under the ROC curve (AUC) of 91.31% in training set and 70.65% in validation set of Cohort 2. However, CpGs preselected from the EWAS on self-reported alcohol consumption showed a poor prediction of HAD with AUC 75.18% in the training set and 57.60% in the validation set. Our results demonstrate that an objective measure for alcohol consumption is a more informative phenotype than self-reported data for revealing epigenetic mechanism. The PEth-associated DNAm signature in blood is a robust biomarker for alcohol consumption.


INTRODUCTION
Hazardous alcohol drinking (HAD) is detrimental to health and is highly correlated with medical comorbidities and psychiatric diseases 1,2 . Diagnosing HAD is challenging due to a lack of stable and objective measures for chronic heavy alcohol consumption 3 .
Phosphatidylethanol (PEth) is a lipid metabolite of ethanol formed from phosphatidylcholine in erythrocytes and has been proposed as a biomarker for alcohol consumption. Compared with self-reported data, PEth reliably detects ethanol levels up to 21 days after the last drink 4 , and the PEth level is highly correlated with alcohol consumption 5 . However, the clinical applicability of PEth is limited because its half-life the HAD predicting procedure using the preselected CpGs from the EWAS on AUDIT-C score was also performed. The analytical strategy is presented in Figure 1.

Sample descriptions
Cohort 1 (N=1,047): The DNA samples in Cohort 1 were from the Veterans Aging Cohort Study (VACS). The VACS is a longitudinal cohort of HIV-positive and HIVnegative participants seen in infectious disease and general medical clinics. The study is funded primarily by the National Institute on Alcohol Abuse and Alcoholism at the National Institutes of Health 42 . Data were obtained from the patients after they provided written consent; data were collected via telephone interviews, focus groups, and full access to the national Veterans Affairs (VA) electronic medical record system. All subjects in this subset of the VACS cohort were men.
Samples in Cohort 1 were divided into a discovery set (N=580) and a replication set (N=467) for EWAS and for predicting HAD. A majority of discovery samples were HIV-positive (~85.34%), and all replication samples were HIV-positive. HIV Viral Load (VL) was measured per standard of care by polymerase chain reaction as copies per milliliter. The adherence to antiretroviral therapy (ART adherence) was obtained by a survey in the same time window as the blood draw for the measurement of DNA methylation. Genomic DNA was extracted from whole blood using a standard method 12 .

Cohort 2 (N=502):
We recruited 502 HIV-negative healthy community volunteers who responded to advertisements placed either online or in local newspapers and at a community center in New Haven, CT 43 . The subjects were 18-50 years old. We excluded subjects who met the Diagnostic and Statistical Manual of Mental Disorders, 4th Edition (DSM-IVTR) (American Psychiatric Association, 1994) criteria for substance dependence on any drug or alcohol other than nicotine. Subjects with head injury or those who used prescribed medications for any psychiatric or medical disorders were also excluded. Women on oral contraceptives, women who were peri-and postmenopausal, women who had a prior hysterectomy and women who were pregnant, or lactating were excluded. Participants also received a physical examination during a separate session by a research nurse who assessed cardiovascular, renal, hepatic, pancreatic, hematopoietic, and thyroid functions to ensure that all participants were in good health. A breathalyzer test and urine toxicology screens were conducted at each appointment to ensure a drug-free status among participants. Cohort 2 was divided into a training set and a testing set for machine learning prediction of HAD.
Genomic DNA was extracted from whole blood using a standard method.
All phenotypic data in Cohort 1 and Cohort 2 were obtained in the same time window as the blood draws for DNA methylation profiling. The study was approved by the committee of the Human Research Subject Protection at Yale University and the IRB committee of the Connecticut Veteran Healthcare System.

Phosphatidylethanol (PEth) measurement
In this study, PEth was only measured in Cohort 1 using dried blood spot samples derived from frozen peripheral blood mononuclear cells stored at -80°C 5 . PEth can be detected at concentrations as low as 2 ng/ml. A study showed that the PEth value is linearly related to alcohol consumption 44 . In forensics, 20 ng/ml of PEth was used as a cutoff to detect harmful alcohol use 45 . The sensitivity of PEth has been reported to be 99% 44 , with several studies showing the assay to have perfect specificity, including in the presence of liver disease and hypertension. We previously reported that PEth was highly correlated with the AUDIT-C score from electronic records 46 .

Definition of hazardous alcohol drinking (HAD)
In Cohort 1, HAD was defined by a PEth level greater than 20 ng/ml and an AUDIT-C score greater than 4. In Cohort 2, HAD was defined by a 10-item AUDIT score greater than 8 for men and greater than 7 for women. Demographic and clinical variables for HAD versus non-HAD participants in Cohort 1 and Cohort 2 are presented in Table 1.

DNA methylation and data quality control
In Cohort 1, DNAm for the discovery sample was profiled by using the Illumina Infinium HumanMethylation450 Beadchip (Illumina HM450K) (San Diego, CA, USA). DNAm for Epigenome-wide significance was set at a false discovery rate (FDR) < 0.05 in the discovery sample. Significance in the replication sample was set at a nominal < 0.05.

First generalized linear model
We performed a linear model to adjust for the confounders mentioned above in both the discovery and replication models.

Principal component analysis (PCA) of intermediary residuals
We then performed a PCA on the resulting regression residuals. The top five principal components on the residuals (PCs) ( 1 , … , 5 ) were adjusted in the final model.

A final generalized linear model for identifying differential methylation
We performed a final generalized linear regression analysis for each methylation marker predicting the as a function of the natural logarithm of the PEth value adjusted for technical and biological factors and the top 5 PC residuals derived from the model above.

Meta-analysis of EWAS in Cohort 1
An EWAS meta-analysis was conducted by combining the discovery sample and the replication samples. For each CpG, we obtained effect size estimates and p-values from the two samples and weighted the effect size estimates by their estimated standard errors. Then, the summary statistics of the two samples were combined using a samplesize weighted meta-analysis using the METAL program 52 . Epigenome-wide significance was set at a FDR <0.05.

PolyGenic Methylation Score (PGMS)
We constructed a PGMS for each individual as a weighted sum of the individual CpG values using the effect size estimated from the EWAS as weights 14 . In detail, the PEthrelated CpGs identified in the meta-analysis were chosen to construct the PGMS. Then, the PGMS was applied to establish a prediction model for HAD in Cohort 2.

=
: the PGMS of individual ; : the estimated coefficient for CpG probe ; : the methylation -value for individual at CpG probe .

Adjusted and incremental adjusted
We used the adjusted to estimate the phenotypic variances explained by the DNA We applied the incremental adjusted (incremental ) as one of the parameters for feature selection as described below. The incremental was used to determine whether a new predictor increases the predictive ability above and beyond that provided by an existing model. It was calculated for each selected CpG or the linear combination of selected CpGs.

Feature selection using elastic net regularization (ENR)
CpG features were separately preselected from the EWAS results on PEth and on AUDIT-C in Cohort 1. Using the ENR method, we performed a 10-fold cross-validation for feature selection in the training sample of Cohort 2. Here, we randomly selected 80% of the samples in Cohort 2 and cross-validated them to obtain the values for the ENR tuning parameters. The following steps were taken to select the CpG features and to evaluate their performance.
Step 1. Preselection CpGs. Because DNAm of CpGs under the epigenome-wide significance threshold may collectively account for phenotype variation and may improve prediction of a phenotype, we preselected PEth-associated CpGs with a meta < 1E-04 from the meta-EWAS in Cohort 1 for both PEth and AUDIT-C. The preselected CpGs were used to establish the predictive model in the training set of Cohort 2.
Step 2. Importance ranking CpGs. In the training set of Cohort 2, we performed an ENR for feature selection among the preselected CpGs. We extracted the coefficients for the model with the lambda value corresponding to the minimum mean cross-validated error. This procedure was repeated times. We excluded the CpGs with percentage of zero coefficients larger than 95%. All selected CpGs were reranked according to the summation of the absolute value of the coefficients.
Step 3. Model building by ENR in the training set. CpG features were selected based on the area under the receiver operating characteristic curve (AUC), prediction accuracy, and the incremental for different numbers of CpG sets. The model with the best performance was determined, and the optimal values of the parameters in the best model were found by performing cross-validation in ENR.
Step 4. Model performance testing in the testing set. The performance of the CpG features selected from the training set were evaluated in the testing set using AUC, prediction accuracy, and the incremental .
All analyses were performed using R software (https://www.r-project.org/). ENR was performed using the function "cv.glmnet" in the "glmnet" package.

Biological interpretation of the prediction model
Gene enrichment analysis was performed using the CpGs from the final prediction model to understand the underlying biological significance. We applied the webaccessible, gene annotation term-based Database for Annotation, Visualization and Integrated Discovery (DAVID) for gene enrichment analysis (http://david.niaid.nih.gov) 53 . The expanded DAVID Knowledgebase integrates almost all major and well-known public bioinformatics resources 54 . A significant pathway was set as a nominal < 1.00E-02.

EWAS identifies new DNA methylation CpGs for PEth but not for self-reported alcohol consumption
Two analyses of EWAS on PEth values and on the AUDIT-C scores were separately conducted in Cohort 1. Phenotypically, as expected, PEth level and AUDIT-C score were highly correlated ( = 0.45, < 2.00E-16) ( Figure S1a). Compared to the non-HAD group, the HAD group had a greater AUDIT-C score and a higher level of PEth ( = 3.47E-33) ( Figure S1b).
In the discovery sample, the HAD group included more African Americans (AAs), had a higher rate of tobacco smokers and a lower level of ART adherence compared to the non-HAD group ( < 5.00E-02). In the replication sample, the prevalence of smoking in the HAD group was higher than in the non-HAD group, but this difference did not reach statistical significance ( Table 1). Smoking status was still adjusted in the model to address potential smoking effects.

Discovery EWAS on PEth and on AUDIT-C
We identified 9 epigenome-wide significant CpGs on PEth (FDR = 1.22E-04~4.68E-02) ( Figure S2a, Table S1). The EWAS analysis showed minimal inflation ( = 1.093) ( Figure S2b). The 9 significant CpGs were located on 7 genes: SLC7A11 We found no CpGs that reached an epigenome-wide significance threshold for self-reported AUDIT-C scores. Six of the 9 CpGs associated with PEth showed nominal association with AUDIT-C (nominal ranged from 3.50E-03 to 4.76E-02): cg06690548 (ABAT), and cg18590502 (CCDC71). It is noteworthy that all 9 CpGs associated with PEth showed the same direction as the associations with the AUDIT-C scores in the discovery set.
As expected, the analysis of the EWAS on AUDIT-C scores revealed no CpG reaching epigenome-wide significance in the replication sample. Only 2 of 9 CpGs associated with PEth were nominally associated with AUDIT-C scores (cg11376147 in SLC43A1 with = -3.69 and = 2.71E-04, cg26689780 in WDR1 with = 2.07 and = 3.92E-02) and showed the same direction as the association of PEth.
Interestingly, even with an increased sample size in the meta-analysis, we found no epigenome-wide significant CpG site of the meta-EWAS on AUDIT-C scores ( Figure   2b).

We further tested the correlation between the -values of the 13 CpGs with
Bonferroni significance and PEth. All 13 CpGs were significantly correlated with PEth levels after the model was adjusted for confounding factors (Figure 2c), four of the 13 CpGs were positively correlated with PEth, and the remaining 9 CpGs were negatively correlated with PEth.

PEth-associated CpG sites improves the prediction of HAD in Cohort 1
Because PEth itself was highly correlated with AUDIT-C scores and differed significantly between the HAD and the non-HAD groups, we were interested in whether PEth-associated CpG DNAm improved the prediction of HAD compared to the prediction of HAD using PEth alone. We found that the AUC was 74.2% for PEth alone, 76.9% with the 13 Bonferroni significant CpGs and PEth, and 88.3% with the 102 epigenome-wide significant CpGs and PEth ( Figure S4). Thus, DNAm features improved the prediction of hazardous alcohol consumption compared to PEth alone in the same cohort.

PGMS derived from 102 PEth-associated CpGs is correlated with alcohol consumption in an independent sample
To be consistent with the analysis in Cohort 1, we performed an EWAS on AUDIT-C score in Cohort 2. We found no epigenome-wide significant CpG for AUDIT-C. An EWAS for a full scale of AUDIT score also revealed no significant CpG.
We found that a PGMS constructed from the 102 PEth-associated CpGs was highly correlated with the self-reported 10-item AUDIT score in Cohort 2 ( = 0.40, < 8.09E-20). The incremental of the association between the PGMS corresponding to 102 PEth-related CpGs and the full AUDIT score was 0.1002, which implied that the PGMS explained 10.02% of the variance of the full AUDIT score in an independent population ( Figure S5a).
We further tested whether the PGMS derived from the PEth-associated CpGs was separately correlated with self-reported alcohol consumption (AUDIT-C, first 3items of AUDIT) and self-reported problem alcohol drinking behaviors (AUDIT-P, item 4-10 of full AUDIT). We found that the PGMS was significantly correlated with AUDIT-C score ( = 0.37, = 4.60E-16) ( Figure S5b) and AUDIT-P score ( = 0.35, = 3.70E-11) (Figure S5c). The correlation of the PGMS was slightly stronger with the AUDIT-C score than with the AUDIT-P score.

PEth-associated DNA methylation CpG sites predict HAD in Cohort 2
Notably, we found no statistically significant difference in the characteristics between the training set and the testing set in Cohort 2 ( In the testing set, we found that the model with the 130 CpGs showed an AUC of 70.60%, a balanced accuracy of 60.00%, and an incremental of 3.79% (Figure 3b).
The results show that the 130 selected PEth-associated CpGs enabled the good prediction of HAD. Notably, the panel of 130 CpGs included 48 epigenome-wide significant CpGs for meta-EWAS on PEth in Cohort 1. We summarized the information of the 130 selected CpGs in Table S3.
Using the same approach for the analysis of feature selection of AUDIT-C-

Biological interpretation of the 130 identified PEth-associated CpGs
The 130 CpGs from the final predictive model were annotated on 111 genes. Gene enrichment analysis using the 111 genes yielded 28 significant annotation terms ( < 1.00E-02, Figure S6)  Emerging evidence suggests that a set of epigenetic modification markers across different tissues is more stable and reproducible than we previously expected 56  The reproducible CpGs suggest a robust, consistent epigenetic response to alcohol consumption that can serve as biomarkers for clinical use. Using a machine learning approach, we identified a set of 130 CpGs that enables the distinction of HAD and non-HAD individuals. One of the common challenges for machine learning prediction is model overfitting. We took several steps to address this concern: 1) feature preselection and selection were conducted in two different cohorts; 2) the processes of feature selection and model evaluation were carried out in the same cohort but in different sets without overlapping samples; and 3) we applied a newly developed machine learning ENR method to select features in a combination of 10-fold crossvalidation. Compared to two traditional penalized regression methods, Ridge 57 and the least absolute shrinkage and selection operator (LASSO) 58 , ENR has the advantage of selecting informative features without compromising predictive accuracy and has been shown to outperform both the Ridge and LASSO methods 59 . With these strengths of the analytical approach, we showed that a panel of 130 CpGs performed fairly well with an AUC of 70.60%, a balanced accuracy of 60.00%, and an incremental of 3.79% in the testing sample set. Although the AUC in our study was less than the previously reported AUC of 0.90-0.99 with 144 CpGs 30 , our result is less likely to be inflated because of our analytical approach to avoid data overfitting. In the previous study, the model building and evaluation were performed using the same sample set while we performed the prediction analysis in the training and testing set separately.

DISCUSSION
Several limitations should be considered in interpreting the current findings. 1) There was a lack of power to detect sex-specific associations between CpGs and HAD.
It is well known that HAD in men and women is epidemiologically and mechanistically different. The individuals in Cohort 1 were all men and approximately 50% of the individuals in Cohort 2 were women. These samples are insufficient to seek sex-specific DNAm markers.
2) The DNAm signatures were identified from whole blood samples that other alcohol use-related phenotypes, e.g., alcohol use disorder, is necessary to confidently claim the predictive performance and accuracy for clinical use.
In summary, to the best of our knowledge, this is the first study to demonstrate that PEth is a robust phenotype for detecting subtle DNAm changes associated with alcohol consumption compared to self-reported alcohol use data. PEth-associated DNAm markers predicted HAD with a good accuracy. These findings suggest that

Demographic variables, clinical variables and methylation status for the VACS samples
were submitted to the GEO dataset (GSE117861) and are available to the public. All codes for analysis are also available upon a request to the corresponding author.

AUTHORS' CONTRIBUTIONS
XL was responsible for the bioinformatics data processing and statistical analysis. ACJ provided DNA samples and clinical data and contributed to the interpretation of findings and manuscript preparation. KS contributed to the manuscript preparation. JHK contributed to the interpretation of findings and manuscript preparation. RS provided DNA samples and clinical data and contributed to manuscript preparation. KX was responsible for the study design, study protocol, sample preparation, data analysis, interpretation of findings, and manuscript preparation. All authors read and approved the final manuscript.