New screening approach for Alzheimer’s disease risk assessment from urine lipid peroxidation compounds

Alzheimer Disease (AD) standard biological diagnosis is based on expensive or invasive procedures. Recent research has focused on some molecular mechanisms involved since early AD stages, such as lipid peroxidation. Therefore, a non-invasive screening approach based on new lipid peroxidation compounds determination would be very useful. Well-defined early AD patients and healthy participants were recruited. Lipid peroxidation compounds were determined in urine using a validated analytical method based on liquid chromatography coupled to tandem mass spectrometry. Statistical studies consisted of the evaluation of two different linear (Elastic Net) and non-linear (Random Forest) regression models to discriminate between groups of participants. The regression models fitted to the data from some lipid peroxidation biomarkers (isoprostanes, neuroprostanes, prostaglandines, dihomo-isoprostanes) in urine as potential predictors of early AD. These prediction models achieved fair validated area under the receiver operating characteristics (AUC-ROCs > 0.68) and their results corroborated each other since they are based on different analytical principles. A satisfactory early screening approach, using two complementary regression models, has been obtained from urine levels of some lipid peroxidation compounds, indicating the individual probability of suffering from early AD.


Materials and Methods
Study design and participants. Participants were from the Neurology Unit (University and Polytechnic Hospital La Fe, Valencia, Spain). Their ages were between 50 and 75 years, and they were classified into early AD (case group) (n = 70) and healthy (control group) (n = 29) according to neuropsychological tests, neuroimaging (nuclear magnetic resonance, computerized axial tomography), and CSF biomarkers (β-amyloid, total tau (t-Tau), phosphorylated tau (p-Tau)).
The study protocol was approved by the Ethics Committee (CEIC) from Health Research Institute La Fe (Valencia, Spain), the methods were carried out in accordance with the relevant guidelines and regulations, and informed consent from all participants was obtained.
Urine samples analysis. Urine samples (n = 99) were collected in a sterile bottle and immediately stored at −80 °C until analysis (~6 months). As stated in a previous study, no deterioration was observed for the lipid peroxidation compounds at long-term, since samples were not subjected to freeze-thaw cycles 26 . Then, they were treated following the optimum procedure established in a previous work 26 . Briefly, samples were thawed on ice and 5 μL of the internal standard solution (PI) (PGF 2α -d 4 10 μmol L −1 and d 4 -10-epi-10-F 4t -NeuroP 6 μmol L −1 ) were added to 1 mL of sample. Then, enzymatic hydrolysis was performed by adding the enzyme β-glucuronidase and sodium acetate buffer (100 mmol L −1 , pH 4.9) and incubated for 2 hours at 37 °C. Then, the reaction was stopped and the enzyme was precipitated with cold methanol and chlorhydric acid (37%, v/v) and centrifuged for 10 min (14000 g, 4 °C). The supernatant pH was adjusted to 6-7 with sodium hydroxide (2.5 mol L −1 ). Then, a cleaning and pre-concentration step was carried out by solid-phase extraction (SPE). For this, the cartridges were first conditioned with methanol and H 2 O, then the samples were loaded into the SPE cartridge and the cartridge was washed with ammonium acetate (100 mmol L −1 , pH 7) and heptane. Elution was carried out with 2 × 500 μL of methanol (5% v/v CH 3 COOH). After that, the samples were evaporated in the vacuum evaporator and reconstituted in 100 μL of H 2 O (pH 3):CH 3 OH (85:15 v/v) containing 0.01% (v/v) CH 3 COOH. Finally, the samples were injected into a chromatographic system (UPLC-MS/MS).
The results were standardized by the creatinine levels measured using a colorimetric kit (MicroVue creatinine EIA) and a spectrophotometer. chromatographic system. The chromatographic system consisted of a UPLC system (Waters Acquity) coupled to a Xevo TQD system mass spectrometry system (Waters, United Kingdom). The conditions used were: ionization in negative mode (ESI-), capillary tension 2.0 kV, source temperature of 150 °C, desolvation temperature of 395 °C, gas flow of the nitrogen cone of 150 L h −1 , and desolvation flow of 800 L h −1 .
The LC conditions were selected to achieve appropriate chromatographic retention and resolution by using a C 18 column (2.1 × 100 mm, 1.7 μm) (Acquity UPLC BEH, Waters). Mobile phases consisted of water (0.01% v/v CH 3 COOH as mobile phase A) and acetonitrile (0.01% v/v acetic acid as mobile phase B). The temperatures of the column and the autosampler were set at 55 °C and 4 °C, respectively. The injection volume was set at 8 µL and the flow rate was set to 0.45 mL min −1 . A total 8.5 min elution gradient was performed. It consisted of 0.5 min with eluent composition at 80% A and 20% B, which was gradually changed to 55% A and 45% B at 6 min; then B was increased to 95% along 0.2 min, and kept constant for 0.8 min. Finally, the mobile phase composition returned to the initial conditions, and it was maintained for 1.3 min for system conditioning.
The detection was performed by multiple reaction monitoring (MRM) using the acquisition parameters obtained in a previous work 26 . Statistical analysis. Data were summarized using median and interquartile range (IQR) in the case of continuous variables, and with relative and absolute frequencies in the case of categorical variables (Table 1). Prior to modelling, variables were log-transformed to avoid potential strongly influential outliers due to the highly skewed nature of some variables ( Fig. S1 in Supplementary Material). Then, a logistic regression model based on elastic-net-penalized was developed including gender and age as covariates. The penalization parameter lambda was selected by performing 500 replications of ten-fold cross validation. The minimum cross-validated error was selected on each replication and the median from the selected lambda values was considered the consensus lambda. Since the minimum lambda value was used, an alternative variable selection method was performed as a sensitivity analysis. This alternative analysis consisted on a random forest using the Altmann et al. method 27 . The final elastic net model was validated using bootstrap validation. For this, the procedure of Steyerberg et al. was followed 28 . Statistical analyses were performed using the softwares R (version 3.5.0), the BootValidation R (version 0.1.3), glmnet R (version 2.0-16), and ranger (version 0.9.0). Table 1 shows the demographic and clinical data for both groups. Small differences were shown for age and gender between groups, so these variables were considered covariates. Regarding the neuropsychological variables (Clinical Dementia Rating (CDR), Repeatable Battery for the Assessment of Neuropsychological Status (RBANS), Functional Activities Questionnaire (FAQ), Minimental State Examination (MMSE) and biological measures (CSF β-amyloid, CSF t-Tau, CSF p-Tau, temporal atrophy) used in the standard diagnosis, they showed significant differences between groups. However, the demographic variables (age, gender, studies, alcohol, smoking status, medication and comorbidity) did not show statistical differences between groups.   Screening model from urine lipid peroxidation biomarkers. The elastic net model selected five variables corresponding to one isoprostane, one neuroprostane, one prostaglandin and two dihomo-isoprostanes shown in Table 3. The model also included gender and age, which were introduced as covariates. These predictor variables were combined as it is indicated in the formula below in order to estimate the individual probability (Pr) of suffering from AD. The alternative analysis using random forest selected the same five variables as the most important ones (Table 3), and they were also all considered statistically significant by the Altmann method 27 . Classification performance of the models was assessed using bootstrap in the case of elastic net and by the Out of Bag (OOB) estimate in the case of random forest. Bootstrap validated area under the receiver operating characteristics (AUC-ROC) for the elastic net model was 0.682 and OOB accuracy for the random forest model was 0.71, so their performance can be considered similar. Remarkably for the elastic net results, the sensitivity and specificity profile shows a sharp decrease of the sensitivity values as the specificity increases, forcing a decision between high sensitivity (0.97) at a cost of low specificity (0.31) or high specificity (0.93) at a cost of mediocre sensitivity (0.5) (Fig. 2).   Table 3. Results of the elastic net and random forest analyses. Coefficients of the elastic net model are interpreted as log-odds, so negative values indicate a negative association between higher concentration levels and risk of disease and positive values indicate a positive association between higher concentration levels and risk of disease. Importance values and p-values for random forest are derived from the gini index using Altman method.

Discussion
The reliable determination of lipid peroxidation products levels in urine samples from well-defined healthy and early AD participants, and the satisfactory classification performance of two complementary regression models allowed to develop an early and non-invasive screening model to identify individuals with high risk to develop the AD. The role of lipid peroxidation in AD development has been largely studied 10 , but few studies have been carried out determining isoprostanoids as target metabolites in AD 17,29 . In addition, the analytical methods used in most of these works were based on commercial kits or immunoassays what is associated to low specificity on isomers determinations 30 . Nevertheless, in the present study a previously validated analytical method based on mass spectrometry detection has been used, providing high selectivity and sensitivity, as well as high reliability to determine simultaneously several isoprostanoids isomers 26 .
In this work, two alternative modeling methods with completely different characteristics were used. First, elastic net logistic regression is based on standard generalized linear regression models, thus assuming linearity of the relationship between predictors and the linear predictor, no interactions are assessed and the results are fully interpretable as in a standard logistic regression. On the other hand, random forest is a non-linear non-parametric model, that enable the assessment of higher order interactions between variables at a cost of lower statistical power compared to elastic net model when the relationship is linear 33,34 . Random forest does not provide an interpretable model, but provides a list of the most important variables in predicting the response. The fact that both methods obtained very similar results, provides robustness to our results.
In literature, few AD predictive models using these sophisticated statistical tools can be found [21][22][23]34 , and most of them are based on neuroimaging measures 24 . However, none of them were based on non-invasive determination of lipid peroxidation biomarkers in early AD patients.
The diagnostic indexes obtained from both models indicated that the results could constitute a satisfactory screening approach from early AD stages with the consequent benefits for patients and health public system. In fact, the high sensitivity obtained would allow a reliable identification of high-risk patients in the early stages of AD, and they would be derived to a method with higher specificity to rule out false positives 17 . Nevertheless, further clinical validation using an external cohort of participants would be required in order to obtain a reliable diagnostic model.
Regarding the study limitations, the low number of controls compared to cases would be explained by the difficulty to obtain healthy participants with CSF biomarkers. Also, we did not include participants with other similar dementias, so differential AD diagnosis was not achieved. Further clinical validation work will be developed by including a higher number of controls, as well as patients with similar pathologies. In addition, a follow-up study will be carried out in order to evaluate the variation of these compounds levels along the time.

conclusion
A set of new lipid peroxidation biomarkers has been determined in urine samples from well-defined participants (early AD, healthy) by means of a previously validated analytical method. So, reliable results have been obtained and used to develop a preliminary early and non-invasive screening model in order to identify potential individuals with high risk of suffering AD, although it could not be considered AD specific. For this, two different regression models (linear, elastic net; non-linear, random forest) were developed, obtaining similar performance in terms of variable selection and accuracy, in spite of being based on different analytical principles, and so providing robustness to the results.

Data Availability
The datasets generated during the current study are available from the corresponding author on reasonable request.