Plasma and urinary metabolomic profiles of Down syndrome correlate with alteration of mitochondrial metabolism

Down syndrome (DS) is caused by the presence of a supernumerary copy of the human chromosome 21 (Hsa21) and is the most frequent genetic cause of intellectual disability (ID). Key traits of DS are the distinctive facies and cognitive impairment. We conducted for the first time an analysis of the Nuclear Magnetic Resonance (NMR)-detectable part of the metabolome in plasma and urine samples, studying 67 subjects with DS and 29 normal subjects as controls selected among DS siblings. Multivariate analysis of the NMR metabolomic profiles showed a clear discrimination (up to of 80% accuracy) between the DS and the control groups. The univariate analysis of plasma and urine revealed a significant alteration for some interesting metabolites. Remarkably, most of the altered concentrations were consistent with the 3:2 gene dosage model, suggesting effects caused by the presence of three copies of Hsa21 rather than two: DS/normal ratio in plasma was 1.23 (pyruvate), 1.47 (succinate), 1.39 (fumarate), 1.33 (lactate), 1.4 (formate). Several significantly altered metabolites are produced at the beginning or during the Krebs cycle. Accounting for sex, age and fasting state did not significantly affect the main result of both multivariate and univariate analysis.

It is widely accepted that the Hsa21 gene product excess in a ratio of 3:2 when comparing trisomy 21 and normal cells is responsible for the typical features of DS 2,11-13 ; however, a pathogenetic model linking specific structural and functional aspects of Hsa21 to ID in DS is not yet known.
In his studies, Lejeune hypothesized that DS could be considered a metabolic disease. In the conference talk "Vingt Ans Après", he explained how the one carbon cycle could be involved in the pathogenesis of ID in subjects who do not have a gross anatomic defect of the brain, and he asserted: "the goal is to figure out where a link between mental deficiency and trisomy 21 should be sought" 14 .
To explain his thoughts, he compared the genotype to an orchestra in "concert": trisomy 21 is dis-concerting 15 . That means that the chemical bases of ID in these subjects are not coordinated. Through a careful cytological and biochemical analysis, it was demonstrated that some enzymes with increased activities are encoded by genes located on Hsa21, but also by genes located on the other chromosomes. For example, superoxide dismutase 1 (SOD1) activity, which is increased by 1.5 times in trisomy 21 children, belongs to the first group, while glutathione peroxidase (GPX1), which is also increased, belongs to the second one 15 .
In this work, we performed for the first time a metabolomic analysis of plasma and urine from Down syndrome and control subjects in order to give some insight into the metabolic processes possibly changed in DS as a result of gene imbalance. Metabolomics is a fairly recent discipline focusing on comprehensive analysis of the metabolites, in a biological system 16 . It studies metabolites, small molecules, end products of the cellular processes, which are enclosed in the term "metabolome" 17 . The major challenge of metabolomics is to analyze the highest number of endogenous metabolites as possible in a more accurate way 16 . The metabolic profile could be considered an instantaneous "snapshot" of the cell physiology. Indeed, metabolomics is giving important outcomes in the clinical area, especially in identifying biomarkers or in defining disease pathophysiology 18,19 . Any of these profiles provide information that cannot be obtained directly from the genotype, gene expression profiles, or even from the proteome (the set of all the proteins expressed by the genome) of an individual 19 .
Blood serum, plasma and urine are the biological fluids generally used to examine the alterations of metabolite levels. The two main methods used to perform metabolomic analysis are: nuclear magnetic resonance (NMR) spectroscopy and mass spectrometry (MS) coupled with separation techniques 20,21 . NMR, although characterized by a lower sensitivity than MS, results to be a very appropriate platform because it is highly reproducible, quantitative, and requires minimal sample manipulation.
The aim of this work is to verify the hypothesis that specific metabolic alterations may be detected in biological fluids of subjects with DS. To model systematic alterations of metabolites in subjects with DS, we chose to analyze plasma and urine samples. Untargeted 1 H-NMR has been used to measure the NMR-detectable part of the metabolome in these biological fluids in both subjects with DS and healthy control subjects recruited among DS normal siblings. Multivariate statistical analysis was performed to evaluate the discrimination accuracy between DS and controls on the basis of their NMR profiles. Univariate analysis was performed to identify metabolites that have significantly different concentrations in DS and control groups. All significant results have been discussed in terms of genomics and of biochemistry.

Results
Study Design. The main features of the analyzed cohorts and collected samples are described in the Materials and Methods section and summarized in Table 1. Given that in the pediatric age it is not always possible to collect all sample at a fasting state, to exclude the possibility that breakfast could have altered the results if patients were examined later in the morning, we performed multivariate and univariate analysis for two groups of subjects: the "all" groups include fasting and non-fasting subjects, and the "fasting" groups.
A random sampling of 25 DS and 25 control subjects, repeated 50 times, still highlighted a good discrimination (80%) between DS and controls, excluding that the statistical significant difference of the mean age between the two groups may have affected the main results.
Regarding univariate analysis, the signals of 33 metabolites were unambiguously assigned (Supplementary Table S1) and integrated in 1 H-NMR spectra of plasma. They are listed in Table 2 (all) and 3 (fasting). When all subjects were considered irrespectively of the fasting state, acetate, acetoacetate, acetone, creatine, formate, L-glutamine, glycerol, pyruvate, succinate and Unk3 (an unknown metabolite, Supplementary Figure S2) were significantly increased in DS plasma with a DS/CTRL ratio > 1; instead, lysine and tyrosine were significantly reduced in DS with a DS/CTRL ratio < 1 ( Table 2). From the 1D spectral patterns, Unk3 appears to give rise to a single detectable signal at 3.94 ppm, which makes the use of 2D approaches useless for its assignment.
When fasting subjects were selected, the same metabolites were also found to be significantly increased or decreased in DS, with the exception of acetoacetate and lysine, which resulted not significantly different (Table 3). To evaluate the presence of confounding factors like age and sex, the metabolites which resulted significant in the univariate statistical analysis were tested with univariate and multivariate logistic regression (Supplementary  Table S2).
A casual sampling of 20 DS and 20 control subjects from the total groups, repeated 50 times, still highlighted a good discrimination (71.4%) of DS vs controls, excluding that the statistical (but not clinical) significant difference of the mean age between the two groups may have affected the main results.
Regarding univariate analysis, the signals of 30 metabolites were unambiguously assigned (Supplementary  Table S3) and integrated in 1 H-NMR spectra of urine. They are listed in Table 4 (all) and 5 (fasting). When all subjects were considered irrespectively of the fasting state, phenylacetylglycine, trimethylamine-N-oxide (TMAO) and tyrosine were significantly increased in DS with a DS/CTRL ratio > 1; instead, glycine was reduced in DS with a DS/CTRL ratio < 1 (Table 4).
When fasting subjects were selected, ethanolamine, glutamate + glutamine and phenylacetylglycine were significantly increased in DS with a DS/CTRL ratio > 1; instead, leucine was significantly reduced in DS with a DS/ CTRL ratio < 1 (Table 5).

Genomic Analysis.
We performed an analysis to correlate the metabolites we found altered in DS, the enzymes of the metabolic pathways related to these metabolites and the genomic location of the corresponding genes. The results are shown in Supplementary Table S4.

Discussion
In this work, we have conducted for the first time a systematic analysis of the NMR-detectable part of the metabolome in subjects with DS and normal controls recruited among their siblings. Theories about DS pathogenesis, in particular concerning ID, have considered different aspects of the data derived from different research themes, focusing on neuronal proliferation, neurotransmission modulation and oxidative stress as possible main mechanisms impaired in DS. The metabolic hypothesis was presented mainly by J. Lejeune in the 70s, although related systematic investigation has not yet been performed.
To date no report has been published on the metabolomic analysis in DS, although several studies, discussed below, have underlined alterations in single compounds in blood, urine or cells obtained from DS subjects compared to normal subjects suggesting that specific alterations of metabolic pathways could play a critical role in the pathogenesis of DS 14 .
Over time several alterations on metabolite concentrations have been described in the blood of DS in comparison with control subjects: i) increased levels of phenylalanine and tyrosine in blood serum following 1-phenylalanine load and due to lower hydroxylation rate of phenylalanine 22 ; ii) lower plasma levels of free histidine, lysine, tyrosine, phenylalanine, leucine, isoleucine and tryptophan 23 ; iii) increased plasma concentrations of leucine, isoleucine, cysteine and phenylalanine at an age vulnerable to Alzheimer changes 24 ; iv) decreased plasma concentration of serine at any age, possibly due to a dosage effect of the gene for cystathionine beta synthase   25 ; v) increased plasma lysine concentration in patients above 10 years old, possibly due to premature aging 25 . More recently, concentrations of metabolites related to the methylation cycle such as cysteine, cystathionine, choline and dimethylglycine concentrations were found to be significantly elevated in DS plasma by MS analysis 26 , as well as S-adenosylhomocysteine and S-adenosylmethionine plasma level that however were found to be decreased in a previous report 27 . Discrepancies found in the results for some metabolite dosages could be due to the use of different methods or to differences in the investigated populations 26 .
Although it has been proposed that oxidative stress has a main role in the pathogenesis of DS, urinary biomarkers of oxidative stress have not been studied in this condition. A urine tyrosyl radical produced from the oxidation of L-tyrosine by the myeloperoxidase-H 2 O 2 system of macrophages and neutrophilis 28 , has been proposed as an oxidative stress biomarker in hypothyroid DS children 29 .
Metabolomic studies were also conducted on amniotic fluid samples from fetuses with DS compared with those of non-syndromic fetuses by MS analysis showing an elevation of phenylpyruvate that inhibits the metabolism of tetrahydrobiopterin 30 ; decreased levels of glycine and glutamate, involved in the neurotransmission processes 31 , and an increased level of glutamine were also measured by high-performance liquid chromatography (HPLC).
The recent availability of powerful techniques opens the way to investigate a large number of metabolites in biological fluids. This allows the measurement of the concentration of specific metabolites of interest by analyzing  Table 2. Univariate statistical analysis of plasma samples. List of metabolites whose concentration levels (in arbitrary units) have been determined in all samples (DS, n = 41; CTRL, n = 25). The p-value (p) of the univariate Wilcoxon-Mann-Whitney test for each metabolite is reported together with the p-value calculated after false discovery rate correction (pFDR). The effect size, using the Cliff 's delta formulation, was also calculated to aid the identification of the meaningful signals giving an estimation of the magnitude of the separation between the different groups. Metabolites that show significant concentration differences in the two groups (p-value < 0.05) and/or show values in the interval next to 3:2 are reported in bold. *Values in the interval next to 3:2 (range 1.3-1.7).
the spectra that are acquired for the whole complement of substances present in the fluid. On the other hand, using plasma and urine as the study models provides the possibility to visualize a balance of the metabolism at the level of the whole body. This makes sense in a genetic condition in which all cell types in all organs present with the same basic defect. A first key result is the multivariate analysis of the data obtained from the plasma analysis, which has allowed us to discriminate with accuracies of the order of 80% changes in the NMR metabolomic profile between the DS and normal groups (Fig. 1).
The univariate analysis was based on 33 metabolites, whose NMR signals were identified and integrated in plasma spectra (Supplementary Table S1 and Tables 2 and 3). The results showed a systematic deviation for a subgroup of the analyzed substances, including several metabolites involved in central metabolic processes related to mitochondrial metabolism such as the Krebs cycle, glycolysis and oxidative phosphorylation (OXPHOS) in DS.
Metabolomic analysis by NMR provides highly precise measures of the relative concentration of the metabolites possibly affected by the fasting state of the subject when collecting the investigated fluid. Remarkably, univariate analysis revealed, even correcting for fasting state, a high level of specific change of metabolite in DS samples. Sex of the subjects was uniformly distributed in our groups (Table 1). Actually, also accounting for sex did not significantly affect the main result that plasma metabolome profile is different in DS vs normal subjects (Supplementary Figure S1). Regarding age, it is well-known that the risk of having a child with DS increase with the maternal age 32 . Therefore, children with DS are often younger than their siblings, constituting the control group. This led to a statistically significant difference between the mean age of the DS group vs control group (DS = 11.8 ± 7.1 and control = 16.8 ± 7.6, mean age ± SD). However, we have considered the critical advantage of having normal siblings of DS subjects as controls in the study of a genetic disease as the most similar genetic background is obtained between the two groups, suggesting that differences may have been due to the extra chromosome in the affected siblings rather than a more generalized genetic difference. In addition, the casual multiple resampling of subgroups with different mean ages fully confirmed the difference between the two groups. Moreover, the significant metabolites identified by univariate statistical analysis (Tables 2 and 3) maintained their significance when adjusted for sex and age factors using a multivariate logistic regression model (Supplementary  Table S2). Previous works have demonstrated that dosable enzymes whose genes are located on Hsa21 adhere to the 3:2 overexpression model expected in trisomy 21 [33][34][35][36] . Therefore, we have chosen to consider a metabolite to be increased based on either a statistically significant difference or a biologically significant ratio of the DS/normal median near to 3:2 (1.5), although not statistically significant due to the distribution of the values. An analogous reasoning was applied to identify a decreased concentration. Interestingly, there was a basic scheme with two alternatives: a DS/normal ratio very near to 1, implying that the metabolite is not involved in an alteration of metabolism in DS, or a ratio very near to 3:2 or 2:3, consistent with the hypothesis that the pertinent kinetic reactions are actually proceeding at 150% of their normal velocity, with the consequent yield of 150% or 67% of the relative final product, depending on the structure of the pathway. It should be noted that several significantly increased metabolites with 3:2 ratio with respect to healthy control samples are metabolic products of enzymes whose genes, except for one, are not located on Hsa21 (see Supplementary Table S4) suggesting that the interactions of Hsa21 gene products with other genes or proteins are potentially responsible for DS phenotypic variations. Interestingly, the metabolites we found increased are pyruvate, which connects Krebs cycle and glycolysis; fumarate and succinate intermediates of Krebs cycle; lactate, the end product of anaerobic glycolysis; formate, involved in mitochondrial one-carbon metabolism; and creatine, involved in the process of energy-dependent muscle activity. It is important to underline that the increment of pyruvate, succinate, formate and creatine are significant in terms of p FDR and classified as medium or large in terms of Cliff 's delta (Tables 2 and 3). These alterations suggest a systematic imbalance of the Krebs cycle and different pathways of mitochondrial metabolism. Although several works have pointed out the role of mitochondria in DS 37,38 , no description of an alteration of the metabolites related to the Krebs cycle had been reported to date in DS. Krebs cycle is a central metabolic pathway for regulation of cell metabolism and energy homeostasis 39 Table 3. Univariate statistical analysis of plasma samples (fasting subjects subset). List of metabolites whose concentration levels (in arbitrary units) have been determined in samples from fasting subjects (DS, n = 25; CTRL, n = 21). The p-value (p) of the univariate Wilcoxon-Mann-Whitney test for each metabolite is reported together with the p-value calculated after false discovery rate correction (p FDR ). The effect size, using the Cliff 's delta formulation, was also calculated to aid the identification of the meaningful signals giving an estimation of the magnitude of the separation between the different groups. Metabolites that show significant concentration differences in the two groups (p-value < 0.05) and/or show values in the interval next to 3:2 are reported in bold. The increase of plasma lactate we found in DS samples is consistent with the increase of basal levels of lactate found in fibroblasts from DS patients 42 and supports the hypothesis that in DS cells, in which the OXPHOS is impaired 42,43 , DS cells activate glycolysis for their energy demands. Consistently, also in another neurodevelopmental disease such as the autism spectrum disorder associated with mitochondrial metabolism impairment 44 , abnormal levels of metabolites associated with activation of glycolysis like serum lactate and pyruvate were found 45 . Very interesting and new data is the increase of plasma creatine in DS samples. Creatine is phosphorylated in mitochondria by ATP derived from oxidative phosphorylation and the phosphocreatine, subsequently exported outside mitochondria, used by the cytosolic creatine kinase to resupply ATP for muscle activity 46 . We can suppose that plasma accumulation of creatine, probably due to the OXPHOS impairment, could account for the muscle weakness, another typical DS phenotype.
Defective mitochondrial biogenesis has also been extensively described in DS 37 . Indeed, the overexpression of the Hsa21 gene NRIP1 (21q11.2-21q21.1) is known to impair the activity of the transcriptional coactivator PPARGC1A causing mitochondrial dysfunction reverted by the PPARGC1A expression inducer drug metformin 47 . Impairment of the methyl cycle has been actually documented and also affects mitochondrial methyl availability and glutathione levels in DS 48 . Finally, a work of Coppus and Coll. 49 has confirmed some metabolic alterations in DS. By HLPC, they found, for example, a decrease of tyrosine in DS compared with controls. Here, we also observed decreased tyrosine levels (not significant in terms of p FDR but classified as medium in terms of Cliff 's delta (Tables 2 and 3)) with a ratio DS/control = 0.87, which was near the 2:3 ratio. Tyrosine is a precursor of thyroid hormones, whose level is often decreased in DS, so it would be interesting to test the hypothesis of a correlation between tyrosine and thyroid hormones in subjects with DS. Our preliminary analysis failed to find  Table 4. Univariate statistical analysis of urine samples. List of metabolites whose concentration levels (in arbitrary units) have been determined in all samples (DS, n = 51; CTRL, n = 20). The p-value (p) of the univariate Wilcoxon-Mann-Whitney test for each metabolite is reported together with the p-value calculated after false discovery rate correction (p FDR ). The effect size, using the Cliff 's delta formulation, was also calculated to aid the identification of the meaningful signals giving an estimation of the magnitude of the separation between the different groups. Metabolites that show significant concentration differences in the two groups (p-value < 0.05) and/or show values in the interval next to 3:2 or 2:3 are reported in bold. *Values in the interval next to 3:2 (range 1.3-1.7). **Values in the interval next to 2:3 (range 0.58-0.76).
SCIENtIfIC RePoRTS | (2018) 8:2977 | DOI:10.1038/s41598-018-20834-y such a correlation when routine laboratory analysis data for thyroid hormone levels in the children investigated here were correlated to plasma tyrosine levels.
As far as the possible relationship between metabolomic profile in DS and ID are concerned, we used the "Lejeune machine" 14 to identify which biochemical pathways presumably involved in ID were interested, according to our and previously published data for plasma (Fig. 3).
The four sections of the machine, decades before current system biology diagrams, represent the critical conditions for the nervous system to work: neuron proliferation (DNA "bases"), neurotransmission ("Mediateurs"), nerve fiber insulation ("Gaines") and availability of energy ("Energie"). According to this scheme, our data suggest that there is a key alteration in the production of energy. To cite J. Lejeune: "Even if the network is correct and the insulating system properly developed, genetic mistakes can prevent the function. Generally speaking, one gets the feeling that the machine is running but cannot develop its full power. Exactly like a motor to which the fuel is not provided in correct amount"; "One would believe that either the brain does not dispose of enough energy or that some toxic is impairing its ignition process" 14 .
At variance with blood, urine is a biofluid characterized by a large daily variability due to the effect of food/ beverage intake and a possible modulation of the urine metabolome by the circadian clock 50 . Dilution resulting by different hydration status of the donors was corrected by using PQN normalization. Nevertheless, usually, multiple collections of urine samples from the same individual are needed to extract the characteristic individual phenotype from the urinary "metabolic" noise 50-54 . Here, it was possible to obtain a single urine sample per donor, still, some meaningful differences were detectable between control and DS subjects, which complement the results obtained on plasma. The discrimination between DS and control groups by multivariate analysis  Table 5. Univariate statistical analysis of urine samples (fasting subjects subset). List of metabolites whose concentration levels (in arbitrary units) have been determined in samples from fasting subjects (DS, n = 26; CTRL, n = 14). The p-value (p) of the univariate Wilcoxon-Mann-Whitney test for each metabolite is reported together with the p-value calculated after false discovery rate correction (p FDR ). The effect size, using the Cliff 's delta formulation, was also calculated to aid the identification of the meaningful signals giving an estimation of the magnitude of the separation between the different groups. Metabolites that show significant concentration differences in the two groups (p-value < 0.05) and/or show values in the interval next to 3:2 or 2:3 are reported in bold. *Values in the interval next to 3:2 (range 1.3-1.7). **Values in the interval next to 2:3 (range 0.58-0.76).
SCIENtIfIC RePoRTS | (2018) 8:2977 | DOI:10.1038/s41598-018-20834-y appeared essentially unaffected by fasting state, sex or age ( Fig. 2 and Supplementary Figure S3). This result is of particular interest because of the above-discussed fluctuation of metabolite concentrations in urine. A lower number of metabolites was found to be altered by univariate analysis (none significant in terms of p FDR , except for leucine in fasting subjects) still confirming that metabolites implied in key steps of the general metabolism are altered (Tables 4 and 5).
Higher levels of citrate have been previously reported in peripheral blood mononuclear cells and lymphoblastoid cells from children with DS 55 . Here, we found an increase of citrate in urine of fasting subjects with a ratio DS/control = 1.33, which was near the 3:2 ratio, although the differences in citrate levels between the two groups are not significant.
To verify if enzymes involved in pathways related to the metabolites that we have found to be significantly altered in DS are located on Hsa21, we have mapped the genomic location of these enzymes (Supplementary Table S4). The currently analyzed set revealed only one enzyme gene, FTCD, located on 21q22.3 and encoding for the formimidoyltransferase cyclodeaminase. FTCD enzyme is involved in the most common inborn error of folate metabolism due to an autosomal recessive disorder causing a glutamate formiminotransferase The drawing has been modified here by the authors by means of coloring. Red = increased (at p < 0.05 and/or with a 1.3-1.7 DS/CTRL ratio); orange = increased, although at p ≥ 0.05 and not within the 1.3-1.7 ratio range; blue = decreased (at p < 0.05 and/or with a 0.58-0.76 DS/ CTRL ratio); green = decreased, although at p ≥ 0.05 and not within the 0.58-0.76 ratio range; yellow/violet = increased/decreased, according to literature data, respectively [24][25][26][27]49 . S-adenosylhomocysteine and S-adenosylmethionine plasma level were decreased according to Pogribna and Coll. 27 , and increased according to Obeid and Coll. 26 . The yellow gear above the "Krebs" gear represents cystathionine. Other explanations in the text.
SCIENtIfIC RePoRTS | (2018) 8:2977 | DOI:10.1038/s41598-018-20834-y deficiency 56 . This enzyme catalyzes two reactions of the histidine metabolism, in particular the degradation of N-formimino-L-glutamic acid to form 5,10-methenyltetrahydrofolate, L-glutamate, and ammonia (KEGG pathway 2.1.2.5, Supplementary Table S4). L-glutamate is one of the compounds having an increased level in DS urine of the fasting group subjects, not significant in terms of p FDR but classified as medium in terms of Cliff 's delta (Table 5). Further study is needed in this regard by analyzing other metabolites. A possible explanation of mapping only one Hsa21 gene among those found by KEGG pathway database and listed in Supplementary Table S4 could be that the alteration of an enzyme or regulatory gene on Hsa21 propagates its 3:2 effect on subsequent steps of the metabolic chain, thus affecting enzymes located on other chromosomes. From this point of view, a deep analysis of the recently described HR-DSCR, suspected to contain unknown fundamental genetic determinants for ID in DS 10 , is recommended. Although the size of HR-DSCR is lower than the mean size of a single protein-coding gene 57 , this mean size is mainly due to introns whose size may also be extremely low 58 , and there is also the possibility that short non-coding or micro-RNA might be encoded in this currently "desertic" region. The deletion of a single copy of HR-DSCR from trisomic cultured cells via CRISPR/Cas9 59 could also allow the demonstration of phenotypic effects, and the isolation of effects on metabolism in DS due to HR-DSCR from the ones due to Hsa21 genes distant from this critical region.
Another interesting metabolite that, although not significant in terms of p FDR but classified as medium in terms of Cliff 's delta, increases in urine is TMAO. This metabolite is not processed by enzymes produced by human genes but it has been hypothesized to play a role of the human microbiota. A literature analysis confirmed that TMAO is a gut-microbiota-dependent metabolite 60 , and several works also reveal that TMAO has a role in the onset of cardiovascular diseases [61][62][63] and kidney diseases [64][65][66] . Biagi and Coll. 67 conducted a study that analyzed the gut microbiota (GM) in DS subjects, considering the premature aging that occurs in the DS may be due to changes in GM. Deterioration of GM plays an important role in the aging of the general population as well 68 . This study revealed that DS GM is predominantly composed of Firmicutes, Actinobacteria and Bacteroidetes. The most represented families in DS GM were Ruminococcaceae (39%) and Clostridiales (9%) 67 . These bacteria are positively associated with TMAO levels 60 , so further analysis could be conducted to understand its association with DS.
Our study reveals that DS subjects present some alterations in metabolic pathways; however, more analyses are necessary to find out what is the main mechanism that determines this unbalanced concentration of some metabolites. The NMR approach used resulted extremely powerful in providing an efficient high-throughput untargeted picture of the metabolic fingerprint of the DS subjects. Nevertheless, the technique suffers sensitivity limitations. Only metabolites with concentrations ≥1 μM are measurable with confidence. Obtaining a confirmation of the proposed alteration in metabolic pathways would require targeted mass spectrometry analyses, which are beyond the scope of the present study. Possible evolutions might also be the determination of absolute rather than relative concentration of metabolites, the incorporation of a larger body of metabolome data in a metabolic network model 69,70 and the study of the relationship between each metabolite and the protein or mRNA expression level of enzymes involved in its processing, e.g. generating quantitative, validated transcriptome maps providing DS/ normal tissue ratios such as those already available for normal human tissues 71 . In addition, while in this work we focused on the diagnosis of DS as the invariant phenotype to be studied at the biochemical level, further characterization of variability within our cohort of subjects at both the genetic level (e.g., single nucleotide polymorphisms -SNP -analysis) or phenotypic level (in particular, quantitative assessment of the grade of ID) could uncover fine relationships between DS features able to explain part of the variability observed in DS, for instance a more severe grade of disease in presence of a more clear deviation of the concentration of a specific metabolite.
Thinking of DS as a metabolic disease would result in a change of perspective, especially from the point of view of possible treatment. The focus must be shifted from what is upstream (gene excess or gene defect) to what is downstream (gene product). The "blocked" mechanism that determines ID severity and specific molecule protagonists of this complex mechanism might be identified, as occurred for other complex diseases: "Phenylketonuria, galactosemia, vitamine B6 dependant homocystinuria, to take few examples, can be properly handled and the children protected against mental deficiency. Who could believe that during the coming years no new progress will be achieved?" 14 .

Materials and Methods
Ethics Statement. The study was approved by the independent Ethics Committee of the University Hospital St. Orsola-Malpighi Polyclinic, Bologna, Italy. Informed written consent was obtained from all participants. It was required for the patient, if over 18, or parents, to sign the informed consent for the collection of urine, blood and clinical data to participate in the study. All methods were performed in accordance with the Ethical Principles for Medical Research Involving Human Subjects of the Helsinki Declaration.
Case selection. A total of 137 children/young adults were recruited to the study from February 3, 2014 to December 12, 2016, including 97 with DS and 40 healthy children/young adults that were siblings of the children with DS but with no evidence of abnormal karyotypes. Inclusion criteria for children with DS were diagnosis of Down syndrome with homogeneous or mosaic trisomy 21 and age > 2 years. Exclusion criteria for children with DS were distress at birth, severe prematurity (gestational age < 35 weeks) or severe neurologic disease at birth. The study has been proposed to all subjects consecutively admitted to the Day Hospital of the Neonatology Unit, Sant'Orsola-Malpighi Polyclinic, Bologna, in the context of the routine follow up provided for DS and matching the above-mentioned criteria.
Regarding the metabolomic analysis planned in the study, we were able to perform analysis and obtain results only in a subset of this group (Table 1)  the fasting state of the subject; a delayed treatment of the sample following its transfer from the Day Hospital to the University Laboratory (>2 hours); a macroscopic alteration of the blood/urine sample following centrifugation. Fasting before biological sample withdrawal was preferred, however if the patient had had breakfast before blood and/or urine collection, drinks and food assumed after midnight were recorded.
The metabolomic results were eventually obtained from a total of 67 children with DS (mean age = 11.3 yrs, ± 7.0 Standard Deviation -SD) and 29 control subjects (CTRL, mean age = 16.6 yrs ± 7.8 SD). The sex distribution of both DS and CTRL was similar (Table 1), as confirmed by lack of significant differences by Fisher's test.
Plasma and Urine Sample Preparation. Preanalytical treatment of blood and urine samples followed standard operating procedures 72,73 . All procedures were conducted carefully and in sterility to avoid contaminations.
Blood samples were collected in EDTA-coated blood collection tubes and kept at room temperature. They were treated within two hours of blood draw. The sample was transferred to a new tube and centrifuged at 1250 g for 10 min. The plasma fraction was isolated and centrifuged for a second time at 800 g for 30 min and the supernatant was transferred to new tubes without touching the pellet or the bottom of the tube and divided in aliquots of 300 µL. All plasma samples were rapidly stored in a −80 °C freezer and ready for subsequent analysis. The exclusion criteria of plasma samples from the subsequent analysis were blood sample treatment after two hours from the draw and evident contamination of plasma samples by residual erythrocytes at the end of the treatment.
Urine samples were collected in a sterile plastic cup with lid and kept refrigerated at + 4 °C if immediate processing was not possible. They were treated within two hours of collection. The sample was transferred to a new tube and centrifuged at 2500 g for 5 min at + 4 °C (refrigerated centrifuge). After centrifugation, filtration by 0.20 µm cut-off filter was performed in order to avoid contamination of the metabolome with soluble molecules derived from cellular components. The filtered urine was transferred to sterile cryovials making 1.0 ml aliquots. All urine samples were rapidly stored in liquid nitrogen and ready for subsequent analysis.
The exclusion criteria of urine samples from the subsequent analysis were urine sample treatment two hours after the collection and formation of sandy sediment after centrifugation. NMR sample preparation. NMR samples were prepared according to standard procedures 72 .
Frozen samples were thawed at room temperature and shaken before use. A total of 300 µL of each plasma sample was added to 300 µL of a phosphate sodium buffer ( NMR experiments. The NMR analysis has been conducted at CERM, the Magnetic Resonance Center of the University of Florence, Sesto Fiorentino (FI), Italy. 1 H-NMR spectra for all samples were acquired using a Bruker 600 MHz spectrometer (Bruker BioSpin) operating at 600.13 MHz proton Larmor frequency and equipped with a 5 mm PATXI 1 H-13 C-15 N and 2 H-decoupling probe including a z axis gradient coil, an automatic tuning-matching (ATM) and an automatic and refrigerate sample changer (SampleJet, Bruker BioSpin). A BTO 2000 thermocouple served for temperature stabilization at the level of approximately 0.1 K at the sample. Before measurement, samples were kept for 5 minutes inside the NMR probe head, for temperature equilibration at 300 K or 310 K in the case of urine or plasma samples, respectively.
Plasma is a heterogeneous mixture composed of thousands of metabolites as well as macromolecules like proteins and lipoproteins. Due to its intrinsic characteristic, for each plasma sample, three monodimensional 1 H NMR spectra were acquired with water peak suppression and different pulse sequences that allowed the selective observation of different molecular components: (i) a standard NOESY (Nuclear Overhauser Effect Spectroscopy) 74 1Dpresat (noesygppr1d.comp; Bruker BioSpin) pulse sequence, using 32 scans, 98,304 data points, a spectral width of 18,028 Hz, an acquisition time of 2.7 s, a relaxation delay of 4 s and a mixing time of 0.1 s. This pulse sequence is designed to obtain a spectrum in which both signals of metabolites and high molecular weight molecules (lipids and lipoproteins) are visible. (ii) a standard CPMG 75 (cpmgpr1d.comp; Bruker BioSpin) pulse sequence, using 32 scans, 73,728 data points, a spectral width of 12,019 Hz and a relaxation delay of 4 s. This pulse sequence is designed for the selective observation of small molecule components in solutions containing macromolecules. (iii) a standard diffusion-edited 76 (ledbgppr2s1d.comp; Bruker BioSpin) pulse sequence, using 32 scans, 98,304 data points, a spectral width of 18,028 Hz and a relaxation delay of 4 s. This pulse sequence is designed for the selective observation of macromolecule components in solutions containing small molecules; the resulting spectrum is generally made up only of the lipid, lipoprotein and protein signals.
Urine is a very complex biofluid but, unlike plasma, it is mainly composed of low molecular weight metabolites. Thus, for each urine sample, only a monodimensional 1 H NMR spectra was acquired with water peak suppression and a standard NOESY 74 pulse sequence using 32 scans, 98304 data points, a spectral width of 18028 Hz, an acquisition time of 2.7 s, a relaxation delay of 4 s and a mixing time of 0.1 s.
Acquisition of plasma and urine samples lasted approximately 4 days of NMR time. Samples of DS subjects and controls were mixed and acquired in a totally random order to avoid any batch effects.
NMR spectral processing and analysis. Free induction decays were multiplied by an exponential function equivalent to a 0.3 Hz line-broadening factor before applying Fourier transform. Transformed spectra were automatically corrected for phase and baseline distortions and calibrated. All the urine and plasma spectra were calibrated to the reference signal of TMSP at δ 0.00 ppm, and to the glucose doubled at δ 5.24 ppm, respectively, using TopSpin 3.5 (Bruker BioSpin).
Each spectrum in the region 10.00-0.2 ppm was segmented into 0.02 ppm chemical shift bins, and the corresponding spectral areas were integrated using the AMIX software (Bruker BioSpin). Binning is a means to reduce the number of total variables and to compensate for small shift in the signals, making the analyses more robust and reproducible.
For urine samples, normalization was applied on the obtained bins to minimize dilution effects caused, for example, by variation in fluid intake; the area of each bin was normalized using PQN, calculated with exclusion of the water region (4.40-5.00 ppm).
No scaling of the binned data was performed; the data are only mean-centered before multivariate statistical analyses.
Statistical analysis. Various kinds of multivariate statistical techniques were applied on the obtained bins using R 3.0.2 in house scripts.
Unsupervised Principal Component Analysis (PCA) was used to obtain a preliminary outlook of the data (visualization in a reduced space, cluster detection, screening for outliers, presence of batch effects or instrumental bias).
Partial Least Squares (PLS) was employed to perform supervised data reduction and classification between samples from healthy and diseased volunteers. Canonical Analysis (CA) was used in combination with PLS to increase supervised data reduction and classification.
The global accuracy for classification was assessed by means of a Monte Carlo validation scheme. Accordingly, each dataset was randomly divided into a training set (90% of the data) and a test set (10% of the data). The training set was used to build the model, whereas the test set was used to validate its discriminant and predictive power; this operation was repeated 200 times. For each model, the resultant confusion matrix was reported and its discrimination accuracy, specificity and sensitivity were estimated according to standard definitions. Their confidence intervals (95%) are provided in the Figures' legends. Each classification model was also validated using permutation test; the permutation was repeated 100 times and the resulting p-value was calculated.
The metabolites, whose peaks in the spectra were well defined and resolved, were assigned (Supplementary Tables S1 and S3) and their levels analyzed. The assignment procedure was made up using an internal NMR spectral library of pure organic compounds, public databases such as the Human Metabolome Database 77 , stored reference NMR spectra of metabolites, spiking NMR experiments and using literature data 78,79 . Matching between new NMR data and databases was performed using the AMIX software.
Before univariate analysis, each metabolite was aligned to a reference value of chemical shift, obtaining a perfect alignment among all the spectra. The relative concentrations of the various metabolites were calculated by integrating the corresponding signals in defined spectral range 80 , using a home-made tool for signal deconvolution. Not assigned signals were labeled as unknown.
The nonparametric Wilcoxon-Mann-Whitney test was used for the determination of the meaningful metabolites; a p-value < 0.05 was considered statistically significant. In order to reduce false discoveries, False Discovery Rate correction (FDR) was then applied using the Benjamini and Hochberg method 81 and the resulting p-values are reported as p FDR .
The effect size, using the Cliff 's delta (Cd) formulation 82 , was also calculated to aid in the identification of the meaningful signals giving an estimation of the magnitude of the separation between the different groups. The magnitude is assessed using the thresholds provided in Romano and Coll. 83 , i.e. |Cd| < 0.147 "negligible", |Cd| < 0.33 "small", |Cd| < 0.474 "medium", otherwise "large".
Univariate and multivariate logistic regression models were applied to the most significant metabolites to assess the presence of confounding factors such as sex and age of the children and the respective ODD ratios and p-values were calculated. Genomic Analysis. The KEGG pathway database (http://www.genome.jp/kegg/pathway.html) has been used to identify key enzymes upstream or downstream of the metabolite with an altered concentration in DS subjects. NCBI Gene database (https://www.ncbi.nlm.nih.gov/gene) has been used to search for the chromosomal location of these enzymes in order to verify a specific role of Hsa21.
Literature Analysis. The Human Metabolome Database (http://www.hmdb.ca/) and PubMed database (https://www.ncbi.nlm.nih.gov/pubmed/) have been used to search for previous descriptions in human plasma and urine of the metabolites with an altered concentration in DS.
Data availability. The datasets generated and analyzed during the current study are made available as Supplementary Datasets S1 (plasma data) and S2 (urine data).