Longitudinal relationship of amino acids and indole metabolites with long-term body mass index and cardiometabolic risk markers in young individuals

Amino acid metabolites in biofluids are associated with high body mass index (BMI) and cardiometabolic abnormalities. However, prospective investigations regarding these associations are few, particularly among young individuals. Moreover, little is presently known about the impact of long-term high BMI. Using data from the DOrtmund Nutritional and Anthropometric Longitudinally Designed study (111 males and 107 females), we prospectively investigated relations between repeatedly measured urinary levels of 33 metabolites and (1) previously identified long-term BMI trajectory groups from childhood into late adolescence and (2) cardiometabolic risk markers in late adolescence–young adulthood, in sex-specific linear mixed regression models. Males with long-term overweight had lower indole-3-acetic acid when compared to others. Further, methionine, isoleucine, tryptophan, xanthurenic acid, and indole-3-carboxaldehyde were negatively associated with C-reactive protein (CRP), but 5-hydroxyindole-3-acetic acid was positively associated with CRP. No associations were observed in females. Long-term overweight from childhood into late adolescence is associated with decreased urinary levels of gut bacteria-derived indole-3-acetic acid, and several urinary amino acids, including gut bacteria-derived indole-3-carboxaldehyde are associated with elevated CRP later on in life. Taken together, our data suggest that indole metabolites, and their gut bacteria producers play potentially important roles in overweight-related inflammation.

Similarly, amino acid metabolites are positively related to cardiometabolic risk markers (CRM) in children, adolescents, and young adults, albeit with a few exceptions. The positive associations include branched-chain amino acids (BCAAs) with insulin resistance (IR) 15,21,22 , triglycerides 23 , and triglycerides to high-density lipoprotein-cholesterol (HDL-C) ratio 24 ; valine, leucine, and isoleucine 10 , phenylalanine and tyrosine 21 , and kynurenine and KTR 25 with IR; cysteine with fasting plasma glucose (FPG), leptin, and IR; 12 indole-3 propionic acid with HDL-C 16 ; arginine, histidine, and serine with insulin sensitivity 5 ; KTR with C-reactive protein (CRP), interferon-gamma, interleukin (IL)−6, IL-10, and alpha-1-acid glycoprotein 26 ; and isoleucine and valine with IR 27 . The few reported negative associations include BCCAs with FPG 23 and adiponectin 24 ; tryptophan with CRP, interferon-gamma, IL-6, IL-10, and alpha-1-acid glycoprotein 26 ; and numerous amino acids with CRP and IL-6 28 . The relation of BCCA with BCCAs with FPG 23 and adiponectin 24 , and that of isoleucine and valine with IR 27 are restricted to males, while the relation of BCAA with triglycerides 23 , and triglycerides to HDL-C ratio 24 are restricted to females. The mechanistic involvement of BCCAs in the development of IR includes persistent activation of mTOR signaling pathway and accumulation of toxic BCAA metabolites that trigger mitochondrial dysfunction 29 , and the involvement of tryptophan and KTR in inflammation is through the immunoregulatory role of indoleamine 2,3-dioxygenase 1 30 .
Notably, only a few of the aforementioned studies are prospective 9,21,[24][25][26] . Thus, more prospective investigations are needed. Moreover, the one-time measurement of BMI in earlier studies indicate a relation between a static state of body composition and amino acid metabolites. Given that body composition is dynamic, particularly among children and adolescents 31 , it would be necessary to investigate whether long-term BMI is associated with similar or different amino acid metabolites. Moreover, our knowledge of the relation between BMI and amino acid metabolites, and between amino acid metabolites and CRM might be incomplete because earlier studies sampled amino acids only at one time point. Static sampling of amino acids does not adequately capture the complex interplay of the several phases of host metabolism 19 . Further, some findings being restricted to either males or females warrants sex-specific analysis. Surely, aberrant amino acid metabolites may reveal metabolites and/or metabolic pathways perturbed as a result of a high BMI and identification of amino acids related to CRM may improve our understanding of the pathophysiological mechanisms underlying cardiometabolic abnormalities.
In a group of apparently healthy young individuals in whom BMI was measured annually over a 15-year period from childhood to late adolescence, we previously identified distinct long-term BMI trajectory groups 32 . In a subset of this study sample, we therefore sought in sex-specific analysis to (1) examine association between long-term BMI trajectory groups and repeatedly measured urine metabolites, mostly amino acids, during childhood and adolescence, and (2) investigate whether these metabolites are prospectively associated with the levels of 11 CRM in late adolescence-young adulthood.

Results
Study population. Our study population comprised 111 males and 107 females. As published before 32 , overweight trajectory was present only in males, and more males followed the high-normal weight trajectory as compared to females. Urine was sampled in majority of individuals at ages 9, 17, 18. In late adolescenceyoung adulthood, males had higher levels of SBP and FPG, while females had higher HDL-C, CRP, adiponectin, chemerin, and leptin. Males were heavier and longer than females at birth, males consumed more energy and protein daily as compared to females, and a lower percentage of mothers of males were employed as compared to mothers of females (Table 1).
Further details of the study participants' age at urine sampling (metabolite measurement) are shown in Suppl. Table 1a,b. In both sexes, over half of the participants had their three measurements at ages 9, 17, and 18, and about 10% had measurements at ages 9, 16, and 17.

Multivariable linear regression. The association between BMI trajectory and metabolites. As shown in
Suppl. Table 2, except for kynurenine and KTR in males, the linear time models yielded generally a better fit for the metabolites. Therefore, except for these two metabolites which we fitted with quadratic time models, all the remaining metabolites in both sexes were fitted with linear time models. Table 2 shows the p-values and q-values of the association between BMI trajectory group and each of the 32 metabolites and KTR in males and females. The results indicate that at p-value ≤ 0.05 and q-value ≤ 0.05, BMI trajectory group was significantly associated with 3-methoxy-p-tyramine (p-value = 0.005, q-value = 0.045), indole-3-acetamide (p-value = 0.005, q-value = 0.045), and indole-3-acetic acid (p-value = 0.0001, q-value = 0.002) in males. However, BMI trajectory group was not associated with any metabolites in females. For this reason, subsequent mean comparisons were performed for only 3-methoxy-p-tyramine, indole-3-acetamide, and indole-3-acetic acid in males.
In humans, ultimate source of indole-3-acetic acid is tryptophan. Hence, in sensitivity analyses, BMI trajectory group differences in indole-3-acetic acid were further adjusted for tryptophan. Adjusting for this precursor slightly attenuated the above result (Suppl. Table 3). This result was similar with further adjustment for urine creatinine or osmolality. the association between metabolites and kynurenine and tryptophan ratio, and cardiometabolic risk markers. Collectively, the metabolites and KTR explained between 26-54% and 10-89% of the variance in CRM in males and females, respectively (Table 3). However, at p-value ≤ 0.05 and q-value ≤ 0.05 only the explained variance of CRP in males was statistically significant. The metabolites and KTR collectively explained 54% of the variance in CRP in males (p-value = 0.002, q-value = 0.025). Therefore for subsequent analysis, we reported associations between these metabolites and CRP in males.  Table 1. Basic characteristics of the study population, 111 males and 107 females. a n(%); b Median (25 th percentile, 75th percentile). n = count, % = percentage, BMI = Body mass index, HDL-C = High-density lipoprotein-cholesterol.

Discussion
The aims of this study were to examine sex-specific associations between long-term BMI trajectory groups and repeatedly measured urine metabolites, mostly amino acids, among apparently healthy young individuals, and to investigate whether these metabolites are associated with the levels of subsequent CRM in late adolescence-young adulthood. We observed that individuals who belonged to the overweight trajectory group, that is, persistently overweight from childhood into late adolescent had lower levels of urinary indole-3-acetic acid when compared to other individuals. Furthermore, we found that methionine, isoleucine, tryptophan, xanthurenic acid, and indole-3-carboxaldehyde levels were independently negatively, and 5-hydroxyindole-3-acetic acid levels were independently positively associated with CRP in late adolescence-young adulthood. Interestingly, both findings were restricted to males.
Our sex-specific findings extends previous findings of studies in children and adolescents comprising both sexes. This includes a study reporting that high BMI was associated with a decreased KTR in males, but not   www.nature.com/scientificreports www.nature.com/scientificreports/ in females 14 . Moreover, BCCAs were associated with a decreased FPG in males, but increased triglycerides in females 23 as well as with a decreased adiponectin in males, but with increased triglycerides to HDL-C ratio in females 24 . The fact these studies 14,23,24 and ours consistently showed inverse associations in males suggest that our findings are unlikely to be spurious. Possible explanations for these observed sex-specific findings include sex-related differences in protein turnover, amino acid handling systems, and the influence of sex hormones 33 . Sex-related differences are thought to primarily develop after adolescence due to the more marked hormonal differences 33 . However, considering that our study included measurement of metabolites in childhood, either indicate that an impact of sex hormones occur before adolescence, or that all the aforementioned mechanisms provide a complete explanation of the sex-specific findings. These sex-specific findings warrant further investigation.
Overweight, either cross-sectionally or prospectively has not been previously linked to indole-3-acetic acid in children and adolescents. The few cross-sectional studies in adults regarding indole-3-acetic acid have yielded mixed results [34][35][36][37] . In agreement with our study, Liu et al. observed that BMI negatively correlated with sputum indole-3-acetic acid 35 , a result at variance with others in which no relation was observed between BMI and serum indole-3-acetic acid 34 , between BMI and faecal indole-3-acetic acid 36 , and between BMI and plasma indole-3-acetic acid in older men 37 . Clearly, BMI trajectory group differences being observed for only indolic tryptophan metabolite indicate that the impact of long-term BMI is very specific and does not result in a global disruption of all amino acid metabolism. However, our study failed to demonstrate an association with indole-3-propionic acid, an indole metabolite that has been cross-sectionally related to BMI in children and adolescents 16 . Indeed, the consistent inverse relation of BMI with the indole metabolites in our study and Farook et al. 16 suggest that high BMI impact indolic tryptophan metabolism, with the resulting metabolites being dependent on the duration of high BMI. right. Model 1: adjusted for time. Model 2: adjusted for time, birth weight and length, maternal BMI, maternal pregnancy weight gain, breastfeeding duration, birth order, maternal education, and maternal employment, smoking in household, physical activity, daily energy intake, percentage of energy from protein, and two-way interactions of physical activity, daily energy intake, and percentage of energy from protein with BMI trajectory. *Simulated adjusted p-value.  Table 3. Overall significance of regression model with cardiometabolic risk markers as dependent variable and 32 metabolites and kynurenine and tryptophan ratio as independent variables. The bolded cardiometabolic risk marker indicate a significant model at p-value ≤ 0.05 and q-value ≤ 0.05 *Metabolites and cardiometabolic risk markers were normalized before statistical analysis.

p-value q-value R-square values p-value q-value R-square values
www.nature.com/scientificreports www.nature.com/scientificreports/ Indole-3-acetic acid have been shown to drive the emergence of obesity by acting on the extended reward network 36 , however the current study modeled the effect of BMI trajectory group since it makes intuitive sense that changes in BMI impacted the levels of indole-3-acetic acid and our BMI measurements precedes metabolite measurements. Nonetheless, one cannot exclude the presence of a bidirectional association between BMI and indole-3-acetic.
In humans, indole-3-acetic acid is a tryptophan metabolite that are largely generated by the direct or indirect metabolism of the gut microbiota 38,39 . Approximately 4-6% of tryptophan undergoes bacterial degradation to yield indole metabolites 40 . Since human de novo production of indole metabolites is unlikely 40 , the levels of indole-3-acetic acid is either a reflection of dietary intake (in this case, tryptophan-containing foods and cruciferous vegetables) that are modulated by the gut bacteria or entirely a reflection of changes in gut microbial metabolism. There is a correlation between vegetable intake and calorie intake 41 , therefore controlling for calorie intake and protein intake, and their interaction with BMI trajectory group minimizes the influence of differential consumption of cruciferous vegetables and tryptophan-containing foods on our findings. The fact that tryptophan only minimally attenuate BMI trajectory group differences in indole-3-acetic acid also suggests that this association is independent of this precursor.
Clostridium bartlettii, several Bacteroides species and other bacteria have been reported to produce indole-3-acetic acid 42 . This suggests that in individuals with long-term overweight, dysbiosis and/or altered metabolic functions of the gut microbiota results in decreased synthesis or increased utilization of indole-3-acetic acid. Although the potential of the gut bacteria of healthy humans to produce indole is very heterogeneous; 43 the absence 44 and lower levels 45 of indoles in serum of germ-free animals compared to their conventionally raised counterparts suggest that these bacteria will be either absent or decreased in overweight individuals. Prospectively linking long-term overweight status with the abundance and function of the gut microbiota would help identify the altered bacteria, and linking the identified bacteria to levels of indole-3-acetic acid would help identify its producer in our study population. There are other gut microbiota-controlled tryptophan pathways 46 . Therefore, whether and the extent to which long-term overweight affects these pathways should be evaluated in future studies. Ultimately, indole-3-acetic acid may help to further characterize young individuals with different long-term BMI phenotypes. Overall, indolic tryptophan metabolic pathway seems to be highly dysregulated by long-term overweight.
Secondly, we observed that the levels of methionine, isoleucine, tryptophan, xanthurenic acid, and indole-3-carboxaldehyde from childhood into late adolescent were inversely related to CRP levels in late adolescence-young adulthood, but the levels of 5-hydroxyindole-3-acetic acid was directly related to CRP-a well-established marker of inflammation. Interestingly, all these metabolites are independently related to CRP, indicating that their association is not confounded by their precursors. While there is some evidence that changes in amino acids occur in inflammatory states 47 , it is less clear if these changes are a cause or a consequence of inflammation. Only a few prospective investigations in children and adolescent have implicated amino acid metabolites in cardiometabolic risks 21,24-26 , and of these studies only one reported relations with CRP 26 . In www.nature.com/scientificreports www.nature.com/scientificreports/ agreement with our results, Kosek et al. 26 found an inverse association between tryptophan and CRP. Similarly, a cross-sectional investigation, including children and adolescents, is in line with our findings of the inverse relation of methionine and tryptophan with CRP 28 .
To our knowledge, no previous study in childhood, adolescents, or young adults have prospectively linked methionine, isoleucine, xanthurenic acid, indole-3-carboxaldehyde, and 5-hydroxyindole-3-acetic acid to CRP. Nevertheless, cross-sectional investigations in older adults are in line with our findings. These include an inverse association of tryptophan 48 and xanthurenic acid with CRP 49 , and a direct association between 5-hydroxyindole-3-acetic acid and CRP 50 . Other studies in adults that contrast our findings include lack of association of isoleucine 35 , lack of association of methionine and tryptophan 50 , and positive relation of isoleucine 51,52 with CRP. Although no study have reported an association between indole-3-carboxaldehyde and CRP; exogenous administration of indole-3-carboxaldehyde decreased the production of several inflammatory cytokines 53 . Consistent with our findings of several amino acids being related to CRP, a review concluded that several amino acids are involved in the development of future cardiometabolic abnormalities such as IR 54 . Since. we adjusted for the precursors (tryptophan and serotonin) of indole-3-carboxaldehyde, xanthurenic acid, and 5-hydroxyindole-3-acetic acid, it is unlikely that their association with CRP is due levels of these precursors or intake of diets rich in them. Systemic inflammation drives amino acid metabolism 46 ; however our longitudinal study provides a direct evidence that variation in amino acid metabolites predates systemic inflammation.
The inflammatory role of the CRP-related amino acid metabolites in the current study have been documented. Methionine, an aliphatic, sulfur-containing, essential amino acid exerts anti-inflammatory activity through the activation of endogenous antioxidant enzymes such as methionine sulfoxide reductase A, and by counteracting oxidative stress through the biosynthesis of glutathione 55 . Isoleucine, a branched chain amino acid exerts anti-inflammatory activity through interference with the action and/or synthesis of prostaglandins 56 , and inducing the expression of β-defensins via G-protein-coupling receptors and ERK/MAPK signaling pathways 57 . Tryptophan, an essential aromatic amino acid, plays a critical role in controlling systemic inflammatory responses through its endothelium-derived metabolite 5-methoxytryptophan 58 . Xanthurenic acid is one of the several metabolites in the kynurenine pathway of tryptophan metabolism 59 . It reduces interferon-gamma production 60 and possesses powerful antioxidant properties 61 . Indole-3-carboxaldehyde (indole-3-aldehyde), a gut microbiota-derived indole derivative of tryptophan catabolism produced by the action of Lactobacillus-encoded tryptophanase inhibits inflammation through activation of the aryl hydrocarbon receptor in lymphoid cells 62 . Finally, 5-hydroxyindole-3-acetic acid is the most abundant end product of both central and peripheral enzymatic degradation of serotonin 63 . Although the direct biological activity of 5-hydroxyindole-3-acetic acid is yet to be documented; its excessive production by endocrine and neuroendocrine tumors 63 and the large quantities of reactive oxygen species that are produced during its biosynthesis from serotonin 64 might explain its association with systemic inflammation. Epidemiological studies have reported associations between proinflammatory state among adolescents and young adults and an increased risk of future cardiovascular diseases 65 , thus the amino acids associated with elevated CRP in the present study could be biomarkers of proinflammatory phase leading to the development cardiovascular diseases.
One major strength of this work is that it is a prospective study in apparently healthy young individuals. This prospective investigation indicates that the obtained exposure-outcome associations are reliable. Moreover, the repeated measurement of metabolites ensures that the metabolite levels of our participants are well captured, thereby providing greater confidence in our findings. In addition, for both research aims, we adjusted for protein and calorie intake since the metabolic fate of amino acids is dependent on their dietary availability. We also corrected for variations in physical activity and several early life factors. Only a few previous studies adequately controlled for these covariates. Finally, metabolites were profiled in urine which is a non-invasive biosample. Considering that the concentrations of several urine and plasma amino acids are correlated 66 , urine may be a substitute for plasma in profiling amino acids, especially in large epidemiological studies.
Notwithstanding the strengths of our study, one limitation is that this study is observational, so it cannot confirm causal relationships. It would be necessary to determine whether a causal relation exists between long-term BMI trajectory groups and these metabolites, and between these metabolites and CRP, and if it exists the magnitude of their causal relations. One approach that may improve causal inference of association of these metabolites with CRP within an observational study setting is by examining whether genetically determined levels of these metabolites are related to CRP in Mendelian randomization analysis. Secondly, the homogeneous nature of our study population, in terms of geographical location and socioeconomic background, may limit generalizability of our findings. These findings should be externally validated on more geographically, socioeconomically, and ethnically diverse populations. Besides, residual and unmeasured confounders such as medications might have influenced our results. Additionally, single imputation of covariates may have underestimated the uncertainty around the effect estimates in our adjusted models. Several host factors such as protein/amino acid absorption, catabolism and uptake, and excretion as well as kidney function might also confound our results. Future studies should consider the assessment of the aforementioned host factors such as measurement of faecal levels of these metabolites in order to rule out malabsorption. Moreover, our findings speculate about the role of the gut microbiota, without measuring the faecal levels of these metabolites. Faecal as well as circulating levels of these metabolites would have strengthened our findings.
In conclusion, this prospective investigation in apparently healthy young individuals shows that independent of several factors, males who were overweight from childhood into late adolescent have decreased urinary levels of gut bacteria-derived indole-3-acetic acid, and that several urinary amino acids, including gut bacteria-derived indole-3-carboxaldehyde, negatively predict serum CRP in late adolescence-young adulthood. Although exactly the same metabolite is not related to both long-term overweight and CRP, the fact that decreasing levels of indole metabolites-indole-3-acetic acid and indole-3-carboxaldehyde are associated with long-term overweight and elevated CRP, respectively suggests that indole metabolites and their gut bacteria producers play an imperative role Scientific RepoRtS | (2020) 10:6399 | https://doi.org/10.1038/s41598-020-63313-z www.nature.com/scientificreports www.nature.com/scientificreports/ in overweight-related inflammation among young individuals. These current findings further underline how the host phenotype and the microbiota interact to influence health outcomes. More human prospective studies and animal studies are needed to better understand the role of altered indolic tryptophan metabolism and the gut microbiota in overweight-associated systemic inflammation.

Subjects and Methods
Study design and participants. The study sample was selected from participants of the DOrtmund Nutritional and Anthropometric Longitudinally Designed (DONALD) study, an ongoing, open cohort study conducted in Dortmund, Germany. This study collects data on diet, growth, development and metabolism of healthy children and adolescents since 1985. In the first few study years, approximately 300 participants >2 years old were also recruited. Since then, approximately 35-40 infants are newly recruited every year. The regular visits begin at 3 months of age. The participants return for three more visits in the first year, two in the second year and thereafter annually until young adulthood. Yearly examinations include 3-day weighed dietary records, anthropometric measurements, collection of 24-h urine samples, interviews on lifestyle and medical examinations. Since 2005, participants >18 y are invited for subsequent examinations with fasting blood withdrawal. Parental examinations (anthropometric measurements, lifestyle interviews) take place every four years. Due to the specific design of the DONALD study, participants from well-educated mothers are overrepresented. Further details of the DONALD study were described elsewhere 67 . The study was approved by the Ethics Committee of the University of Bonn (approval number 098/06) and conducted according to the guidelines of the Declaration of Helsinki.

Statement attesting to informed consent for study participation.
For all examinations in the DONALD study, written consent was obtained from parent and/or legal guardian of participants during their childhood phase and written consent was also obtained from study participants themselves later on in life. A statement attesting to informed consent from a parent and/or legal guardian for study participation was provide as thus, "I consent to my child being medically examined and physically measured by qualified personnel. I am prepared to answer questions about my child regarding pregnancy and birth, development and diseases, as well as lifestyle factors and diet, and to prepare dietary protocols. I agree to answer questions about my own health status and lifestyle and to have my height and weight measured".

Study sample.
The sample for the current analyses was a subset of 354 males and 335 females DONALD study participants from whom we previously identified long-term BMI trajectory groups from age four to 18 32 . These individuals are singletons, full term births (36 to 42 weeks), had a birth weight greater than or equal to 2500 g, and had at least one BMI measurement in childhood (4-9.9 years), early adolescence (10-14.9 years), and late adolescence (15-18 years). Four trajectory groups (overweight, high-normal weight, mid-normal weight, and low-normal-weight), and three trajectory groups (high-normal weight, mid-normal weight, and low-normal-weight) were identified in males and females, respectively 32 . Of these individuals, 218 (111 males and 107 females) individuals provided 24-hour urine samples at three time-points over the course of BMI measurement, and one blood sample in late adolescence-young adulthood (between age 18 and 39 years) (Fig. 3). The current analyses were based on data from these 111 males and 107 females.
Assessment of anthropometric variables, dietary intake and physical activity, early life and socio-economic characteristics Anthropometric data for all participants were obtained by measurement that adhere to standard procedures during their annual study visit. BMI, standardized BMI z-scores, and their trajectories were determined, as previously reported 32 . Three-day weighed dietary records were used to collect nutritional data for three consecutive days. Using all three-day dietary data, the nutrient intake was computed using the continuously updated in-house nutrient database, LEBTAB, and the individual mean daily intake of energy and protein over the three record days was calculated. Further, physical activity was assessed using a detailed questionnaire about the duration and frequency of both organized and non-organized physical activity. Physical activity was then expressed as daily energy expenditure in metabolic equivalent task-hours. Parents were interviewed about familial and socio-economic characteristics. Information on birth anthropometrics were extracted from a standardized document ("Mutterpass") given to all pregnant women in Germany. The BMI of the mother of the participants at study entry was also documented.
Urine sampling, preparation, and metabolic profiling of metabolites. Urine sampling. Annual 24-h urine sampling were scheduled for participants older than 3 or 4 years. The sampling conducted at home follows a standardized procedure after detailed instruction to the families. Participants were asked to void their bladders upon getting up in the morning; this urine sample is completely discarded. This sets the start of the collection and which ends with voiding the bladder in the next morning. The participants store the urine sample in preservative-free, Extran-cleaned (Extran, MA03; Merck, Darmstadt, Germany) 1 L plastic containers at less than −12 °C. The samples were transferred to the study centre where they were stored at −22 °C until analysis. Details of urine sampling have been presented elsewhere 68  www.nature.com/scientificreports www.nature.com/scientificreports/ were gamma-aminobutyric acid, methionine, valine, leucine, isoleucine, picolinic acid, quinolinic acid, serotonin, dopamine, tyrosine, phenylalanine, tyramine, 3-methoxy-p-tyramine, homovanillic acid, kynurenic acid, 5-hydroxyindole-3-acetic acid, tryptophan, kynurenine, xanthurenic acid, indole-3-acetamide, indole-3-acetic acid, indole-3-lactic acid, indole-3-propionic acid, indole-3-carboxaldehyde, indole-3-carboxylic acid, anthranilic acid, 3-hydroxyanthranilic acid, tryptamine, 5-methoxyindole-acetic acid, tryptophan methyl ester, 5-hydroxy-tryptophan, 3-hydroxykynurenine. The metabolites show a coefficient of variation lower than 1% and matrix effects, evaluated by the matrix match calibration approach, were minimal. Details of urine sample preparation and targeted metabolic profiling of amino acid metabolites can be found elsewhere 69 . Blood sampling and measurement of cardiometabolic risk markers. In late adolescence-young adulthood, venous blood samples were drawn after an overnight fast, centrifuged within 15 min, and frozen at −80 °C. Circulating concentrations of HDL-C, FPG, CRP, IL-6, IL-18, adiponectin, chemerin, and leptin were measured. HDL-C was measured with the Advia 1650-Chemistry System analyser (Siemens Healthcare Diagnostics, Eschborn, Germany, FPG on a Roche/Hitachi Cobas c 311 analyzer (Basel, Switzerland), triglycerides and CRP with the Roche/Hitachi Cobas c311 analyser (Roche diagnostics, Mannheim, Germany), IL-6 with the Human IL-6 Quantikine HS ELISA (R&D Systems, Wiesbaden, Germany), IL-18 with the Human IL-18 ELISA (Medical and Biological Laboratories, Nagoya, Japan), adiponectin with the Human Total Adiponectin/ Acrp30 Quantikine ELISA kit (R&D Systems), chemerin with the Human Chemerin ELISA (BioVendor, Brno, Czech Republic), and leptin with Leptin Quantikine ELISA (R&D System) as previously described 70,71 .
Systolic (SBP) and diastolic blood pressures (DBP) were measured by the standard procedure at the study center, according to standardized procedures with an automated oscillometric device (Datascope Accutorr Plus, Mahwah, NJ, USA). For each participant, three consecutive BP measurements were taken at 3-min intervals after an initial 5-min rest and following a non-strenuous part of the examination. During the measurements, the participant's back was supported in an upright position, with the right forearm resting on a table at heart level, and the legs uncrossed with feet on the floor. Therefore, in the present study, eleven CRM-SBP and DBP, serum triglycerides, HDL-C, FPG, CRP, IL-6, IL-18, adiponectin, chemerin, and leptin were analysed. Statistical analyses. Participant characteristics. All statistical analyses were conducted using SAS 9.4 (SAS ® , SAS Institute, Cary, NC). Sex-specific baseline characteristics of the study sample (n = 218, males = 111, females = 107) are presented as medians and corresponding 25th and 75th percentiles or as counts (percentages) where appropriate. Sex-specific differences in baseline characteristics according to sex were explored using the Mann-Whitney U test and chi square tests for continuous and categorical data, respectively.

Multivariable linear regression.
The association between BMI trajectory groups and metabolites. All 32 metabolites and the KTR were normalized by calculating their rank normal scores using the Blom method. We used linear mixed models (PROC MIXED) to examine the association between BMI trajectory (categorical predictor variable) and each of the 33 (32 metabolites and KTR) dependent variables.
First, we determined the trends for each of the dependent variables over time (participants' ages) by fitting unconditional linear and quadratic time models. We examined the convergence information and selected a better fitting model between the linear and quadratic time models as the one which minimizes the corrected Akaike's Information Criterion. Time was expressed as a deviation from its mean in order to reduce multicollinearity between the linear and quadratic terms of the curvilinear model when the quadratic model indicated a better fit. The REPEATED statement was included in the PROC MIXED, and the individual was specified as the subject. www.nature.com/scientificreports www.nature.com/scientificreports/ The Kenward-Roger estimation method was used for calculating degrees of freedom. Due to the unequally spaced repeated measurements, we specified an exponential spatial covariance structure.
Then, we included the BMI trajectory group (categorical) variable as fixed effect and tested its association with each dependent variable. We obtained the overall significance (type 3 F-tests p-value) of the BMI trajectory and estimated its corresponding false discovery rate (FDR) p-value (q-value) by PROC MULTTEST. A combined p-value ≤ 0.05 and q-value ≤ 0.05 corrected with respect to the 33 dependent variables, were considered statistically significant. Thus, model 1 includes BMI trajectory, time and BMI trajectory by time interaction (for linear model), with quadratic time, and BMI trajectory by quadratic time interaction (for curvilinear model).
When the BMI trajectory group effect was significant for a dependent variable in model 1, we further examined the heterogeneity across BMI trajectory groups as the marginal mean difference (and 95% Confidence interval, CI) in the dependent variable using the LSMEANS statement. The confidence limits of the mean differences and p-value were adjusted for multiple comparisons with the simulate method. Model 1 was further adjusted for directed acyclic graph-identified minimal adjustment sets of covariates of the association between BMI trajectory and the dependent variables (model 2). These covariates were birth weight and length, maternal BMI, maternal pregnancy weight gain, breastfeeding duration, birth order, maternal education, and maternal employment, smoking in household, physical activity, daily energy intake, percentage of energy from protein, and two-way interactions of physical activity, daily energy intake, and percentage of energy from protein with BMI trajectory.
Association between metabolites and cardiometabolic risk markers. To address the second aim, that is, whether metabolites are related to the 11 CRM (SBP, DBP, triglycerides, HDL-C, FPG, CRP, IL-6, IL-18, adiponectin, chemerin, and leptin), we constructed a two-stage model. In the first stage, we modeled the normalized concentration of the 32 metabolites and KTR as a function of time using linear mixed models (PROC MIXED) with random coefficients and unstructured covariance-structure, and obtained individual-specific predicted value (best linear unbiased predictor, BLUP). In the second stage, in a multivariate multivariable linear model, the normalized CRM (dependent variables) were regressed on the BLUP estimates of the 33 independent variables (32 metabolites and KTR). We considered the CRM for which this model (model 1) showed significance at an overall F statistic p-value ≤ 0.05 and q-value ≤ 0.05. For significant model(s), metabolites were considered significant at p-value ≤ 0.05. In model 2, we adjusted for birth weight and length, maternal BMI, maternal pregnancy weight gain, breastfeeding duration, birth order, maternal education, and maternal employment, smoking in household, physical activity, daily energy intake, percentage of energy from protein, and BMI trajectory. The effect estimates and their 95% CIs represent the association between a 1-SD change in each metabolite and corresponding SD change in the CRM.
Handling of missing covariates. All covariates except birth weight and length, breastfeeding duration and birth order had missing values. The range of percentages of missing covariates were 2-30% in males and 1-18% in females. Using SAS function 'proc mi' , we created a single imputed dataset in one burn in iterations with the fully conditional method, linear regression for continuous variables (maternal BMI, gestational weight gain), logistic regression for ordinal categorical variables (smoking household), and discriminant function for nominal categorical variables (maternal education and maternal employment). Despite the single imputation's limitation of not achieving uncertainty of missing data, we opted for this technique over the optimal multiple imputation because of the complexity of applying multiple imputation in longitudinal datasets 72 , the unstable results in longitudinal mixed-model analysis after multiple imputation 73 , and the generally low proportions of missingness in our data.

Data availability
The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request and approval by the principal investigator.