Machine learning, applied to complex multidimensional data, is shown to provide personalized dietary recommendations to control blood glucose levels. This is a step towards integrating the gut microbiome into personalized medicine.
Weather forecasters were once the unfortunate subjects of countless jokes. Predicting the weather from multiple interacting meteorological factors that are greatly influenced by underlying geography seemed no better than extracting a forecast from a cloudy crystal ball. The complexity and individuality of the human body pose similar hurdles to making accurate predictions in personalized medicine. But access to huge amounts of data, refinement of mathematical models and enhanced computing power have transformed predictive meteorology1, and the same is beginning to apply to human health. Writing in Cell, Zeevi et al.2 approach the complex problem of how an individual's blood glucose concentration will be affected by particular foods, the microorganisms in their gut and other aspects of their physiology, and present a predictive model that enables personalized food recommendations.
Obesity and type 2 diabetes mellitus are sweeping the developed world3. An individual's post-prandial glycaemic response (PPGR), a measure of how much blood glucose levels rise after a meal, is a predictor of risk for developing type 2 diabetes4 — the greater the rise, the greater the risk. Because of this link, specific guidelines for how a person can maintain glycaemic control would be extremely useful5. Zeevi et al. equipped 800 people with subcutaneous probes that measured their blood glucose levels every five minutes over the course of a week (Fig. 1). With the exception of 5,107 standardized meals provided across the group, the participants ate their typical meals and logged detailed dietary records. The contents of the 52,005 meals were then analysed alongside more than 1.5 million glucose measurements.
The data revealed significant interpersonal variability of PPGRs to the identical (standardized) meals and to similar self-reported meals. Furthermore, the foods that induced the highest PPGRs differed greatly between individuals: a banana had a bigger effect than a cookie for one person, but the opposite was true for another. These insights could explain why standard dietary interventions for controlling PPGR are not uniformly effective across a population.
To make sense of the highly personalized glycaemic responses to food, the authors turned to the vast amount of data collected for each individual (Fig. 1). Included in their analyses were physiological characteristics, such as body-mass index; blood markers, such as cholesterol levels; behavioural data gathered from a questionnaire, for example activity level and sleep habits; and profiles of the participants' gut microbiomes (their resident gut microorganisms), including species composition and combined genome sequences. The data immediately revealed that an individual's PPGRs correlate with known risk factors for developing type 2 diabetes, such as body-mass index and systolic blood pressure. However, other, less obvious aspects of the composite medical profile also correlated with PPGRs, including the presence of particular taxa in the microbiome, such as the Enterobacteriaceae, and particular bacterial genes, such as those involved in chemotactic movement.
The authors then used a 'decision tree' machine-learning method to create an algorithm that would incorporate all these pieces of metadata. This approach proved to be predictive for PPGRs in cross-validation with the cohort of 800 participants — this means that a person's PPGRs could be predicted by an algorithm generated using data from the other 799 participants. The algorithm also predicted PPGRs of an independent cohort of 100 individuals whose data were not used to train the algorithm.
The authors identified several features in the metadata that were associated with an individual's PPGRs. As expected, increased carbohydrate consumption was closely tied to a raised PPGR. The presence of dietary fibre in meals led to an increased PPGR shortly after consumption, but decreased PPGR in the following 24 hours. There were also several features that were predictive of PPGRs that did not relate to meal consumption, including sleep, physical activity and aspects of the microbiome.
Overall, this approach was statistically more accurate at predicting glycaemic response than the current gold standard, which is based on the carbohydrate content of a meal. In a final test, the authors recruited 26 new participants and tailored meal recommendations (such as chicken recommended for one person, but withheld from another) for each participant using either their algorithm or expert interpretation of those individuals' PPGRs to specific meals. The recommendations based on the model improved PPGRs and stability of blood glucose levels similarly to the improvement achieved by the expert recommendations.
Although associations have previously been made between aspects of the gut microbiome and diseases ranging from obesity to autism6,7, the mechanisms underlying such links are mostly unknown. One of the big advantages of Zeevi and colleagues' approach is that such mechanisms need not be known for it to work. Nevertheless, this study provides a roadmap for generating and testing hypotheses about mechanisms. For example, do the Akkermansia muciniphila bacteria, which degrade the glycoprotein mucins that line the gut, and whose presence was found by the authors to correlate with higher PPGRs, causally contribute to this glycaemic response, and if so, how? The authors' large human data set and machine-learning approach provides an excellent launching point for mechanistic studies that are likely to be generalizable and relevant to people.
At this point in time, most microbiome researchers would not want to emulate weather forecasters and be asked to predict an individual's response to diet or medication on the basis of their microbiome profile. However, when combined with a machine-learned algorithm that incorporates additional metrics of host biology, such prediction seems much less daunting. The application of machine-learning methods to endpoints beyond PPGRs, such as progression towards or treatments for autoimmune diseases, cardiovascular disease and cancer, is likely to follow rapidly. In the era of 'big data' science, in which we can measure an enormous number of parameters, harnessing the most-predictive aspects of highly dimensional data will be extremely powerful. Although the previous picture of how the complexity of individual microbiome profiles could inform personalized medicine was cloudy, this study provides the grounds for an optimistic forecast.
About this article
European Journal of Nutrition (2016)