Need Help?

A reference map of potential determinants for the human serum metabolome

The collection of metabolites circulating in the human blood, termed the serum metabolome, contains a plethora of biomarkers and causative agents. Although the origin of specific compounds is known, we have a poor understanding of the key determinants of most metabolites. Here, we measured the levels of 1251 circulating metabolites in serum samples from a healthy human cohort of 491 individuals, and devised machine learning algorithms to predict their levels in held-out subjects based on a comprehensive profile consisting of host genetics, gut microbiome, clinical parameters, diet, lifestyle, anthropometric measurements and medication data. Notably, we obtained statistically significant predictions for over 76% of the profiled metabolites. Despite using the strict out-of-sample prediction metric, which is a lower bound for the explained variance, diet and microbiome each explained hundreds of metabolites, with over 50% of the variance explained in some metabolites. We further validated the robustness of the microbiome related associations by showing a high replication rate in two geographically independent cohorts that were not available to us when developing the algorithms. We also demonstrate that some of these interactions are causal, as some metabolites we predicted to be positively associated with bread increased in level following a randomized clinical trial of bread intervention. Microbiome-explained metabolites were enriched with unnamed metabolites, and we devised an algorithm that accurately predicts their biological pathway, finding that they mainly include food components, aromatic amino acids and secondary bile acid derivatives. Overall, our results unravel potential determinants of over 800 metabolites, paving the way towards mechanistic understanding of alterations in metabolites under different conditions and to designing interventions for manipulating circulating metabolite levels.

Click on a Dataset ID in the table below to learn more, and to find out who to contact about access to these data

Dataset ID Description Technology Samples
EGAD00001006247 3
EGAD00001006354 1