Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Multi-omics analyses of the ulcerative colitis gut microbiome link Bacteroides vulgatus proteases with disease severity

Abstract

Ulcerative colitis (UC) is driven by disruptions in host–microbiota homoeostasis, but current treatments exclusively target host inflammatory pathways. To understand how host–microbiota interactions become disrupted in UC, we collected and analysed six faecal- or serum-based omic datasets (metaproteomic, metabolomic, metagenomic, metapeptidomic and amplicon sequencing profiles of faecal samples and proteomic profiles of serum samples) from 40 UC patients at a single inflammatory bowel disease centre, as well as various clinical, endoscopic and histologic measures of disease activity. A validation cohort of 210 samples (73 UC, 117 Crohn’s disease, 20 healthy controls) was collected and analysed separately and independently. Data integration across both cohorts showed that a subset of the clinically active UC patients had an overabundance of proteases that originated from the bacterium Bacteroides vulgatus. To test whether B. vulgatus proteases contribute to UC disease activity, we first profiled B. vulgatus proteases found in patients and bacterial cultures. Use of a broad-spectrum protease inhibitor improved B. vulgatus-induced barrier dysfunction in vitro, and prevented colitis in B. vulgatus monocolonized, IL10-deficient mice. Furthermore, transplantation of faeces from UC patients with a high abundance of B. vulgatus proteases into germfree mice induced colitis dependent on protease activity. These results, stemming from a multi-omics approach, improve understanding of functional microbiota alterations that drive UC and provide a resource for identifying other pathways that could be inhibited as a strategy to treat this disease.

This is a preview of subscription content

Access options

Buy article

Get time limited or full article access on ReadCube.

$32.00

All prices are NET prices.

Fig. 1: Multi-omic diversity correlates with IBD disease activity.
Fig. 2: Multi-omic analysis of IBD disease activity.
Fig. 3: Integrated metagenomic–metaproteomic analyses reveal Bacteroides proteases distinguishing a subset of active UC patients.
Fig. 4: Assessing proteolysis in UC patients and Bacteroides supernatant.
Fig. 5: Protease inhibition protects from B. vulgatus and faecal transplant induced pathology in vitro and in vivo.

Data availability

Metabolomic data, proteomic data and additional supplementary files for reanalyzing the data collected here are available online at https://massive.ucsd.edu (Cohort 1 proteomics and metabolomics study ID MSV000082094, Cohort 2 study ID MSV000086509, Cohort 2 metabolomics study ID MSV000084908). Proteomic data and supplementary files for reanalyzing data collected from the faecal transplant study and Bacteroides supernatant are under MassIVE identifiers MSV000086510 and MSV000086511, respectively. Genomic data has been uploaded through EBI https://www.ebi.ac.uk/ena under the study identifiers PRJEB42151 for Cohort 1 and PRJEB42155 for Cohort 2. Comparisons with data generated in this study were also made with proteomics data downloaded from the IBD multi-omics database (https://ibdmdb.org/tunnel/public/HMP2/Proteomics/1633/rawfiles). Databases used in this study include UniRef50 (https://www.uniprot.org/downloads), the human proteome (https://www.uniprot.org/proteomes/UP000005640), mouse proteome (https://www.uniprot.org/proteomes/UP000000589), B. vulgatus proteome (https://www.uniprot.org/proteomes/UP000002861), B. theta proteome (https://www.uniprot.org/proteomes/UP000001414), B. dorei proteome (https://www.uniprot.org/proteomes/UP000005974), a microbial genome database (https://biocore.github.io/wol/) and a human gut microbiome database (https://db.cngb.org/microbiome/genecatalog/genecatalog_human/). Source data are available for in vitro and in vivo experiments. Source data are provided with this paper.

Code availability

The code used in the analysis and visualization of data is available at https://github.com/knightlab-analyses/uc-severity-multiomics.

References

  1. Fumery, M. et al. Natural history of adult ulcerative colitis in population-based cohorts: a systematic review. Clin. Gastroenterol. Hepatol. 16, 343–356.e343 (2018).

    PubMed  Article  Google Scholar 

  2. Dulai, P. S., Siegel, C. A., Colombel, J. F., Sandborn, W. J. & Peyrin-Biroulet, L. Systematic review: monotherapy with antitumour necrosis factor alpha agents versus combination therapy with an immunosuppressive for IBD. Gut 63, 1843–1853 (2014).

    CAS  PubMed  Article  Google Scholar 

  3. Sartor, R. B. & Wu, G. D. Roles for intestinal bacteria, viruses, and fungi in pathogenesis of inflammatory bowel diseases and therapeutic approaches. Gastroenterology 152, 327–339.e324 (2017).

    CAS  PubMed  Article  Google Scholar 

  4. Schirmer, M. et al. Compositional and temporal changes in the gut microbiome of pediatric ulcerative colitis patients are linked to disease course. Cell Host Microbe 24, 600–610.e604 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  5. Shen, Z. H. et al. Relationship between intestinal microbiota and ulcerative colitis: mechanisms and clinical application of probiotics and fecal microbiota transplantation. World J. Gastroenterol. 24, 5–14 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  6. Halfvarson, J. et al. Dynamics of the human gut microbiome in inflammatory bowel disease. Nat. Microbiol. 2, 17004 (2017).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  7. Lloyd-Price, J. et al. Multi-omics of the gut microbial ecosystem in inflammatory bowel diseases. Nature 569, 655–662 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  8. Franzosa, E. A. et al. Gut microbiome structure and metabolic activity in inflammatory bowel disease. Nat. Microbiol. https://doi.org/10.1038/s41564-018-0306-4 (2018).

  9. Campieri, M. & Gionchetti, P. Bacteria as the cause of ulcerative colitis. Gut 48, 132–135 (2001).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  10. Khan, I. et al. Alteration of gut microbiota in inflammatory bowel disease (IBD): cause or consequence? IBD treatment targeting the gut microbiome. Pathogens https://doi.org/10.3390/pathogens8030126 (2019).

  11. Mills, R. H. et al. Evaluating metagenomic prediction of the metaproteome in a 4.5-year study of a patient with Crohn’s disease. mSystems 4, e00337-00318 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  12. Verberkmoes, N. C. et al. Shotgun metaproteomics of the human distal gut microbiota. ISME J. 3, 179–189 (2009).

    CAS  PubMed  Article  Google Scholar 

  13. Zhang, X., Li, L., Butcher, J., Stintzi, A. & Figeys, D. Advancing functional and translational microbiome research using meta-omics approaches. Microbiome 7, 154 (2019).

    PubMed  PubMed Central  Article  Google Scholar 

  14. Liu, C. W. et al. Isobaric labeling quantitative metaproteomics for the study of gut microbiome response to arsenic. J. Proteome Res. 18, 970–981 (2019).

    CAS  PubMed  Article  Google Scholar 

  15. Tran, H. Q. et al. Associations of the fecal microbial proteome composition and proneness to diet-induced obesity. Mol. Cell. Proteomics https://doi.org/10.1074/mcp.RA119.001623 (2019).

  16. Jansson, J. K. & Baker, E. S. A multi-omic future for microbiome studies. Nat. Microbiol. 1, 16049 (2016).

    CAS  PubMed  Article  Google Scholar 

  17. Zhang, X. et al. Metaproteomics reveals associations between microbiome and intestinal extracellular vesicle proteins in pediatric inflammatory bowel disease. Nat. Commun. https://doi.org/10.1038/s41467-018-05357-4 (2018).

  18. Erickson, A. R. et al. Integrated metagenomics/metaproteomics reveals human host-microbiota signatures of Crohn’s disease. PLoS ONE 7, e49138 (2012).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  19. Vergnolle, N. Protease inhibition as new therapeutic strategy for GI diseases. Gut 65, 1215–1224 (2016).

    CAS  PubMed  Article  Google Scholar 

  20. Galipeau, H. J. et al. Novel fecal biomarkers that precede clinical diagnosis of ulcerative colitis. Gastroenterology 160, 1532–1545 (2021).

    CAS  PubMed  Article  Google Scholar 

  21. Lewis, J. D. et al. Use of the noninvasive components of the Mayo score to assess clinical response in ulcerative colitis. Inflamm. Bowel Dis. 14, 1660–1666 (2008).

    PubMed  Article  Google Scholar 

  22. Narula, N., Alshahrani, A. A., Yuan, Y., Reinisch, W. & Colombel, J. F. Patient-reported outcomes and endoscopic appearance of ulcerative colitis: a systematic review and meta-analysis. Clin. Gastroenterol. Hepatol. https://doi.org/10.1016/j.cgh.2018.06.015 (2018).

  23. Dulai, P. S., Levesque, B. G., Feagan, B. G., D’Haens, G. & Sandborn, W. J. Assessment of mucosal healing in inflammatory bowel disease: review. Gastrointest. Endosc. 82, 246–255 (2015).

    PubMed  PubMed Central  Article  Google Scholar 

  24. Walsh, A. J., Bryant, R. V. & Travis, S. P. Current best practice for disease activity assessment in IBD. Nat. Rev. Gastroenterol. Hepatol. 13, 567–579 (2016).

    CAS  PubMed  Article  Google Scholar 

  25. Bakir, M. A., Sakamoto, M., Kitahara, M., Matsumoto, M. & Benno, Y. Bacteroides dorei sp. nov., isolated from human faeces. Int. J. Syst. Evol. Microbiol. 56, 1639–1643 (2006).

    CAS  PubMed  Article  Google Scholar 

  26. Kulagina, E. V. et al. Species composition of Bacteroidales Order bacteria in the feces of healthy people of various ages. Biosci. Biotechnol. Biochem. 76, 169–171 (2012).

    CAS  PubMed  Article  Google Scholar 

  27. O’Donoghue, A. J. et al. Global substrate profiling of proteases in human neutrophil extracellular traps reveals consensus motif predominantly contributed by elastase. PLoS ONE 8, e75141 (2013).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  28. Nemoto, T. K. & Ohara-Nemoto, Y. Exopeptidases and gingipains in Porphyromonas gingivalis as prerequisites for its amino acid metabolism. Jpn Dent. Sci. Rev. 52, 22–29 (2016).

    PubMed  Article  Google Scholar 

  29. Kumagai, Y. et al. Enzymatic properties of dipeptidyl aminopeptidase IV produced by the periodontal pathogen Porphyromonas gingivalis and its participation in virulence. Infect. Immun. 68, 716–724 (2000).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  30. Deacon, C. F. & Lebovitz, H. E. Comparative review of dipeptidyl peptidase-4 inhibitors and sulphonylureas. Diabetes Obes. Metab. 18, 333–347 (2016).

    CAS  PubMed  Article  Google Scholar 

  31. Mimura, S. et al. Dipeptidyl peptidase-4 inhibitor anagliptin facilitates restoration of dextran sulfate sodium-induced colitis. Scand. J. Gastroenterol. 48, 1152–1159 (2013).

    CAS  PubMed  Article  Google Scholar 

  32. Donaldson, G. P., Lee, S. M. & Mazmanian, S. K. Gut biogeography of the bacterial microbiota. Nat. Rev. Microbiol. 14, 20–32 (2016).

    CAS  PubMed  Article  Google Scholar 

  33. Wexler, H. M. Bacteroides: the good, the bad, and the nitty-gritty. Clin. Microbiol. Rev. 20, 593–621 (2007).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  34. Foley, M. H., Cockburn, D. W. & Koropatkin, N. M. The Sus operon: a model system for starch uptake by the human gut Bacteroidetes. Cell. Mol. Life Sci. 73, 2603–2617 (2016).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  35. Onderdonk, A. B., Franklin, M. L. & Cisneros, R. L. Production of experimental ulcerative colitis in gnotobiotic guinea pigs with simplified microflora. Infect. Immun. 32, 225–231 (1981).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  36. Bamba, T., Matsuda, H., Endo, M. & Fujiyama, Y. The pathogenic role of Bacteroides vulgatus in patients with ulcerative colitis. J. Gastroenterol. 30, 45–47 (1995).

    PubMed  Google Scholar 

  37. Waidmann, M. et al. Bacteroides vulgatus protects against Escherichia coli-induced colitis in gnotobiotic interleukin-2-deficient mice. Gastroenterology 125, 162–177 (2003).

    PubMed  Article  Google Scholar 

  38. Sellon, R. K. et al. Resident enteric bacteria are necessary for development of spontaneous colitis and immune system activation in interleukin-10-deficient mice. Infect. Immun. 66, 5224–5231 (1998).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  39. Vich Vila, A. et al. Gut microbiota composition and functional changes in inflammatory bowel disease and irritable bowel syndrome. Sci. Transl. Med. https://doi.org/10.1126/scitranslmed.aap8914 (2018).

  40. Zhou, Y. & Zhi, F. Lower level of Bacteroides in the gut microbiota is associated with inflammatory bowel disease: a meta-analysis. BioMed. Res. Int. 2016, 5828959 (2016).

    PubMed  PubMed Central  Google Scholar 

  41. García-López, M. et al. Analysis of 1,000 type-strain genomes improves taxonomic classification of Bacteroidetes. Front. Microbiol. 10, 2083 (2019).

    PubMed  PubMed Central  Article  Google Scholar 

  42. Shimshoni, E., Yablecovitch, D., Baram, L., Dotan, I. & Sagi, I. ECM remodelling in IBD: innocent bystander or partner in crime? The emerging role of extracellular molecular events in sustaining intestinal inflammation. Gut 64, 367–372 (2015).

    CAS  PubMed  Article  Google Scholar 

  43. Van Spaendonk, H. et al. Regulation of intestinal permeability: the role of proteases. World J. Gastroenterol. 23, 2106–2123 (2017).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  44. Steck, N., Mueller, K., Schemann, M. & Haller, D. Bacterial proteases in IBD and IBS. Gut 61, 1610–1618 (2012).

    CAS  PubMed  Article  Google Scholar 

  45. Carroll, I. M. & Maharshak, N. Enteric bacterial proteases in inflammatory bowel disease – pathophysiology and clinical implications. World J. Gastroenterol. 19, 7531–7543 (2013).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  46. Kriaa, A. et al. Serine proteases at the cutting edge of IBD: focus on gastrointestinal inflammation. FASEB J. 34, 7270–7282 (2020).

    CAS  PubMed  Article  Google Scholar 

  47. Denadai-Souza, A. et al. Functional proteomic profiling of secreted serine proteases in health and inflammatory bowel disease. Sci. Rep. 8, 7834 (2018).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  48. O’Sullivan, S., Gilmer, J. F. & Medina, C. Matrix metalloproteinases in inflammatory bowel disease: an update. Mediators Inflamm. 2015, 964131 (2015).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  49. Biancheri, P. et al. Proteolytic cleavage and loss of function of biologic agents that neutralize tumor necrosis factor in the mucosa of patients with inflammatory bowel disease. Gastroenterology 149, 1564–1574.e1563 (2015).

    CAS  PubMed  Article  Google Scholar 

  50. Gordon, M. H. et al. N-terminomics/TAILS profiling of proteases and their substrates in ulcerative colitis. ACS Chem. Biol. 14, 2471–2483 (2019).

    CAS  PubMed  Article  Google Scholar 

  51. Roka, R. et al. Colonic luminal proteases activate colonocyte proteinase-activated receptor-2 and regulate paracellular permeability in mice. Neurogastroenterol. Motil. 19, 57–65 (2007).

    CAS  PubMed  Article  Google Scholar 

  52. Ordas, I., Eckmann, L., Talamini, M., Baumgart, D. C. & Sandborn, W. J. Ulcerative colitis. Lancet 380, 1606–1619 (2012).

    PubMed  Article  Google Scholar 

  53. Sałaga, M., Sobczak, M. & Fichna, J. Inhibition of proteases as a novel therapeutic strategy in the treatment of metabolic, inflammatory and functional diseases of the gastrointestinal tract. Drug Discov. Today 18, 708–715 (2013).

    PubMed  Article  CAS  Google Scholar 

  54. Riepe, S. P., Goldstein, J. & Alpers, D. H. Effect of secreted Bacteroides proteases on human intestinal brush border hydrolases. J. Clin. Invest. 66, 314–322 (1980).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  55. Obiso, R. J. Jr., Lyerly, D. M., Van Tassell, R. L. & Wilkins, T. D. Proteolytic activity of the Bacteroides fragilis enterotoxin causes fluid secretion and intestinal damage in vivo. Infect. Immun. 63, 3820–3826 (1995).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  56. Valguarnera, E. & Wardenburg, J. B. Good gone bad: one toxin away from disease for Bacteroides fragilis. J. Mol. Biol. 432, 765–785 (2020).

    CAS  PubMed  Article  Google Scholar 

  57. Elhenawy, W., Debelyy, M. O. & Feldman, M. F. Preferential packing of acidic glycosidases and proteases into Bacteroides outer membrane vesicles. MBio 5, e00909–e00914 (2014).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  58. Marotz, C. et al. DNA extraction for streamlined metagenomics of diverse environmental samples. Biotechniques 62, 290–293 (2017).

    CAS  PubMed  Article  Google Scholar 

  59. Caporaso, J. G. et al. Ultra-high-throughput microbial community analysis on the Illumina HiSeq and MiSeq platforms. ISME J. 6, 1621–1624 (2012).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  60. Thompson, L. R. et al. A communal catalogue reveals Earth’s multiscale microbial diversity. Nature 551, 457–463 (2017).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  61. Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  62. Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  63. Li, D. H., Liu, C. M., Luo, R. B., Sadakane, K. & Lam, T. W. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics 31, 1674–1676 (2015).

    CAS  PubMed  Article  Google Scholar 

  64. Hyatt, D. et al. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11, 119 (2010).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  65. Buchfink, B., Xie, C. & Huson, D. H. Fast and sensitive protein alignment using DIAMOND. Nat. Methods 12, 59–60 (2015).

    CAS  PubMed  Article  Google Scholar 

  66. Kanehisa, M., Sato, Y. & Morishima, K. BlastKOALA and GhostKOALA: KEGG tools for functional characterization of genome and metagenome sequences. J. Mol. Biol. 428, 726–731 (2016).

    CAS  PubMed  Article  Google Scholar 

  67. Zhu, Q. et al. Phylogenomics of 10,575 genomes reveals evolutionary proximity between domains Bacteria and Archaea. Nat. Commun. 10, 5477 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  68. Caporaso, J. G. et al. QIIME allows analysis of high-throughput community sequencing data. Nat. Methods 7, 335–336 (2010).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  69. Patro, R., Duggal, G., Love, M. I., Irizarry, R. A. & Kingsford, C. Salmon provides fast and bias-aware quantification of transcript expression. Nat. Methods 14, 417–419 (2017).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  70. Koontz, L. TCA precipitation. Methods Enzymol. 541, 3–10 (2014).

    CAS  PubMed  Article  Google Scholar 

  71. Villen, J. & Gygi, S. P. The SCX/IMAC enrichment approach for global phosphorylation analysis by mass spectrometry. Nat. Protoc. 3, 1630–1638 (2008).

    PubMed  PubMed Central  Article  Google Scholar 

  72. Haas, W. et al. Optimization and use of peptide mass measurement accuracy in shotgun proteomics. Mol. Cell. Proteomics 5, 1326–1337 (2006).

    CAS  PubMed  Article  Google Scholar 

  73. Wessel, D. & Flugge, U. I. A method for the quantitative recovery of protein in dilute solution in the presence of detergents and lipids. Anal. Biochem. 138, 141–143 (1984).

    CAS  PubMed  Article  Google Scholar 

  74. Van Rechem, C. et al. Lysine demethylase KDM4A associates with translation machinery and regulates protein synthesis. Cancer Discov. 5, 255–263 (2015).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  75. Tolonen, A. C. & Haas, W. Quantitative proteomics using reductive dimethylation for stable isotope labeling. J. Vis. Exp. https://doi.org/10.3791/51416 (2014).

  76. Lapek, J. D., Jr et al. Defining host responses during systemic bacterial infection through construction of a murine organ proteome atlas. Cell Syst. https://doi.org/10.1016/j.cels.2018.04.010 (2018).

  77. Tolonen, A. C. et al. Proteome-wide systems analysis of a cellulosic biofuel-producing microbe. Mol. Syst. Biol. 7, 461 (2011).

    PubMed  PubMed Central  Article  Google Scholar 

  78. Thompson, A. et al. Tandem mass tags: a novel quantification strategy for comparative analysis of complex protein mixtures by MS/MS. Anal. Chem. 75, 1895–1904 (2003).

    CAS  PubMed  Article  Google Scholar 

  79. Wang, Y. et al. Reversed-phase chromatography with multiple fraction concatenation strategy for proteome profiling of human MCF10A cells. Proteomics 11, 2019–2026 (2011).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  80. Lapek, J. D., Jr, Lewinski, M. K., Wozniak, J. M., Guatelli, J. & Gonzalez, D. J. Quantitative temporal viromics of an inducible HIV-1 model yields insight to global host targets and phospho-dynamics associated with protein Vpr. Mol. Cell. Proteomics https://doi.org/10.1074/mcp.M116.066019 (2017).

  81. Eng, J. K., McCormack, A. L. & Yates, J. R. An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. J. Am. Soc. Mass. Spectrom. 5, 976–989 (1994).

    CAS  PubMed  Article  Google Scholar 

  82. Beausoleil, S. A., Villen, J., Gerber, S. A., Rush, J. & Gygi, S. P. A probability-based approach for high-throughput protein phosphorylation analysis and site localization. Nat. Biotechnol. 24, 1285–1292 (2006).

    CAS  PubMed  Article  Google Scholar 

  83. Huttlin, E. L. et al. A tissue-specific atlas of mouse protein phosphorylation and expression. Cell 143, 1174–1189 (2010).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  84. Elias, J. E. & Gygi, S. P. Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry. Nat. Methods 4, 207–214 (2007).

    CAS  Article  PubMed  Google Scholar 

  85. Elias, J. E., Haas, W., Faherty, B. K. & Gygi, S. P. Comparative evaluation of mass spectrometry platforms used in large-scale proteomics investigations. Nat. Methods 2, 667–675 (2005).

    CAS  PubMed  Article  Google Scholar 

  86. Peng, J., Elias, J. E., Thoreen, C. C., Licklider, L. J. & Gygi, S. P. Evaluation of multidimensional chromatography coupled with tandem mass spectrometry (LC/LC-MS/MS) for large-scale protein analysis: the yeast proteome. J. Proteome Res. 2, 43–50 (2003).

    CAS  Article  PubMed  Google Scholar 

  87. Jagtap, P. et al. A two-step database search method improves sensitivity in peptide sequence matches for metaproteomics and proteogenomics studies. Proteomics 13, 1352–1357 (2013).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  88. Wang, M. et al. Sharing and community curation of mass spectrometry data with Global Natural Products Social Molecular Networking. Nat. Biotechnol. 34, 828–837 (2016).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  89. Pluskal, T., Castillo, S., Villar-Briones, A. & Oresic, M. MZmine 2: modular framework for processing, visualizing, and analyzing mass spectrometry-based molecular profile data. BMC Bioinformatics 11, 395 (2010).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  90. Tripathi, A. et al. Chemically informed analyses of metabolomics mass spectrometry data with Qemistree. Nat. Chem. Biol. https://doi.org/10.1038/s41589-020-00677-3 (2020).

  91. Duhrkop, K., Shen, H., Meusel, M., Rousu, J. & Bocker, S. Searching molecular structure databases with tandem mass spectra using CSI:FingerID. Proc. Natl Acad. Sci. USA 112, 12580–12585 (2015).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  92. Djoumbou Feunang, Y. et al. ClassyFire: automated chemical classification with a comprehensive, computable taxonomy. J. Cheminform. 8, 61 (2016).

    PubMed  PubMed Central  Article  Google Scholar 

  93. Zhang, J. et al. PEAKS DB: de novo sequencing assisted database search for sensitive and accurate peptide identification. Mol. Cell. Proteomics 11, M111.010587 (2012).

    PubMed  Article  CAS  Google Scholar 

  94. Quinn, R. A. et al. Neutrophilic proteolysis in the cystic fibrosis lung correlates with a pathogenic microbiome. Microbiome 7, 23 (2019).

    PubMed  PubMed Central  Article  Google Scholar 

  95. Li, J. et al. An integrated catalog of reference genes in the human gut microbiome. Nat. Biotechnol. 32, 834–841 (2014).

    CAS  PubMed  Article  Google Scholar 

  96. Gonzalez, A. et al. Qiita: rapid, web-enabled microbiome meta-analysis. Nat. Methods 15, 796–798 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  97. Bolyen, E. et al. Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2. Nat. Biotechnol. 37, 852–857 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  98. Szklarczyk, D. et al. STRING v10: protein–protein interaction networks, integrated over the tree of life. Nucleic Acids Res. 43, D447–D452 (2015).

    CAS  PubMed  Article  Google Scholar 

  99. Shannon, P. et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13, 2498–2504 (2003).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  100. Colaert, N., Helsens, K., Martens, L., Vandekerckhove, J. & Gevaert, K. Improved visualization of protein consensus sequences by iceLogo. Nat. Methods 6, 786–787 (2009).

    CAS  PubMed  Article  Google Scholar 

  101. Wang, F. et al. Interferon-gamma and tumor necrosis factor-alpha synergize to induce intestinal epithelial barrier dysfunction by up-regulating myosin light chain kinase expression. Am. J. Pathol. 166, 409–419 (2005).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  102. Tremelling, M. et al. IL23R variation determines susceptibility but not disease phenotype in inflammatory bowel disease. Gastroenterology 132, 1657–1664 (2007).

    CAS  PubMed  Article  Google Scholar 

  103. Wakula, M. et al. Quantification of cell–substrate adhesion area and cell shape distributions in MCF7 cell monolayers. J. Vis. Exp. https://doi.org/10.3791/61461 (2020).

  104. Legland, D., Arganda-Carreras, I. & Andrey, P. MorphoLibJ: integrated library and plugins for mathematical morphology with ImageJ. Bioinformatics 32, 3532–3534 (2016).

    CAS  PubMed  Google Scholar 

  105. Moschen, A. R. et al. Lipocalin 2 protects from inflammation and tumorigenesis associated with gut microbiota alterations. Cell Host Microbe 19, 455–469 (2016).

    CAS  PubMed  Article  Google Scholar 

  106. Hecht, G. et al. A simple cage-autonomous method for the maintenance of the barrier status of germ-free mice during experimentation. Lab. Anim. 48, 292–297 (2014).

    CAS  PubMed  Article  Google Scholar 

  107. Katakura, K. et al. Toll-like receptor 9-induced type I IFN protects mice from experimental colitis. J. Clin. Invest. 115, 695–702 (2005).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  108. Chassaing, B. et al. Fecal lipocalin 2, a sensitive and broadly dynamic non-invasive biomarker for intestinal inflammation. PLoS ONE 7, e44328 (2012).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  109. Xiao, Y. et al. A novel significance score for gene selection and ranking. Bioinformatics 30, 801–807 (2014).

    CAS  PubMed  Article  Google Scholar 

  110. Mills, R. H. et al. Organ-level protein networks as a reference for the host effects of the microbiome. Genome Res. 30, 276–286 (2020).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

Download references

Acknowledgements

P.S.D., R.H.M. and C.S. were supported through a UCSD training grant from the NIH/NIDDK Gastroenterology Training Program (T32 DK007202). P.S.D. was also supported by an American Gastroenterology Association Research Scholar Award. We thank E. Griffis, D. Bindels and the Nikon Imaging Center at UCSD for help with confocal microscopy, and the UCSD Neuroscience Microscopy Shared Facility (NS047101). This study was supported in part by NIDDK-funded San Diego Digestive Diseases Research Center (P30 DK120515, D.J.G., P.S.D.) and the UCSD Collaborative Center of Multiplexed Proteomics.

Author information

Authors and Affiliations

Authors

Contributions

R.H.M., D.J.G., R.K., P.S.D., H.C., A.T.G., B.C. and P.C.D. conceived and designed the study. R.H.M., P.S.D., R.A.Q., H.C., A.T.G., B.C., Y.V.-B., Q.Z. and R.K. developed the methodology. R.H.M., M.M.O., K.W., M.C.-T., R.A.Q., G.H., L.D.G. and M.B. acquired multi-omics data. R.H.M. and Y.V.-B. analysed multi-omics data. B.C., N.D., R.R.G., L.E.B. and H.C. conducted animal studies. R.H.M. and C.S. conducted mammalian and bacterial culture studies. R.H.M., P.S.D., H.C., Y.V.-B., Q.Z., R.K., D.J.G., R.A.Q. and P.C.D. interpreted the data. R.H.M., P.S.D. and D.J.G. wrote the manuscript. R.H.M., P.S.D., C.S., R.A.Q., Y.V.-B., R.K., M.R., W.J.S., D.J.G., Q.Z.,Y.Z., A.T.G., B.C. and H.C. reviewed and revised the manuscript.

Corresponding authors

Correspondence to Rob Knight or David J. Gonzalez.

Ethics declarations

Competing interests

R.H.M., P.S.D. and D.J.G. have jointly filed for a patent based on this work (International Application No. PCT/US2020/057784). Over the course of the publication process, R.H.M. started employment at Precidiag Inc., a company that has licensed the patent based on this work. All other authors declare no competing interests.

Peer review information

Nature Microbiology thanks Daniel Figeys and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Study Design and Database Generation.

Paired faecal and serum samples were collected from 40 patients with varying severity of Ulcerative Colitis. A separately analyzed cohort of faecal samples was also collected on 210 samples with 73 UC, 117 CD and 20 healthy controls. Samples were processed for proteomics using a Tandem Mass Tag multiplexing workflow. Faecal samples were also subjected to both 16S and shotgun metagenomic analyses for microbial composition and gene quantification respectively. In parallel, a metabolomics workflow was performed on faecal samples where collected MS2 spectra were analyzed for both metabolites and peptides in two separate computational pipelines. A custom database was compiled from the metagenome of faecal samples to mediate a comparative analysis between shotgun metagenomic and metaproteomic data sets. This eliminated database dependent bias and the shared reference was used for estimating copy number.

Extended Data Fig. 2 A multiplexing approach improves the depth and sparsity of metaproteomics data.

a, Multiplexed metaproteomic methods increase the total number of proteins quantified. Shown is a bar graph showing the total number of proteins identified when using identical database methodology between the 102 UC samples from the IBD multiomics database, the 40 UC samples from cohort 1 of this study, and the 205 samples from cohort 2 of this study. b, Multiplexed metaproteomic methods improve the number of proteins quantified per sample. Displayed are the mean + /- SD of the proteins identified per sample from studies shown in (a). Data derived from n = 102, 40, 205 biologically independent samples as described for (a). One-way ANOVA p-values adjusted for multiple comparisons are shown (P < 0.0001). c, Multiplexed metaproteomic methods decrease the sparsity of metaproteomic studies. The percentage of missing quantification values for proteins in each data set is shown.

Source data

Extended Data Fig. 3 Characterizing uneven samples.

a, Alpha diversity (using Pielou’s evenness metric) by disease activity as shown in Fig. 1b, but highlighting classification of samples as uneven when below Pielou Evenness of 0.5. Best-fit linear regression lines with 95% confidence intervals are shown and an R2 statistic is reported from an ordinary least-squares regression using the formula (Disease Activity + Diagnosis + Disease Activity:Diagnosis). b, 16 S beta-diversity is strongly influenced by community evenness. The weighted UniFrac distance metric was used and each sample was classified by community evenness, diagnosis and whether the most abundant 16 S feature was from the family Enterobacteriaceae. c, Characterizing the most abundant 16 S features. Each sample was classified as either “Uneven” (Pielou Evenness < 0.5) or “Other” as shown in (a). Abundances of each amplicon sequence variant were summed by their highest resolution taxonomic annotation and the most abundant feature of samples are represented in a donut plot. The inside ring represents the fractional composition of each patient subgroup and the outside rings represents the number of patients within each subgroup whom share a similar most abundant feature. Less common features for each patient subgroup are counted as “Other”.

Extended Data Fig. 4 Comparison of genera annotations from genes and proteins correlated to disease severity.

The genus composition of genes and proteins correlated to disease activity were compared with different levels of sparsity as a requirement for being deemed “correlated”. Stacked bar charts summarize the number of genes or proteins from the 10 most common genus assignments when correlated to either partial Mayo severity in UC cohorts or CDAI in CD patients. Only genes or proteins with |r | > 0.3 from linear regression were included. a, Genus composition of significant positively and negatively correlated genes from the MG with no sparsity requirement. b, Genus composition of significantly positively and negatively correlated proteins from the MP with no sparsity requirement. c, Genus composition of associated proteins as in Fig. 3a, but without removing host proteins (genus Homo). d, Genes correlated to disease activity from the MG when filtering out genes appearing in less than 40% of patients within each category. e, Summary of comparing the portions of positively and negatively correlated genes and proteins from each patient cohort when examining the top 10 genera identified in the MG. This analysis is analogous to Fig. 3b, but displaying the top MG genera.

Extended Data Fig. 5 Comparison of genera and functional annotations from genes and proteins correlated to disease severity in CD subtypes.

a, Genus level barcharts of significantly correlated genes or proteins stratified by CD subtype. The genus composition of genes and proteins from either the MG or MP were correlated to CDAI and shown in stacked bar charts. Only genes or proteins with |r | > 0.3 from linear regression were included, and the top 10 genera are displayed with other genera compiled into an “Others” category. b, CD subtypes genus level association comparison. The portion of genes or proteins correlated with disease activity from (a) are plotted by a Log10 comparison between the proportion of positive to negative correlations. Genes correlated to disease activity from the MG when filtering out genes appearing in less than 40% of patients within each category. c, CD subtypes functional association comparison. This analysis is analogous to (b) but summarizing the associations to KEGG functional category annotations in the MP.

Extended Data Fig. 6 Patients with overproduction of Bacteroides vulgatus proteases have increased endoscopic and histological severity.

a, Bacteroides protease production corresponds to increased endoscopic severity. The disease activity of overproducers, underproducers, and other patients are individually plotted over boxplots. Two-tailed, t-test p-values are displayed above the boxplots. Sample sizes include n = 16, 14 and 71 for overproducers, underproducers and others respectively. Boxplots are defined by the median, quartiles and 1.5x inter-quartile range. b, Bacteroides protease production corresponds to a patient population with a decreased proportion of patients in histological remission. Each UC patient sample was categorized by Bacteroides vulgatus protease production category and the percent of patients in histological remission is shown in a bargraph with the number of samples in each category displayed above each bar. Histological remission is defined here as Geboes Grade 3 = 0.

Extended Data Fig. 7 Peptide fragments are increased in active UC patients and Bacteroides protease enriched patients.

a, Comparison of peptide fragments identified in patients with varying abundance of Bacteroides proteases. Overproducers from UC cohort 1 had increased peptide fragments in comparison to other patients (Two-tailed t-test P = 3.5E-2). Data was derived from n = 8, 9, 23 UC cohort 1 samples and n = 6, 6, 49 UC cohort 2 samples from patients classified as underproducer, overproducer and other respectively. b, Peptide termini indicate unique proteolysis of human and microbial proteins. The frequency of each amino acid within the N and C terminus of human and de-novo peptides was compared to either the human proteome or the total amino acid content of de novo peptides. The Y-axis represents the percent difference of each residue and the letter indicates the amino acid associated with the difference. The N and C terminus are shown separately and each residue is colored by chemical property (Green = polar, Black = Hydrophobic, Red = Acidic, Blue = Basic, Purple = Neutral). c, Peptide fragment identification comparison by disease activity in UC cohort 1. Boxplots with a two-tailed t-test p-value is shown (P = 4.7E-3). Data was derived from n = 18, 12, 10 patient samples with low moderate or high disease activity respectively. d, Peptide fragment identification comparison by disease and disease activity state for cohort 2 samples. Boxplots are shown with overlaid two-tailed t-test p-values. Data was derived from n = 19 healthy controls, n = 39, 30, 12 UC samples, and n = 64, 30, 8 CD samples from patients of low, moderate and high activity respectively. Boxplots in (a,c,d) are defined by the median, quartiles and 1.5x inter-quartile range.

Extended Data Fig. 8 Determining the impact of Bacteroides species on TEER using co-culture, supernatants and protease inhibitors.

a, Bacteroides vulgatus and Bacteroides dorei, but not other Bacteroides species disrupt Caco-2 epithelial barriers. Barplots are showing the mean and standard deviation of the change in TEER at different time points. Data was derived from n = 3 independent cultures collected over n = 2 independent experiments. b, Growth curves of Bacteroides vulgatus with protease inhibitors under different growth conditions. OD600 was measured at indicated time points and a non-linear fit is shown. Data was derived from n = 3 independent cultures collected over n = 1 independent experiments. c, Supernatants from Bacteroides in mid-log phase growth do not significantly impact TEER. B. vulgatus and B. theta were grown to mid-log phase, and their supernatants were concentrated and added to Caco-2 monolayers. TEER was measured at the initial time-point and compared to TEER measured after 1, 4, and 8 h of incubation. Plotted are the mean and SEM from n = 3 independent experiments each representing the mean of n = 3 independent wells/experiment (n = 4 wells/experiment for B. vulgatus group). No significant differences were found at any timepoint.

Source data

Extended Data Fig. 9 Additional measurements from faecal transplant experiments.

a-f Barplots showing the mean + /- SD of macroscopic organ measurements from fecal transplant of UC patients samples in IL10-/- mice with or without administration of a protease inhibitor. Dots represent one mouse, with each group representing results from 3 UC patient fecal samples with each sample given to 3 co-housed mice. Measurements include final weight of the mice (a), colon weight (b), ratios of the colon weight to length (c), caecum weight (d), fat pad weight (e), liver weight (f). g-h Barplots showing the mean + /- SEM for the concentration of an intestinal inflammatory marker, fecal lipocalin2 (g), and amount of 16 S rRNA in the spleen of mice for an estimate of the splenic bacterial load (h). Each dot in g-h represents the mean of n = 3 mice transplanted with the same UC fecal sample (with the exception of a mean from n = 2 mice for one patient sample in the Abundant Proteases + Inhibitor Cocktail group) from n = 2 independent experiments. i, Metaproteome genera composition of mice transplanted with UC fecal samples. Fecal samples taken at 8-weeks from mice transplanted with one high protease containing sample (H19) and one control patient sample (L3) were analyzed by mass spectrometry based metaproteomics. Stacked barplots are shown for each mouse displaying the proportion of protein signal derived from the most common genera. j, Molecular function of B. vulgatus proteases identified in mice receiving UC fecal samples. The relative abundance of each B. vulgatus protease is shown in stacked barplots grouped by the Gene Ontology molecular function associated with each protein. k, Top B. vulgatus or B. dorei proteases associated with the fecal samples of mice receiving the H19 sample. Each protein is ranked by pi-score, which combines two-sided t-test p-values and the fold-change difference between all H19 and L3 samples. l, Cumulative protease comparisons. A venndiagram is shown comparing the protein names of B. vulgatus or B. dorei proteases from four independent proteomics experiments performed in this study. A full list of the Bacteroides proteases identified in this analysis can be found in Supplementary Table 4.

Extended Data Fig. 10 Working hypothesis.

The results of our study may indicate that certain species from the genus Bacteroides, particularly those recently reclassified under the genus Phocaeicola (for example Bacteroides vulgatus & Bacteroides dorei), may be implicated in the transition from remission to active disease in UC. We hypothesize that a stressor in the UC gut such as nutrient deprivation or cell-to-cell competition may increase protease production, and a switch in the utilization of carbohydrates to proteins as a nutrient source. Some of these proteases may be involved in the disruption of the epithelial barrier, allowing an influx of innate immune cells which further exacerbate disease.

Supplementary information

Supplementary Information

Supplementary Table 1 and Figs. 1–6.

Reporting Summary

41564_2021_1050_MOESM3_ESM.xlsx

Supplementary Table 2 Association of UC clinical variables to alpha and beta diversity. For alpha diversity, Kruskal–Wallis tests were performed on each of the listed categorical variables, and linear regression was applied for quantitative variables. P values are reported for all tests, and r values are reported for quantitative variables. When more than two categories were present in categorical variables, the P value between the two largest categories was reported. For beta diversity, categorical variables were tested using PERMANOVA, continuous variables were tested using Adonis. P values followed by pseudo-F values for each category are reported. For continuous variables, R2 values are reported after pseudo-F values. Testing was based on Bray–Curtis distance matrices, unless otherwise specified. Significance level is indicated according to P value (* <0.05, ** <0.01, *** <0.001).

41564_2021_1050_MOESM4_ESM.xlsx

Supplementary Table 3 Features of importance to predicting UC disease activity. Feature importance values from the 100 random forest iterations predicting UC disease activity are reported for each omic data type from both UC cohorts. The top-100 features from each data type are provided, ranked by the summed importance scores from both cohorts. From the combined datasets, the top-100 features and annotation information related to each feature’s data type are also provided. These data correspond to the random forest results shown in Fig. 2c.

41564_2021_1050_MOESM5_ESM.xlsx

Supplementary Table 4 Bacteroides proteases identified in UC patients, bacterial supernatants and faecal material from humanized mice. Protein names of peptidases or proteases identified throughout the multiple proteomic experiments from this study are listed. Lists are provided for proteases from B. vulgatus or B. dorei that were positively associated (r > 0.3) with UC patient disease activity from either cohort. Additionally, proteases or peptidases identified in the supernatant from different species of Bacteroides are listed. Finally, from a metaproteomic analysis of the faecal material from mice humanized by UC patient faecal samples, B. vulgatus or B. dorei proteases increased (π > 1) in mice transplanted with a sample overabundant in proteases compared with mice transplanted without overabundant proteases.

Source data

Source Data Fig. 4

Tables containing de novo peptide identifications and data from experiments of B. vulgatus supernatant protease activity with different inhibitors.

Source Data Fig. 5

Multiple files related to in vitro and in vivo studies displayed in Fig. 5. This includes the raw data from Caco-2 co-cultures with B. vulgatus and protease inhibitors (Fig. 5b,c), the original image files from Fig. 5d, the quantification of cell morphology (Fig. 5e), original data from the monocolonization experiments (Fig. 5f–i) and the original data from faecal transplant studies (Fig. 5j–n).

Source Data Extended Data Fig. 2

Tables containing the underlying data from Extended Data Fig. 2.

Source Data Extended Data Fig. 8

Tables containing the underlying data from Extended Data Figure 8.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Mills, R.H., Dulai, P.S., Vázquez-Baeza, Y. et al. Multi-omics analyses of the ulcerative colitis gut microbiome link Bacteroides vulgatus proteases with disease severity. Nat Microbiol 7, 262–276 (2022). https://doi.org/10.1038/s41564-021-01050-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41564-021-01050-3

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing