A host–microbiota interactome reveals extensive transkingdom connectivity

Sonnert, Nicole D.; Rosen, Connor E.; Ghazi, Andrew R.; Franzosa, Eric A.; Duncan-Lowey, Brianna; González-Hernández, Jaime A.; Huck, John D.; Yang, Yi; Dai, Yile; Rice, Tyler A.; Nguyen, Mytien T.; Song, Deguang; Cao, Yiyun; Martin, Anjelica L.; Bielecka, Agata A.; Fischer, Suzanne; Guan, Changhui; Oh, Julia; Huttenhower, Curtis; Ring, Aaron M.; Palm, Noah W.

doi:10.1038/s41586-024-07162-0

Article
Published: 20 March 2024

A host–microbiota interactome reveals extensive transkingdom connectivity

Nature volume 628, pages 171–179 (2024)Cite this article

18k Accesses
107 Altmetric
Metrics details

Subjects

Abstract

The myriad microorganisms that live in close association with humans have diverse effects on physiology, yet the molecular bases for these impacts remain mostly unknown^1,2,3. Classical pathogens often invade host tissues and modulate immune responses through interactions with human extracellular and secreted proteins (the ‘exoproteome’). Commensal microorganisms may also facilitate niche colonization and shape host biology by engaging host exoproteins; however, direct exoproteome–microbiota interactions remain largely unexplored. Here we developed and validated a novel technology, BASEHIT, that enables proteome-scale assessment of human exoproteome–microbiome interactions. Using BASEHIT, we interrogated more than 1.7 million potential interactions between 519 human-associated bacterial strains from diverse phylogenies and tissues of origin and 3,324 human exoproteins. The resulting interactome revealed an extensive network of transkingdom connectivity consisting of thousands of previously undescribed host–microorganism interactions involving 383 strains and 651 host proteins. Specific binding patterns within this network implied underlying biological logic; for example, conspecific strains exhibited shared exoprotein-binding patterns, and individual tissue isolates uniquely bound tissue-specific exoproteins. Furthermore, we observed dozens of unique and often strain-specific interactions with potential roles in niche colonization, tissue remodelling and immunomodulation, and found that strains with differing host interaction profiles had divergent interactions with host cells in vitro and effects on the host immune system in vivo. Overall, these studies expose a previously unexplored landscape of molecular-level host–microbiota interactions that may underlie causal effects of indigenous microorganisms on human health and disease.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Assembling a host exoproteome–microbiome interaction atlas using BASEHIT.**

**Fig. 2: Organizational principles of human microbiome–host exoprotein interactions.**

**Fig. 3: Shared and divergent host exoprotein-binding patterns define distinct subsets of phylogenetically related bacterial strains.**

**Fig. 4: Exoprotein interactions imply key roles in bacterial colonization and disease modulation.**

**Fig. 5: Differential effects of exoprotein-binding and non-binding strains.**

A distinct Fusobacterium nucleatum clade dominates the colorectal cancer niche

Article Open access 20 March 2024

An open source knowledge graph ecosystem for the life sciences

Article Open access 11 April 2024

Microbiota in health and diseases

Article Open access 23 April 2022

Data availability

All data supporting this study are included in the paper and its associated supplementary tables or deposited in publicly available databases. Source Data is available for all figures (Figs. 1–5 and Extended Data Figs. 1–10). Raw BASEHIT sequence data were deposited and are available at the NCBI Sequence Read Archive with the BioProject identifier: PRJNA1039280. Mapped barcode data have been deposited and are available at Zenodo (https://doi.org/10.5281/zenodo.10606150)⁵¹. RNA sequencing data and whole-genome sequences for Staphylococcus strains were also deposited and can be found at PRJNA1039280. Public databases used: bioBakery 3 (https://github.com/biobakery), Species Genome Bin (http://segatalab.cibio.unitn.it/data/Pasolli_et_al.html), ProTraits (http://protraits.irb.hr/), UniProt (https://www.uniprot.org/), Gene Ontology (https://geneontology.org/), proteins physical properties⁵⁵ and the Human Protein Atlas (https://www.proteinatlas.org). Source data are provided with this paper.

Code availability

The custom code for the analysis of BASEHIT data has been deposited and is available at Zenodo (https://doi.org/10.5281/zenodo.10606150)⁵¹.

References

Ruff, W. E., Greiling, T. M. & Kriegel, M. A. Host–microbiota interactions in immune-mediated diseases. Nat. Rev. Microbiol. 18, 521–538 (2020).
Article CAS PubMed Google Scholar
Fan, Y. & Pedersen, O. Gut microbiota in human metabolic health and disease. Nat. Rev. Microbiol. 19, 55–71 (2021).
Article CAS PubMed Google Scholar
Fischbach, M. A. Microbiome: focus on causation and mechanism. Cell 174, 785–790 (2018).
Article CAS PubMed PubMed Central Google Scholar
Niemann, H. H., Schubert, W. D. & Heinz, D. W. Adhesins and invasins of pathogenic bacteria: a structural view. Microbes Infect. 6, 101–112 (2004).
Article CAS PubMed Google Scholar
Poole, J., Day, C. J., von Itzstein, M., Paton, J. C. & Jennings, M. P. Glycointeractions in bacterial pathogenesis. Nat. Rev. Microbiol. 16, 440–452 (2018).
Article CAS PubMed Google Scholar
Chatterjee, S., Basak, A. J., Nair, A. V., Duraivelan, K. & Samanta, D. Immunoglobulin-fold containing bacterial adhesins: molecular and structural perspectives in host tissue colonization and infection. FEMS Microbiol. Lett. 368, fnaa220 (2021).
Foster, T. J., Geoghegan, J. A., Ganesh, V. K. & Hook, M. Adhesion, invasion and evasion: the many functions of the surface proteins of Staphylococcus aureus. Nat. Rev. Microbiol. 12, 49–62 (2014).
Article CAS PubMed PubMed Central Google Scholar
Langley, R., Patel, D., Jackson, N., Clow, F. & Fraser, J. D. Staphylococcal superantigen super-domains in immune evasion. Crit. Rev. Immunol. 30, 149–165 (2010).
Article CAS PubMed Google Scholar
Rooijakkers, S. H. & van Strijp, J. A. Bacterial complement evasion. Mol. Immunol. 44, 23–32 (2007).
Article CAS PubMed Google Scholar
Okumura, R. et al. Lypd8 promotes the segregation of flagellated microbiota and colonic epithelia. Nature 532, 117–121 (2016).
Article ADS CAS PubMed Google Scholar
Gur, C. et al. Binding of the Fap2 protein of Fusobacterium nucleatum to human inhibitory receptor TIGIT protects tumors from immune cell attack. Immunity 42, 344–355 (2015).
Article CAS PubMed PubMed Central Google Scholar
Walch, P. et al. Global mapping of Salmonella enterica–host protein–protein interactions during infection. Cell Host Microbe 29, 1316–1332.e12 (2021).
Article CAS PubMed PubMed Central Google Scholar
Penn, B. H. et al. An Mtb–human protein–protein interaction map identifies a switch between host antiviral and antibacterial responses. Mol. Cell 71, 637–648.e5 (2018).
Article CAS PubMed PubMed Central Google Scholar
Schweppe, D. K. et al. Host–microbe protein interactions during bacterial infection. Chem. Biol. 22, 1521–1530 (2015).
Article CAS PubMed PubMed Central Google Scholar
Weimer, B. C., Chen, P., Desai, P. T., Chen, D. & Shah, J. Whole cell cross-linking to discover host–microbe protein cognate receptor/ligand pairs. Front. Microbiol. 9, 1585 (2018).
Article PubMed PubMed Central Google Scholar
Nicod, C., Banaei-Esfahani, A. & Collins, B. C. Elucidation of host–pathogen protein–protein interactions to uncover mechanisms of host cell rewiring. Curr. Opin. Microbiol. 39, 7–15 (2017).
Article CAS PubMed PubMed Central Google Scholar
Martinez-Martin, N. Technologies for proteome-wide discovery of extracellular host–pathogen interactions. J. Immunol. Res. 2017, 2197615 (2017).
Article PubMed PubMed Central Google Scholar
Wood, L. & Wright, G. J. Approaches to identify extracellular receptor–ligand interactions. Curr. Opin. Struct. Biol. 56, 28–36 (2019).
Article CAS PubMed Google Scholar
Wang, E. Y. et al. High-throughput identification of autoantibodies that target the human exoproteome. Cell Rep. Methods 2, 100172 (2022).
Korotkova, N. et al. A subfamily of Dr adhesins of Escherichia coli bind independently to decay-accelerating factor and the N-domain of carcinoembryonic antigen. J. Biol. Chem. 281, 29120–29130 (2006).
Article CAS PubMed Google Scholar
Berger, C. N., Billker, O., Meyer, T. F., Servin, A. L. & Kansau, I. Differential recognition of members of the carcinoembryonic antigen family by Afa/Dr adhesins of diffusely adhering Escherichia coli (Afa/Dr DAEC). Mol. Microbiol. 52, 963–983 (2004).
Article CAS PubMed Google Scholar
Garrett, W. S. et al. Enterobacteriaceae act in concert with the gut microbiota to induce spontaneous and maternally transmitted colitis. Cell Host Microbe 8, 292–300 (2010).
Article CAS PubMed PubMed Central Google Scholar
Brbic, M. et al. The landscape of microbial phenotypic traits and associated genes. Nucleic Acids Res. 44, 10074–10090 (2016).
CAS PubMed PubMed Central Google Scholar
Jung, P. et al. Isolation and in vitro expansion of human colonic stem cells. Nat. Med. 17, 1225–1227 (2011).
Article CAS PubMed Google Scholar
Lee, S. M. et al. Bacterial colonization factors control specificity and stability of the gut microbiota. Nature 501, 426–429 (2013).
Article ADS CAS PubMed PubMed Central Google Scholar
Van Rossum, T., Ferretti, P., Maistrenko, O. M. & Bork, P. Diversity within species: interpreting strains in microbiomes. Nat. Rev. Microbiol. 18, 491–506 (2020).
Article PubMed PubMed Central Google Scholar
Crost, E. H. et al. Utilisation of mucin glycans by the human gut symbiont Ruminococcus gnavus is strain-dependent. PLoS ONE 8, e76341 (2013).
Article ADS CAS PubMed PubMed Central Google Scholar
Hall, A. B. et al. A novel Ruminococcus gnavus clade enriched in inflammatory bowel disease patients. Genome Med. 9, 103 (2017).
Article PubMed PubMed Central Google Scholar
Kostic, A. D. et al. Genomic analysis identifies association of Fusobacterium with colorectal carcinoma. Genome Res. 22, 292–298 (2012).
Article CAS PubMed PubMed Central Google Scholar
Castellarin, M. et al. Fusobacterium nucleatum infection is prevalent in human colorectal carcinoma. Genome Res. 22, 299–306 (2012).
Article CAS PubMed PubMed Central Google Scholar
Kostic, A. D. et al. Fusobacterium nucleatum potentiates intestinal tumorigenesis and modulates the tumor–immune microenvironment. Cell Host Microbe 14, 207–215 (2013).
Article CAS PubMed PubMed Central Google Scholar
Gur, C. et al. Fusobacterium nucleatum supresses anti-tumor immunity by activating CEACAM1. Oncoimmunology 8, e1581531 (2019).
Article PubMed PubMed Central Google Scholar
Abed, J. et al. Colon cancer-associated Fusobacterium nucleatum may originate from the oral cavity and reach colon tumors via the circulatory system. Front. Cell. Infect. Microbiol. 10, 400 (2020).
Article CAS PubMed PubMed Central Google Scholar
Parhi, L. et al. Breast cancer colonization by Fusobacterium nucleatum accelerates tumor growth and metastatic progression. Nat. Commun. 11, 3259 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Matsui, S. et al. Human Fat2 is localized at immature adherens junctions in epidermal keratinocytes. J. Dermatol. Sci. 48, 233–236 (2007).
Article CAS PubMed Google Scholar
Jonca, N. et al. Corneodesmosomes and corneodesmosin: from the stratum corneum cohesion to the pathophysiology of genodermatoses. Eur. J. Dermatol. 21, 35–42 (2011).
Article CAS PubMed Google Scholar
Johnson, N. C. XG: the forgotten blood group system. Immunohematology 27, 68–71 (2011).
Article CAS PubMed Google Scholar
Uhlen, M. et al. Proteomics. Tissue-based map of the human proteome. Science 347, 1260419 (2015).
Article PubMed Google Scholar
Bourhis, E. et al. Wnt antagonists bind through a short peptide to the first β-propeller domain of LRP5/6. Structure 19, 1433–1442 (2011).
Article CAS PubMed Google Scholar
Kahn, M. Can we safely target the WNT pathway? Nat. Rev. Drug Discov. 13, 513–532 (2014).
Article CAS PubMed PubMed Central Google Scholar
Anastas, J. N. & Moon, R. T. WNT signalling pathways as therapeutic targets in cancer. Nat. Rev. Cancer 13, 11–26 (2013).
Article CAS PubMed Google Scholar
Carvalheiro, T. et al. Leukocyte associated immunoglobulin like receptor 1 regulation and function on monocytes and dendritic cells during inflammation. Front. Immunol. 11, 1793 (2020).
Article CAS PubMed PubMed Central Google Scholar
Weiskopf, K. et al. Engineered SIRPα variants as immunotherapeutic adjuvants to anticancer antibodies. Science 341, 88–91 (2013).
Article ADS CAS PubMed Google Scholar
Blondel, C. J. et al. CRISPR/Cas9 screens reveal requirements for host cell sulfation and fucosylation in bacterial type III secretion system-mediated cytotoxicity. Cell Host Microbe 20, 226–237 (2016).
Article CAS PubMed PubMed Central Google Scholar
Sauer, M. M. et al. Catch-bond mechanism of the bacterial adhesin FimH. Nat. Commun. 7, 10738 (2016).
Article ADS CAS PubMed PubMed Central Google Scholar
Adrian, J., Bonsignore, P., Hammer, S., Frickey, T. & Hauck, C. R. Adaptation to host-specific bacterial pathogens drives rapid evolution of a human innate immune receptor. Curr. Biol. 29, 616–630.e5 (2019).
Article CAS PubMed Google Scholar
Baker, E. P. et al. Evolution of host–microbe cell adherence by receptor domain shuffling. eLife 11, e73330 (2022).
Xiang, H. et al. Crystal structures reveal the multi-ligand binding mechanism of Staphylococcus aureus ClfB. PLoS Pathog. 8, e1002751 (2012).
Article CAS PubMed PubMed Central Google Scholar
UniProt, C. UniProt: the universal protein knowledgebase in 2021. Nucleic Acids Res. 49, D480–D489 (2021).
Article Google Scholar
Carpenter, B. et al. Stan: a probabilistic programming language. J. Stat. Softw. 76, 1–32 (2017).
Article PubMed PubMed Central Google Scholar
andrewGhazi/basehitmodel: basehitmodel-0.1.0. Zenodo https://doi.org/10.5281/zenodo.10606151 (2024).
Pasolli, E. et al. Extensive unexplored human microbiome diversity revealed by over 150,000 genomes from metagenomes spanning age, geography, and lifestyle. Cell 176, 649–662.e20 (2019).
Article CAS PubMed PubMed Central Google Scholar
The Gene Ontology Consortium. The Gene Ontology Resource: 20 years and still going strong. Nucleic Acids Res. 47, D330–D338 (2019).
Article Google Scholar
Zhou, X., Kao, M. C. & Wong, W. H. Transitive functional annotation by shortest-path analysis of gene expression data. Proc. Natl Acad. Sci. USA 99, 12783–12788 (2002).
Article ADS CAS PubMed PubMed Central Google Scholar
Wang, T. & Tang, H. The physical characteristics of human proteins in different biological functions. PLoS ONE 12, e0176234 (2017).
Article PubMed PubMed Central Google Scholar
Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate — a practical and powerful approach to multiple testing. J. R. Stat. Soc. B 57, 289–300 (1995).
MathSciNet Google Scholar
Beghini, F. et al. Integrating taxonomic, functional, and strain-level profiling of diverse microbial communities with bioBakery 3. eLife 10, e65088 (2021).
Suzek, B. E. et al. UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches. Bioinformatics 31, 926–932 (2015).
Article CAS PubMed Google Scholar
Hyatt, D. et al. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11, 119 (2010).
Article PubMed PubMed Central Google Scholar
Asnicar, F. et al. Precise phylogenetic analysis of microbial isolates and genomes from metagenomes using PhyloPhlAn 3.0. Nat. Commun. 11, 2500 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Segata, N., Bornigen, D., Morgan, X. C. & Huttenhower, C. PhyloPhlAn is a new method for improved phylogenetic and taxonomic placement of microbes. Nat. Commun. 4, 2304 (2013).
Article ADS PubMed Google Scholar
Sukumaran, J. & Holder, M. T. DendroPy: a Python library for phylogenetic computing. Bioinformatics 26, 1569–1571 (2010).
Article CAS PubMed Google Scholar

Download references

Acknowledgements

We thank all members of the Palm, Ring and Huttenhower laboratories for helpful advice and assistance. This work was supported by a grant from the Leona M. and Henry B. Helmsley Charitable Trust (3083 to N.W.P. and A.M.R.). N.W.P. is additionally supported by an NIH Director’s New Innovator Award (DP2DK125119), the NIA and NIGMS (R01AG068863 and RM1GM141649), a Pew Scholar Award, the Chan Zuckerberg Initiative, Aligning Science Across Parkinson’s, F. Hoffmann-La Roche Ltd, and gifts from the Mathers Family Foundation and Ludwig Family Foundation. A.M.R. is additionally supported by an NIH Director’s Early Independence Award (DP5OD023088), a Pew-Stewart Scholar award, and gifts from the Mathers Family Foundation, the Ludwig Family Foundation and the Robert T. McCluskey Foundation. C.E.R. and N.D.S. were supported by the National Science Foundation Graduate Research Fellowship Program. The computations in this paper were run in part on the FASRC Cannon cluster supported by the FAS Division of Science Research Computing Group at Harvard University. Illustrations in Figs. 1a, 4a and 5a,c were generated with BioRender (https://biorender.com).

Author information

These authors contributed equally: Nicole D. Sonnert, Connor E. Rosen, Andrew R. Ghazi

Authors and Affiliations

Department of Immunobiology, Yale School of Medicine, New Haven, CT, USA
Nicole D. Sonnert, Connor E. Rosen, Brianna Duncan-Lowey, Jaime A. González-Hernández, John D. Huck, Yi Yang, Yile Dai, Tyler A. Rice, Mytien T. Nguyen, Deguang Song, Yiyun Cao, Anjelica L. Martin, Agata A. Bielecka, Suzanne Fischer, Aaron M. Ring & Noah W. Palm
Department of Microbial Pathogenesis, Yale School of Medicine, New Haven, CT, USA
Nicole D. Sonnert
Department of Biostatistics, Harvard T. H. Chan School of Public Health, Boston, MA, USA
Andrew R. Ghazi, Eric A. Franzosa & Curtis Huttenhower
Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA
Andrew R. Ghazi
The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
Changhui Guan & Julia Oh
Department of Pharmacology, Yale School of Medicine, New Haven, CT, USA
Aaron M. Ring

Authors

Nicole D. Sonnert
View author publications
You can also search for this author in PubMed Google Scholar
Connor E. Rosen
View author publications
You can also search for this author in PubMed Google Scholar
Andrew R. Ghazi
View author publications
You can also search for this author in PubMed Google Scholar
Eric A. Franzosa
View author publications
You can also search for this author in PubMed Google Scholar
Brianna Duncan-Lowey
View author publications
You can also search for this author in PubMed Google Scholar
Jaime A. González-Hernández
View author publications
You can also search for this author in PubMed Google Scholar
John D. Huck
View author publications
You can also search for this author in PubMed Google Scholar
Yi Yang
View author publications
You can also search for this author in PubMed Google Scholar
Yile Dai
View author publications
You can also search for this author in PubMed Google Scholar
Tyler A. Rice
View author publications
You can also search for this author in PubMed Google Scholar
Mytien T. Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Deguang Song
View author publications
You can also search for this author in PubMed Google Scholar
Yiyun Cao
View author publications
You can also search for this author in PubMed Google Scholar
Anjelica L. Martin
View author publications
You can also search for this author in PubMed Google Scholar
Agata A. Bielecka
View author publications
You can also search for this author in PubMed Google Scholar
Suzanne Fischer
View author publications
You can also search for this author in PubMed Google Scholar
Changhui Guan
View author publications
You can also search for this author in PubMed Google Scholar
Julia Oh
View author publications
You can also search for this author in PubMed Google Scholar
Curtis Huttenhower
View author publications
You can also search for this author in PubMed Google Scholar
Aaron M. Ring
View author publications
You can also search for this author in PubMed Google Scholar
Noah W. Palm
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

C.E.R., N.D.S., N.W.P. and A.M.R. designed the study. C.E.R. and N.D.S. established the BASEHIT platform and performed BASEHIT screens. C.E.R., Y.D., S.F. and A.M.R. created the exoprotein yeast display library. A.R.G. developed the BASEHIT statistical model and performed associated analysis. E.A.F. performed the global network and phylogenetic analysis. N.D.S. and C.E.R. performed all other analyses. C.E.R., N.D.S., A.A.B. and Y.C. acquired and grew bacteria for BASEHIT screens. C.E.R., N.D.S. B.D.-L., J.A.G.-H., J.D.H. and T.A.R. contributed essential reagents for and performed orthogonal validations. N.D.S., B.D.-L. and J.A.G.-H. performed the in vitro functional experiments. N.D.S., Y.Y., M.T.N. and D.S. assessed potential phenotypes and performed the in vivo experiments. Y.Y. performed the whole-genome sequencing of Staphylococcus strains. C.G. and J.O. contributed Staphylococcus strains. A.L.M. assisted with the gnotobiotic mouse experiments. C.H., A.M.R. and N.W.P. supervised the study. C.E.R., N.D.S., A.R.G., E.A.F., C.H., A.M.R. and N.W.P. wrote the paper with input from all authors.

Corresponding authors

Correspondence to Aaron M. Ring or Noah W. Palm.

Ethics declarations

Competing interests

C.E.R., N.W.P. and A.M.R. are inventors of patents related to the BASEHIT technology and specific host–microorganism interactions discovered through BASEHIT. N.W.P. is a co-founder of Artizan Biosciences and Design Pharmaceuticals. All other authors declare no competing interests.

Peer review

Peer review information

Nature thanks Mikhail Savitski and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Yeast exoproteome library composition and diversity and bacterial strain collection composition and diversity.

a, Extracellular protein sequences are curated and cloned into a standardized backbone featuring a C-terminal epitope tag. Proper display is confirmed via epitope tag staining, as well as binding by confirmation-specific antibodies or endogenous ligands for a subset of proteins. b, Schematic of expression construct used in the yeast display library. c, Proportion of the human exoproteome represented in the yeast display library. d, Each protein is represented by multiple barcodes, with a median of 20 barcodes per protein. Boxplot shows median, IQR, and whiskers extend to 1.5x IQR for n = 3,406 epitopes from 3,336 proteins in the library. e, Tissue expression (defined as Normalized Expression (NX) > 10 in the Human Protein Atlas) of proteins in the library, grouped by barrier, immune, and sterile tissues. f, Percentage of proteins in the library belonging to highly represented protein families. g, Number of strains from indicated genera, showing all genera with 9 or more strains. h, Number of strains from indicated species, showing all species with 5 or more strains. i, Number of strains from different body sites, showing all body sites with 5 or more strains.

Source Data

Extended Data Fig. 2 BASEHIT optimization with AIEC identifies conditions that yield selectivity and specificity and are broadly specific across diverse known host-microbe interactions.

a, Enrichment of CD55 and CEACAM1 by AIEC using different bead:cell ratios. Enrichment is defined as the fold change in frequency of reads for the indicated protein in the post-selection library relative to the pre-selection library. Enrichment of both CD55 and CEACAM1 decreases with increasing cell:bead ratio. b, Enrichment of CD55 and CEACAM1 by AIEC labelled with variable concentrations of sulfo-NHS-biotin reagent. Increasing or decreasing concentrations of biotin decrease enrichment of CD55 and CEACAM1. c, Enrichment of CD55 and CEACAM1 by various E. coli strains with or without expression of Dr-family adhesins as indicated. CD55 and CEACAM1 are specifically enriched by the Dr-adhesin containing AIEC strain. d, Exoproteome-wide host exoprotein binding pattern of AIEC determined by BASEHIT. CD55 and CEACAM1 are enriched substantially more than any other protein. Data in a, b represent the mean ± s.d., from n = 3 independent samples. e, Diverse bacterial strains with previously described interactions with human exoproteins were screened by BASEHIT and assessed for enrichment. Interactions that were successfully detected by BASEHIT are shown as filled circles, while interactions that BASEHIT failed to detect are shown as empty circles. The overall rate of detection of previously reported interactions (54%) is shown in the pie chart on the right.

Source Data

Extended Data Fig. 3 Impacts of biotinylation and bacterial cell density on the detection of interactions via BASEHIT.

a, Four bacterial strains with differing interaction profiles were grown and labeled with a titration of biotin ranging from 50 nM to 500 µM and then screened by BASEHIT. The enrichments of each protein hit are shown across all conditions, along with the enrichments of two predicted inert proteins — the coronaviral spike protein 229E-S1, and the arylsulfatase ARSA, which serve as internal negative controls. The biotin concentration used for labelling in our large-scale screen (5 µM) is highlighted in teal. Across all tested interactions, 5 µM biotin exhibited enrichments within two-fold of the “optimal” condition, and no appreciable enrichment of inert proteins was observed under any conditions. Data represent the mean ± s.d. from n = 3 independent experiments. b, Five strains were screened via BASEHIT at bacterial amounts ranging from 50 µL of 0.25 OD/mL to 10 OD/mL per well. The enrichments of hits identified in the BASEHIT screen, as well as the predicted inert proteins 229E-S1 and ARSA. The density used in our large-scale BASEHIT screen, 5 OD/mL, is highlighted in each graph. Across all tested interactions, an input of 50 µL of 5 OD/mL provided enrichment within two-fold of the “optimal” condition, and no appreciable enrichment of inert proteins was observed under any conditions. The density of bacterial particles was determined via volumetric counts for 97 strains used in our large-scale BASEHIT screen (all strains were at ~5 OD/mL). The five strains selected approximated the lower and upper bounds of particle density (~1 × 10⁷ to ~3 × 10⁸ particles/mL).

Source Data

Extended Data Fig. 4 Modelling and scoring procedure metrics.

a, A histogram of the protein barcode representation in the input library. The wide spread on the log₁₀ x-axis indicates a high degree of variability. The model accounts for this by using barcode input concentration as an offset term. Each tick mark across the x-axis below the histogram represents a protein. b, A Venn diagram showing interaction counts that pass each of the three hit-calling thresholds for the standard threshold set (95% interval excludes zero, estimated effect size > 0.5, and concordance score > 0.75). c, A plot of normalized counts demonstrating the utility of the concordance threshold. Both interactions shown have about the same interaction score (around 1.9) and similarly variable inputs in the Pre library (top panels), but the concordance between normalized output counts (bottom panels) in the TFF2:HM645 interaction is much higher than in SLC6A9:HM1171. Grey cells represent zero counts. d, A histogram of concordance scores for all interactions in the assay. Dashed vertical lines indicate the stringent and standard thresholds. e, Saturation curves from repeated rarefaction analysis. Given that both sets of thresholds have roughly plateaued, we can conclude that we have identified most of the interactions that are detectable under the experimental conditions. f, Comparison of the results of an initial run of the scoring method against five repeated runs where the standard deviation of the normal prior on interaction scores varied from 0.075 to 0.3. Each dot represents the score of a particular interaction. Only interactions that were a hit in at least one run are shown. The middle panel uses the same value as the initial run, showing the extent of Monte Carlo error. As expected, the rank and relative magnitude of scores are highly consistent between runs, while narrower priors lead to lower scores and fewer hits and wider priors lead to higher scores and more hits. The two distinct groups of interactions visible in the panels with wide priors represent subpopulations of interactions that are either more or less amenable to the zero-inflation component of the model.

Source Data

Extended Data Fig. 5 Proteins from multiple tissues bind bacteria with a power-log distribution, and bacteria from different tissues or phyla show similar distributions of host protein binding.

a, Plot of number of bacterial strains bound (interaction called as a hit) for proteins expressed in multiple host tissues. Tissues expression is defined as Human Protein Atlas normalized expression NX > 10. b, Plot of number of proteins bound (interaction called as a hit) for all bacteria with hits as well as for all bacteria including non-binders. c, Same plot as b but depicting strains isolated from specific tissues. Maximum and mean reported for bacteria with one or more hits. d, Same plot as b but depicting strains from indicated phyla. Maximum and mean reported for bacteria with one or more hits.

Source Data

Extended Data Fig. 6 Biophysical properties are significantly different between interacting and non-interacting proteins.

Proteins which bound at least one bacterial strain (“Targets”) are compared with “Non-targets” for various biophysical properties as indicated. FDR shown is for a two-tailed Wilcoxon Rank-Sum test. Box plots show median, IQR, and whiskers extending to 1.5x IQR, for n = 631 “Targets” and n = 2,705 “Non-targets”.

Source Data

Extended Data Fig. 7 Relationships between similarity in strains’ interaction profiles and their phylogenetic distance.

a, We computed a phylogenetic tree over 108 genomes of tested strains based on ~ 400 broadly distributed protein families. We compared distances in this tree with similarity of strains’ interaction profiles using Spearman correlation (n = 5,565 strain pairs). Phylogenetic distance is expressed in units of amino acid substitutions per amino acid site. Interaction similarity was measured as the Jaccard overlap score between strains’ sets of human protein binding partners (ignoring strains with no binding partners). b, We separately considered the subset of n = 907 strain pairs with phylogenetic distance <0.02 substitutions per site, which was largely synonymous with a conspecific relationship in taxonomy. In both regimes, interaction similarity and phylogenetic distance were strongly and significantly negatively correlated. In both cases a two-tailed Mantel test with 10⁴ permutations with FDR adjustments was performed.

Source Data

Extended Data Fig. 8 Superbinder Staphylococcus show highly overlapping sub-networks.

a, Network of 7S. pasteuri and 8 other Staphylococcus superbinders, highlighted in green and orange respectively. The 5 proteins bound by the most strains are labeled. b, Overlap in interaction profiles across strains. Proteins are binned according to whether they are bound by more than half of the S. pasteuri strains (“Pasteuri core”), or by multiple or only one superbinder strains (“Multiple” and “Unique”, respectively). c, Top proteins bound by multiple superbinders. Overall interaction profiles of proteins bound by 7 or more superbinder strains are colored according to the strains they recognize, including all other Staphylococcus strains as well as non-Staphylococcus strains. d, Interactions for skin-expressed proteins CDSN, FAT2, and XG for all 519 bacterial strains organized by tissue of origin. Dashed red line at 0.5 represents hit threshold.

Source Data

Extended Data Fig. 9 Phylogenetic specificity of interactions with tissue-specific proteins across all tested strains.

The interaction scores for all 519 tested strains are shown for the indicated proteins, which are highlighted in Fig. 4a. Strains are colored by phylum, and all scores above the hit threshold line at 0.5 are indicated and labeled with the genus of the strain. Parentheses indicate the frequency of hits within a genus.

Source Data

Extended Data Fig. 10 Ruminococcus gnavus and Fusobacterium strains influence host cell binding and function.

a, Representative flow cytometry plots of CD7-binding and non-binding R. gnavus strains labelling mock, CD7-, and CD55-expressing EXPI293 cells as shown in Fig. 5b. b, Representative flow cytometry plots of THP-1 phagocytosis of CFSE-labelled Fusobacterium spp. and of fluorescein-labelled E. coli K12 BioParticles incubated with unlabelled Fusobacterium spp. from Fig. 5d,e.

Source Data

Supplementary information

Supplementary Information

Reporting Summary

Supplementary Tables 1–21

Source data

Source Data Fig. 1

Source Data Fig. 2

Source Data Fig. 3

Source Data Fig. 4

Source Data Fig. 5

Source Data Extended Data Fig. 1

Source Data Extended Data Fig. 2

Source Data Extended Data Fig. 3

Source Data Extended Data Fig. 4

Source Data Extended Data Fig. 5

Source Data Extended Data Fig. 6

Source Data Extended Data Fig. 7

Source Data Extended Data Fig. 8

Source Data Extended Data Fig. 9

Source Data Extended Data Fig. 10

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Sonnert, N.D., Rosen, C.E., Ghazi, A.R. et al. A host–microbiota interactome reveals extensive transkingdom connectivity. Nature 628, 171–179 (2024). https://doi.org/10.1038/s41586-024-07162-0

Download citation

Received: 07 May 2022
Accepted: 05 February 2024
Published: 20 March 2024
Issue Date: 04 April 2024
DOI: https://doi.org/10.1038/s41586-024-07162-0

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Access options

Similar content being viewed by others

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Extended data figures and tables

Supplementary information

Source data

Rights and permissions

About this article

Cite this article

Share this article

Comments

Search

Quick links