Global analysis of biosynthetic gene clusters reveals conserved and unique natural products in entomopathogenic nematode-symbiotic bacteria

Microorganisms contribute to the biology and physiology of eukaryotic hosts and affect other organisms through natural products. Xenorhabdus and Photorhabdus (XP) living in mutualistic symbiosis with entomopathogenic nematodes generate natural products to mediate bacteria–nematode–insect interactions. However, a lack of systematic analysis of the XP biosynthetic gene clusters (BGCs) has limited the understanding of how natural products affect interactions between the organisms. Here we combine pangenome and sequence similarity networks to analyse BGCs from 45 XP strains that cover all sequenced strains in our collection and represent almost all XP taxonomy. The identified 1,000 BGCs belong to 176 families. The most conserved families are denoted by 11 BGC classes. We homologously (over)express the ubiquitous and unique BGCs and identify compounds featuring unusual architectures. The bioactivity evaluation demonstrates that the prevalent compounds are eukaryotic proteasome inhibitors, virulence factors against insects, metallophores and insect immunosuppressants. These findings explain the functional basis of bacterial natural products in this tripartite relationship.

Nature Research wishes to improve the reproducibility of the work that we publish. This form provides structure for consistency and transparency in reporting. For further information on Nature Research policies, see our Editorial Policies and the Editorial Policy Checklist.

Statistics
For all statistical analyses, confirm that the following items are present in the figure legend, table legend, main text, or Methods section.
n/a Confirmed The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement A statement on whether measurements were taken from distinct samples or whether the same sample was measured repeatedly The statistical test(s) used AND whether they are one-or two-sided Only common tests should be described solely by name; describe more complex techniques in the Methods section.
A description of all covariates tested A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons A full description of the statistical parameters including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient) AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals) For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted Give P values as exact values whenever suitable.
For Bayesian analysis, information on the choice of priors and Markov chain Monte Carlo settings For hierarchical and complex designs, identification of the appropriate level for tests and full reporting of outcomes Estimates of effect sizes (e.g. Cohen's d, Pearson's r), indicating how they were calculated Our web collection on statistics for biologists contains articles on many of the points above.

Software and code
Policy information about availability of computer code Data collection Bruker Compass DataAnalysis 4.3 was used for chromatography and mass spectrometry. Bruker TopSpin 4.0 was used for NMR data.
Geneious Prime 2021 was used for sequence data. Tecan iControl 2.0 was used for absorbance (cell viability) measurements. Collection of Xray diffraction data was processed with the program package XDS version February 5, 2021.

Data analysis
Raw genome sequencing data were trimmed using Trimmomatic 0.39. Genomes were assembled with SPAdes 3.10.1 and annotated with Prokka 1.12. Completed genome sequences were analyzed and viewed in Geneious Prime 2021. Biosynthetic gene clusters were annotated by antiSMASH 5.0, exploration by BiG-FAM 1.0.0, and classification by BiG-SCAPE 1.0.0 with the MIBiG repository 2.0 reference and PFAM database 32.0. Cytoscape 3.7.2 was used to visualize the BiG-SCAPE network. Pangenome analysis and visualization were performed by anvi'o 6.1. Statistical data of biosynthetic gene clusters were analyzed and evaluated using Origin 2020b and Excel of Microsoft office 365. Compound production in a wild-type strain or mutant, as well as MS/MS were analyzed by Bruker Compass DataAnalysis 4.3 and MetabolicDetect 2.1. IC50 values (cytotoxicity and IOC proteasome inhibition) were calculated based on sigmoidal fitting in GraphPad Prism 8.4.3 or 9.0.2. Data from insect immunity bioassays were subjected to one-way ANOVA using PROC GLM of SAS program (SAS Institute, 1989) for continuous variables. All results were plotted by using Sigma plot 12.0. Conventional crystallographic rigid body, positional, and temperature factor refinements were carried out with REFMAC5 5.0.32. Protein crystal was determined and analyzed by CCP4 Program Suite 7.1.016. Model building was performed with Coot 0.8.7.
For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors and reviewers. We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Research guidelines for submitting code & software for further information.

Life sciences study design
All studies must disclose on these points even when the disclosure is negative.

Sample size
For bacterial genomes, no sample-size calculation was performed. The involved strains (covering all strains in our collection and representing almost all Xenorhabdus and Photorhabdus taxonomy) for pangenome and sequence similarity network analysis were chosen in order to maximize our ability to obtain a comprehensive biosynthetic gene cluster atlas. Sample sizes of compound production in strains and bioassays were based on previous work (

Replication
In general, all experiments were performed at least three independent times with representative data shown. All attempts to repeat compound determination and bioassays were successful. By comparing Xenorhabdus and Photorhabdus genome sequencing data available in NCBI, we were able to assess the reproducibility of the results (that is, the biosynthetic gene clusters) from each study (https:// www.nature.com/articles/s41564-017-0039-9). Despite the different genome assemblies, the same biosynthetic gene clusters were recovered independently when using a consistent strain sample. Therefore, genome sequencing was not replicated.
For analysis of hemocyte-spreading behavior, a region with 100 cells was randomly chosen under the microscope for counting spread cells. For Galleria infection assays, the larvae were purchased and bred in-house to similar weight/size. Those deemed equivalents were randomly assigned to the experimental groups.

Blinding
Blinding was not relevant to this study, as we analyzed all sequenced genomes in our strain collection.
Reporting for specific materials, systems and methods We require information from authors about some types of materials, experimental systems and methods used in many studies. Here, indicate whether each material, system or method listed is relevant to your study. If you are not sure if a list item applies to your research, read the appropriate section before selecting a response.

Animals and other organisms
Policy information about studies involving animals; ARRIVE guidelines recommended for reporting animal research

Laboratory animals
The beet armyworm (Spodoptera exigua) was used in insect immunity assays. The wax moth (Galleria mellonella) was used for Xenorhabdus szentirmaii and the mutants thereof infection assays. Since we used only the larva of insects, the sex could not be determined until they become adults. The lepidopteran insect larvae of S. exigua were reared on an artificial diet (https:// agris.fao.org/agris-search/search.do?recordID=KR19910051407) at 25 ± 2oC and relative humidity of 60 ± 5% with 16 h:8 h (L:D) photoperiod. Under these rearing conditions, S. exigua underwent five larval instars (L1-L5) before pupation. They continue being in the L5 phase for three days and then they become prepupa. In all insect immunity experiments, we used L5 day 1 (L5D1). Adults were provided with 10% sucrose for oviposition. G. mellonella larvae were purchased from Zoohaus Haindl, Frankfurt am Main.

Wild animals
The study did not involve wild animals.
Field-collected samples S. exigua larvae were collected from Welsh onion (Allium fistulsum L.) field in Andong, Korea in 1994. The colony was reared for more than 26 years in the laboratory under the conditions described above. Insects were reared in the laboratory under conditions of 25 ± 2°C constant temperature, 16:8 h (L: D) photoperiod, and 60 ± 5 % relative humidity.

Ethics oversight
No ethical approval or guidance was required because no insect pests were used in this study.
Note that full information on the approval of the study protocol must also be provided in the manuscript.