Hydrogenotrophic methanogenesis is the key process in the obligately syntrophic consortium of the anaerobic ameba Pelomyxa schiedti

Pelomyxa is a genus of anaerobic amoebae that live in consortia with multiple prokaryotic endosymbionts. Although the symbionts represent a large fraction of the cellular biomass, their metabolic roles have not been investigated. Using single-cell genomics and transcriptomics, we have characterized the prokaryotic community associated with P. schiedti, which is composed of two bacteria, Candidatus Syntrophus pelomyxae (class Deltaproteobacteria) and Candidatus Vesiculincola pelomyxae (class Clostridia), and a methanogen, Candidatus Methanoregula pelomyxae. Fluorescence in situ hybridization and electron microscopy showed that Ca. Vesiculincola pelomyxae is localized inside vesicles, whereas the other endosymbionts occur freely in the cytosol, with Ca. Methanoregula pelomyxae enriched around the nucleus. Genome and transcriptome-based reconstructions of the metabolism suggests that the cellulolytic activity of P. schiedti produces simple sugars that fuel its own metabolism and the metabolism of a Ca. Vesiculincola pelomyxae, while Ca. Syntrophus pelomyxae energy metabolism relies on degradation of butyrate and isovalerate from the environment. Both species of bacteria and the ameba use hydrogenases to transfer the electrons from reduced equivalents to hydrogen, a process that requires a low hydrogen partial pressure. This is achieved by the third endosymbiont, Ca. Methanoregula pelomyxae, which consumes H2 and formate for methanogenesis. While the bacterial symbionts can be successfully eliminated by vancomycin treatment without affecting the viability of the amoebae, treatment with 2-bromoethanesulfonate, a specific inhibitor of methanogenesis, killed the amoebae, indicating the essentiality of the methanogenesis for this consortium.


Fluorescence in situ hybridization
FISH preparations were performed according to the protocol described in [6].For all our probes, 30% formamide was used as optimal concentration.Hybridization of the probes was performed at 46 o C for six hours then the slides were washed once at 48 o C for 15 minutes.The slides were mounted using Vectashield mounting medium with DAPI (H-1200, Vector Laboratories).Images were acquired using a Leica SP8 confocal microscope, with gating enabled.Deconvolution of the acquired images was performed using Huygens Professional 22.10 and further processing was performed using ImageJ 1.53n.

Electron microscopy
Fixation of the cells was performed using 2.5% glutaraldehyde in 0.1 M cacodylate buffer as described previously [7].The fixed cells were dehydrated in an ethanol series, transferred to acetone and then embedded in EPON resin.Ultrathin sections were prepared on an ultramicrotome (Reichert-Jung Ultracut E) with a diamond knife.Sections were stained with uranyl acetate and lead citrate and examined using a JEOL 1011 transmission electron microscope.

Transcriptome amplification
Transcriptome sequencing was performed based on a modified SmartSeq2 protocol [8] designed to capture also the bacterial transcripts.The modifications include bacterial cell lysis either using rLysozyme or Mutanolysin, polyadenylation of the transcripts using PolyA polymerase and blocking of the adenylation of rRNA using the EMBR-seq strategy [9].EMBR primers were designed based on the 16S, 23S, 5S, for bacteria and 18S, 28S and 5.8S for the eukaryote.20 µM of EMBR primer pools were prepared by mixing equimolar concentrations of the ordered EMBR primers (Table S2).
Briefly, a single cell of Pelomyxa schiedti was isolated by micromanipulation, washed two times in the same buffer, before being placed in a clear 0.2 ml PCR tube containing 0.1 µl of Recombinant RNAse inhibitor (Takara Bio, USA) and 0.9 µl of 0.2% Triton X-100 (Sigma-Aldrich, USA).The bacteria were lysed either with rLysozyme Solution (Merck Millipore, USA) or Mutanolysin (Sigma-Aldrich, USA).When rLysosyme was used, the stock solution of rLysozyme was diluted 100x using nuclease-free water.From the diluted rLysozyme, 0.7 µl were added to the isolated cell together with 0.3 µl of PolyA Polymerase buffer (M0276S, New England Biolabs, UK).When the bacteria were lysed with Mutanolysin, 0.5 µl of nucleasefree water, 0.2 µl of 1000U/mL Mutanolysin, and 0.3 µl PolyA Polymerase buffer were added to the picked cells.In both situations, the mixture was incubated at 37 o C for 15 minutes.Meanwhile the adenylation mixture was prepared by mixing 1.25 µl of 20 µM EMBR primer pool, 1 µl of 10 mM ATP (New England Biolabs, UK), 1 µl of PolyA polymerase (New England Biolabs, UK) and 1.75 µl of nuclease-free water.After incubation 0.5 µl of adenylation mix was added to the reaction and the tube was incubated for 10 minutes at 37 o C. Following incubation, 1 µl of 10 mM dNTP (Thermo Fisher Scientific, USA), 0.25 µl of 20 µM oligo dT and 0.25 µl of EMBR primer pool was added to the reaction then the tube was incubated at 65 o C for five minutes to inactivate the PolyA polymerase followed by 58 o C for one minute and 72 o C for three minutes then the tube was placed immediately on ice.The downstream protocol followed directly the SmartSeq2 protocol starting from the "reverse transcription" step [8].When the EMBR-seq primers were omitted, the primers volume was simply replaced by nuclease-free water.A total of 12 libraries were prepared: five libraries using rLysozyme as bacterial lysis reagent (three of these without EMBR primers, two with EMBR primers) and seven libraries using Mutanolysin (two of these without EMBR primers, five of them with EMBR primers).

Library preparation and sequencing
Transcriptome libraries were prepared using Nextera XT DNA Library preparation Kit (Illumina, San Diego, CA).Five of the prepared single-cell transcriptome libraries were sequenced on Illumina NovaSeq (PE 2x150bp) at Macrogen, Inc., South Korea, and seven of them were sequenced on Illumina NextSeq (PE 2x50bp) at Genomics Core Facility, EMBL, Heidelberg, Germany.

Genome assembly and binning
The single-cell genome assemblies were carried out using the seven single-cell genome sequencing experiments P. schiedti deposited previously [7] in NCBI as part of the Bio Project PRJNA672820.For each single-cell metagenome the raw reads were adapter and quality trimmed using Trimommatic 0.36 [10] with a quality threshold of 15.Afterwards the reads were assembled individually for each single-cell using SPAdes 3.13.0[11] in single-cell (--sc) mode and with the k-mers of 21, 33, 55, 77, 99, and 121.From the assemblies, the eukaryotic data was removed by blastn against the P. schiedti reference genome (NCBI accession GCA_020536535.1).All scaffolds that had a significant hit (>90% identity and >80% coverage) towards the reference genome were classified as eukaryotic and removed.The remaining data was binned using the tetraESOM [12] method.Afterwards, the bins from individual single-cell metagenomes that contained identical small subunit rRNA (SSU) sequences were merged together and clustered further using cd-hit-est [13] at an identity level of 99%.Only the bins that were present in all seven single-cell assemblies were analyzed further.Using this approach, we generated five bins: a Ca.Methanoregula pelomyxae.bin, a Victivales bin, a Ca.Vesiculincola pelomyxae, Ca.Syntrophus pelomyxae, and an Acetomicrobium sp.bin.
Each of the five bins were reassembled by mapping back the raw reads using BBMap from the BBTools package and taking just the mapped Illumina reads.Reassembly was performed using SPAdes 3.13.0using the same parameters as described above.The reassemblies were performed both individually for each singlecell genome, as well as merged, by combining the reads for one bin from all seven single-cells.During reassembly we also added the Nanopore reads (accession no.SRX9527949) to improve the overall contiguity of the genomes.From the resulting assemblies, sequences below 500 bp were discarded.The bins were also checked for any potential contamination using a combination of blastn and blastp as described previously [14].
Finally, for each selected bin, a manual improvement of the assemblies was performed by aligning all generated assemblies to each other using MAUVE [15] and selecting the most contiguous scaffolds from each bin until all the unique sequences are included in the final assembly.In the case of Ca.Vesiculincola pelomyxae, the resulting assemblies were extremely fragmented due to extremely high genome coverage (>5000x).To improve the assemblies, the mapped reads were further normalized using BBNorm from the BBTools package, at an average coverage of 60x.The normalized reads were used to reassemble the genome and a manual improvement step was performed using MAUVE as described above.

Gap closure of Ca. Methanoregula pelomyxae chromosome using PCR
To circularize the assembly of Ca.Methanoregula pelomyxae, specific primers were designed (Table S3) based on the MAUVE alignment to amplify a missing portion of the genome for circularization.The amplification was carried out using Q5 DNA polymerase according to the manufacturer's protocol with an annealing temperature of 55 o C for 30 seconds and extension at 72 o C for 7 minutes.This resulted in a PCR product of ~8kbp that was cloned into pJET1.2vector using CloneJET PCR Cloning Kit (Thermo Fisher Scientific, USA) and sequenced in both directions using the primer walking strategy.The resulting DNA sequence was mapped back to the contig ends and used to circularize the archaeal chromosome.

Genome annotation
The annotation of the symbiotic bacterial genome sequences was performed using Prokka 1.14.6 automatic annotation pipeline [16].This annotation was further completed using 18].
All the annotation sources were merged in a single GenBank file using emapper2gbk [19] and imported into Pathway Tools v 23.0 [20], for further curation and analysis.For all pathways of interest, the annotations were manually curated.

Phylogenetic analysis
For SSU rRNA gene phylogenies several datasets were created by downloading closely related sequences from NCBI.Where it was the case, the 16S rRNA sequences of the bacterial symbionts of Pelomyxa palustris [21] were also included in the datasets.The sequences were aligned using MAFFT 7 [22] with the G-INS-i algorithm followed by manual inspection.Trimming of the alignment was done using BMGE [23] using default parameters.Maximum Likelihood (ML) phylogenetic trees were constructed using IQ-TREE 1.6.12[24] with the Model Finder Plus setting and with 1000 non-parametric bootstrap replicates.
For phylogenomic analysis of Ca.Vesiculincola pelomyxae, we used GTDB-tk [25] to generate a phylogenomic dataset from the 120 conserved bacterial genes.We used GTDB-tk identify to get the single copy genes from the genome of Ca.Vesiculincola pelomyxae and GTDB-tk align to align these genes to representatives already present in the GTDB database [26].We limited the database to the families Acutalibacteraceae, Ruminococcaceae and Ethanoligenenaceae from the order Oscilospiralles.The aligned and untrimmed dataset generated by GTDB-tk align, was further refined to keep at least one representative for each genus from the dataset.The alignment was trimmed using BMGE, with default parameters.The final alignment contained 230 taxa and 27340 amino acid positions.A ML phylogenetic tree was inferred by IQ-TREE 1.6.12using the Posterior Mean Site Frequency (PMSF) empirical model with a LG+F+G guide tree.
The branch supports were estimated using the ultrafast bootstrapping strategy with 10000 bootstrap replicates.

Expression analysis
Raw reads generated by transcriptome sequencing were adapter and quality trimmed using fastp [27].
The trimmed reads were mapped and separated by BBSplit from the BBTools package, using the four genomes (Pelomyxa schiedti, Ca.Vesiculincola pelomyxae, Ca.Methanoregula pelomyxae, and Ca.Syntrophus pelomyxae) as references.The mapped reads for each genome were merged together from all seven transcriptomes and mapped back to their respective genomes using Bowtie 2 [28].FeatureCounts [29] was used to assign each read to gene features.The read count was used to calculate Transcripts Per Million (TPM) for each gene using a python script.TPM was used to evaluate expression levels of the genes for each organism.