Anaerococcus urinimassiliensis sp. nov., a new bacterium isolated from human urine

To date there are thirteen species validly assigned to the genus Anaerococcus. Most of the species in this genus are anaerobic and of human origin. Anaerococcus urinimassiliensis sp. nov., strain Marseille-P2143T is member of family Peptoniphilaceae, which was isolated from the urine of a 17-year-old boy affected by autoimmune hepatitis and membranoproliferative glomerulonephritis using the culturomic approach. In the current study, a taxono-genomics method was employed to describe this new species. The strain Marseille-P2143T was gram positive cocci with translucent colonies on blood agar. Its genome was 2,189,509 bp long with a 33.5 mol% G + C content and exhibited 98.48% 16S rRNA similarity with Anaerococcus provencensis strain 9,402,080. When Anaerococcus urinomassiliensis strain Marseill-P2143T is compared with closely related species, the values ranged from 71.23% with A. hydrogenalis strain DSM 7454T (NZ_ABXA01000052.1) to 90.64% with A. provencensis strain 9402080T (NZ_HG003688.1). This strain has implemented the repertoire of known bacteria of the human urinary tract.

The genus Anaerococcus belonging to the phylum Firmicutes, was first described in 2001 1 . Members of this bacterial genus are mainly anaerobic gram-positive cocci 2 . They are mostly encountered in human vagina, but can also be detected in nostrils or skin 3 . Anaerococcus spp. were involved in human infections and were isolated from different sites of human body such as peritoneal, ovarian and cervical abscesses, an arthritic knee, bacteremia, foot ulcers, a sternal wound and vaginoses [4][5][6] . Actually, the genus Anaerococcus contains 13 species validly described with standing in nomenclature 7 . The culturomic concept has recently been developed in our laboratory as an alternative method to expand the human gut repertoire through the multiplication of culture conditions with a rapid identification method by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) [8][9][10][11] . Isolation and identification of microorganisms by culturomic can be used for further studies 12 as well as for their diagnosis and/or therapeutic potential. The systematic description of new bacterial species recovered from patients may contribute to the description of emerging infections, but can also led to other discoveries. For example, the strain Eubacterium limosum isolated by culture in the gut sample has been used to test biotransformation's of specific pollutants, methoxychlor and dichlorodiphenyltrichloroethane (DDT) insecticides 13 . The studies conducted on probiotic Escherichia coli strain Nissle 1917, show that a visceral analgesic can be produced by bacterial strain, which could be the basis of the development of new visceral pain therapies 14 .
Currently clinical trials using bacterial cocktails are being used to restore dybiosis by fecal transplantation.
We report here, through a taxono-genomics strategy 15 , the description of Anaerococcus urinimassiliensis strain Marseille-P2143 (= CSUR P2143 = DSM 103,473), a new bacterium isolated from the urine of a young boy affected by autoimmune hepatitis associated with membranoproliferative glomerulonephritis, and classified into Peptoniphilaceae family. This new bacterial species was shortly described in a new species announcement in 2016 16 .

Strain isolation and identification by MALDI-TOF MS.
The initial growth was obtained after 10 days of incubation in an anaerobic blood culture vial (Becton Dickinson, Le Pont-de-Claix, France) supplemented with 5 mL of 0.2 μm filtered rumen fluid. A pure culture of strain Marseille-P2143 was then obtained after 48 h of incubation at 37 °C on 5% sheep blood-Columbia agar medium (bioMérieux, Marcy l'Etoile, France) in anaerobic atmosphere generated using the GENbag Anaer system (bioMérieux) as previously described 16 . Strain Marseille-P2143 T was not identified by Matrix Assisted Laser Desorption Ionization-Time of Flight Mass Spectrometry (MALDI-TOF MS), after several attempts essayed as described elsewhere 17 . The screening was performed on a Microflex LT spectrometer (Bruker, Daltonics, Bremen, Germany) as previously reported 18 . The reference spectrum obtained ( Fig. 1) was imported and analyzed using the Biotyper software (version 3.0) 19 against the Bruker database, which was continually incremented with local URMS database (https ://www.medit erran ee-infec tion.com/urms-data-base/).

Strain identification and phylogenetic tree.
In order to classify this bacterium, the 16S rRNA gene was amplified using the primer pair fD1 and rP2 (Eurogentec, Angers, France) and sequenced using the Big Dye Terminator v1.1 Cycle Sequencing Kit and 3500xLGenetic Analyzer capillary sequencer (Thermofisher, Saint-Aubin, France) as previously described 12 . The 16S rRNA nucleotide sequence was assembled and corrected using CodonCode Aligner software (http://www.codon code.com). For phylogenetic analysis, sequences of the phylogenetically closest species were obtained after performing a BLASTn search within the NCBI Blastn 16S rRNA Sequence Reference Base for closest related species to calculate sequence similarities of the 16S rRNA genes (refseq_rna) (https ://blast .ncbi.nlm.nih.gov/Blast .cgi?PROGR AM=blast n&PAGE_TYPE=Blast Searc h&LINK_LOC=blast home). "The All-Species Living Tree" Project of Silva 20 . The alignment was performed using MUSCLE 21 . The evolutionary history was inferred using the Maximum Likelihood method based on the Tamura-Nei model 22 . The tree with the highest log likelihood (-5398.79) is shown. The percentage of trees in which the associated taxa clustered together is shown next to the branches. Initial tree(s) for the heuristic  www.nature.com/scientificreports/ search were automatically obtained by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using the Maximum Composite Likelihood (MCL) approach, and then selecting the topology with superior log likelihood value. The tree is drawn to scale, with branch lengths measured in the number of substitutions per site. The analysis involved 18 nucleotide sequences. The codon positions included were 1st + 2nd + 3rd + Noncoding. All positions containing gaps and missing data were eliminated. There were a total of 1184 positions in the final dataset. Evolutionary analyses were conducted in MEGA software (version X) (https ://www.megas oftwa re.net/)23.
Phenotypic characteristics and biochemical features. The optimum growth condition of the strain was determined by culturing the strain under different temperatures, atmospheres, PH and salinity. The strains were cultured and incubated under aerobic, anaerobic (GENbag anaer, bioMérieux Limited, France) and microaerophilic (GENbag Microaer, bioMérieux Limited, France) conditions on Columbia agar enriched with 5% sheep blood (bioMérieux Limited, France) at the following temperatures: 25, 28, 37, 45, and 55 °C. The pH conditions used were 5.5, 6, 6.5, 7, 7.5, 8, 8.5, 9, 9.5 and the salinity conditions used were the following: 0%, 5%,10%, 25%, 50%, 100%. The phenotypic characteristics of the strain such as Gram staining, motility, oxidase and catalase activities were determined using standard microbiological methods as previously described 23 . These phenotypic and biochemical characteristics were tested for strain Marseille-P2143 incubated at 37 °C for 48 h. The use of carbon sources was assayed with API 50 CH strips. The API 50 CH strips were interpreted after incubation at 37 °C for 24 h. Antibiotic susceptibility was determined using disc diffusion plate method according to the instructions of the CA-SFM / EUCAST 2020 (https ://www.eucas t.org/clini cal_break point s_and_dosin g/ eucas t_setti ng_break point s/), in reference to the EUCAST disk diffusion method for susceptibility testing of the Bacteroides fragilis group isolates 24 . Antibiotic discs used were the following: erythromycin (15 μg/ml), penicillin G (10 UI), doxycycline (30 μg/ml), rifampicin (30 μg/ml), vancomycin (30 μg/ml), clindamycin (15 μg/ ml), fosfomycin (50 μg/ml), amoxicillin (25 μg/ml), colistin (15 μg/ml), gentamycin (500 μg/ml), amoxicillinclavulanic acid (30 μg/ml), ceftriaxone (30 μg/ml), colistin (50 μg/ml), trimethoprim-sulfamethoxazole (25 μg/ ml), oxacillin (5 μg/ml), imipenem (10 μg/ml), tobramycin (10 μg/ml), and metronidazole (4 μg/ml). For fatty acids analysis, the bacterial strains cultivated on cos medium after 48 h in aerobic condition, were collected in triplicates in three tubes with approximately the same amount of biomass and were then weighed. We obtained an average of 130 mg of biomass per tube and cellular fatty acid methyl ester (FAME) analysis was performed by Gas Chromatography/Mass Spectrometry (GC/MS) as described by Sasser et al. 25 . GC/MS analyses were carried out as previously described 26  Genome sequencing and assembly. Genomic DNA was extracted using the EZ1 biorobot with the EZ1 DNA tissue kit (Qiagen, Hilden, Germany) and then sequenced on a MiSeq sequencer (Illumina Inc, San Diego, CA, USA) with the Nextera Mate Pair sample prep kit and Nextera XT Paired End (Illumina), as previously described 27 . The assembly was performed using a pipeline containing several softwares (Velvet 28 , SPAdes 29 and SOAP Denovo 30 ) and trimmed (MiSeq and Trimmomatic 31 softwares) or untrimmed data (only MiSeq software). GapCloser software 32 was used to reduce assembly gaps. Scaffolds < 800 base pairs (bp) and scaffolds with a depth value lower than 25% of the mean depth were removed. The best assembly was selected using different criteria (number of scaffolds, N50, number of N). The degree of genomic similarity of strain Marseille-P2143 T (NZ_FQRX00000000.  33 . These genomes were then aligned with scapper (https ://githu b.com/tseem ann/scapp er). These aligned genomes were eventually used to build the phylogenetic tree using the Maximum likelihood method from MEGA X software 34 .
Genome annotation and analysis. The prediction was performed using prodigal in the open reading frame (ORF) 35 with default parameters. Planned ORFs covering a sequencing gap region (containing N) were excluded. The bacterial proteome was predicted with BLASTP (E-value of 1e03, coverage of 0.7 and identity percentage of 30) against the database of orthologic group clusters (COG). If no matches were found, we searched the nr database 36 using BLASTP with an E-value of 1e03, a coverage of 0.7 and an identity percentage of 30. An E-value of 1e05 was used only if the sequence length was less than 80 amino acids (aa). The domains maintained by the PFAM (PFAM-A and PFAM-B domains) were searched on each protein using the hmmscan analysis tool. RNAmmer 37 and the tRNAScanSE tools 38 were used to find rRNA and tRNA genes. When BLASTP E-value was lower than 1e-03 for alignment length greater than 80 amino acids, ORFans are identified. However, when alignment lengths smaller than 80 amino acids were obtained, an E-value of 1e-05 was used. Artemis 39 was used for data management and visualization of genomic characteristics. The in-house MAGI software was used to analyze the average level of similarity of nucleotide sequences at the genome level. It calculated the average genomic identity of gene sequences (AGIOS) among the genomes compared 15 . This software combines Proteinortho software 40 to detect orthologic proteins in pairwise genomic comparisons. Then, the corresponding genes were recovered and the average percentage identity of nucleotide sequences among the orthological ORFs was determined using the Needleman-Wunsch global alignment algorithm. We also used the Genome-to- Ethical approval. All the methods were carried out in accordance with relevant guidelines and regulations conformed to Declaration of Helsinki.

Results and discussion
Identification of Strain Marseille-P2143. The mass spectrum of strain Marseille-P2143 was not present in the MALDI-TOF MS Bruker database. Thus, we were not able to identify this strain using this instrument. However, the 16S rRNA gene sequencing analysis indicated that it exhibited 98.48% sequence similarity with Anaerococcus provencensis strain 9402080T (Genbank accession number NR_133036.1), the phylogenetically closest species after blast in the NCBI database (Fig. 2). We consequently proposed to classify strain Marseille-P2143 Type strain as a new species within the genus Anaerococcus 43 .

Conclusion
On the basis of unique phenotypic features, including the MALDI-TOF spectrum, a 16S rRNA sequence similarity lower than < 98.65% and, an OrthoANI and a DDH values lower than 95% and 70% respectively with the phylogenetically closest species with standing in nomenclature, we formally propose strain Marseille-P2143 T as the type strain of Anaerococcus urinimassiliensis sp. nov., a new species within the genus Anaerococcus. Anaerococcus urinimassiliensis (u.ri.ni.mas.si.li.en'sis. N.L.adj.masc. urinimmassiliensis, composed of urini, from the latin urina, urine and massiliensis, from Massilia, the roman name of Marseille, France, where the strain Marseille-P2143 was first isolated. The colonies are thin, translucent and 50 µm in diameter. Cells are Gram-positive and anaerobic cocci. Cells have a diameter ranging from 0.5 to 0.7 µm. They do not produce catalase and oxidase, but exhibit alkaline phosphatase, leucine arylamidase, esterase lipase, α-galactosidase, naphthol-AS-BI-phosphohydrolase and β-galactosidase. d-glucose, d-galactose, d-maltose, d-lactose, d-fructose, N-acetyl-glucosamine, salicin, and sucrose are metabolized. The genome of strain Marseille-P2143 is 2,189,509 bp long with a 33.5 mol% G + C content. Its 16S rRNA gene sequence and whole-genome sequence are deposited in GenBank under accession numbers LN898272.1 and NZ_FQRX00000000.1, respectively. The type strain Marseille-P2143 T (= CSUR P2143 = DSM 103,473) was isolated from the urine of a 17-year-old boy suffering from autoimmune hepatitis and membranoproliferative glomerulo-nephritis. This new species implements the repertoire of human urinary tract known bacteria 44 . Description of Anaerococcus urinimassiliensis sp. nov. Anaerococcus urinimassiliensis (u.ri.ni.mas. si.li.en'sis. L. gen. masc. urini, of Urine and massiliensis referring to the Latin name of Marseille where strain Marseille-P2143 was cultivated). Cells are Gram-positive and non-motile, but they are negative for catalase and oxidase activities. They had a mean diameter of 0.6 µm. On blood agar after 48 h of incubation at 37 °C, colonies of strain Marseille-P2143 appear transluscent white. The optimum growth is observed at pH 7.5. Major cellular fatty acid was Hexadecanoic acid (57%), while unsaturated fatty acids such as 9-Octadecenoic acid and    License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.