Mycobacterium terramassiliense, Mycobacterium rhizamassiliense and Mycobacterium numidiamassiliense sp. nov., three new Mycobacterium simiae complex species cultured from plant roots

Three slowly growing mycobacteria named strain AB308, strain AB215 and strain AB57 were isolated from the tomato plant roots. The 16S rRNA and rpoB gene sequence analyses suggested that each strain was representative of one hitherto unidentified slowly-growing Mycobacterium species of the Mycobacterium simiae complex. Genome sequencing indicated that each strain contained one chromosome of 6.015–6.029 Mbp. A total of 1,197, 1,239 and 1,175 proteins were found to be associated with virulence and 107, 76 and 82 proteins were associated with toxin/antitoxin systems for strains AB308, AB215 and AB57, respectively. The three genomes encode for secondary metabolites, with 38, 33 and 46 genes found to be associated with polyketide synthases/non-ribosomal peptide synthases and nine, seven and ten genes encoding for bacteriocins, respectively. The genome of strain AB308 encodes for one questionable prophage and three incomplete prophages, while only incomplete prophages were predicted in AB215 and AB57 genomes. Genetic and genomic data indicate that strains AB308, AB215 and AB57 are each representative of a new Mycobacterium species that we respectively named Mycobacterium terramassiliense, Mycobacterium numidiamassiliense and Mycobacterium rhizamassiliense.

Characterization of strain AB308. Strain AB308 exhibited scotochromogenic yellow, circular, smooth colonies on Middlebrook 7H10 medium within two weeks but culture was negative in 2%-salt Middlebrook 7H10 medium. Growth occurred in the range of 28-42 °C with an optimum growth at 37 °C. Strain AB308 exhibited Gram-positive and Ziehl-Neelsen stained red cells, which were non-motile and measured 0.58 ± 0.005 µm wide and 1.32 ± 0.09 µm long (Fig. 3A). The catalase test was positive and the oxidase test was negative. Strain AB308 tested positive for acid-phosphatase, esterase, leucine and valine arylamidase, lipase, naphtol-AS-BI-phosphohydrolase and tween 80 hydrolysis but tested negative for alkaline phosphatase, alpha and beta glucosidase, beta-glucuronisidase, esculin and gelatin hydrolysis, N-acetyl-beta-glucosaminidase, niacin production, nitrate reductase, pyrazinamidase and urease. This phenotypic profile differed from those of M. interjectum and M. paraense, the two closest organisms based on 16S rRNA gene sequence similarity: growth at 42 °C, negative urease and Tween 80 hydrolysis were distinguishing strain AB308 from M. interjectum and M. paraense ( Table 2).
The draft genome sequence of strain AB308 was composed of 17 contigs assembled into 12 scaffolds without any extra-chromosomal replicons (Fig. 4). Analyzing the replication origin genome using Ori-Finder 9 predicted one OriC region (647-bp) splited by the dnaA gene (Supplementary File 2). The 647-bp predicted OriC region showed no homology sequence in DoriC database 10 (5,626,5,533,5,375 and 5,020, respectively). Of 5,730 genes predicted in the strain AB308 genome, 5,678 encode for proteins and 52 encode for RNAs including one complete ribosomal operon and 49 tRNAs. A total of 4,401 genes (77.51%) were assigned as putative function by COGs or by NR blast and 139 genes (2.45%) were identified as ORFans. The remaining 955 genes (16.82%) were annotated as hypothetical proteins. A total of 2,582 genes (45.47%) were found to be associated with the mobilome, including 217 phage proteins. Further genome analysis predicted one questionable prophage and three incomplete prophages (Fig. 5). COG analyses found (6.9%) genes coding for secondary metabolites biosynthesis, transport and catabolism, with 18 genes found to be associated with polyketide synthases and 20 genes with non-ribosomal peptide synthases. A total of 1,197 proteins were found to be associated with virulence, 107 proteins were associated with toxin/antitoxin systems (17 toxin, 26 antitoxin and 64 unidentified toxin/antitoxin proteins). Nine genes encode for bacteriocins while no gene was associated with the resistome. We identified many genes assigned to COG functional categories for transport and metabolism of lipids (9.67%), amino acid transport and metabolism (4.3%) and energy production and conversion (5.28%) ( These phenotypic, genetic and genomic features indicated that strain AB308 was representative of a previously un-described species that we named Mycobacterium terramassiliense with strain AB308 T as the type strain. Characterization of strain AB57. Strain AB57 exhibited scotochromogenic yellow, circular, smooth colonies on Middlebrook 7H10 medium within two weeks but culture was negative on 2%-salt Middlebrook 7H10 medium. Growth occurred at 28-37 °C with an optimum growth at 30 °C. Strain AB57 exhibited Gram-positive cells stained in red using the Ziehl-Neelsen staining. Cells were non-motile and measured 0.67 ± 0.01 µm wide and 1.32 ± 0.14 µm long (Fig. 3B) was positive for alkaline phosphatase, beta glucosidase, esterase, leucine and valine arylamidase, lipase and naphtol-AS-BI-phosphohydrolase but it was negative for alpha glucosidase, beta-glucuronisidase, esculin and gelatin hydrolysis, N-acetyl-beta-glucosaminidase, nitrate reductase, pyrazinamidase and urease. This phenotypic profile differed from those of M. montefiorense and M. simiae, the two closest organisms based on the 16S rRNA gene sequence similarity. Indeed, the type of pigmentation distinguished strain AB57 from M. montefiorense and M. simiae while negative urease test distinguished strain AB57 from M. simiae ( Table 2). The draft genome sequence of AB57 is composed of six contigs assembled into one 6,015,465-bp-long scaffold. Using Ori-Finder 9 , three OriC regions (488 bp, 793 bp and 110 bp) splited by the dnaA gene were predicted (Supplementary File 2). The 793-bp predicted OriC region showed the highest homology sequence with Mycobacterium indicus pranii MTCC 9506, while the 488-bp and the 110-bp Oric regions showed no homology sequence in DoriC database 10 . The genome length of strain AB57 is smaller than those of M. parascrofulaceum, strain AB215, M. triplex and AB308 strain (6.56, 6.24, 6.38 and 6.02 Mb, respectively) but larger than those of M.  16 and 65.85%, respectively). The 5,730-gene content of strain AB57 is smaller than those of M. parascrofulaceum, M. triplex, M. interjectum and strains AB215 and AB308 (6,456, 5,988, 5,953, 5,834 and 5,678, respectively) but larger than those of M. simiae, M. genavense and M. sherrisii (5,533, 5,375, 5,020, respectively). Of the 5,730 predicted genes, 5,626 were protein-coding genes and 53 were RNAs including a unique complete ribosomal operon and 50 tRNAs. A total of 4,427 genes (78.69%) were assigned as putative function by COGs or by NR blast and 127 genes (2.26%) were identified as ORFans. The remaining 991 genes (16.19%) were annotated as hypothetical proteins. A total of 2,588 genes (46%) were found to be associated with the mobilome, including 183 phage proteins. Further genome analysis predicted one incomplete prophage (Fig. 5). The genome of strain AB57 has the genetic potential to produce secondary metabolites, with 18 genes found to be associated with polyketide synthases and 28 genes with non-ribosomal peptide synthases. A total of 1,175 proteins were found to be associated with virulence, 82 proteins were associated with toxin/antitoxin systems (11 toxin, 15 antitoxin and 56 unidentified toxin/antitoxin proteins). Seven genes encoded for bacteriocins while no gene was associated with the resistome. We identified 12.03% genes assigned to COG functional categories for transport and metabolism of lipids, secondary metabolites biosynthesis, transport and catabolism (8.48%), amino acid transport and metabolism (3.96%) and energy production and conversion (5.05%) ( Table 3).  Table 4). These phenotypic, genetic and genomic features indicated that strain AB57 was representative of a previously un-described species that we named Mycobacterium rhizamassiliense with strain AB57 T as the type strain. Characterization of strain AB215. Strain AB215 exhibited scotochromogenic yellow, circular, smooth colonies on Middlebrook 7H10 medium within two weeks but culture failed on 2%-salt Middlebrook 7H10 medium. Growth occurred at 28 °C-37 °C with an optimum growth at 30 °C. Strain AB215 exhibited Gram-positive, red Ziehl-Neelsen-stained cells which were non-motile and measured 0.65 ± 0.06 µm wide and 1.23 ± 0.18 µm long (Fig. 3C). The catalase test was positive and the oxidase test was negative. Strain AB215 was positive for beta-glucosidase, esterase, leucine and valine arylamidase but it was negative for acid and alkaline phosphatase, alpha glucosidase, beta-glucuronisidase, esculin and gelatin hydrolysis, N-acetyl-beta-glucosaminidase, nitrate Growth at 42 °C  (5,678,5,626,5,533,5,375 and 5,020, respectively). Of the 5,888 predicted genes, 5,834 were protein-coding genes and 53 were RNAs including one complete ribosomal operon and 50 tRNAs. A total of 4,484 genes (76.86%) were assigned as putative function (by COGs or by NR blast) and 198 genes (3.39%) were identified as ORFans. The remaining 986 genes (16.9%) were annotated as hypothetical proteins. A total of 2,620 genes (44.91%) were found to be associated with the mobilome, including 217 phage proteins. Further genome analysis predicted two incomplete prophage regions (Fig. 5). The genome of strain AB215 has the genetic potential to produce secondary metabolites, with 15 genes found to be associated with polyketide synthases and 18 genes with non-ribosomal peptide synthases. A total of 1,239 genes were found to be associated with virulence, 76 proteins were associated with toxin/antitoxin systems (eight toxin, 16 antitoxin and 52 unidentified toxin/antitoxin proteins). Ten genes encoded for bacteriocins while no gene was associated with the resistome. We identified 11.06% genes assigned to COG functional categories for transport and metabolism of lipids, secondary metabolites biosynthesis, transport and catabolism (7.47%), amino-acid transport and metabolism (3.98%) and energy production and conversion (4.99%) ( Table 3). Strain AB215 exhibited respectively the highest average nucleotide identity and DNA-DNA  (Fig. 6, Table 4).
These phenotypic, genetic and genomic features indicated that strain AB215 was representative of a previously un-described species that we named Mycobacterium numidiamassiliense with strain AB215 T as the type strain.

Discussion
Investigating mycobacteria associated with tomato plant rhizosphere, we isolated and cultured three mycobacteria strains which exhibited phenotypic, genetic and genomic features indicative of three different, previously un-described species. Negative controls in culture experiments remained negative after a drastic decontamination of the root surface supported that indeed the three isolates are part of roots of the tomato plant. We showed the presence of mycobacteria only in the roots in agreement with an Illumina-based analysis of rice plants indicating that mycobacteria were abundant in roots, less abundant in stems and absent in seeds 1 . On the other hand, it was shown experimentally that M. avium can internalize in the roots of tomato plants and spread in its different tissues, namely stem, leaves and fruits 11 . We are here reporting on natural infection. Accordingly, the number of colony-forming units (CFUs) estimated in the roots was 60 CFU/g which is five logs lesser than the 10 6 CFU/g detected in roots in the experimentally model 11 .
Our study is the third one reporting the culture of mycobacteria in plant roots after the isolation of a Mycobacterium related to Mycobacterium sacrum from root nodules of the wild legume Sphaerophysa salsula; and the isolation of Mycobacterium frederiksbergens from root nodules of the vegetable Astragalus armatus 12 . However, at least eleven isolates were recovered from plant rhizosphere, including one Mycobacterium phlei isolate 3 and four Mycobacterium poriferae related isolates from soil rhizosphere of wheat plants 13 , four M. gilvum isolates from the rhizosphere of Phragmites australis plants 14 and two M. rhodesiae related isolates from the rhizosphere of Arctic native plants 15 . Some of these rhizosphere mycobacteria can colonize plant roots and promote its growth (3), some others contribute to hydrocarbons biodegradation such as pyrene, benzo[a]pyrene and diesel in the rhizosphere 14,15 . These twelve Mycobacterium strains belong to the Mycobacterium parafortuitum 16,17 and Mycobacterium smegmatis complexes 18 whereas the three isolates here reported belong to the M. simiae complex, currently the largest Mycobacterium complex comprising 18 described species 19 .
Genome sequence-derived analyses of the Carbohydrate-Active Enzymes (Cazy) content in strains AB308 T , AB215 T  Further genome analysis did not show the presence of structural nitrogen fixation genes (nif), namely nifH, nifD and nifK. However, the three genomes encode for nifU-like domains, the only common region between the nifU protein from nitrogen-fixing bacteria and rhodobacterial species 12 . The role of this protein is not fully elucidated but it has been proposed to be required for the full These three isolates being deposited into publicly available culture collections are now available for the community to further assess their characteristics and potential values. Cells are Gram-stain-positive bacilli and are acid-alcohol-fast. Colonies are scotochromogenic yellow, circular, smooth and grow on 5% sheep blood agar, Lowenstein Jensen medium and on Middlebrook 7H10 agar at an optimal temperature of 37 °C after one-week incubation. M. terramassiliense exhibits positive catalase and Tween 80 hydrolysis and negative results for niacin production, nitrate reductase, oxidase test and urease. M. terramassiliense contains α, α′, keto/epoxy/ω-1 and dicarboxy/ω-carboxy mycolic acids and exhibits a 6,029,590-bp draft genome with 68.39% G + C content, 5,678 protein-coding genes and 52 predicted RNA genes.

Description of
The type strain AB308 T is deposited in the Collection de Souches de l'Unité des Rickettsies (CSUR P2565) and in the DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH (DSM 103275). Its genome is deposited in the GenBank under accession number (NZ_FTRV00000000).  Cells are Gram-stain-positive bacilli and are acid-alcohol-fast. Colonies are scotochromogenic orange yellow, circular, smooth and grow on 5% sheep blood agar, Lowenstein Jensen medium and on Middlebrook 7H10 agar at an optimal temperature of 30 °C after one-week incubation. M. rhizamassiliense exhibits positive catalase and negative results for niacin production, nitrate reductase, oxidase test, Tween 80 hydrolysis and urease. M. rhizamassiliense contains α, α′ and keto/epoxy/ω-1 mycolic acids and exhibits a 6,015,465-bp draft genome with 67.22% G + C content, 5,626 protein-coding genes and 53 predicted RNA genes.

Description of
The type strain AB57 T is deposited in the Collection de Souches de l'Unité des Rickettsies (CSUR P2566) and in the DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH (DSM 103273). Its genome is deposited in the GenBank under accession number (FUFA00000000). Cells are Gram-stain-positive bacilli and are acid-alcohol-fast. Colonies are scotochromogenic dark orange/ yellow, circular, smooth and grow on 5% sheep blood agar, Lowenstein Jensen medium and on Middlebrook 7H10 agar at an optimal temperature of 30 °C after one-week incubation. M. numidiamassiliense exhibits positive catalase and negative results for niacin production, nitrate reductase, oxidase test, Tween 80 hydrolysis and urease. It contains α, α′ and keto/epoxy/ω-1 mycolic acids and exhibits a 6,248,949-bp draft genome with 65.85% G + C content, 5,834 protein-coding genes and 54 predicted RNA genes.

Description of
The type strain AB215 T is deposited in the Collection de Souches de l'Unité des Rickettsies (CSURP2567) and in the DSMZ-Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH (DSM 103274). Its genome is deposited in the GenBank under accession number (FUEZ00000000).

Materials and Methods
A 500-mg quantity of secondary roots, main root, stem and leaves of twelve tomato plants were sterilized separately by immersion for 1 minute in a 30% bleach followed by another three rinses in distilled sterile water for 5 minutes, samples were then crushed using a sterile piston pellet in 2-mL sterile tubes-Eppendorf containing 1 mL of sterile phosphate buffered saline (PBS). The homogenates were than filtered through a 5 µM filter to remove the debris, 1 ml of each filtrate was then mixed with 3 mL chlorhexidin 1% in 15-mL tubes and shacked for 20 minutes and the remaining volume was then completed with PBS. The tubes were centrifuged at 4,000 g/min for 20 minutes, the supernatants were eliminated and 200 µL of PBS were added to the pellet, 100 µL were then cultured in MOD 9 medium 5 and incubated at 30 °C.
Morphological features. The colonies were sub-cultured on Middlebrook 7H10 solid agar (Becton Dickinson, Le Pont de Claix, France). Matrix-assisted laser desorption ionization-time of flight-mass spectrometry (MALDI-TOF-MS) profile did not match any of the profiles entered in the Bruker database (version December, 2015). Optical microscopy was performed using a Leica DM 2500 microscope (Leica, Wetzlar, Germany) after Gram staining and Ziehl-Neelsen staining and pictures were taken using a Nikon Digital Sight DS-U1 Camera System (Nikon, Tokyo, Japan). For electron microscopy, 10 µL of a bacterial suspension in PBS were deposited on a 400-mesh Formvar carbon-coated copper grid (Euromedex, Strasbourg, France) for 10 minutes after the glow discharge. The sample was contrasted using a 5% ammonium molybdate solution. Images were collected using a Tecnai G2 (FEI, Munich, Germany), operating at 200 keV and bacteria were measured using Image J software, version 1. Pigment production in the dark and catalase activity were determined according to the standard procedures 22 .

Phenotypic characterization.
Extraction and analysis of mycolic acids. Mycolic acids were prepared as described previously 23 with some modifications. At least 5 inoculation loops were collected from a culture plate and transferred into a glass tube containing 2 mL of potassium hydroxyde 9 M. Mycolic acids were hydrolysed at 100 °C during 2 hours. Three mL of 6 N hydrochloric acid were added. Free mycolic acids were then extracted with 2 mL of chloroform. The organic phase was collected and dried at 40 °C under a stream of nitrogen. Free mycolic acids were then dissolved in 100 µL of a methanol-chloroform mixture (50:50, v/v) and subjected to electrospray-mass spectrometry analysis after a 2,000 fold dilution in methanol. Samples were analyzed in the Sensitivity Negative ionization mode using a Vion IMS QTof high resolution mass spectrometer (Waters, Guyancourt, France). Samples were infused at a 10 µL/min flow, after fluidics wash with a chloroform/methanol solution (50:50) and monitored from 500 to 2,000 m/z during 2 minutes. Ionization parameters were set as follow: capillary voltage 2.5 kV, cone voltage 50 V, source and desolvation temperatures 120/650 °C. Mass calibration was performed automatically during analysis using a Leucine Enkephalin solution at 50 pg/µL (Lockmass 554.2620 m/z). Mass spectra were then combined between 800 and 1400 m/z and smoothed for subsequent data interpretation. Mycolic acids were described according to previously detailed structures 24,25 . Here, keto, epoxy and ω-1methoxy mycolic acid subclasses could not be distinguished because of their identical chemical formula.
Protein profile analysis. A small part of a colony was picked on a Middlebrook 7H10 solid-medium using a sterile tip and applied directly on a ground-steel MALDI target plate. Then, one µL of a matrix solution (saturated α-cyano-4hydroxycinnamic acid in 50% acetonitrile and 2.5% trifluoroacetic acid) (Bruker Daltonics) was used to over-lay the sample. After 5 minute-drying, the plate was loaded into the Microflex LT (Bruker Daltonics) mass spectrometer. Spectra were recorded following the parameters as previously described 26 . All signals with resolution ≥400 were automatically acquired using AutoXecute acquisition control in flexControl software version 3.0 and the identifications were obtained by MALDI Biotyper software version 3.0 with the Mycobacteria Library v2.0 database (version December, 2015).
Genome sequencing. Genomic DNA was sequenced on the MiSeq Technology (Illumina Inc, San Diego, CA, USA) with the 2 applications: paired end and mate pair. Both strategies were barcoded in order to be mixed respectively with 11 other genomic projects prepared according to the Nextera XT DNA sample prep kit (Illumina) and with 11 others projects according to the Nextera Mate Pair sample prep kit (Illumina). To prepare the paired end library, 1 ng of gDNA was fragmented and amplified by limited PCR (12 cycles), introducing dual-index barcodes and sequencing adapters. After purification on AMPure XP beads (Beckman Coulter Inc, Fullerton, CA, USA), the libraries were then normalized and pooled for sequencing on the MiSeq. Automated cluster generation and paired end sequencing with dual indexed 2x250-bp reads were performed in a 9-hours run. Total i²nformation of 9.0 Gb was obtained from a 1,019 k/mm 2 cluster density with a cluster passing quality control filters of 90.2% (17,374,744 passed filtered reads). Within this run, the index representation was 8.20% for AB215, 8.09% for AB57 and 6.38% for AB308. The number of end reads was 1,424,260 for AB215, 1,405,843 for AB57 and 1,108,717 for AB308. These reads were trimmed and filtered according to the read qualities.
The mate pair library was prepared with 1.5 µg of genomic DNA using the Nextera mate pair Illumina guide. The genomic DNA sample was simultaneously fragmented and tagged with a mate pair junction adapter. The profile of the fragmentation was validated on an Agilent 2100 BioAnalyzer (Agilent Technologies Inc, Santa Clara, CA, Phylogenetic analysis. Phylogenetic analyses based on the 16S rRNA gene sequence were performed using MEGA7 42 with the maximum likelihood method and complete deletion option, based on the Tamura-Nei model for nucleotide sequences. Initial trees for the heuristic search were obtained automatically by applying the neighbor-joining and BIONJ algorithms to a matrix of pairwise distances estimated using the maximum composite likelihood (MCL) method. Statistical support for internal branches of the trees was evaluated by bootstrapping with 1000 iterations.