Introduction

Non-tuberculous mycobacteria (NTM) are isolated from human and environmental samples all over the world1, including tropical areas and remote countries and territories2. Our recent investigation of mycobacteria in French Polynesia, a remote French territory composed of 117 islands in the South Pacific3, indicated that rapidly-growing mycobacteria formed the most prevalent group of NTM in human respiratory tract specimens3. Among such organisms, sequencing the 16 S rRNA, rpoB and hsp65 genes suggested that strain M26, isolated from sputum collected in August 2011 from an 86-year-old French Polynesian man living in Taravao (17 44’ S, 149 18’ W) was representative of a new species. Strain M26 displayed 91.47% rpoB gene sequence similarity with Mycobacterium rhodesiae CIP 106806T (GenBank: EU109302), 98.93% 16 S rRNA gene sequence similarity with Mycobacterium smegmatis ATCC 19420T (GenBank: NR115233) and 95.59% hsp65 gene sequence similarity with Mycobacterium conceptionense CIP 108544T (GenBank: AY859678)3. Since strain M26 was regarded as a potentially undescribed species of the genus Mycobacterium, it was further characterized.

For phenotype, we observed that strain M26 exhibited yellow-orange, circular, convex, smooth colonies on Middlebrook 7H10 medium. Exposing colonies to light during incubation did not change their color. Growth occurred from 28–42 °C with optimum growth at 30–37 °C, where colonies are detected at 48-hour subculture. Growth was also observed on trypticase soy agar and egg-based Coletsos medium, and Middlebrook 7H9 broth. Strain M26 exhibited Gram-positive, red Ziehl-Neelsen stained cells, which were non-motile and measured 0.54 ± 0.1 μm wide and 3.32 ± 1.46 μm on an average (Fig. 1). The catalase, niacin and Tween 80 hydrolysis tests were negative at 37 °C. Culture was positive when adding nitrobenzoic acid or up to 2% salt in Middlebrook 7H10 solid medium. Strain M26 hydrolyzed the complex polysaccharides esculin and gelatin. Further enzyme activities included pyrazinamidase, alkaline phosphatase, alpha-glucosidase, beta-glucosidase, urease but it was negative for nitrate reductase, pyrolidonyl arylamidase, beta-glucuronidase, beta-galactosidase, and N-acetyl beta-glucosaminidase. This phenotypic profile was unique, and specifically M26 strain phenotypically differed from M. smegmatis and M. conceptionense, the two closest organisms based on 16 S rRNA and hsp65 gene sequence similarity (Fig. 2). In particular, results of catalase, growth at 5% NaCl, Tween 80 hydrolysis, nitrate reductase, gelatin hydrolysis distinguish strain M26 from M. smegmatis4. Also, the absence of scotochromogenic colonies, negative catalase activity and negative Tween hydrolysis at 37 °C all distinguished strain M26 from M. rhodesiae, the closest species by rpob sequencing4 (Fig. 2). Accordingly, M26 strain exhibited a reproducible matrix-assisted laser desorption ionization-time of flight-mass spectrometry (MALDI-TOF-MS) profile that did not match any of the profiles entered in the Bruker database (version December, 2015). Furthermore, strain M26 was susceptible to rifabutin (minimal inhibitory concentration (MIC) < 0.015 mg/L), rifampicin (MIC, 0.06 mg/L), ethambutol (MIC, 1 mg/L), clarithromycin (MIC, 2 mg/L), azithromycin (MIC, 2 mg/L), linezolid (MIC, 0.5 mg/L), amikacin (MIC, 0.25 mg/L), ciprofloxacin (MIC, 0.125 mg/L), levofloxacin (MIC, 0.06 mg/L), moxifloxacin (MIC, <0.03 mg/L), sulfamethoxazole (MIC, 1 mg/L) and resistant to penicillin G (MIC, 256 mg/L), imipenem (MIC, 64 mg/L), doxycyclin (MIC, 4 mg/L) and minocyclin (MIC, 1 mg/L). This antibiotic susceptibility profile also distinguished strain M26 from M. rhodesiae, which is resistant to rifampicin with a MIC of 25 mg/L. Phenotypic traits therefore suggested that strain M26 was a representative strain of a previously undescribed Mycobacterium species.

Figure 1: Transmission electron microscopy of “Mycobacterium massilipolynesiensis” sp. nov. strain M26T, using a Morgani 268D (Philips) at an operating voltage of 60 kV.
figure 1

The scale bar represents 500 nm.

Figure 2: Phylogenetic tree based on rpoB sequences, bootstrapped 100 times, highlighting the position of Mycobacterium massilipolynesisensis sp. nov. M26T (in bold) relative to other nontuberculous mycobacteria (branch point highlighted by a red arrow).
figure 2

The tree was constructed using the ClustalW and MEGA5 software programs. Only homology distance values above 90% are displayed in the tree. The corresponding GenBank accession numbers are displayed in parentheses. Bar: 0.02 substitution per nucleotide position.

Further, we found that the genome of strain M26 comprised one 5,732,017-bp long chromosome without plasmid and a 67.54% GC content (Fig. 3). The genome of strain M26, comprising 12 contigs assembled into nine scaffolds, was ordered using Mauve software and Mycobacterium gilvum genome as nearest sequenced complete genome reference. The ordered scaffolds were then assembled into one complete sequence. Analyzing the replication origin in strain M26 genome using Ori-Finder5,6 predicted two OriC regions splited by the dnaA gene and located at scaffold00008 (508 and 368 bp) (Supplementary File 1). The 368 bp predicted OriC region showed homology sequence with Mycobacterium rhodosiae NBB3 and Mycobacterium avium 104 in DoriC database7 (Supplementary File 1). The genome of strain M26 encodes 5,500 protein-coding genes and 52 RNAs including two complete rRNA operons and 46 tRNA. A total of 4,086 genes (74.29%) were assigned as putative function (by COGs or by NR blast), whereas 166 genes (3.02%) were identified as ORFans. The remaining genes were annotated as hypothetical proteins (989 genes, 18%). A total of 2,876 proteins were found to be associated with mobilome. Eleven genes encode for bacteriocins and 100 proteins are associated with toxin/antitoxin system. We detected no gene encoding for proteins potentially involved in the resistome in the genome of strain M26. One protein was found to belong to the FxsA bacterial family of cytoplasmic membrane proteins. The molecular function of FxsA is unknown, but its over-expression in Escherichia coli has been shown to lessen the exclusion of phage T7 in those cells with an F plasmid. Analyzing the mobilome of strain M26 predicted one incomplete 23-kbp prophage region related to the Herpesviridae family, which lost the structural and replication protein required for a phage (Fig. 3, Fig. 4). We detected no gene for clustered regularly interspaced short palindromic repeats (CRISPR). We identified a large number of genes assigned to COG functional categories for transport and metabolism of lipids (8.9%), secondary metabolites biosynthesis, transport and catabolism (7.3%), amino acid transport and metabolism (6.3%) and energy production and conversion (6%) in the strain M26 genome (Table 1). This genomic content probably reflects the high lipid content in strain M26, as is known for the genus Mycobacterium. Mycobacterial lipids play important roles in virulence and drug susceptibility8. The genome of M26 also has the potential to produce secondary metabolites, with 73 genes found to be associated with polyketide synthases (PKSs) and non-ribosomal peptide synthetase (NRPSs). Some of these genes are orthologous to some Mycobacterium tuberculosis genes such as Pks 139. NRPSs encode for peptide synthetase orthologous to those of Myxococcus stipitatus and siderophore biosynthesis protein, similar to those of Streptomyces davawensis and Streptomyces griseus. Siderophores are implicated in the acquisition of ferric iron in most microorganisms.

Figure 3: Graphical circular map of M26 genome.
figure 3

Figure 3 shows the genome sequence of M26 strain (Black) compared to three Mycobacterium genomes. The outermost ring highlights the genome sequence of M26 and annotated Prophage and OriC regions, shown in blue and red respectively. The scaffolds were ordered using Mauve software and the complete genome sequence of M. gilvum (Second circle). The remaining rings show BLAST comparison with M. gilvum. M. phlei and M. rhodesiae. The innermost rings show GC skew (purple/green) and GC content (black).

Figure 4: Genomic organization of M26 uncomplete prophage region.
figure 4

 The brown color represents the attachment site; the green the integrase, the blue indicates prophage-like proteins.

Table 1 Number of genes associated with the 25 general COG functional categories.

Phylogenomic analysis showed that the strain M26 genome is closest to Mycobacterium phlei genome, with average nucleotide identity of 76.6% (Fig. 5). The strain M26 genome is smaller than those of M. smegmatis str. mc2 155 and M. rhodesiae NBB3 (6.99 Mb and 6.42 Mb, respectively); its GC% content is higher than those of M. smegmatis str. mc2 155 and M. rhodesiae NBB3 (67.4% and 65.49%, respectively). We further analyzed 32 Mycobacterium species genomes in addition to the M26 strain genome to construct the Mycobacterium pangenome (Table 2). These 33 Mycobacterium species yielded a pangenome of 181,387 genes. The core genome (the set of genes shared by all the studied species) is composed of 1,474 orthologous genes (45.29%) and the accessory genome (the set of genes present in some but not all the species) is composed of 8,406 orthologous genes and 8,963 unique genes (genes unique to individual species) (54.71%). M26 strain encodes 361 unique genes (0.19%). A total of 14,356 genes encoded in other Mycobacterium species, are absent in the M26 genome (7.54%) (Fig. 6A). Functional annotation of M26 strain unique genes revealed that 84.21% have unknown function and 3.88% are implicated in lipid transport and metabolism; while 48.87% of genes absent in the M26 strain genome have unknown function, 9.5% are implicated in transcription and 19% are implicated in transport and metabolism (Fig. 6B). M. phlei is an environmental organism seldom responsible for opportunistic infection10,11,12,13. In a few patients, M. phlei was isolated from the respiratory tract, as was the case for strain M2614,15. In the situation where a clinical non-tuberculous clinical strain M26 exhibited an uncertain phylogenetic position, whole genome sequencing analysis unambiguously confirmed that it was a representative of a previously undescribed new species most closely related to M. phlei, with a 76.6% Average Nucleotide Identity (ANI).

Figure 5: Aligned whole genomes phylogenomic tree based on Mauve alignment identity matrix.
figure 5

ANI, Average Nucleotide Identity.

Table 2 Mycobacterial genomes used in pangenomic analysis.
Figure 6: Pangenomic analysis and functional annotation of 33 Mycobacterium genome species.
figure 6

(A) Pangenome distribution (B) COGs distribution of unique genes and genes absent in M26 strain genome.

The unique phenotypic and genetic characteristics of strain M26 support the fact that it is representative of a hitherto undescribed species in the genus Mycobacterium. Strain M26T (=CSUR P1385T = DSM 46850T) is the type strain of “Mycobacterium massilipolynesiensis” sp. nov. [masilipɔlinezjɛnsis], whose name is composed of Massilia, the Latin name for Marseilles, France, and Polynesia, referring to French Polynesia, where it was isolated. It is proposed to be a novel species belonging to the Mycobacterium genus, based upon rpoB discrimination within the nontuberculous rapidly-growing mycobacteria group and its genome features. A remarkable genomic Mycobacterium massilipolynesiensis feature is the large number of genes assigned to the transport and metabolism of lipids and secondary metabolite biosynthesis, transport and catabolism. Older suggestion that a new bacterial species be reported on the basis of several isolates16,17 would have prevented to draw attention of microbiologists to this new species. Alternatively, its free availability in public culture collection will allow microbiologists to report on additional isolates in order to precise the potential of M. massilipolynesiensis as an opportunistic pathogen; on the wave of culturomics hundreds of new bacterial species with only one cultured representative18,19.

Methods

To further characterize strain M26, growth was observed after sub-culture on Middlebrook 7H10 solid agar (Becton Dickinson, Le Pont de Claix, France) at 28–42 °C for two weeks. The development of colonies was inspected daily by the naked eye. Optical microscopy was performed using a Leica DM 2500 microscope (Leica, Wetzlar, Germany) after Gram staining and Ziehl-Neelsen staining and pictures were taken using a Nikon Digital Sight DS-U1 Camera System (Nikon, Tokyo, Japan). Transmission electron microscopy (Morgani 268D; Philips, Eindhoven, The Netherlands) was used to determine the size of the microorganisms after negative staining. Carbon source utilization and enzyme activities were determined by inoculating a API® Coryne strip (bioMérieux) as indicated by the manufacturer. For Tween 80 hydrolysis, two drops of Tween 80 reagent (Sigma-Aldrich, L’Isle-d’Abeau, France) and neutral red as indicator were mixed with 1 mL of sterile distilled water into a screw-cap tube. The tube was inoculated with a loopful of the microorganism and incubated at 37 °C in the dark with the cap tight. The tube was visually inspected daily up for to 10 days of incubation. Niacin production was detected using BBL ™Taxo ™TB Niacin test strips (Becton Dickinson) as described by the manufacturer. Salt tolerance of strain M26 was tested by supplementing the Middlebrook 7H10 solid medium with 0–5% NaCl. Matrix-assisted laser-desorption/ionization time-of-flight mass spectrometry (MALDI-TOF- MS) protein analysis was carried out as previously described20. Drug susceptibility was tested using the broth microdilution method in Middlebrook 7H9 medium (Becton Dickinson) as previously described21.

Genome sequencing and assembly

DNA was extracted from strain M26 and cultured for seven days in one MGIT medium (Becton Dickinson) tube supplemented with 0.8 mL of Oleic Albumin Dextrose Catalase (OADC) enrichment (Becton Dickinson) at 37 °C. The inoculated broth was then centrifuged at 8,000 g for 10 min and the pellet was resuspended in 160 μL of G2 buffer (EZ1 DNA Tissue kit, Qiagen, Courtaboeuf, France), mixed with 40 μL of lysozyme at 25 mg/mL (Euromedex, Strasbourg, France). DNA was extracted using a BioRobot EZ1 Advanced XL (Qiagen). Genomic M26 DNA was sequenced on the MiSeq Technology (Illumina Inc, San Diego, CA, USA) with both paired-end and mate-pair applications. The gDNA was quantified by a Qubit assay with the high sensitivity kit (Life Technologies, Carlsbad, CA, USA) to 95.1 ng/μL and was barcoded respectively in order to be mixed with 12 other projects with the Nextera Mate Pair sample prep kit (Illumina) and with 15 others projects with the Nextera XT DNA sample prep kit (Illumina). To prepare the paired-end library, dilution was performed to require 1ng of each genome as input. The “tagmentation” step fragmented and tagged the DNA. Limited cycle PCR amplification (12 cycles) completed the tag adapters and introduced dual-index barcodes. After purification on AMPure XP beads (Beckman Coulter Inc, Fullerton, CA, USA), the libraries were normalized on specific beads according to the Nextera XT protocol (Illumina). Normalized libraries were pooled for sequencing on the MiSeq. The pooled single strand library was loaded onto the reagent cartridge and then onto the instrument along with the flow cell. Automated cluster generation and paired end sequencing with dual index reads were performed in a single 39-hour run in 2 × 251-bp. Total information of 5.9 Gb was obtained from a 643 K/mm2 cluster density with cluster passing quality control filters of 91.8% (12,365,000 clusters). Within this run, the index representation for M26 was 6.43%. The 729,975 paired-end reads were trimmed and filtered according to the read qualities. Two separated mate-pair libraries were prepared with 1.36 and 1.5 μg of genomic DNA using the Nextera mate pair Illumina guide. The genomic DNA sample was simultaneously fragmented and tagged with a mate pair junction adapter. The fragmentation profiles were validated on an Agilent 2100 BioAnalyzer (Agilent Technologies Inc, Santa Clara, CA, USA) with a DNA 7500 labchip. The size of the DNA fragments ranged from 1 to 11 kb with an optimal size at 4.58 and 5.27 kb, respectively. No size selection was performed and 503 and 600 ng respectively of tagmented fragments were circularized. The circularized DNA was mechanically sheared to small fragments with an optimal at 871 bp on the Covaris device S2 in microtubes (Covaris, Woburn, MA, USA). The libraries profiles were quantified and visualized on a High Sensitivity Bioanalyzer LabChip (Agilent Technologies Inc). The libraries were normalized at 2 nM, pooled, denatured and diluted at 15 pM for sequencing. One M26 library was loaded in two different 2 × 251-bp runs (at 506 K/mm2 cluster density with cluster passing quality control filters of 97%, the index representation was determined at 5.71%, at 852 K/mm2 cluster density with cluster passing quality control filters of 94.53%, the index representation was 4.04%). The second library was loaded and the run performed at 587 K/mm2 cluster density with cluster passing quality control filters of 96.3% the index representation was 7.44%). These three runs in mate-pair strategy generated a total of 2,004,521 pass filter paired-reads for M26. Illumina reads were trimmed using Trimmomatic22, then assembled using Spades software23,24. Contigs were combined together by SSpace25 and Opera software v1.226, helped by GapFiller V1.1027 to reduce the set. Some manual refinements using CLC Genomics v7 software (CLC bio, Aarhus, Denmark) and homemade tools made in Python improved the genome. Finally, the draft genome of strain M26 consists into nine scaffolds. The replication origin in strain M26 genome was predicted using Ori-Finder5,6 (http://tubic.tju.edu.cn/Ori-Finder/) and homology with other OriC regions was predicted using blast algorithm in DoriC database7 (http://tubic.tju.edu.cn/doric/). The 16 S rRNA gene sequence of strain M26 was deposited in GenBank under accession number LN909503 and its complete genome under accession number FAXD00000000.

Genome annotation

Non-coding genes and miscellaneous features were predicted using RNAmmer28, ARAGORN29, Rfam30, PFAM31, and Infernal32. Coding DNA sequences (CDSs) were predicted using Prodigal33 and functional annotation was achieved using BLAST+34 and HMMER335 against the UniProtKB database36. To refine the search of mobile genetic elements in the genome of strain M26, PHAST software predicted prophage37 completed by the Actinobacteriophage Database at phageDB.org and the online plasmid search tool: https://plasmid.med.harvard.edu/PLASMID/Home.xhtml. Phylogenomic tree was constructed using an identity matrix based on Mauve program genome alignment38. Average Nucleotide Identity values were calculated between strain M26 and sequenced Mycobacterium species genomes using BLASTN and a home-made software, following the algorithm described by Goris et al.39. The genomes of 32 Mycobacterium species were downloaded from Genbank (Table 2). The ORFeome of these species, in addition to M26 genome, were annotated using Prodigal33 in order to normalize ORFs predictions. The Pangenomic analyses were computed using OrthoMCL software with >60% amino acid identiy in all-against-all BLASTP algorithm40.

Additional Information

How to cite this article: Phelippeau, M. et al. “Mycobacterium massilipolynesiensis” sp. nov., a rapidly-growing mycobacterium of medical interest related to Mycobacterium phlei. Sci. Rep. 7, 40443; doi: 10.1038/srep40443 (2017).

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.