Use of next generation sequence to investigate potential novel macrolide resistance mechanisms in a population of Moraxella catarrhalis isolates

Although previous studies have confirmed that 23S rRNA gene mutation could be responsible for most of macrolide resistance in M. catarrhalis, a recent study suggested otherwise. Next generation sequence based comparative genomics has revolutionized the mining of potential novel drug resistant mechanisms. In this study, two pairs of resistant and susceptible M. catarrhalis isolates with different multilocus sequence types, were investigated for potential differential genes or informative single nucleotide polymorphisms (SNPs). The identified genes and SNPs were evaluated in 188 clinical isolates. From initially 12 selected differential genes and 12 informative SNPs, 10 differential genes (mboIA, mcbC, mcbI, mboIB, MCR_1794, MCR_1795, lgt2B/C, dpnI, mcbB, and mcbA) and 6 SNPs (C619T of rumA, T140C of rplF, G643A of MCR_0020, T270G of MCR_1465, C1348A of copB, and G238A of rrmA) were identified as possibly linked to macrolide resistance in M. catarrhalis. Most of the identified differential genes and SNPs are related to methylation of ribosomal RNA (rRNA) or DNA, especially MCR_0020 and rrmA. Further studies are needed to determine the function and/or evolution process, of the identified genes or SNPs, to establish whether some novel or combined mechanisms are truly involved in M. catarrhalis macrolide resistance mechanism.

Findings from a recent study in Japan revealed that M. catarrhalis strains with high level macrolide resistance also exhibit mutations in ribosomal proteins L4 (V27A and R161C) and L22 (K68T) 14 . Interestingly and in contrast to M. catarrhalis, mutation of the 23S rRNA gene is usually not the main reason for macrolide-resistance in other bacterial species (such as Streptococcus pneumoniae, Streptococcus pyogenes, etc [15][16][17]. Moreover, multilocus sequence typing (MLST) results from our previous study showed a very high level of heterogeneity among M. catarrhalis isolates 10,13 . Given the above background, it is possible that several mechanisms are involved in M. catarrhalis macrolide resistance.
In order to investigate potential novel mechanisms involved in macrolide resistance by M. catarrhalis, with special emphasis on new relevant genes or informative single nucleotide polymorphisms (SNPs), we studied in detail 2 macrolide resistant and 2 susceptible M. catarrhalis isolates using genomic sequencing. The aim was to screen for other possible resistance genes or mutations (apart from 23S rRNA gene mutation) responsible for macrolide resistance in M. catarrhalis, and to further confirm the findings in a large collection of clinical isolates.
Firstly, we intended to gain further insights into whether A2330T mutation is solely responsible for macrolide resistance in M. catarrhalis, or whether other mechanisms, including methylase, efflux pump, or other genes or mutations 14 , alone or in combination, are involved. Secondly, given that macrolide resistant M. catarrhalis are so different from other macrolide resistant cocci, we assumed that macrolide resistance in M. catarrhalis may be associated with the distinct genomic background of this organism.
In order to answer the two questions above, comparative genomics and multiple molecular typing methods for genetic population were used in this study. Based on the genome-wide data of two pairs of macrolide susceptible and resistant isolates (n = 4), and further evaluation in 188 clinical isolates, we found that six informative SNPs and ten differently expressed genes, possibly contribute to macrolide-resistance in M. catarrhalis.

Materials and Methods
Statement. All  In addition, 21 macrolide-resistant M. catarrhalis (including 11XR4410 and 13R13685) and 167 macrolidesusceptible M. catarrhalis (including 11XR1696 and 13R13726) isolates from our previous study 13 , were randomly selected to evaluate the comparative genomic results (Table 1).
Antimicrobial susceptibility testing. As previously published, all the isolates (n = 188) were tested for susceptibility to erythromycin and azithromycin using the disc diffusion (Thermo Fisher, Oxoid, Basingstoke, UK) method according to CLSI 2010 guideline 18 . And the macrolide-resistant M. catarrhalis isolates were confirmed by E-test (bioMérieux, Marcy l'Etoile, France) method to get the minimum inhibitory concentrations (MICs). Staphylococcus aureus ATCC 25923 was used for quality control.
DNA extraction. Isolates were grown overnight at 35 °C on blood agar plates and DNA extracted using the QIAamp DNA Mini Kit (Qiagen, Dusseldorf, Germany) following the manufacturer's instructions.
Pulsed-field gel electrophoresis (PFGE) and copB polymerase chain reaction-restriction fragment length polymorphisms were performed on the four isolates as previously described 10 .
Next-generation genomic sequencing (NGS). Genome sequencing was performed on the two pairs of M. catarrhalis isolates (one pair susceptible and the other resistant). DNA libraries were constructed with genomic DNA using kits provided by Illumina Inc. according to the manufacturer's instructions. Libraries with an insert size of 500-bp were prepared for each isolate. Methods for DNA manipulation, including formation of single-molecule arrays, cluster growth and paired-end sequencing, were performed on an Illumina Hiseq 2500 sequencer according to standard protocols. The Illumina base-calling pipeline was used to process the raw fluorescent images and call sequences. Raw reads of low quality from paired-end sequencing (those with consecutive bases covered by fewer than five reads) were discarded. The bioproject accession number for the four isolates (11XR4410, 11XR1696, 13R13726 and 13R13685) is PRJNA338378. Differential gene definition in macrolide resistant and susceptible groups. The paired-end reads from each of the four genome sequenced isolates were mapped to a previously published M. catarrhalis reference genome, the BBH18 reference genome (GenBank accession number: CP002005.1) and M. catarrhalis isolate E22 plasmid pLQ510 (GenBank accession number: NC_011131.1) using Burrows-Wheeler Alignment (BWA) software. Nucleotide base coverage of each gene from each of the isolates on the BBH18 and plasmid pLQ510 genomes was assessed using Samtools mpileup packages (http://samtools.sourceforge.net/). Based on the Samtools mpileup results, the average coverage for each gene was calculated. If the gene coverage was different between the resistant and susceptible groups by either being present or absent, or if present, by significantly (P < 0.05) different levels of gene coverage, those genes were considered as differential genes.
For SNPs, the number of genome nucleotide bases of a test isolate that were similar to those of the reference genome (ref) were determined. Likewise the number of test isolate genome nucleotide bases that were different (alt) to the reference genome were also determined for each isolate. High quality SNPs were defined as SNPs that satisfied the following criteria: alt/(alt+ ref) > 0.95 (which means to be different from the reference genome nucleotide base) or alt/(alt+ ref) ≤ 0.05 (which means to be similar to the reference genome nucleotide base). A high quality SNP which satisfied the criterion alt/(alt+ ref) > 0.95, and appeared in at least one isolate, was considered to be an informative candidate SNP. PCR screening of differential genes. PCR was performed on 188 M. catarrhalis isolates (21 macrolide resistant, and 167 macrolide susceptible) derived from Peking Union Medical College Hospital (PUMCH): 2010-2013, to detect the following identified differential genes as per definition above; mboIA, mcbC, mcbI, mboIB, MCR_1794, MCR_1795, lgt2B/C, dpnI, mcbB, MCR_0360, MCR_0361, and mcbA (see Table 3 for primer sequences and Supplementary Table S1 for full description of the genes). A detailed flow chart of the study is shown in Fig. 1 28  3  3  14  36  17  2  327 b  II   13R13685  8  3  2  2  17  15  8  2  NP-ST-5 b  II   11XR1696  3  22  34 a  9  8  3  8  2  312 b  II   11XR4410  3  3  2  2  17  15  3  2 NP-ST-4 b II PCR and sequencing analysis of the informative candidate SNPs. PCR was performed to detect rumA, rpIF, MCR_0016, MCR_0020, MCR_1465, copB and rrmA genes (see Table 3 10 . Due to limited budget, only 73 isolates instead of the 188 were tested for the presence of the identified informative candidate SNPs (Fig. 1).  DNA sequencing was performed on the 73 isolates using the same primers used for PCR amplification, providing bidirectional coverage. The obtained sequences were aligned to those of the wild type GenBank reference M. catarrhalis strain, BBH18; GenBank accession number. NR_103214.1).

Results
General genome features of the studied M. catarrhalis isolates. Detailed      of analysis), ST312, and ST327. Furthermore, 2 novel sequence variants for the abcZ (abcZ 61) and efp (efp 34) alleles were present in strains 13R13726 and 11XR1696, respectively (Table 2). However, isolate 13R13685 was broadly similar to 11XR4410, albeit with two exceptions (abcZ and ppa). All the four isolates belonged to copB II. We utilized PFGE analysis to determine the clonal relationship of the 4 isolates, and four pulsotypes were found (data not shown), suggesting origination from different clones.
Comparisons of the frequency distribution of informative candidate SNPs in the two groups (resistant, n = 21 vs. susceptible, n = 52) using a χ 2-test, indicated that the presence of the A2144T, A2330T, and C2480T mutation of 23S rRNA, C619T of rumA, T140C of rplF, G643A of MCR_0020, T270G of MCR_1465, C1348A of copB, and G238A of rrmA, was statistically different between the two groups (p < 0.05). In contrast, the presence of A1249G mutation of MCR_0016, G695A of MCR_1465, and A1205C of copB genes, was very similar in the two groups (p > 0.05) ( Table 5)

Discussion
Based on our literature review, we found that very few studies have examined the molecular mechanisms involved in M. catarrhalis macrolide resistance. In most studies, 23S rRNA gene mutation is singled out as being responsible for the majority of cases of macrolide resistance 10,11 . However, whether 23S rRNA gene mutation is the  only mechanism leading to macrolide resistance in M. catarrhalis remains unknown, though the presence of macrolide-resistant strains without any 23S rRNA gene mutations seem to suggest otherwise.
Comparative genomics is a practical tool which has been widely used in the study of drug resistance mechanisms 21 . Specifically, if the isolates to be compared are derived from the same patient and have similar genetic background, comparative genomics can provide some important information, including discovery of some novel mechanisms. Unfortunately, in our study, we couldn't find sufficient numbers of M. catarrhalis isolates with the same MLST or pulsed field gel electrophoresis types, as there was considerable genetic diversity among the isolates. As such, it is clear that not all the SNPs and genes identified in this study can be considered to be highly associated with macrolide resistance, hence had to confirm some of them in a large collection of clinical isolates.
Based on the comparative genomics results combined with evaluation in a large collection of clinical isolates, some genes and SNPs considered possibly involved in macrolide resistance were identified (Figs 1 and 5). Most of the identified genes and SNPS are related to the methylation of ribosomal RNA (rRNA) or DNA, especially MCR_0020 and rrmA (Supplementary Table S1). Due to limited relevant literature on M. catarrhalis genome or resistance to macrolide, some of the genes (such as mboIA and mboIB) (Supplementary Table S1) and informative candidate SNPs (such as C619T of rumA and T140C of rplF) (Supplementary Table S1) identified in this study have not been previously reported. It is unclear how methylation of rRNA and rDNA is associated with macrolide resistance in M. catarrhalis. Thus further investigation, including function of the gene, and crystal structure of the protein involved, and how these relate to macrolide resistance, is needed.
Furthermore, we analyzed the distribution of the identified 12 differential genes and 12 SNPs in the two groups (resistant, n = 21 vs. susceptible, n = 52), in order to find possible gene combinations associated with macrolide resistance (Fig. 5). Among the 12 candidate SNPs, 5 (23S rRNA_A2144T, 23S rRNA_A2330T, 23S rRNA_C2480T, MCR_0020_G643A, MCR_1465_T270G) were only detected in the resistant group, whilst among the differential genes, only one gene (lgt2B/C) was identified in the susceptible group. The remaining SNPs and differential genes were detected in both groups. In addition, among the remaining SNPs, C619T of rumA, T140C of rplF, and G238A of rrmA mutations, were always found together in the resistant group, and no definite pattern was obvious in the susceptible group. Moreover, the distribution frequency of the mcbB, mcbC and mcbI genes was similar between the two groups, and so was that of mboIA and mboIB, MCR_1794 and MCR_1795 genes. These findings suggest that these genes are associated with each other when they function in the cell.
Interestingly, the identified 12 SNPs and 12 differential genes could also be used to differentiate individual isolates even between two isolates which shared the same MLST type, such as strain xm21 (NP-ST-3) and c17 (NP-ST-3).
Based on the above results, we surmise that the molecular mechanism of macrolide resistance might not be as simple as previously thought [10][11][12][13][14] , and that some genes or SNPs (such as MCR_0020 and rrmA) might be involved in this process. Many of the identified genes are related to the methylation of ribosomal RNA (rRNA) or DNA, and may solely, or in combination, with one another or with 23S rRNA gene mutation, be responsible for macrolide resistance in M. catarrhalis. However, the functions of these identified genes or SNPs, and the crystal structure of their translated proteins, are still unknown. We consider these findings as hypothesis generating and exploratory, requiring confirmation in the future to fully elucidate some of these findings.
This study has several limitations. First, the four isolates used as the main anchor of the study were chosen arbitrarily. It's possible that a different set of M. catarrhalis isolates would have yielded different candidate differential genes or informative SNPs. Second, Samtools pileup pipeline can yield less SNPs than other pipelines, therefore some SNPs potentially related to macrolide resistance might have been missed. Third, the selection of candidate differential genes and informative candidate SNPs was mainly based on reviewing relevant literature and further evaluation in 188 clinical isolates; it is possible that some relevant genes and SNPS may have been overlooked due to limited bioinformatic analysis. Fourth, none of the 10 annotated differential genes were assessed for expression under normal cultural conditions and/or in the presence of macrolides. And finally, it is not possible to rule out that the other 88 hypothetical proteins and 1862 SNPs not mentioned are not involved as they could have hidden some potentially important macrolide resistance genes. More studies are needed to fully understand the mechanism of macrolide resistance by M. catarrhalis. Our limited budget was a hindrance to carrying out more detailed studies.