Estimation of the mutation rate of Mycobacterium tuberculosis in cases with recurrent tuberculosis using whole genome sequencing

Comín, Jessica; Cebollada, Alberto; Samper, Sofía

doi:10.1038/s41598-022-21144-0

Download PDF

Article
Open access
Published: 06 October 2022

Estimation of the mutation rate of Mycobacterium tuberculosis in cases with recurrent tuberculosis using whole genome sequencing

Jessica Comín¹,
Alberto Cebollada²,
Aragonese Working Group on Molecular Epidemiology of Tuberculosis (EPIMOLA) &
…
Sofía Samper^1,3,4

Scientific Reports volume 12, Article number: 16728 (2022) Cite this article

1806 Accesses
2 Citations
1 Altmetric
Metrics details

Subjects

An Author Correction to this article was published on 27 February 2023

This article has been updated

Abstract

The study of tuberculosis latency is problematic due to the difficulty of isolating the bacteria in the dormancy state. Despite this, several in vivo approaches have been taken to mimic the latency process. Our group has studied the evolution of the bacteria in 18 cases of recurrent tuberculosis. We found that HIV positive patients develop recurrent tuberculosis earlier, generally in the first two years (p value = 0.041). The genome of the 36 Mycobacterium tuberculosis paired isolates (first and relapsed isolates) showed that none of the SNPs found within each pair was observed more than once, indicating that they were not directly related to the recurrence process. Moreover, some IS6110 movements were found in the paired isolates, indicating the presence of different clones within the patient. Finally, our results suggest that the mutation rate remains constant during all the period as no correlation was found between the number of SNPs and the time to relapse.

Assessment of closely related Mycobacterium tuberculosis variants with different transmission success and in vitro infection dynamics

Article Open access 26 May 2021

Characterisation of drug-resistant Mycobacterium tuberculosis mutations and transmission in Pakistan

Article Open access 11 May 2022

Phenotypic and genotypic features of the Mycobacterium tuberculosis lineage 1 subgroup in central Vietnam

Article Open access 30 June 2021

Introduction

Mycobacterium tuberculosis has afflicted and co-evolved with man over thousands of years. Its success is due to its ability to infect a host and persist in a dormancy state for years^1,2. During this period, the host is asymptomatic and not infectious, making the study of this state unmanageable. The bacteria stay in the granuloma, a barrier made by the immune systems cells, until not-well characterised signals or a weakening of the immune system allows the bacteria to escape and develop an active disease³. In recent years, some in vitro approaches have been designed for trying to increase the knowledge of latency⁴ and a debate about the physiological state of M. tuberculosis during this period has emerged. The number of mutations during a time period can be used as a molecular clock to study the evolution of the pathogen^6,7,8,9. There were two studies with apparently contradictory results. Ford et al.¹⁰ used bacteria isolated from macaque lesions that mimic those of tuberculosis (TB) latent infection and concluded that generation time for latent TB would be similar to active TB, so the bacteria is physiologically active. On the other hand, Colangeli et al.¹¹, studying an outbreak in New Zealand, concluded that generation times during latency are longer than during active TB. Recently Colangeli et al.¹², pairing index cases to their TB contacts as a latency approach, concluded that both studies were correct: during the first two years of latency, the generation time is similar to that of the active disease, while later it starts to increase for long periods of time and a reduced mutation rate is observed. Looking for a novel in vivo approach, we carried out the analysis of isolates from individuals known to have developed several episodes of active TB for studying the evolution of the bacteria during the period between these episodes. Since 2004, all strains of M. tuberculosis have been genotyped in Aragon, Spain, which allowed us to identify the cases of TB relapses. The DNA previously used for genotyping remained in storage and could be used for whole genome sequencing (WGS) to analyse the variability of the isolates in the different episodes of the disease.

Results

Patient selection and risk factors

The search of cases with recurrent TB among the total of cases in Aragon revealed 127 patients from 2004 until 2019 (4.97%). The genotype of the isolates revealed that 114 patients were infected by the same or very similar RFLP pattern strain, which would imply a potential relapse, while 13 patients were infected with non-related strains, i.e., re-infections, so these were not considered for this study. Among the potential relapses, we selected the cases with at least one year between episodes. Eighty-one patients had isolates with less than one year between them, therefore this group was discarded. Based on the time distance between the isolates, the cases were split into two groups: cases with ≥ 1 year but ≤ 2 years between isolates (12 patients), and cases with more than two years between the diagnosis of their isolates (21 patients). Eighteen pairs of M. tuberculosis isolates with available DNA of at least two different episodes were studied: twelve patients in the > 2 years group and six patients in the ≥ 1 but ≤ 2 years (Fig. 1). The lineages of the selected isolates are shown in Fig. 2.

Several selected patients had risk factors to develop TB. At least 38.9% were HIV+ , 22.2% declared a high alcohol consumption, 38.9% were smokers and 16.7% were intravenous (IV) drugs users. The treatment for all patients was the standard for susceptible TB. Despite some of them did not follow it correctly, no drug resistance was developed. We split the patients into two groups: [1–2 years until relapse] and (2–14 years until relapse] in order to investigate if some of these risk factors were related to a shorter or longer relapsing period (time between the first and the second episode). The intervals were fixed according to the results obtain by Colangeli et al.¹², being 160 months the maximum time between episodes observed in our study. Results are shown in Table 1.

Table 1 Risk factors of the 18 selected cases.

Full size table

HIV status was significant (p value = 0.035) between the two groups, showing that HIV positive patients suffered relapse in the first two years more frequently than HIV negative patients.

Analysis of the genomes

SNPs versus relapsing period

The number of existent SNPs between the first and its correspondent relapsed isolate ranged from 0 to 8. These SNPs were usually in the relapsed isolates, but we could also find some of them in the earliest isolates that then disappeared, showing different clones co-existing in the patient. The mutation effect of the SNPs and also the functional categories of the affected genes were analyzed; 26.3% belonged to cell wall and cell processes category and 21.1% to intermediary metabolism and respiration category. None of them were present in more than one case, therefore they do not seem to be directly implicated in relapsing. The detailed SNPs can be found in Table 2.

Table 2 SNPs found between the first and the relapsed paired isolates.

Full size table

In order to represent the number of SNPs developed between the first and the relapsed isolates of the same patient versus the time between the diagnosis of both isolates (in months), resembling in that way the latency period, we reproduced the study of Colangeli et al.¹² using the Poisson regression model. Equally to Colageli et al. 2020, we found this correlation not significant (p value = 0.34), meaning that those isolates with a longer relapsing period were not necessarily those with more SNPs. The results are shown in Fig. 3. Otherwise, we observed that pairs of L4.1 had a higher mutation rate per genome per year (0.93 SNPs) than other sublineages (0.58 SNPs in L4.8 and 0.32 in L4.3), and it is above the average found by this study (0.64 SNPs).

Mutation rate versus generation time

In order to analyse the correlation between the mutation rate and the relapsing period, we used the Poisson model, as described by Colangeli et al. ¹². The generation time was fixed at 18 h as seen in M. tuberculosis actively replicating in vitro. Results are shown in Fig. 4. The mutation rate tends to diminish in longer relapsing periods, being marginally significant (p value = 0.061).

When we considered the data in (1–2) and (2–14) years of relapsing periods, the mutation rate is slightly lower for the second, estimated at 2.728 × 10^–10 [95% CI: 1.433 × 10⁻¹⁰, 5.193 × 10⁻¹⁰] mutations per (bp × generation), than in the first period, estimated at 2.798 × 10^–10 [95% confidence interval (CI): 1.209 × 10^–10, 6.477 × 10⁻¹⁰]. However, this difference was not statistically significant (p value = 0.96). Results are shown in Fig. 5.

IS6110 copies variation between the isolates

All the IS6110 copies of the isolates were analyzed using the WGS data. We found several IS6110 movements between the first and the relapsed isolates as a result of different clones. The relapsed strain gained extra IS6110 copies in P7 (two copies gained) and P12 (one copy gained). On the other hand, we also observed four cases in which the relapsed isolate had lost some IS6110 copies, present in the first isolate: P10 (one copy lost), P13 (three copies lost), P14 (one copy lost) and P15 (one copy lost). All these extra and absent copies usually had a lower number of reads than the fixed copies, indicating that they were not completely extended in the bacteria population. All these movements were observed in strains with more than 2 years between the isolates, except P14 pair (14 months). The IS6110 locations can be found in Table S1.

Discussion

Studying M. tuberculosis latency in humans is harsh due to the difficulty of isolating the dormant bacteria, which is not possible until the active disease. Much has been published regarding latent TB and the percentages of reactivation and disease, but the latency data in patients who have already passed the disease have not been studied. Different approaches were used to mimic this process^10,11,12. This work shows, for the first time, results obtained using isolates of patients with recurrent TB. Aragon, a region in the North of Spain, has a low incidence of TB. Thanks to the surveillance protocol carried out in this region since 2004, all M. tuberculosis isolates are genotyped and registered, allowing to trace the clinical TB history of the patients. Around 5% of the TB cases in our population correspond to recurrent TB. Of them, 89.8% were TB cases with isolates showing identical IS6110-RFLP patterns, indicating a potential relapse. Most of them (71%) were later considered as fail of treatment. In contrast, 10.2% of the patients had isolates with different genotypes, considered as reinfections. Among the total of TB cases in our community, reinfection occurs in 0.5% of the TB cases, reflecting that reinfection is uncommon among our population. These data are in agreement with a previous study in Madrid population, which showed an 87.5% of relapses and 12.5% of reinfections among the cases with recurrent TB¹³. However, in a study in the Canary Islands, the results showed a higher reinfection percentage (44%) versus the 55% of relapses¹⁴. A more extreme result was obtained in a study in London, in which 72.6% of the repeated patients were classified as reinfections against a 27.4% of relapses¹⁵. The large variation of the results among the different studies suggests that they largely depend on the population sample studied. It would be very interesting to analyse the reinfection cases in each of the studies to understand the reasons for these differences. Regarding endemic TB regions, a higher percentage of recurrent TB was found. Around 9.5% of TB patients had recurrent TB in Malawi (39.6% had relapse and 14.4% reinfection, the rest was undetermined)¹⁶ and a study carried out in India demonstrated that the majority of relapses they had were among HIV negative people (95% of TB recurrences) while the majority of reinfections were among HIV positive people (75% of TB recurrences)¹⁷.

Regarding the epidemiological and risk factors of the relapsed TB cases studied, we found that relapse was significantly earlier in HIV positive patients (in the first two years since the first episode) when compared to HIV negative patients (p value = 0.035), what would be in accordance with a compromised immune system. Any risk factor was found as significant for causing an earlier reactivation by Colangeli et al. ¹², however they recognized that the clinical cases studied did not have in general any comorbidity.

The number of SNPs between the pairs ranged from 0 to 8. Remarkably, three among the 18 pairs had more than 5 SNPs between the first and the relapsed isolate, interpreted as not recent transmission¹⁸, even though the bacteria were isolated from the same patient. This could be related to clinical characteristics of the patients, as immunosuppression, HIV status or the treatment adherence. Surprisingly, several SNPs were found in the first isolates that were absent in the relapsed isolates, as if they had reverted. This phenomenon was extreme in P12, in which six out of the seven SNPs found were absent in the relapsed isolate. The explanation could be the presence of different clones in the patient^19,20. In this way, in the different disease episodes a different clone was isolated, resulting from different bottlenecks and selective pressures of the original strain^21,22. The reinfection with an identical strain has been described as a limitation of these kind of studies, but in our case, it can be discarded as only one of the pairs belonged to a large endemic cluster (P4, with 0 SNPs). The rest of the pairs were infected with orphan or small-outbreak strains of up to four cases, differently from other studies with large endemic clusters and high TB prevalence²².

Same as Colangeli et at.¹², we did not find a significant correlation between the number of SNPs and the time between episodes. However, it is possible that P8 (160 months between episodes and 0 SNPs) is altering the trend of SNP accumulation when the time between episodes increases. This is one limitation when working with small sample size, that a single point could have a great impact in the results. None of the SNPs found seemed related with recurrence as all were unique and therefore not common to more than one pair of isolates. It has been described that 0.5 SNPs per genome, per year is the standard mutation rate for M. tuberculosis¹⁰. Some studies, where multiple MDR/XDR isolates coming from the same patients were sequenced, have reported that selective pressure and antibiotic resistance can increase this mutation rate as high as more than 3 SNPs^17,21. Despite all strains had been under the selective pressure of treatment, they did not achieve such a higher rate, maybe because they were drug susceptible. The mean mutation rate found in our study was 0.64 SNPs, slightly above the standard, due to the high mutation rate found in L4.1, almost double than the standard.

The correlation between the mutation rate and the relapsing period was found just marginally significant (p value = 0.0613), differently to Colangeli et al.¹², who found it significant. It is important to remark that the approaches were completely different: they used transmission events to mimic the latency period as the time between the diagnosis of the two cases, while we used isolates from the same patient who had a previous TB episode. We eliminated all patients with less than one year between the diagnosis of the episodes, as this was considered as a treatment failure, while Colangeli et al. 2020 had latency periods from one month, which was not possible in our clinical cases as a minimum of 6 months of treatment was required. We did not find a significant correlation between the mutation rate along the variable generation times analysed when we split the data into [1–2 years] and (2–14 years), we observed just a small difference. This difference was much smaller than that found by Colangeli et al. 2020 (as high as 8 × 10^–10 for early latency), suggesting that mutation rate was constant during the relapsing period in recurrent TB cases. The mutation rate found in our study, 2.7 × 10^–10, was similar to that found by Ford et al. 2011 (2 × 10^–10)¹⁰, therefore both more distant from the one found by Colangeli et al. 2020. The reason why our results are similar to those of Ford et al. 2011 could be due to the similarity of the approaches applied, as they used lesions of the same macaques for studying latency and we used relapsed isolates from the same patients.

The analysis of the IS6110 element showed differences in the number of IS6110 copies in six of the pairs studied, affecting more than one IS copy in several pairs. It has been observed that IS6110 transposed more in great starvation conditions²³, which could be similar to the conditions the mycobacteria found in the granuloma⁴. It was surprising that in four of the pairs studied, the relapsed isolates had lost 1 to 3 copies that were present in the first isolates. Noteworthy, the number of reads obtained in the fastQ files for these copies was considerably lower than for the rest of the IS copies. This suggests that those lost copies were not still fixed in the complete bacteria population, therefore a selection among the different clones present in the same patient had taken place²⁴. It could be that the lost copies in the relapsed isolates had some deleterious effect for the mycobacteria as the relapsed bacteria were the ones without that IS copies. The fact that five out of the six pairs with IS6110 movements had more than 2 years of relapsing period supports the idea of IS transposing more during the asymptomatic state of the patient²³.

The main limitation to analyse the evolution of the bacteria during the dormancy period is the approach used for resembling this state. There is not a perfect approach, as it is impossible to reproduce what is happening inside the granuloma of a concrete patient, but we think that using isolates of the same patient is the closest way to do it. The difficulty to obtain the complete epidemiological information of the patients is another limitation because it does not allow to determine the accurate development of the disease’s episodes. Another limitation is that some of the SNPs could be the result of a sequencing error or due to laboratory management, what would have a huge impact on the mutation rate. In addition, although there were more cases of potential relapses in our records, DNA of the isolates was not available. We decided not to re-cultivate these stored isolates to avoid more manipulation that could introduce errors such as additional SNPs that were not present in the original strains.

As a conclusion, the patients with HIV seemed to suffer reactivation in the first two years after the initial episode of TB more frequently than HIV negative patients. Besides, IS6110 movements occurred more frequently in patients with more than two years between episodes and it seems that different clones of the original strain could be responsible for the first and the following episodes. No correlation was found between the number of SNPs and the time between episodes, neither between the mutation rate and the relapsing period, just a trend of diminishing in longer time periods. Finally, the mutation rate seemed to be constant along all the period between episodes.

Material and methods

Selection of samples and patients

Of around 2553 cases of TB in Aragon since 2004, we first looked for those with more than one isolate more than 1 year apart and of a similar genotype. We used Bionumerics v6.7 software (v7.6, Applied Maths, Kortrijk, Belgium) to confirm that both isolates coming from the same patient shared an identical IS6110-RFLP pattern. Eighteen pairs of M. tuberculosis isolates with available DNA were included in this study. When there were more than two isolates from the same patient, the two more distant in isolation dates were considered for the evolution study during relapse. All data remained anonymous during the epidemiological search. Our regional ethical committee (Comité de Ética de la Investigación de la Comunidad Autónoma de Aragón, Record No. 20/2018) approved the methodology used in this work, detailed in 18/0336 project.

DNA of the bacterial isolates was obtained using the cetrimonium bromide method, as previously described²⁵. No human DNA was sequenced. All DNA extractions were stored at – 20 °C until sequencing. All the isolates were genotyped by IS6110-RFLP and spoligotyping as previously described^26,27. The genetic patterns obtained were stored and analysed in Bionumerics database software.

SNP annotation and lineage identification

Thirty-six isolates corresponding to 18 different patients were sequenced using Ion Torrent technology according to manufacturer’s instructions. The fastQ files obtained were mapped against the reference M. tuberculosis strain H37Rv (NC_000962.3) in order to obtain the Binary Aligned Map (BAM) and Variant Call Format (VCF) files, used for the SNP study. The fastQ files were uploaded in Bionumerics software for the study and comparison of the genomes. The SNP annotation was carried out using Snippy software (default parameters) and Integrative Genomics Viewer (IGV), from the Broad Institute²⁸. The effect of the mutation (synonymous or non-synonymous) was observed using Genewise platform (https://www.ebi.ac.uk/Tools/psa/genewise/). All the mutation points are referred to the H37Rv reference strain. For lineage identification, the SNP-based classification stablished by Coll et al. 2014²⁹ was used. This classification assigns specific SNPs to each TB lineages and sublineages.

IS6110 location

All the reads containing the first and the last 30 base pairs of the IS6110 sequence were extracted. These reads are formed by the beginning or the ending of the IS6110 along with part of the gene in which the IS is inserted. After the extraction of the sequences, BLAST was made in Tuberculist and Bovilist to know the insertion point. BLAST was also made automatically with the script, but manual BLAST was required for some ambiguous points. The script used in R is in the Supplementary Materials.

Mutation rate calculation

The mutation rate per (bp × generation) was calculated as previously described¹⁰, adjusting the parameters to our own data. Briefly, the mutation rate per bp × generation is defined as

$$\mu = \frac{{\mathop \sum \nolimits_{i = 1}^{n} m_{i} }}{{N\mathop \sum \nolimits_{i = 1}^{n} \left( {{{t_{i} } \mathord{\left/ {\vphantom {{t_{i} } g}} \right. \kern-\nulldelimiterspace} g}} \right)}}$$

where μ is the mutation rate, m is the number of SNPs between the first and the relapsed isolate, N is the genome size (since we had, on average, reads covering 97.4% of the M. tuberculosis genome, N = 0.974 × L where L is reference genome size), t is time since infection (in hours, Table 1), and g is generation time (in hours).

Statistical methods

Poisson regression was used to model the variation of mutation rate over a range of generation times. To control the deviations from distributional assumptions a robust variance of Robust Sandwich Estimator was used. Poisson models were used to obtain mutation rates per (bp × generation) by using bp × generation as an offset. Two Poisson models were fit according to the relapsing period (one for 1–2 years including n = 6 pairs and another model for 2–14 years including n = 12 pairs). We also fitted a Poisson model using the relapsing period as a continuous independent variable. The hypothesis that we tested was if the parameter associated to relapsing period was significantly different from 0. To test the Poisson model parameters, a two-sided chi square test using the robust variance was used. Software R version 4.0.5 (2021-03-31) was used to all statistics analysis. Regression Poisson of all models was implemented in R using a generalized linear model function and robust variance control with sandwich package³⁰.

Data availability

The genomes of the studied isolates are loaded in GenBank with the accession numbers SAMN26037035-SAMN26037070 and the BioProject ID PRJNA808219, https://www.ncbi.nlm.nih.gov/bioproject/?term=PRJNA808219.

Change history

27 February 2023
A Correction to this paper has been published: https://doi.org/10.1038/s41598-023-30479-1

References

Esmail, H., Barry, C. E. 3rd., Young, D. B. & Wilkinson, R. J. The ongoing challenge of latent tuberculosis. Philos. Trans. R. Soc. Lond. Ser. B, Biol. Sci. 369(1645), 20130437 (2014).
Article CAS Google Scholar
Getahun, H., Matteelli, A., Chaisson, R. E. & Raviglione, M. Latent Mycobacterium tuberculosis infection. N Engl J Med. 372(22), 2127–2135 (2015).
Article CAS PubMed Google Scholar
Veatch, A. V. & Kaushal, D. Opening Pandora’s Box: Mechanisms of Mycobacterium tuberculosis Resuscitation. Trends Microbiol. 26(2), 145–157 (2018).
Article CAS PubMed Google Scholar
Gibson, S. E. R., Harrison, J. & Cox, J. A. G. Modelling a silent epidemic: a review of the in vitro models of latent tuberculosis. Pathog (Basel, Switzerland). 7(4), 88 (2018).
CAS Google Scholar
Behr, M. A., Edelstein, P. H. & Ramakrishnan, L. Revisiting the timetable of tuberculosis. BMJ 362, 1–10. https://doi.org/10.1136/bmj.k2738 (2018).
Article Google Scholar
Weller, C. & Wu, M. A generation-time effect on the rate of molecular evolution in bacteria. Evolution 69(3), 643–652 (2015).
Article CAS PubMed Google Scholar
Hershkovitz, I. et al. Detection and molecular characterization of 9,000-year-old Mycobacterium tuberculosis from a Neolithic settlement in the Eastern Mediterranean. PLoS ONE 3(10), 3426 (2008).
Article ADS Google Scholar
Wirth, T. et al. Origin, spread and demography of the Mycobacterium tuberculosis complex. PLoS Pathog. 4(9), e1000160 (2008).
Article PubMed PubMed Central Google Scholar
Arnold, C. Molecular evolution of Mycobacterium tuberculosis. Clin. Microbiol. Infect. Off. Publ. Eur. Soc. Clin. Microbiol. Infect Dis. 13(2), 120–128 (2007).
CAS Google Scholar
Ford, C. B. et al. Use of whole genome sequencing to estimate the mutation rate of Mycobacterium tuberculosis during latent infection. Nat. Genet. 43(5), 482–486 (2011).
Article CAS PubMed PubMed Central Google Scholar
Colangeli, R. et al. Whole genome sequencing of Mycobacterium tuberculosis reveals slow growth and low mutation rates during latent infections in humans. PLoS ONE 9(3), e91024 (2014).
Article ADS PubMed PubMed Central Google Scholar
Colangeli, R. et al. Mycobacterium tuberculosis progresses through two phases of latent infection in humans. Nat. Commun. 11(1), 4870 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Cacho, J. et al. Recurrent tuberculosis from 1992 to 2004 in a metropolitan area. Eur. Respir. J. 30(2), 333–337 (2007).
Article CAS PubMed Google Scholar
Caminero, J. A. et al. Exogenous reinfection with tuberculosis on a European island with a moderate incidence of disease. Am. J. Respir. Crit. Care Med. 163(3 Pt 1), 717–720 (2001).
Article CAS PubMed Google Scholar
Afshar, B., Carless, J., Roche, A., Balasegaram, S. & Anderson, C. Surveillance of tuberculosis (TB) cases attributable to relapse or reinfection in London, 2002–2015. PLoS ONE 14(2), e0211972 (2019).
Article CAS PubMed PubMed Central Google Scholar
Guerra-Assunção, J. A. et al. Recurrence due to relapse or reinfection with Mycobacterium tuberculosis: a whole-genome sequencing approach in a large, population-based cohort with a high HIV infection prevalence and active follow-up. J. Infect. Dis. 211(7), 1154–1163 (2015).
Article PubMed Google Scholar
Shanmugam, S. et al. Whole genome sequencing based differentiation between re-infection and relapse in Indian patients with tuberculosis recurrence, with and without HIV co-infection. Int. J. Infect. Dis. IJID Off. Publ. Int. Soc. Infect. Dis. 113(Suppl), S43–S47 (2021).
CAS Google Scholar
Lalor, M. K. et al. The use of whole-genome sequencing in cluster investigation of a multidrug-resistant tuberculosis outbreak. Eur. Respir. J. 51(6), 1702313. https://doi.org/10.1183/13993003.02313-2017 (2018).
Article PubMed Google Scholar
Gagneux, S. Ecology and evolution of Mycobacterium tuberculosis. Nat. Rev. Microbiol. 16(4), 202–213 (2018).
Article CAS PubMed Google Scholar
Moreno-Molina, M. et al. Genomic analyses of Mycobacterium tuberculosis from human lung resections reveal a high frequency of polyclonal infections. Nat. Commun. 12(1), 2716 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Xu, Y. et al. In vivo evolution of drug-resistant Mycobacterium tuberculosis in patients during long-term treatment. BMC Genomics 19(1), 640. https://doi.org/10.1186/s12864-018-5010-5 (2018).
Article CAS PubMed PubMed Central Google Scholar
Pérez-Lago, L. et al. Recurrences of multidrug-resistant tuberculosis: Strains involved, within-host diversity, and fine-tuned allocation of reinfections. Transbound. Emerg. Dis. 69(2), 327–336. https://doi.org/10.1111/tbed.13982 (2022).
Article CAS PubMed Google Scholar
Gonzalo-Asensio, J. et al. New insights into the transposition mechanisms of IS6110 and its dynamic distribution between Mycobacterium tuberculosis Complex lineages. PLoS Genet. 14, e1007282 (2018).
Article PubMed PubMed Central Google Scholar
Tanaka, M. M. Evidence for positive selection on Mycobacterium tuberculosis within patients. BMC Evol Biol. 4, 1–8 (2004).
Article Google Scholar
van Soolingen, D., de Haas, P. E., Hermans, P. W. & van Embden, J. D. DNA fingerprinting of Mycobacterium tuberculosis. Methods Enzymol. 235, 196–205 (1994).
Article CAS PubMed Google Scholar
Van Embden, J. D. A. et al. Strain identification of Mycobacterium tuberculosis by DNA fingerprinting: Recommendations for a standardized methodology. J. Clin. Microbiol. 31, 406–409 (1993).
Article PubMed PubMed Central Google Scholar
Kamerbeek, J. et al. Simultaneous detection and strain differentiation of Mycobacterium tuberculosis for diagnosis and epidemiology. J. Clin. Microbiol. 35, 907–914 (1997).
Article CAS PubMed PubMed Central Google Scholar
Robinson, J. T. et al. Integrative genome viewer. Nat. Biotechnol. 29(1), 24–26 (2011).
Article CAS PubMed PubMed Central Google Scholar
Coll, F. et al. A robust SNP barcode for typing Mycobacterium tuberculosis complex strains. Nat. Commun. 5, 4–8 (2014).
Article Google Scholar
Zeileis, A., Köll, S., & Graham, N. Various versatile variances: an object-oriented implementation of clustered covariances in R. J. Stat. Softw. 95, (1 SE-Articles), 1–36 (2020).

Download references

Acknowledgements

Authors would like thank Ainhoa Telletxea and Montserrat Gutierrez for proofreading the manuscript. We would like to thank the EPIMOLA group for supplying the genotyped bacterial DNA used in this work and to acknowledge the use of Servicio General de Apoyo a la Investigación-SAI, Universidad de Zaragoza (Servicio de Análisis Microbiológico), and Servicios Científico Técnicos, IACS (Servicio de Secuenciación y Genómica Funcional and Servicio de Biocomputación). This work was supported by the Carlos III Health Institute in the context of a Grant (FIS18/0336) co-funded by European Regional Development Fund/European Social Fund “A way to make Europe”/“Investing in your future” and J.C. was awarded a scholarship by the Government of Aragon/European Social Fund, “Building Europe from Aragon”.

Author information

A list of authors and their affiliations appears at the end of the paper.

Authors and Affiliations

Instituto Aragonés de Ciencias de la Salud, C/de San Juan Bosco, 13, 50009, Zaragoza, Spain
Jessica Comín & Sofía Samper
Unidad de Biocomputación, Instituto Aragonés de Ciencias de la Salud, C/de San Juan Bosco, 13, 50009, Zaragoza, Spain
Alberto Cebollada
Fundación IIS Aragón, C/de San Juan Bosco, 13, 50009, Zaragoza, Spain
María José Iglesias & Sofía Samper
CIBER de Enfermedades Respiratorias, Av. Monforte de Lemos, 3-5. Pabellón 11, Planta 0, 28029, Madrid, Spain
María José Iglesias & Sofía Samper
Universidad de Zaragoza, Zaragoza, Spain
María José Iglesias, Daniel Ibarz & María Carmen Lafoz
Hospital Universitario Miguel Servet, Zaragoza, Spain
Jesús Viñuelas
Hospital General Universitario San Jorge, Huesca, Spain
Luis Torres
Hospital Universitario Lozano Blesa, Zaragoza, Spain
Juan Sahagún
Salud Pública, Gobierno de Aragón, Zaragoza, Spain
Felipe Esteban de Juanas & María Carmen Malo

Authors

Jessica Comín
View author publications
You can also search for this author in PubMed Google Scholar
Alberto Cebollada
View author publications
You can also search for this author in PubMed Google Scholar
Sofía Samper
View author publications
You can also search for this author in PubMed Google Scholar

Consortia

Aragonese Working Group on Molecular Epidemiology of Tuberculosis (EPIMOLA)

María José Iglesias
, Daniel Ibarz
, Jesús Viñuelas
, Luis Torres
, Juan Sahagún
, María Carmen Lafoz
, Felipe Esteban de Juanas
& María Carmen Malo

Contributions

S.S. “conceptualization, funding acquisition, writing the manuscript”. J.C. “laboratory work, analysis the data, writing the manuscript”. A.C. “statistical analysis, biocomputational work, writing the manuscript”. EPIMOLA “genotyping surveillance, epidemiological support”.

Corresponding author

Correspondence to Jessica Comín.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The original online version of this Article was revised: The original version of this Article contained errors, where the values in the columns ‘[1-2 years]’, ‘[2-14 years]’ and ‘p value’ in Table 1, and consequently in the Results and the Discussion section were incorrect due to an error in data assembly. Full information regarding the correction made can be found in the correction for this Article.

Supplementary Information

Supplementary Information 1.

Supplementary Information 2.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Comín, J., Cebollada, A., Aragonese Working Group on Molecular Epidemiology of Tuberculosis (EPIMOLA). et al. Estimation of the mutation rate of Mycobacterium tuberculosis in cases with recurrent tuberculosis using whole genome sequencing. Sci Rep 12, 16728 (2022). https://doi.org/10.1038/s41598-022-21144-0

Download citation

Received: 23 June 2022
Accepted: 22 September 2022
Published: 06 October 2022
DOI: https://doi.org/10.1038/s41598-022-21144-0

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.