Detection of Mycobacterium tuberculosis multiple strains in sputum samples from patients with pulmonary tuberculosis in south western Uganda using MIRU-VNTR

Infections with multiple strains of Mycobacterium tuberculosis are now widely recognized as a common occurrence. Identification of patients infected with multiple strains provides both insight into the disease dynamics and the epidemiology of tuberculosis. Analysis of Mycobacterial Interspersed Repetitive Unit-Variable-Number Tandem Repeats (MIRU-VNTR) has been shown to be highly sensitive in detecting multiple M. tuberculosis strains even in sputum. The goal of this study was to identify cases of multiple M. tuberculosis strain infections among patients diagnosed with pulmonary tuberculosis in Southwestern Uganda and assessment of factors associated with multiple strain infections. DNA extracted directly from 78 sputum samples, each from an individual patient, was analyzed using the standard 24 loci MIRU-VNTR typing. Five (6.4%) of the 78 patients were infected with multiple strains of M. tuberculosis with all of them being the newly diagnosed cases while two-thirds of them were co-infected with HIV. Exact regression analysis projected that the natives were more likely to harbor multiple strains (OR; 0.981, 95% CI 0–7.926) as well as those with a high microbial load (OR; 0.390, 95% CI 0–3.8167). Despite these findings being not statistically significant due to the small sample size, this points to a critical component of disease dynamics that has clinical implications and emphasizes a need for a study using a larger cohort. It is also essential to study the potential factors associated with higher risk of exposure to newly diagnosed and HIV positive patients at the community level. In addition, our ability to detect multiple M. tuberculosis strains using the standard 24 loci MIRU-VNTR typing especially with allelic diversity in loci 2059 and 3171, which are excluded from the 15-locus MIRU-VNTR, lead us to recommend the use of this genotyping technique, especially in areas with tuberculosis endemicity similar to this study.

www.nature.com/scientificreports/ is a significant difference in how these two mechanisms generate within-host diversity. Clonal diversity involves sporadic polymorphism resulting from sequential adaptive mutations (microevolutions) [8][9][10] , whereas mixed infection involves a host acquiring an entirely new MTBC genome through successive or concurrent exposure to different strains 11,12 . Considering that members of MTBC have highly conserved genomes 13,14 , high quality methods are required to identify small alterations within the infecting mycobacterial population. So far, various Polymerase Chain Reaction (PCR) based approaches have been utilized to demonstrate multiple strains within the same sputum sample 10,15 or different sputum samples from the same patient 8,16 . Mycobacterial Interspersed Repetitive Units-variable Number of Tandem Repeats (MIRU-VNTR) analysis, initially developed in 2001 17 , identifies such changes in the genome by varying the copy of repeats in highly variable regions of the MTBC genome 18,19 . This method was found to be adequate for large-scale prospective studies due to its short turnaround time but still lacked the discriminatory power required for long-term, population-based studies in order to account for a large number of samples and recent strain evolution. However, in 2006, Supply et al. proposed an expanded set of 24 MIRU loci that has high discriminatory power and thus recommended for phylogenetic studies 20 . This study aimed at identifying multiple MTBC strain infections among patients with pulmonary tuberculosis (PTB) in a high TB incidence area using MIRU-VNTR analysis and determining factors that could be associated with mixed strains infection in this area. TB incidence in Southwestern Uganda is high at 253 cases per 100 000 people per year 21,22 . It has been shown that multiple MTB strain infections are more common among people living in high TB burdened areas 4,7,11,23 and accurate identification of this condition provides not only insight into the disease trends but also helps in the management and control of TB 16,[24][25][26][27] .

Results
Prevalence of multiple strains infection. MIRU Table 2). Based on the unweighted pair group method  (Table 3). MIRU-VNTRplus similarity search indicated that four of the patients with multiple strains had two distinct strains while only one patient (#2264) had three strains. Furthermore, one of the patients (#63) with two distinct strains these strains belonged to two distinct sub-lineages whereas the rest of the patients had strains belonging to the same sub-lineage ( Fig. 1). Patient #202 harbored strains that were resistant to isoniazid that had mutations in both katG and inhA regions ( Table 2).

Discussion
Multiple strain infections in TB are now recognized as common occurrences and identifying patients with multiple MTB strains is critical in clinical practice, public health and molecular epidemiology. This is because not only does it provide insight into the disease patterns but also aids in the management and control of TB. This study revealed that one out of every sixteen PTB patients (6.4%) was infected with multiple strains of MTB. This prevalence is almost similar to the 7.1% reported in Kampala, Uganda 11 but much lower than the 11% observed   11,23,31 , however, the culture step can drastically change the clonal composition thus influencing the frequency with which multiple strains are detected 9,29,32 . In our study, we utilized DNA isolated directly from processed sputum samples. Detection of multiple strains directly from sputum samples has successfully been documented 29 . However, Our findings are also much higher than the 2.8% 12 reported in Malawi and 3.2% in Zambia 33 but lower than the 9.6% 34 and 10% 35 reported in Botswana. Differences between the study settings may partly account for this difference whereby for instance the annual risk of TB infection in Malawi is approximately 1% 3,36 while in Botswana it is 3% 37 .
Our study also revealed that unlike the relapse patients, who were not infected with multiple strains, all (100%) of the patients in our study with multiple strain infections were newly diagnosed cases. This is consistent with the findings of the Mubende study 23 , which observed that the majority (87.5%) of patients with multiple strains were newly diagnosed cases. This might reflect a high level of transmission and heterogeneity of strains www.nature.com/scientificreports/ in this category of patients (Cohen et al., 2012). This hypothesis is supported by the proportion of newly diagnosed cases that exhibited the multiple strain infection phenomena, an attribute that is reported to indicate high transmission rates [38][39][40] . Furthermore, a third (9.7%) of the patients with multiple strain infections were also HIV-infected. This finding is consistent with other studies, in which nearly all multiple strain TB infected people were HIV positive 11,23 . This appears to support the notion of the link between multiple strain TB infection and HIV/TB co-infection 16,23,34,41,42 . Given the high prevalence of HIV and HIV/TB co-infection in this region 43 , it is plausible to suggest that HIV-induced immune deficiency exposes patients to the risk of concurrent infections. HIV removes the security of being reinfected as one battles an ongoing infection thereby creating a scenario where one can be infected even before they clear an ongoing infection (Elizabeth Glaser Foundation, 2015).

Study limitations.
A notable limitation of this study is the small sample size of patients with multiple strains which may have limited the precision with which we could estimate the relationship between the feature of interest and the likelihood of outcome/exposure However, exact logistic regression was selected to make such estimates since this type of analysis provides the highest chance of an event occurring within the sub-population formed by the various factors included in the model. Another limitation of this study is that since genotyping was done on genomic DNA extracted directly from MTB positive sputum samples, low levels of DNA could have been obtained especially from samples with very low/trace Mtb load thus insufficient DNA to detect multiple strains. However, doing PCR on isolates is associated with a drastic change of clonal composition 9,29,32 .

Conclusion
The findings of this study reveal that the rate of multiple strains infection in SWU is at 6.4% with all the patients being diagnosed with TB for their first time having this condition and two-third of them being HIV positive. Despite these findings being not statistically significant due to the small sample size, this points to a critical component of disease dynamics that has clinical implications thus emphasizing a need for a study using a larger cohort. It is also essential to establish the potential factors associated with the high risk of exposure to newly diagnosed patients at the community level. Our ability to detect multiple M. tuberculosis strains using the standard 24 loci MIRU-VNTR typing with allelic diversity in loci such as 2059 and 3171, which are excluded from the 15-locus MIRU-VNTR, lead us to recommend the use of this genotyping technique, especially in areas with M. tuberculosis endemicity similar to that observed in this study.

Materials and methods
Ethical consideration. This study was approved by the Institutional Review Board of Mbarara University of Science and Technology (MUST-IRB), the Uganda National Council for Science and Technology (with UNCST reference number HS 2379). The health facility administrators and the prime minister granted permission to access their facilities and refugee camps, respectively. Written informed consent was obtained from each patient who participated in the study.
Patients, sample collection and processing. All methods of this study were carried out in accordance with the approved guidelines. Sputum samples evaluated in this study were from individuals involved in an ongoing epidemiological study in Southwestern, Uganda, from which some papers have been published 44,45 . Sputum samples were collected between May 2018 and April 2019 from patients seeking health care services at either Nakivale HC111, Kabale Regional Referral Hospital and Mbarara Regional Referral Hospital who consented to the study after filling out an informed consent form. The patients were over the age of 18 years who were diagnosed with PTB using either smear microscopy or Cepheid GeneXpert and reported to have not undergone TB treatment in the preceding month.
DNA extraction. The DNA was extracted directly from clinical sputum samples using the cetyltrimethylammonium bromide (CTAB)-based method described in CLSI 46 with minor modifications since these were sputum already processed for GeneXpert analysis. The, ZN microscopy-diagnosed samples underwent similar processing as the cepheid Gene Xpert-diagnosed samples by adding the Gene Xpert MTB/RIF sample reagent (Cepheid, Sunnyvale, CA, USA) to the sputum at a 2:1 ratio. The mixture was then manually agitated twice during a 15-min room temperature incubation period. Briefly, sputum samples (containing the GeneXpert processing fluid) were centrifuged for 15 min at 14,000 rpm and the supernatant discarded. The pellet was resuspended in 400μL of TE buffer before being heat-killed at 95 °C for 30 min. Subsequently, 50μL of lysozyme (10 mg/mL in Tris-EDTA [TE] buffer) was added to the sample which was then incubated overnight at 37 °C. After incubation, 70% of sodium dodecyl sulfate and 5μL of proteinase K (20 mg / mL) were added, and the sample incubated at 65 °C for 15 min. One hundred microliters of the pre-heated CTAB / NaCl solution mixture (prepared by mixing 10% CTAB with 0.7 M NaCl and heated at 65 °C for 30 min) was added to the sample in a tube together with 100μL of 5 M NaCl and gently mixed by pipetting. The mixture was then incubated for 10 min at 65 °C, after www.nature.com/scientificreports/ which 750μL of chloroform: isoamyl alcohol (24:1) was added and gently swirled. The mixture was then centrifuged for 5 min at 14,000 rpm after which the aqueous layer was carefully transferred to a new microcentrifuge tube containing 450μL ice-cold isopropanol and incubated for 30 min at − 20 °C. The mixture was then centrifuged for 15 min at 14,000 rpm and the supernatant discarded. The sample was then rinsed with 70% ice-cold ethanol and centrifuged for 5 min at 14,000 rpm following which ethanol was removed and the sample allowed to air dry for approximately 15 min. The pellet was then resuspended in 50μL TE buffer and incubated at 65 °C to allow resuspension after which the sample was then ready for downstream use. The quantity of the extracted DNA was obtained using NanoDrop 3300 Fluorospectometer (ThermoFisher Scientific) after which all the samples were confirmed as MTB by PCR-detection of a 123 bp fragment of the IS6110, which is common in the members of the MTB complex. Drug susceptibility testing to screen for rifampicin and isoniazid was carried out using high resolution melting curve analysis as described in Micheni et al. 45 .
Single nucleotide polymorphic (SNP) typing. SNP typing was then performed on these samples using lineage 3 and 4 markers as earlier described in Micheni et al. 44 to screen for the commonly occurring MTBC lineages in this region while SNP typing was performed as previously described 44 using lineage 3 and 4 specific primers (Rv004C for MTB L4-U, Rv2962C for MTB L4-NU and Rv0129C for MTB L3) and their accompanying hybridization probes. Briefly, the assays were performed in 20 µl reaction mixture containing 3.75 μl of PCR water, 1.25 μl (0.5 μM final concentration) of each primer, 0.625 μl (0.25 μM final concentration) of each probe, 9.5 μl of 2X Lunar ® Universal genotyping master mix, and 3 μl (5-50 ng) of extracted genomic DNA. RT-PCR was carried out in a Bio-Rad CFX96 Touch™ that was programmed for PCR amplification and a melting curve stage. For each of the three uniplex assays, the amplification stage consisted of a pre-PCR stage performed at 95 °C for 10 min, an amplification stage with denaturation at 95 °C for 30 s, primer annealing (50 °C for Rv004C or 52 °C for Rv0129C or 51 °C for Rv2962C) for 30 s and extension at 60 °C for 30 s for 45 cycles. The melting curve analysis consisted of denaturation of the amplicons at 95 °C for 10 s to produce single-stranded DNA, probe annealing temperature at 65 °C for 05 s with a continuous acquisition mode to allow capture the fluorescence and probe melting temperature ranging from 40-80 °C. The MTBC lineages were identified based on differences in melting temperature (Tm). H37Rv (L4-NU), kc32969 (L4-U) and delicus (L3) genomic DNA were used as positive control while non-template mix as a negative control.
MIRU-VNTR typing. The MIRU-VNTR PCRs were performed on genomic DNA extracted from the sputum samples using primers specific for sequences flanking the MIRU units (see Table 5). The PCR was designed to amplify a standard set of 24 MIRU-VNTR loci from genomic DNA retrieved from each sample. Each MIRU locus was amplified individually using a reaction mix and amplification profile described by Supply 28 with slight modifications. Briefly, assays for the various simplex reactions were prepared according to Supply (2005). Two microliters (5-50 ng) of extracted genomic DNA was added to the PCR pre-mix and amplified in the Multi-Gene™ OptiMax Thermal Cycler (Massachusetts, USA) which was programmed for PCR amplification with a pre-PCR stage performed at 95 °C for 15 min, an amplification stage with denaturation at 94 °C for 60 s, primer annealing at 59 °C for 60 s, extension at 72 °C for 90 s for 40cycles and a final extension at 72 °C for 10 min. In all the assays, M. tuberculosis H37Rv was used as positive controls and sterile water as the negative control. Ten microliters of each PCR product were separated electrophoretically on 2% agarose gels for 3 h, with a 100-bp DNA ladder (Solis Biodyne™, Estonia) serving as size markers. The corresponding MIRU-VNTR bands in the gel images were reported as Roman numerals representing the number of repeats per loci as described in the protocol reference table by Supply (2005) 28 . For any sample that revealed multiple bands at any of the MIRU loci, the PCR was repeated to confirm the results. Multiple strains were concluded as being present if a sample had double alleles at more than one locus while those samples that had varying copy numbers at a single locus were considered as single strain evolution rather than multiple strains.

Statistical analysis.
Patients' biodata and the presence or absence of multiple strain infection results were entered and validated in Microsoft Excel® 2013. The data was then exported to Stata (Stata/SE 14.2 for windows, Stata Corp, College Station, TX) for statistical analysis. Chi-square test was used to compute proportions and determine the relationship between independent factors and dependent variables (presence multiple strain infection) with statistical significance considered at a 95% level of confidence. Since the feature of interest (multiple strain infections) was found in a small number of patients, an exact bivariate logistic regression analysis was performed to obtain odds ratios for factors that could be associated with the occurrence of multiple strains of M. tuberculosis among PTB patients in our setting. Exact logistic regression was selected because it calculates the conditional maximum chance of an event occurring within the sample population described by the model's varying factors. We did not specify a statistical significance threshold as per the recent statistical guidelines 47 www.nature.com/scientificreports/