Introduction

Tuberculosis (TB), caused by members of the Mycobacterium tuberculosis complex (MTBC), is one of humanity's oldest scourges and one of the leading causes of death from a single infectious agent globally, accounting for about 1.5 million fatalities and 10 million new cases each year1,2,3. These cases are as a result of either a primary infection, an endogenous reactivation of a primary infection or exogenous reinfection with a new strain4. Historically, it was presumed that TB was as a result of a single strain and any recurrence was due to reactivation of the same strain that caused the first episode5. Infection due to multiple strains at a single point in time was hardly considered. However, in the mid-1970s using phage typing, it was demonstrated that different strains of MTBC can infect a patient at the same time6 either as a result of a single transmission involving multiple distinct strains or due to multiple transmission events7. Multiple strain infections can either be due to mixed infections or clonal diversity and drawing clear distinctions between the two in clinical settings is somehow difficult. However, there is a significant difference in how these two mechanisms generate within-host diversity. Clonal diversity involves sporadic polymorphism resulting from sequential adaptive mutations (microevolutions)8,9,10, whereas mixed infection involves a host acquiring an entirely new MTBC genome through successive or concurrent exposure to different strains11,12. Considering that members of MTBC have highly conserved genomes13,14, high quality methods are required to identify small alterations within the infecting mycobacterial population. So far, various Polymerase Chain Reaction (PCR) based approaches have been utilized to demonstrate multiple strains within the same sputum sample10,15 or different sputum samples from the same patient8,16. Mycobacterial Interspersed Repetitive Units-variable Number of Tandem Repeats (MIRU-VNTR) analysis, initially developed in 200117, identifies such changes in the genome by varying the copy of repeats in highly variable regions of the MTBC genome18,19. This method was found to be adequate for large-scale prospective studies due to its short turnaround time but still lacked the discriminatory power required for long-term, population-based studies in order to account for a large number of samples and recent strain evolution. However, in 2006, Supply et al. proposed an expanded set of 24 MIRU loci that has high discriminatory power and thus recommended for phylogenetic studies20. This study aimed at identifying multiple MTBC strain infections among patients with pulmonary tuberculosis (PTB) in a high TB incidence area using MIRU-VNTR analysis and determining factors that could be associated with mixed strains infection in this area. TB incidence in Southwestern Uganda is high at 253 cases per 100 000 people per year21,22. It has been shown that multiple MTB strain infections are more common among people living in high TB burdened areas4,7,11,23 and accurate identification of this condition provides not only insight into the disease trends but also helps in the management and control of TB16,24,25,26,27.

Results

Prevalence of multiple strains infection

MIRU-VNTR typing was performed on 78 sputum samples, each from an individual PTB patient. Majority of these samples (91%;71/78) were from newly diagnosed cases while 9% (7/78) were relapse patients. Ten (12.8%) patients were from refugees residing in the resettlement camps, 6 (7.7%) were from patients in prison. According to the HIV status records, 39.7%, 24.4%, and 35.9% were HIV positive, negative or unknown respectively. Five of the 78 patients (6.4%; 95% CI 0.864–0.976) were found harbor more than one strain of M. tuberculosis with all cases of patients infected with multiple strains being the newly diagnosed patients (p, 0.468) whereas three of them were HIV positive (see Tables 1 and 2). Two of the five samples showed double alleles at three loci (patient # 202 and # 10,546), while two other samples had three alleles in 1 locus and two alleles at another locus (patient # 228 and # 2264) with allelic diversity being noted in loci 424, 1644, 2059, 3171, 3192 and 3690 (see Table 2). Based on the unweighted pair group method with arithmetic mean (UPGMA) analysis of the standard 24 loci MIRU-VNTR, a total of 12 different strains with 58 unique patterns were identified; Uganda I, Uganda II, EAI, LAM, Haarlem, Cameroon, Ghana, URAL, TUR, S, Bovis and Caprae. Six (7.7%) samples did not match any strain in the database hence regarded as unique. Uganda I and Uganda II sub-lineages accounted for 23.1% and 19.2% of the strains respectively whereas Ghana, Tur and S strains were individually identified in 1.3% of the samples. Animal strains; M. bovis and M. caprae were present in 2.6% and 1.3% of the samples (Table 3). MIRU-VNTRplus similarity search indicated that four of the patients with multiple strains had two distinct strains while only one patient (#2264) had three strains. Furthermore, one of the patients (#63) with two distinct strains these strains belonged to two distinct sub-lineages whereas the rest of the patients had strains belonging to the same sub-lineage (Fig. 1). Patient #202 harbored strains that were resistant to isoniazid that had mutations in both katG and inhA regions (Table 2).

Table 1 Prevalence of M. tuberculosis multiple infections and comparison between patients with multiple versus single strain infections in Southwestern, Uganda.
Table 2 Patients in south western Uganda harboring more than one strain of M. tuberculosis identified using MIRU-VNTR standardized 24 loci.
Table 3 Distribution of MTB lineages.
Figure 1
figure 1

UPGMA tree based on the standard 24 loci MIRU-VNTR of MTB recovered from PTB patients in Southwestern, Uganda. Lineage sub-type identified by MIRU-VNTRplus similarity search.

Exact regression analysis of factors associated with multiple strains infection

The conditional maximum likelihood from the bivariable exact regression revealed that none of the patients’ demographic variables such as age, sex was linked to multiple strain infections (Table 4). The model projected that patients who tested with high/ > 10AFB/OIF (OR 0.390, 95% CI 0–3.8167) and medium/1–10 AFB/OIF (OR; 0.300, 95% CI 0–2.9200) MTB loads in their sample were more likely to be diagnosed with multiple strains compared to those who had tested with Very low/1-9AFB/100 OIF when all other factors were held constant. Furthermore, it was also revealed that the natives were more likely to harbor multiple strains (OR; 0.981, 95% CI 0–7.926) as compared to the refugees. However, because the number of refugees and individuals with multiple strain infections is so small, drawing firm inferences from these findings is difficult.

Table 4 Exact bivariate logistic regression of factors associated with multiple strains of M. tuberculosis infections among PTB patients in Southwestern, Uganda.

Discussion

Multiple strain infections in TB are now recognized as common occurrences and identifying patients with multiple MTB strains is critical in clinical practice, public health and molecular epidemiology. This is because not only does it provide insight into the disease patterns but also aids in the management and control of TB. This study revealed that one out of every sixteen PTB patients (6.4%) was infected with multiple strains of MTB. This prevalence is almost similar to the 7.1% reported in Kampala, Uganda11 but much lower than the 11% observed in Mubende, Uganda23. The disparities in estimations are probably due to discrepancies in the sensitivities of the genotyping techniques used to differentiate between MTB strains. While there are diverse genotyping approaches employed in the identification of mixed infections, the degree of sensitivity of each method varies. While the Mubende and Kampala investigations used 15 loci MIRU-VNTR typing, this study utilized 24 loci MIRU-VNTR typing with a single target conventional PCR. This method has been demonstrated to be very sensitive and discriminative, rendering it the gold standard in the diagnosis of multiple strain infections29,30. Other significant discrepancies can be a result of the laboratory methods utilized. As with any other genotyping approaches, detection of multiple strains can only be established when there are sufficient DNA copies of that strain in the sample being studied. Many studies, including the Kampala and Mubende studies, use culture to increase the mycobacterial population11,23,31, however, the culture step can drastically change the clonal composition thus influencing the frequency with which multiple strains are detected9,29,32. In our study, we utilized DNA isolated directly from processed sputum samples. Detection of multiple strains directly from sputum samples has successfully been documented29. However, Our findings are also much higher than the 2.8%12 reported in Malawi and 3.2% in Zambia33 but lower than the 9.6%34 and 10%35 reported in Botswana. Differences between the study settings may partly account for this difference whereby for instance the annual risk of TB infection in Malawi is approximately 1%3,36 while in Botswana it is 3%37.

Our study also revealed that unlike the relapse patients, who were not infected with multiple strains, all (100%) of the patients in our study with multiple strain infections were newly diagnosed cases. This is consistent with the findings of the Mubende study23, which observed that the majority (87.5%) of patients with multiple strains were newly diagnosed cases. This might reflect a high level of transmission and heterogeneity of strains in this category of patients (Cohen et al., 2012). This hypothesis is supported by the proportion of newly diagnosed cases that exhibited the multiple strain infection phenomena, an attribute that is reported to indicate high transmission rates38,39,40. Furthermore, a third (9.7%) of the patients with multiple strain infections were also HIV-infected. This finding is consistent with other studies, in which nearly all multiple strain TB infected people were HIV positive11,23. This appears to support the notion of the link between multiple strain TB infection and HIV/TB co-infection16,23,34,41,42. Given the high prevalence of HIV and HIV/TB co-infection in this region43, it is plausible to suggest that HIV-induced immune deficiency exposes patients to the risk of concurrent infections. HIV removes the security of being reinfected as one battles an ongoing infection thereby creating a scenario where one can be infected even before they clear an ongoing infection (Elizabeth Glaser Foundation, 2015).

Study limitations

A notable limitation of this study is the small sample size of patients with multiple strains which may have limited the precision with which we could estimate the relationship between the feature of interest and the likelihood of outcome/exposure However, exact logistic regression was selected to make such estimates since this type of analysis provides the highest chance of an event occurring within the sub-population formed by the various factors included in the model. Another limitation of this study is that since genotyping was done on genomic DNA extracted directly from MTB positive sputum samples, low levels of DNA could have been obtained especially from samples with very low/trace Mtb load thus insufficient DNA to detect multiple strains. However, doing PCR on isolates is associated with a drastic change of clonal composition9,29,32.

Conclusion

The findings of this study reveal that the rate of multiple strains infection in SWU is at 6.4% with all the patients being diagnosed with TB for their first time having this condition and two-third of them being HIV positive. Despite these findings being not statistically significant due to the small sample size, this points to a critical component of disease dynamics that has clinical implications thus emphasizing a need for a study using a larger cohort. It is also essential to establish the potential factors associated with the high risk of exposure to newly diagnosed patients at the community level. Our ability to detect multiple M. tuberculosis strains using the standard 24 loci MIRU-VNTR typing with allelic diversity in loci such as 2059 and 3171, which are excluded from the 15-locus MIRU-VNTR, lead us to recommend the use of this genotyping technique, especially in areas with M. tuberculosis endemicity similar to that observed in this study.

Materials and methods

Ethical consideration

This study was approved by the Institutional Review Board of Mbarara University of Science and Technology (MUST-IRB), the Uganda National Council for Science and Technology (with UNCST reference number HS 2379). The health facility administrators and the prime minister granted permission to access their facilities and refugee camps, respectively. Written informed consent was obtained from each patient who participated in the study.

Patients, sample collection and processing

All methods of this study were carried out in accordance with the approved guidelines. Sputum samples evaluated in this study were from individuals involved in an ongoing epidemiological study in Southwestern, Uganda, from which some papers have been published44,45. Sputum samples were collected between May 2018 and April 2019 from patients seeking health care services at either Nakivale HC111, Kabale Regional Referral Hospital and Mbarara Regional Referral Hospital who consented to the study after filling out an informed consent form. The patients were over the age of 18 years who were diagnosed with PTB using either smear microscopy or Cepheid GeneXpert and reported to have not undergone TB treatment in the preceding month.

DNA extraction

The DNA was extracted directly from clinical sputum samples using the cetyltrimethylammonium bromide (CTAB)-based method described in CLSI46 with minor modifications since these were sputum already processed for GeneXpert analysis. The, ZN microscopy-diagnosed samples underwent similar processing as the cepheid Gene Xpert-diagnosed samples by adding the Gene Xpert MTB/RIF sample reagent (Cepheid, Sunnyvale, CA, USA) to the sputum at a 2:1 ratio. The mixture was then manually agitated twice during a 15-min room temperature incubation period. Briefly, sputum samples (containing the GeneXpert processing fluid) were centrifuged for 15 min at 14,000 rpm and the supernatant discarded. The pellet was resuspended in 400μL of TE buffer before being heat-killed at 95 °C for 30 min. Subsequently, 50μL of lysozyme (10 mg/mL in Tris–EDTA [TE] buffer) was added to the sample which was then incubated overnight at 37 °C. After incubation, 70% of sodium dodecyl sulfate and 5μL of proteinase K (20 mg / mL) were added, and the sample incubated at 65 °C for 15 min. One hundred microliters of the pre-heated CTAB / NaCl solution mixture (prepared by mixing 10% CTAB with 0.7 M NaCl and heated at 65 °C for 30 min) was added to the sample in a tube together with 100μL of 5 M NaCl and gently mixed by pipetting. The mixture was then incubated for 10 min at 65 °C, after which 750μL of chloroform: isoamyl alcohol (24:1) was added and gently swirled. The mixture was then centrifuged for 5 min at 14,000 rpm after which the aqueous layer was carefully transferred to a new microcentrifuge tube containing 450μL ice-cold isopropanol and incubated for 30 min at − 20 °C. The mixture was then centrifuged for 15 min at 14,000 rpm and the supernatant discarded. The sample was then rinsed with 70% ice-cold ethanol and centrifuged for 5 min at 14,000 rpm following which ethanol was removed and the sample allowed to air dry for approximately 15 min. The pellet was then resuspended in 50μL TE buffer and incubated at 65 °C to allow resuspension after which the sample was then ready for downstream use.

The quantity of the extracted DNA was obtained using NanoDrop 3300 Fluorospectometer (ThermoFisher Scientific) after which all the samples were confirmed as MTB by PCR-detection of a 123 bp fragment of the IS6110, which is common in the members of the MTB complex. Drug susceptibility testing to screen for rifampicin and isoniazid was carried out using high resolution melting curve analysis as described in Micheni et al.45.

Single nucleotide polymorphic (SNP) typing

SNP typing was then performed on these samples using lineage 3 and 4 markers as earlier described in Micheni et al.44 to screen for the commonly occurring MTBC lineages in this region while SNP typing was performed as previously described44 using lineage 3 and 4 specific primers (Rv004C for MTB L4-U, Rv2962C for MTB L4-NU and Rv0129C for MTB L3) and their accompanying hybridization probes. Briefly, the assays were performed in 20 µl reaction mixture containing 3.75 μl of PCR water, 1.25 μl (0.5 μM final concentration) of each primer, 0.625 μl (0.25 μM final concentration) of each probe, 9.5 μl of 2X Lunar® Universal genotyping master mix, and 3 μl (5–50 ng) of extracted genomic DNA. RT-PCR was carried out in a Bio-Rad CFX96 Touch™ that was programmed for PCR amplification and a melting curve stage. For each of the three uniplex assays, the amplification stage consisted of a pre-PCR stage performed at 95 °C for 10 min, an amplification stage with denaturation at 95 °C for 30 s, primer annealing (50 °C for Rv004C or 52 °C for Rv0129C or 51 °C for Rv2962C) for 30 s and extension at 60 °C for 30 s for 45 cycles. The melting curve analysis consisted of denaturation of the amplicons at 95 °C for 10 s to produce single-stranded DNA, probe annealing temperature at 65 °C for 05 s with a continuous acquisition mode to allow capture the fluorescence and probe melting temperature ranging from 40–80 °C. The MTBC lineages were identified based on differences in melting temperature (Tm). H37Rv (L4-NU), kc32969 (L4-U) and delicus (L3) genomic DNA were used as positive control while non-template mix as a negative control.

MIRU-VNTR typing

The MIRU-VNTR PCRs were performed on genomic DNA extracted from the sputum samples using primers specific for sequences flanking the MIRU units (see Table 5). The PCR was designed to amplify a standard set of 24 MIRU-VNTR loci from genomic DNA retrieved from each sample. Each MIRU locus was amplified individually using a reaction mix and amplification profile described by Supply28 with slight modifications. Briefly, assays for the various simplex reactions were prepared according to Supply (2005). Two microliters (5–50 ng) of extracted genomic DNA was added to the PCR pre-mix and amplified in the MultiGene™ OptiMax Thermal Cycler (Massachusetts, USA) which was programmed for PCR amplification with a pre-PCR stage performed at 95 °C for 15 min, an amplification stage with denaturation at 94 °C for 60 s, primer annealing at 59 °C for 60 s, extension at 72 °C for 90 s for 40cycles and a final extension at 72 °C for 10 min. In all the assays, M. tuberculosis H37Rv was used as positive controls and sterile water as the negative control. Ten microliters of each PCR product were separated electrophoretically on 2% agarose gels for 3 h, with a 100-bp DNA ladder (Solis Biodyne™, Estonia) serving as size markers. The corresponding MIRU-VNTR bands in the gel images were reported as Roman numerals representing the number of repeats per loci as described in the protocol reference table by Supply (2005)28. For any sample that revealed multiple bands at any of the MIRU loci, the PCR was repeated to confirm the results. Multiple strains were concluded as being present if a sample had double alleles at more than one locus while those samples that had varying copy numbers at a single locus were considered as single strain evolution rather than multiple strains.

Table 5 PCR primer sequences and MIRU-VNTR locus designations1 used in this study.

Statistical analysis

Patients’ biodata and the presence or absence of multiple strain infection results were entered and validated in Microsoft Excel® 2013. The data was then exported to Stata (Stata/SE 14.2 for windows, Stata Corp, College Station, TX) for statistical analysis. Chi-square test was used to compute proportions and determine the relationship between independent factors and dependent variables (presence multiple strain infection) with statistical significance considered at a 95% level of confidence. Since the feature of interest (multiple strain infections) was found in a small number of patients, an exact bivariate logistic regression analysis was performed to obtain odds ratios for factors that could be associated with the occurrence of multiple strains of M. tuberculosis among PTB patients in our setting. Exact logistic regression was selected because it calculates the conditional maximum chance of an event occurring within the sample population described by the model's varying factors. We did not specify a statistical significance threshold as per the recent statistical guidelines47,48.