Molecular typing of Mycobacterium kansasii using pulsed-field gel electrophoresis and a newly designed variable-number tandem repeat analysis

Molecular epidemiological studies of Mycobacterium kansasii are hampered by the lack of highly-discriminatory genotyping modalities. The purpose of this study was to design a new, high-resolution fingerprinting method for M. kansasii. Complete genome sequence of the M. kansasii ATCC 12478 reference strain was searched for satellite-like repetitive DNA elements comprising tandem repeats. A total of 24 variable-number tandem repeat (VNTR) loci were identified with potential discriminatory capacity. Of these, 17 were used to study polymorphism among 67 M. kansasii strains representing six subtypes (I-VI). The results of VNTR typing were compared with those of pulsed-field gel electrophoresis (PFGE) with AsnI digestion. Six VNTRs i.e. (VNTR 1, 2, 8, 14, 20 and 23) allow to differentiate analyzed strains with the same discriminatory capacities as use of a 17-loci panel. VNTR typing and PFGE in conjunction revealed 45 distinct patterns, including 11 clusters with 33 isolates and 34 unique patterns. The Hunter-Gaston’s discriminatory index was 0.95 and 0.66 for PFGE and VNTR typing respectively, and 0.97 for the two methods combined. In conclusion, this study delivers a new typing scheme, based on VNTR polymorphism, and recommends it as a first-line test prior to PFGE analysis in a two-step typing strategy for M. kansasii.

VNTR 19 was the only locus capable of discriminating between all M. kansasii subtypes. The HGDIs for each VNTR locus are shown in Table 1. The most distinguishing loci were VNTR 8 and VNTR 20 (HGDI = 0.38). Whereas the lowest diversity was observed for VNTR 14 (HGDI = 0.09). The schematic representation of allelic diversity at VNTR 8 locus is depicted on Fig. 1.
Loci, most difficult to assess with respect to the copy number upon gel-sizing (i.e. containing short repeats of 20 bp size or less, namely VNTR 19, 20 and 24) were sequenced to verify estimates of sequence divergence. The copy number of VNTR 19 was in full agreement with that based on gel sizing for the representatives of M. kansasii types I, II and VI. Yet, in the genotype V strain, a single copy difference was noted (Suppl.

PFGE.
Three restriction endonucleases, that is AseI, DraI, and XbaI were attempted in PFGE profiling.
Whereas, the latter two consistently resulted in DNA band smearing, thus making the profiles unreadable (data not shown), the use of AseI provided good-quality restriction patterns with clear band separation. PFGE with AseI was performed on the entire study sample (67 isolates).
Subtype II isolates either belonged to one of two clusters (l/AC; l/AE), each of two isolates or harbored unique profiles (GDI = 0.71; HGDI = 0.9).
Of the eleven VNTR-PFGE clusters, three (a/H; p/AI; s/AB) contained isolates from more than one country and three (a/F; a/H; b/X) contained both clinically relevant and irrelevant isolates. One cluster (p/AI) accommodated isolates of both clinical and environmental origin.

Discussion
In the epidemiology of infectious diseases, including those of mycobacterial etiology, the key issue is disclosing sources of infection, transmission links, and dissemination in the environment. For this to be accomplished, high resolution inter-strain discrimination or typing is of utmost importance 33 .
Molecular typing methods have substantially improved our knowledge of the epidemiology of mycobacterial infections and genuinely assisted in the fight against them.
A significant source of the genetic polymorphism in mycobacteria, and one of the main driving forces of their genome evolution, are repetitive DNA elements. One such large group of repeated DNA are insertion sequences (ISs) 5,[33][34][35] . Equally important, as ISs, group of repetitive DNA elements are tandem repeats (TRs) alias variable number of tandem repeats (VNTRs), which are short direct repeats organized as head-to-tail arrays, classified as micro-, mini-or macrosatellites, depending on their repeat unit size 36 . The first VNTRs described in mycobacteria were those identified in M. tuberculosis, collectively referred to as mycobacterial interspersed repetitive units (MIRUs) 37 . Although the panel of MIRU-VNTR loci has been modified over the years, the standardized 24-locus MIRU-VNTR typing is currently recognized as the reference typing system for M. tuberculosis complex [38][39][40] . The principle of the VNTR-based typing modalities involves PCR amplification of a predefined set of VNTR loci, by using primers annealing to their flanking sequences, and amplicon size determination to deduce the number of TR units within each locus. At the end, a numerical code is developed, which corresponds to the numbers of repeats at each locus and serves as a strain's molecular fingerprint. All this renders multiple-locus VNTR analysis (MLVA) robust, easy-to-perform, time-and cost-effective. The portability of the results (MLVA codes) facilitates their storage and exchange between laboratories. More importantly, the discriminatory power of MLVA is often high, surpassing that of other typing schemes 41,42 .
VNTRs have also been described in several NTM species, including M. abscessus, M. avium, M. intracellulare, and M. marinum 25,27,29,[43][44][45][46] . However, until this study, the only evidence for the existence of VNTR-like loci in M. kansasii came from a 30-year-old study by Hermans et al. 47 . DNA homologous to a major polymorphic tandem repeat (MPTR), originating from M. tuberculosis, was shown to be present in M. kansasii and M. gordonae 47 . In this work, the VNTRs were searched directly in the M. kansasii genomes. Based on the applied criteria, a total of 24 VNTR loci were identified in the genome of the M. kansasii ATCC 12478 reference strain. Only 17 of the VNTR locus 19 was efficient in differentiating between all M. kansasii types except type III, whose isolates failed to produce a PCR product. Nevertheless, given the rarity of type III, PCR amplification of VNTR locus 19 offers an alternative to more lengthy, standard PCR-REA assays, for discrimination of M. kansasii genotypes, including two most prevalent, clinical types I and II.
The reason for the detected difference in the VNTR 19 copy number for M. kansasii type V deduced from gel sizing (i.e. 8 copies) and direct sequencing (i.e. 7 copies) may lie in the miscalculation of the band size on the gel. Inaccurate allele calling is one of the major nontechnical error affecting the reproducibility of the VNTR typing 48 .
As far as the evolutionary genomics is concerned, mycobacteria are remarkably homogeneous at the molecular level. This is also reflected by a slow molecular clock of VNTR loci. For instance, in M. tuberculosis, a mean mutation rate per VNTR locus per year was estimated at 10 −4 which translates in a mean of 0.05 pairwise (single-locus) changes per 24-locus genotype expected in sets of isolates originated from a same clone over 10 years 57 25,44 . Given a long collection time of the study sample (16 years) and the fact that the isolates were of different origins (both environmental and clinical), and from different geographical locales, the overall high CR reflects genome homoplasy, which is a product of convergent evolution, rather than ongoing active transmission.  Interestingly, isolate 2193.11 had VNTR pattern somewhat different than other subtype II isolates in loci 1, 2, 3 and 4. Its unique VNTR profile might be associated with geographical origin (this isolate was the only subtype II isolate from Poland).
Since the mid-1990s, PFGE has been the mainstay for molecular typing of NTM. Although, a number of methods have been introduced, over these two decades, PFGE still holds the leading position as a typing scheme in the molecular epidemiological studies of NTM diseases, including those due to M. kansasii 33,58 . This is because PFGE outrivals most of the later-invented typing methods in terms of discriminatory potential 5,59 .
The results from this study shows PFGE comparably discriminatory as in a study by Zhang et al. 13 . By performing three PFGE assays, with different enzymes (XbaI, DraI, and AseI) the GDI was similar between the assays and much the same as in our analysis (0.39-0.49 vs 0.49) 13 . Much lower GDI values for M. kansasii PFGE typing were reported by other authors, even when using the same enzymes, AseI (GDI = 0.25) or DraI (0.04-0.23) and SpeI (GDI = 0.18) [5][6][7]19 .
As for the clustering, our results were again close to those of Zhang et al. 13 (CR = 68% vs 61-69%), albeit rather distant to those from the remaining four aforesaid studies, with the clustering rate ranging from 88.3 to 99.3% [5][6][7]19 . These differences may be explained by geographical-and/or population-related specificities of the study samples. Also, some technicalities might be at play. Since PFGE depends chiefly on DNA quality, the typing results can be influenced by a method of DNA isolation, electrophoresis/running conditions, and the puslotype interpretative criteria applied (e.g. correlation algorithms, cutoff values). Whereas PFGE is the most powerful typing system for M. kansasii, it is time-consuming, labor-intensive, and resource-and expertise-demanding. Thus, a new, simple, cost-effective and high-throughput typing method would be of great advantage. All these criteria are met by VNTR typing, designed in this study. Yet, the discriminatory ability of MLVA was lower compared to PFGE analysis. This was apparent both from the GDI (0.35 vs 0.49) and HGDI (0.66 vs 0.95) scores.
Encouragingly, when the two methods were used together, the resolution power of such combination increased over that of PFGE alone, as reflected by both diversity and discriminatory indexes (GDI = 0.49 vs 0.67; HGDI = 0.95 vs 0.97). At the same time, the clustering rate noticeably decreased by 31% and 19% when compared with MLVA and PFGE alone, respectively (49% vs 68% and 80%). These observations prompted us to propose a two-step typing strategy for M. kansasii, which involves MLVA as a first screening method, performed on the entire study sample, followed by PFGE profiling, performed only within the VNTR-defined clusters.
In conclusion, this study delivers a new typing scheme, based on VNTR polymorphisms, and recommends it as a first-line test prior to PFGE analysis in a two-step typing strategy for M. kansasii. This strategy, though requiring evaluation against large-scale samples, offers a promising tool for mapping outbreaks and delineating transmission patterns of M. kansasii infections.

Isolates.
A total of 67 M. kansasii isolates, representing six of the species subtypes (I-VI) were included in the study (Suppl. Table 1). The isolates were purchased from the American Type Culture Collection (ATCC) (n = 1) or collected over a period of 2000-2015 from Poland (n = 51), the Netherlands (n = 7), the Czech Republic (n = 3), Spain (n = 2), Belgium (n = 1), Germany (n = 1) and Italy (n = 1). Sixty-two isolates were of clinical origin and represented as many unrelated patients diagnosed as having (or not) M. kansasii disease, according to the criteria of the American Thoracic Society (ATS) 3 . Five isolates were recovered from different environmental sites (Suppl. Table 1).
The isolates were identified as M. kansasii by using high pressure liquid chromatography (HPLC) methodology, in accordance with the Centers for Disease Control and Prevention (CDC) guidelines 60 or by means of the GenoType Mycobacterium CM/AS assay (Hain Lifescience, Nehren, Germany).
PFGE. The PFGE analysis was performed as described previously by Kwenda et al. 19 with modifications (Suppl. Materials and Methods).
The gel images were analyzed using BioNumerics ver. 5.0 software (Applied Maths, Sint-Martens-Latem, Belgium). Three molecular-weight size marker (MWSM) lanes in each 15-well gel enabled normalization within and across gels.
Cosine correlation algorithm was used to define PFGE profiles. Band positions were assigned manually, with computer assistance, and the band tolerance was set at 2%. Two isolates exhibiting >80.4% profile similarity were considered clonal. This cut-off value was derived empirically from an analysis of a number of PFGE profiles of the same isolate, in two independent PFGE assays.
Search for VNTR loci and VNTR typing. The whole-genome sequence of the ATCC 12478 M. kansasii reference strain (GenBank, NCBI, Reference Sequence: NC_022663.1) was screened for repetitive DNA elements with the Tandem Repeats Finder Version 4.00 64 and visualized by the Vector NTI Software (Thermo Fisher Scientific, Waltham, USA). The results were filtered on the basis of following criteria: (i) minimum and maximum fragment size obtained on agarose gel: 100 bp and 2,000 bp, respectively (ii) minimum number of repeat units: 4.5. Twenty-four loci were then selected on the basis of 93% of conservation between the VNTRs.
The flanking sequences, of each VNTR locus, determined in the M. kansasii ATCC 12478 reference strain, and several other M. kansasii clinical strains 65,66 with the CLC Genomics Workbench Software 8 (Qiagen, Nehren, Germany), were used to design oligonucleotide primers and PCR protocols with the Vector NTI Software (Thermo Fisher Scientific, Waltham, USA) (Suppl. Table 2). The PCR reactions were performed with a TopTaq Master Mix kit, as recommended by the manufacturer (Qiagen, Hilden, Germany) with 50 ng of template DNA in a final volume of 25 µL. After initial denaturation at 95 °C for 3 min, the reaction mixture was run through 35 cycles of denaturation at 95 °C for 30 s, annealing at a temperature specific for particular VNTR loci (Suppl. Table 2) for 30 s, and extension at 72 °C for 60 s, followed by a final extension at 72 °C for 5 min. The amplicons were separated electrophoretically at 3.5 V/cm in 1% agarose gels in 0.5× TBE buffer and visualized by staining with ethidium bromide (0.5 μg/mL) and exposure to UV light (λ = 320 nm).
To assign the number of alleles, at each VNTR locus, corresponding to the amplicon sizes, the amplicon-length-based allele calling table was used (Suppl. Table 3). The table was configured following the computer-assisted analysis of the VNTR sequences of the M. kansasii ATCC 12478 reference strain. Shortly, the in silico-deduced number of repeats, at each locus, was rounded to the closest integer value. The number of repeats, below or above this value were assigned to amplicons' lengths, calculated by adding or subtracting a multiple of the repeat size, at a given locus, from the amplicon size, determined for the M. kansasii ATCC 12478 reference strain.
For each isolate, the final result was a 17-digit VNTR code, corresponding to the number of repeats at each VNTR locus.
The VNTR copy number for loci containing 20 bp or less (i.e. VNTR 19,20,24) was verified for M. kansasii subtypes representatives by using Sanger technology (Suppl. Cluster definition. A PFGE cluster was defined as two or more isolates assumed as clonal, based on the 80.4% cutoff value of the similarity between two PFGE patterns. A VNTR cluster was defined as two or more isolates sharing identical 17-loci VNTR typing profiles.
A PFGE-VNTR combined cluster was defined as two or more isolates sharing at least 80.4% cutoff value of the similarity between two PFGE patterns and identical 17-loci VNTR typing profiles.
Clustering rate was defined as percentage of clustered isolates among of all isolates genotyped 67 .

Construction of dendrograms. Similarities were calculated by the Cosine coefficient (PFGE and PFGE
and VNTR typing combined) or Pearson's correlation coefficient (VNTR typing) algorithms. Dendrograms were constructed with BioNumerics ver. 5.0 software (Applied Maths, Sint-Martens-Latem, Belgium) by using the unweighted pair-group method with arithmetic averages (UPGMA) with 2% band position tolerance.
Calculation of discriminatory power. As a numerical index for the discriminatory power of each typing method, genetic diversity index (GDI) and Hunter and Gaston discriminatory index (HGDI) were used. GDI was calculated, for each typing method, as a quotient of the total number of genetic patterns to the total number of isolates. The HGDI was calculated with the following formula: where N is the total number of isolates, and nj is the number of isolates representing each type 68 .