Rapid high-resolution melting genotyping scheme for Escherichia coli based on MLST derived single nucleotide polymorphisms

Routinely used typing methods including MLST, rep-PCR and whole genome sequencing (WGS) are time-consuming, costly, and often low throughput. Here, we describe a novel mini-MLST scheme for Eschericha coli as an alternative method for rapid genotyping. Using the proposed mini-MLST scheme, 10,946 existing STs were converted into 1,038 Melting Types (MelTs). To validate the new mini-MLST scheme, in silico analysis was performed on 73,704 strains retrieved from EnteroBase resulting in discriminatory power D = 0.9465 (CI 95% 0.9726–0.9736) for mini-MLST and D = 0.9731 (CI 95% 0.9726–0.9736) for MLST. Moreover, validation on clinical isolates was conducted with a significant concordance between MLST, rep-PCR and WGS. To conclude, the great portability, efficient processing, cost-effectiveness, and high throughput of mini-MLST represents immense benefits, even when accompanied with a slightly lower discriminatory power than other typing methods. This study proved mini-MLST is an ideal method to screen and subgroup large sets of isolates and/or quick strain typing during outbreaks. In addition, our results clearly showed its suitability for prospective surveillance monitoring of emergent and high-risk E. coli clones’.

www.nature.com/scientificreports/ Bacterial strain typing using HRM detection of single nucleotide polymorphisms (SNPs), called mini-MLST or minim typing, has already been successfully applied to Klebsiella pneumoniae 5 , Staphylococcus aureus 6 , Enterococcus faecium 7 and Streptococcus pyogenes 8 . These methods are based on detecting allelic specific SNPs derived from well-established MLST schemes used worldwide. Regarding E. coli, methods based on HRM were already used for multiple purposes such as species identification 9 , detecting and quantifying of enterotoxigenic virulence factors 10 , detecting of AmpC, ESBL and carbapenemase genes 11 and in combination with ligation-mediated realtime PCR for molecular typing on a local level 12 .
Here we describe a novel mini-MLST scheme for E. coli as an alternative method for rapid genotyping suitable for routine clinical practice. We compared the proposed mini-MLST with commonly used molecular typing such as REP-PCR and MLST on an outbreak of E. coli at the Neonatal Department at the University Hospital Brno (UHB). In addition, our novel mini-MLST scheme was also compared to WGS on E. coli strains collected during a surveillance study at the Department of Hematology and Oncology (UHB).

Results
Method design. Firstly, all SNPs in the E. coli MLST 13 loci were identified using Minimum SNPs software 14 .
We used our in-house MLST2MELT software to predict mini-MLST alleles for all 6 regions (Table 1) -368  adk322 FW  GGC ATC AAT GTT  GAT TAC GTTC  122  43, 44, 45, 46, 47, 48, 49, 52  adk322 RV  GGC GGA TTG AAT  TTA ACG T   fumC  227-390  fumC327 FW  CTG CGC AAG CAA  CTC ATT C  152  50, 51, 52, 55, 56, 57 Fig. 1. The corresponding difference curves are shown in Fig. 2. The corresponding melting peaks can be found in Supplementary Fig. S1. Representatives from each obtained HRM curve were sequenced using MLST primers to determine the GC content in specific mini-MLST loci. The GC values were subsequently used to identify mini-MLST alleles. The melting temperature (Tm) values for each mini-MLST allele were calculated ( Table 2). The HRM curves from four isolates had a non-standard shape and differed from the remaining HRM curves in at least one mini-MLST loci (an example of non-standard HRM curves can be found in Supplementary Fig. S2). Using Sanger sequencing, we found these isolates to be a mixture of at least two different strains and they were therefore excluded from further analyses. MelTs were assigned to each isolate based on the acquired HRM curves and our MelT conversion key. From 165 isolates, 34 different MelTs were determined (Table 3). To correlate MelTs and STs, a subgroup of 110 isolates including at least one isolate of each obtained MelT was subjected to complete MLST. Those 110 isolates  Mini-MLST as a tool for outbreak investigation. An increased incidence in extended-spectrum β-lactamases (ESBL) E. coli was observed in March and May 2016 at the Neonatal Department (UHB). In total, 15 ESBL E. coli isolates were isolated from blood cultures, rectal swabs, and neonates' urine. Environment swabs (room, baths, incubators), health-worker swabs and breast milk samples (sterilized prior to administration to the neonate) were also tested. A single ESBL E. coli isolate was recovered from breast milk and none from the environmental and health-workers swabs. All isolates were subjected to molecular typing analysis using rep-PCR, MLST and mini-MLST. As a control, four ESBL E. coli urine isolates isolated from different departments at the UHB were added to our analyses. The isolates were differentiated into 4 rep-profiles, 4 STs and 4 MelTs www.nature.com/scientificreports/ ( Fig. 3) indicating the concordance between methods was 100%, whereas no possible higher discriminatory power could be obtained from any of the used methods. Based on typing, the breast milk isolate was identical to the isolates recovered from the neonates. As a result, an immediate sterilizer inspection was performed, and crucial technical damage resulting in its impaired function was discovered. After replacing the sterilizer, no further ESBL E. coli cases in breast milk were observed.

WGS typing, MLST, rep-PCR and mini-MLST concordance. The concordance of WGS, MLST,
rep-PCR and mini-MLST and a comparison of their discriminatory power were tested on a subset of isolates obtained during a local epidemiological study conducted at the Department of Internal Medicine-Hematology and Oncology, UHB between 5/2019 and 7/2019. In total, 21 ESBL E. coli isolates were obtained from 14 patients. The 21 isolates were differentiated into 14 WGS clusters (cut-off for clustering isolates together was set to 10 allele differences in a total of 4,637 analyzed genes), 11 STs and 11 MelTs and 12 rep-profiles (Fig. 4). While the ST and mini-MLST results were in exact concordance, the rep-PCR and WGS data analysis divided samples belonging to ST58 into two different clusters. In addition, the WGS further sorted ST131 into three WGS clusters against one MelT and one rep-profile.

Discussion
MLST and PFGE are still considered the gold standards for molecular typing. However, the development and application of WGS-based techniques is rising, along with a reduction in their costs. WGS has an unsurpassed discriminatory power over other typing methods but is also the most demanding approach in terms of difficulty and data analysis. In contrast, HRM-based methods are extremely cheap, fast and at the same time also very robust and easily portable between laboratories. HRM has already been effectively used to detect and identify antimicrobial resistance, to screen and identify target mutations, evaluate bacterial population structure and genetic diversity 4 . Mini-MLST typing schemes have already been successfully validated for K. pneumoniae 5 , S. aureus 6 , E. faecium 7 and S. pyogenes 8 . Regarding E.coli, apart from a number of methods designated for species identification, use of HRM methods has been increasing over time and used to quantify virulence and resistance genes or are currently available [9][10][11] . In terms of molecular typing, there are two methods available. The first one is designed to identify ST131 as an internationally spread high-risk clone 16 . This method is only capable of distinguishing ST131 from non-ST131 strains, which is not sufficient in most cases. The second method is based on ligation-mediated real-time PCR followed by HRM 12 . While retaining the advantages of HRM (speed, cost, low labor intensity), the main disadvantage of this method remains the reproducibility and transferability due to the lack of support within a globally recognized scheme (e.g., MLST). Although this method may be used for www.nature.com/scientificreports/ direct isolate comparison during a local outbreak investigation, it is not suitable for comparing of large sample sets, long-term studies, and inter-laboratory studies.
The mini-MLST approach is based solely on the GC content of the target locus. Therefore, different sequences with the same GC content are practically indistinguishable using HRM analysis alone. Being specific, this means that hundreds of MLST alleles are converted into just a few mini-MLST alleles (typically 3-10), which is reflected in the lower mini-MLST against MLST discriminatory power. However, the major advantages of mini-MLST are cost-effectiveness, rapid performance, robustness, and great reproducibility accompanied with lower analytical complexity, resulting in straightforward interpretation (Table 4) 2,17,18 . The total price per isolate is approximately $5 which is significantly more cost effective than the majority of other typing methods (e.g., $50 per complete MLST, $150 per WGS). The whole analysis including results evaluation takes about 2.5 h (excluding Table 3. mini-MLST and MLST genotyping results of 110 selected E. coli isolates. www.nature.com/scientificreports/  www.nature.com/scientificreports/ DNA isolation). Moreover, during this study we proved that it is possible to optimize the reaction mixture and temperature to be identical for all existing mini-MLST schemes. This is a considerable advantage as it allows our laboratory to simultaneously type up to 48 isolates of different bacterial species (using 384-well plates).
Considering mini-MLST's portability, using our approach may accelerate typing in other laboratories, which is particularly suitable for larger laboratories with a significant number of isolates to be analyzed. Mini-MLST's robustness is based on unique HRM curves produced by fragments with different CG content while using an optimal fragment length (70-200 bp). For longer fragments, the impact of GC content differences on Tm may be reduced and the HRM curves may not be clearly distinguishable 6 . Since mini-MLST is derived from globally recognized MLST, it has great portability and together with its high reproducibility, can be globally implemented into laboratories without substantial effort.
To compare the routinely used MLST and to validate mini-MLST for typing large sample sets, we performed an in silico analysis on 73,704 strains retrieved from EnteroBase database 15 . With a D value of 0.9465 (CI 95% 0.9726-0.9736) for mini-MLST, a D value of 0.9731 (CI 95% 0.9726-0.9736) for MLST, mini-MLST proved to have a comparable discrimination power to MLST and to be a suitable method with sufficient discriminatory power for large population studies and long-term screening. Mini-MLST's validation in routine clinical practice was performed against rep-PCR, MLST and WGS on clinical isolates collected during the local outbreak and surveillance study at the UHB. During the local epidemiological study at the Department of Internal Medicine, all typing methods were compared not only with each other but also evaluated against WGS as it provides the highest currently achievable discriminatory potential. This comparison resulted 11 MelTs and 11 ST, 12 repprofiles and 14 WGS clusters. Overall, proposed mini-MLST typing scheme showed great correlation with all three aforementioned methods which was further accompanied by the essential advantages mentioned above. In this case, however, a limited number of isolates need to be taken into account.
Due to the expanding use of next-generation sequencing (incl. WGS), new alleles and/or their combinations (new STs) are being discovered almost daily. To be able to type strains from the newest STs, the conversion key is updated monthly. The current version of the conversion key is available at http:// www. cmbgt. cz/ mini-mlst/ t6353. From the total number of 10,946 STs known as of 3/2021, only 39 STs are marked as MelT0 i.e., with no specified MelT. In most cases, this is caused by the absence of a specific allele in the source MLST database. In this case, we cannot predict the mini-MLST allele (number of GC bases) and thus determine the specific MelT. Even though the conversion key contains an indication of the missing allele (marked as -3 error in the conversion key) and the result is MelT0, the HRM curves still can be obtained. Thus, MelT0 does not necessarily mean it is impossible to type. If the particular allele was added to the source data, the change would be reflected in the conversion key after the update.
Mini-MLST can be used not only to resolve an acute epidemiological situation as a rapid typing method, but also for prospective monitoring of high-risk bacterial strains' occurrence. This is possible due to the correlation between the biological properties and the ST/genotype, previously described for E. coli high-risk clones belonging to ST38 (MelT669), ST69 (MelT387), ST73 (MelT737), ST95 (MelT738), ST131 (MelT653), ST155 (MelT175), ST393 (MelT343), ST405 (MelT923), ST410 (MelT166) and ST648 (MelT359) [19][20][21] . At the same time, the better we know the local bacterial population structure and its properties, the better we can respond to the occurrence of new or emergent virulent and/or multidrug-resistant strains. A two-step approach using mini-MLST can be advantageously used for both prospective surveillance and retrospective molecular typing. Our results clearly showed that the isolates distinguished by mini-MLST are similarly distinguished by WGS-based typing. Therefore, only isolates from the same MelT should be processed to the next typing step during an outbreak investigation involving WGS. This will allow hospitals to concentrate focus and resources specifically on the outbreak strains and their subsequent in-depth typing. On the other hand, in studies characterizing a large bacterial population, it is advantageous both in terms of time and cost to select strains from different MelTs as it prevents sequencing identical strains and acquiring redundant information.
To conclude, mini-MLST has great portability, low labor intensity, great cost-efficiency and very high throughput, which represents immense benefits, even when those are accompanied with its slightly lower discriminatory power than other typing methods. Our results proved mini-MLST is a great method for rapid and cost-effective screening and subgrouping for large isolates sets and/or quick strain typing during outbreaks. In addition, it is also suitable for prospective surveillance monitoring of emergent and high-risk E. coli clones.  13 . All genes were concatenated and aligned using MEGA7 software 22 . The SNPs that change the percentage content of G + C were identified and selected for further analysis as the A ↔ T and C ↔ G nucleotide changes cannot be generally/commonly detected using HRM analysis. The mdh genes were excluded from further analyses as no significant SNPs were found within this gene.
The primer sets were designed using Primer3 v 0.4.0 (http:// bioin fo. ut. ee/ prime r3-0. 4.0/) and targeted conserved regions flanking previously identified SNPs (Table 1). From the available literature, we determined the optimal amplicon length for HRM analysis ranged between 50 to 200 bp 5-8 . Mini-MLST scheme design. Predicting the HRM curve and assigning the melting type (MelT) were carried out using our in-house MELT2MELT software. Each analyzed locus sequence was processed as follows. Specific forward and reverse primers for all loci were found in amplicons using a simple regex search. The number of G and C bases between a pair of primers in every gene in the selected sequence was counted and stored in a table containing all analyzed sequence types. The order of the rows in the table was rearranged according to the increasing number of G and C bases in every analyzed gene. Finally, a MelT number was assigned to each ST by assigning a number one to the first row of the table and increasing this number by one for every ST whose G and C base numbers were different from the previous ST.  Table S1. DNA isolation. Genomic DNA (gDNA) was isolated using Chelex 100 Resin (Bio-Rad, USA). The overnight bacterial cultures were homogenized in 100 μL of 5% w/v Chelex 100 Resin with vortex. The obtained suspensions were incubated for 10 min at 100 °C and then centrifuged for 2 min at 15,500 rcf. Each strain's supernatant containing gDNA was transferred into a clean microtube. For WGS, gDNA was purified using GenElute Bacterial Genomic DNA Kit (SIGMA-ALDRICH, USA).

Mini-MLST.
Mini-MLST was performed on a Bio-Rad CFX96 platform (Bio-Rad, USA). The reaction mixture contained 10 μL 2 × SensiFAST HRM mix (Meridian Bioscience, UK), 0.4 μM of each primer, 1 μL of gDNA (30 ng/µL) and deionized water to a final volume of 20 μL. Thermo cycling parameters were: 95 °C for 3 min, 40 cycles of 95 °C for 5 s, 65 °C for 10 s and 72 °C for 20 s, followed by one cycle of 95 °C for 2 min and 50 °C for 20 s, terminated by HRM ramping from 72 to 88 °C, increasing by 0.1 °C at each step. The results were interpreted using the current version of our conversion key, which is available for free download at http:// www. cmbgt. cz/ mini-mlst/ t6353. The conversion key is regularly updated on a monthly basis.
Multilocus sequence typing. In total, 110 strains were selected for a full MLST sequencing scheme according to the protocol described by Wirth et al. 15 . The current version of the E. coli MSLT database is available on http:// enter obase. warwi ck. ac. uk/ speci es/ ecoli/ downl oad_7_ gene.

Rep-PCR.
To generate DNA fingerprint patterns, Rep-PCR primers REP1R and REP2I were used 24 , following the protocols described previously 25 . The PCR amplicons were analyzed using Agilent 2100 Bioanalyzer (Agilent Technologies, USA) and the previously described algorithm was used 26 to determine the rep-profiles.