Novel blaCTX-M variants and genotype-phenotype correlations among clinical isolates of extended spectrum beta lactamase-producing Escherichia coli

The rapid emergence of multiresistant microbial pathogens, dubbed superbugs, is a serious threat to human health. Extended spectrum beta lactamase (ESBL)-producing Escherichia coli is a superbug causing worldwide outbreaks, necessitating timely and accurate tracking of resistant strains. Accordingly, this study was designed to investigate the spread of ESBL-producing Escherichia coli isolates, to analyze the effect of different genotypic and phenotypic factors on in vitro resistance patterns, and to assess the diagnostic value of commonly used ESBL genetic markers. For that purpose, we cultured 250 clinical isolates and screened their susceptibility to beta-lactam antibiotics. Among 12 antibiotics screened, only imipenem seems to have remained resilient. We subsequently analyzed the ESBL phenotype of Escherichia coli isolates and examined potential associations between their resistance phenotypes and patient-related factors. ESBL genotyping of 198 multiresistant isolates indicated that 179 contained at least one blaCTX-M gene. As we statistically dissected the data, we found associations between overall resistance and body site / type of disease. Additionally, we confirmed the diagnostic value of testing both blaCTX-M-1 and blaCTX-M-15 in providing better prediction of overall resistance. Finally, on sequencing the amplification products of detected blaCTX-M genes, we discovered two novel variants, which we named blaCTX-M-14.2 and blaCTX-M-15.2.


Results
Distribution of the clinical isolates. The study started by screening 250 clinical isolates recovered from different specimens (urine, stool, blood, pus, wounds, swabs, and drains) for antibiotic-resistant Escherichia coli. Out of the screened isolates, 198 were confirmed to be E. coli by standard laboratory methods.
In vitro antibiotic susceptibility pattern of the isolates. All isolates were screened for resistance against 12 antibiotics (Table S2) by the disc diffusion method. All isolates were resistant to penicillin and ampicillin. The next less effective antibiotic was the 2 nd generation cephalosporin, cefoxitin, to which 193 isolates (97.5%) appeared to be resistant. Regarding 3 rd generation cephalosporins, 142 isolates (71.7%) were resistant to cefotaxime and 153 isolates (77%) were resistant to ceftazidime, while 182 isolates (92%) were resistant to piperacillin and 105 isolates (53%) were resistant to piperacillin/tazobactam. Resistance to aztreonam, representing the monobactam class, was exhibited by 52% of the isolates. Finally, only five isolates (2.5%) were resistant to imipenem, representing the carbapenem class. This percentage of resistance was the lowest among all tested classes of beta-lactam antibiotics ( Fig. 2A,C and Table S1).
To allow further analyses and comparisons, we defined a resistance score (R Score) as the number of beta lactams to which an isolate is resistant-out of 12 beta lactams tested. We also attempted to cluster all isolates based on their resistance pattern, and the clustering pattern was presented as a heatmap (Fig. 2B) and an unweighted pair group method with arithmetic mean (UPGMA) tree (Fig. 3). The heatmap has the advantage of showing the overall trends and patterns within the isolates, while the tree has the advantage of showing major clusters, as well as rare patterns (e.g., isolate 1H06, resistant to all antibiotics but AMC). Both methods highlighted some dominant clusters, most prominent of which are four that we named w, x, y, and z (Figs 2B, 3). Cluster z is the largest, representing isolates resistant to all antibiotics but imipenem, while Clusters w and y represent isolates resistant to all but IPM/ATM and IPM/TZP, respectively. Cluster x was not among the largest, but the most alarming, representing isolates resistant to all 12 beta-lactams. esBL activity of the isolates. ESBL activity is usually expressed as synergism between the tested antibiotic and a mixture of amoxicillin/clavulanic acid. This is experimentally tested by disc diffusion and is typically exhibited as a keyhole (merged inhibition zone) formed between the tested antibiotic disc and a central amoxicillin/ clavulanic acid disc on a typical surface-inoculated plate as detailed in Materials and Methods ( Supplementary  Fig. S1A).
When this test was applied to all 198 E. coli isolates, 134 showed positive synergism, indicative of their ESBL activity. Some isolates were resistant to the five antibiotics, which suggested the possibility that they expressed more than one ESBL gene.
A positive ESBL activity for a particular isolate is also expressed as an increase in the inhibitory zone diameter of either cefotaxime or ceftazidime by ≥5 mm when tested in combination with clavulanic acid vs. its zone diameter when tested alone ( Supplementary Fig. S1B). When we used this method, 164 isolates showed positive results. This suggests that the combination disc method might be more sensitive than the double disc synergy (DDS) method.
Factors affecting variability in resistance scores. E. coli's natural habitat is the intestine; however, in disease cases, the bacteria are frequently isolated from extra-intestinal sites. We thus investigated whether there is a difference in the overall resistance between intestinal and extra-intestinal isolates. Although the median R scores of both groups were equal (10), their means were slightly different (9.85 for intestinal isolates vs. 9.35 for extra-intestinal isolates). However, that difference was marginally significant (t-test p value = 0.033, but Mann-Whitney p value was 0.44, Fig. 4A).
Next, we tested whether the site of isolation or the nature of the disease might have an effect on the level of resistance. Overall, isolates from different body sites had more or less comparable R score values, except those from urine specimens (Fig. 4B) R   198  198  193  191  182  174  160  153  142  105  103  5   I   0  0  4  7  5  10  26  15  23  53  21  2   S   0  0  1  0  11  14  12  30  33  40  74  Isolates are shown on the X-axis and antibiotics on the Y-axis. Both isolates and antibiotics are ordered by hierarchical clustering, reflected by the horizontal and vertical trees, respectively. Antibiotic abbreviations are used as tabulated in A. Four clusters of interest (w, x, y, and z) representing different resistance patterns are annotated at the bottom of the heatmap. The heatmap was created in R by the basic heatmap function. C. A stacked bar plot summarizing the disc diffusion resistance patterns of the 198 E. coli isolates to 12 antibiotics (Y axis). Antibiotics are sorted by the same order in which they are clustered in B.
www.nature.com/scientificreports www.nature.com/scientificreports/ any body site from which fewer than three specimens were collected. Analysis of Variance (ANOVA) showed a significant difference between the means of all samples (p value = 0.003), but post hoc tests using FDR or Holm-Bonferroni adjustments indicated that the only significant pairwise differences were between the mean R scores of urine vs. blood and urine vs. stool isolates (FDR-adjusted p values = 0.0058 for both pairs; Holm-Bonferroni-adjusted p values = 0.01 for urine vs. blood and 0.012 for urine vs. stool). This significant difference between blood and urine (both being extra-intestinal) might explain the lack of statistical significance in resistance between all extra-intestinal isolates vs. the more consistent intestinal isolates.
As for disease associations, we observed a higher total resistance in isolates from cancer patients compared to those from patients with other conditions (Mean R score difference = 1.22; t-test p value = 2.284 × 10 −8 ; Mann-Whitney p value = 2.266 × 10 −5 , Fig. 4C). However, because 15 out of 41 cancer patient isolates were from blood specimens whereas no blood specimens were obtained from non-cancer patients, it is possible that the higher resistance observed among these isolates is simply because they were from blood, possibly reflecting cases of secondary septicemia due to failed initial antibiotic therapy. Thus, we repeated the comparison between isolates from cancer and non-cancer patients after filtering out blood samples. Still, the differences between the R scores of these two classes of isolates remained significant (mean R score difference = 1.18, t-test p value = 2.96 × 10 −5 ; Mann-Whitney p value = 0.0009, Fig. 4D).
The most prominent difference between isolates from cancer and non-cancer patients was manifested by their resistance to the most effective antibiotic in this study, imipenem (Table 1). Four out of five imipenem-resistant and two out of two imipenem-intermediate isolates were from cancer patients, indicating an enrichment of resistance to that last-resort antibiotic in cancer patients (Chi-square p value ≤ 0.0001), whereas the distribution of imipenem-sensitive samples was 35:156 (cancer to non-cancer patients, Table 1).
Regarding patterns of resistance to specific antibiotics, we found that aztreonam resistance was enriched in isolates from blood and pus (100%), while the percentage of aztreonam-resistant isolates in urine and stool was 50%. Likewise, cefotaxime resistance was predominant in isolates from blood and pus (100%), but, in urine and stool, the percentage of resistant isolates was 50%.
Two more antibiotics (TZP and CAZ) had substantial resistance in most isolates, but with a significantly smaller proportion of resistant bacteria isolated from urine specimens, a significantly larger proportion of TZP-resistant bacteria from stool, and a significantly larger proportion of CAZ-resistant bacteria from blood and pus (Chi-square p values = 0.0001 and 0.0003, respectively, Table 2).
It is to be noted that we excluded in this statistical analysis any tissues from which two or fewer samples were isolated.  www.nature.com/scientificreports www.nature.com/scientificreports/ Antibiotic-antibiotic correlations. It is logically expected that bacteria may have similar susceptibility patterns to antibiotics belonging to closely related chemical classes. To test this hypothesis, we used the resistance data (Table S1) from all 198 samples to build a correlation matrix between different tested antibiotics ( Fig. 5), with exclusion of penicillin and ampicillin, to which all isolates were resistant-with no exception.
This analysis indicated that cefotaxime (CTX), aztreonam (ATM), cefuroxime (CXM), and ceftazidime (CAZ) were the most correlated. For instance, the Spearman's correlation coefficient r s between CTX and ATM was 0.589; between CTX and CAZ was 0.577; between CTX and CXM was 0.535; finally, CTX and TZP had a correlation coefficient = 0.510. All other Spearman's correlation coefficients were ≤0.5 (Fig. 5).
polymerase chain reaction (pCR) screening of beta-lactamase genes and genotype-phenotype correlation. For the determination of beta-lactamase genotype among the isolates, one-product and multiplex PCR assays were used to detect representative genes (as detailed in Materials and Methods). Different patterns of gene presence/absence were observed (Supplementary Table S1), and these patterns were used to generate a UPGMA tree leading to different clusters, the largest of which represents isolates positive for bla CTX-M-15 and bla TEM genes (Fig. 3).
Next, the isolates were screened by universal CTX primers (MA1/MA2 primers 11 ) to detect the presence of any bla CTX-M genes, then a multiplex PCR was performed to show which CTX-M group is present in each isolate. Afterward, primers for the most predominant members of each CTX-M group were designed and used for further screening (Supplementary Table S3).
One hundred seventy nine isolates were successfully amplified with the universal PCR primers (MA1 and MA2) 11 , which indicates that they contain at least one of the bla CTX-M genes. Those MA1/MA2-positive isolates were screened by a multiplex PCR to determine to which of the five CTX-M groups (1, 2, 8, 9,    , used together, seem to provide better prediction of overall resistance as well as resistance to specific beta lactams. Regarding overall resistance, the R score was found to be significantly higher in isolates in which either bla CTX-M-1 or bla CTX-M-15 was detected by PCR (Fig. 6). More interestingly, isolates that tested positive for both bla CTX-M-1 and bla CTX-M-15 had a significantly higher mean R score than those with a positive PCR result for only  Likewise, both primer pairs seemed to be needed for the best prediction of aztreonam (ATM), cefotaxime (CTX), and ceftazidime (CAZ) resistance (Table 3). For example, 62.2% of the ATM-sensitive isolates tested positive with only one of the primer pairs. Conversely, in the cases where PCR was positive with both primer pairs, only three isolates were still sensitive in spite of the presence of both genes (Table 3).
With cefotaxime resistance, the results were even more striking, as none of the 64 isolates that tested positive with both PCR primer pairs was actually sensitive to CTX. While 18 of the 33 CTX-sensitive isolates tested negative for both, 15 of these 33 isolates had one of the two genes (Table 3). Specifically, among the 15 sensitive isolates that tested positive to one gene, 13 (86.7%) were bla CTX-M-15 positive and the remaining two were bla CTX-M-1 positive. This isn't too surprising given the predominance of the M-15 alleles among our samples.
Resistance to CAZ followed a quite similar pattern to that of CTX in relation to PCR results. This, too, could be expected from the relatively high positive correlation between the two antibiotics (r s = 0.577, Fig. 5).
Universal primers for other beta-lactamases genes (TEM and SHV) were also used for beta-lactamase genotyping or gene screening. Only 132 isolates generated positive TEM PCR products, while 12 isolates were SHV-positive (Table S1). Of note, isolates containing more than one type of bla genes (e.g., bla TEM and bla CTX-M ) showed more phenotypic resistance than those containing only one bla gene (data not shown).  (Table S1).
Among the bla CTX-M-15 and bla CTX-M-14 like genes amplified in this study, we discovered and confirmed two novel alleles ( Supplementary Fig. S2), which we named bla CTX-M-15.2 and bla CTX-M-14.2 , respectively. Both novel alleles were confirmed by resequencing and deposited in NCBI (Accession numbers: KX013145.1 and KX013146.1, respectively).

Discussion
Escherichia coli is a natural inhabitant of the mammalian intestine, but is also recognized as one of the main causes of nosocomial, intestinal and extra-intestinal infections 12 , and a leading cause of morbidity and mortality over the world. According to the World Health Organization (WHO) 2015 report, E. coli causes disease to about 550 million people annually, of whom 230,000 eventually die 13 .
More importantly and worrisomely, E. coli is among the most likely bacteria to evolve into incurable superbugs in 2050, by developing resistance to every known antibiotic, posing a public health threat that is expected to exceed cancer's mortality 2 . Indeed, recent reports of colistin-resistant E. coli [14][15][16][17] raised alarm around the globe and ushered in the start of an era of pan-resistant bacteria.  www.nature.com/scientificreports www.nature.com/scientificreports/ Antimicrobial resistance in E. coli has gained much attention since it is the best studied Gram-negative organism in humans. Cases of developed resistance against beta-lactam antibiotics and others are increasingly documented, and several studies reported the increase of E. coli resistance to more than one drug or class of drugs 18 . Surveillance data show that resistance of E. coli is higher when antimicrobial agents have been used for longer durations, especially beta-lactam antibiotics 18 . This continuous exposure to antibiotics, which exerts selection on the bacteria, is boosted by the bacterial ability to acquire resistance factors from its neighbors (via horizontal gene transfer), thus allowing the emergence of strain variants that carry diverse resistance traits.
In agreement with global reports, an alarming increase in resistance to beta-lactam antibiotics (even to the extended-spectrum subclass) among clinical E. coli isolates is highlighted by the results of this study. Among 198 multi-resistant clinical isolates, 179 (90.4%) contained at least one bla CTX-M gene, of which 86.6% were positive for bla CTX-M-15 . A possible reason behind those high percentages is that many clinical isolates were recovered from the National Cancer Institute, in which high amounts of antibiotics are administered to patients pre-and post-operatively. Yet, even with exclusion of isolates from cancer patients, the percentage and distribution of resistance genes didn't change much, as detailed in the Results section (Fig. 4E).
This high frequency of finding bla CTX-M-15 -containing isolates agrees with previous reports on bla CTX-M-15 being the most common ESBL in the Middle East and North Africa 19 . This increasing predominance of the bla CTX-M-15 allele might be due to the powerful ability of its gene products (Ctx-M-15 and its variants) to hydrolyze ceftazidime, cefotaxime and aztreonam, which probably offers the bacteria a selective advantage especially when multiple antibiotics are concomitantly or consecutively prescribed.
Among 12 antibiotics screened in this study, only imipenem may be considered to have remained resilient; however, it will not be long until extended exposure to imipenem will have its impact on selecting for bacteria that are resistant to this last-resort antibiotic. Recent reports have already described the emergence of carbapenem resistance in Enterobacteriaceae 20 .
Of note, the pairwise correlation between bacterial resistance patterns to all antibiotics tested in this study was explored. Since all isolates were resistant to penicillin and ampicillin, it was not possible to test any correlations with these two relatively old antibiotics. On the other hand, a significant but partial pairwise correlation was found between the four antibiotics: CTX, ATM, CXM, and CAZ. This partial correlation reflects possible common mechanisms of resistance to these antibiotics, yet highlights that there are still discrepancies between how bacteria resist those beta lactams. Discrepancies could be due to slight differences in resistance mechanisms, differences in enzyme specificities, or potential chemical differences.
Associations were also explored between levels and patterns of resistance to different antibiotics and some clinical aspects, such as the site of isolation or the disease condition (cancer or non-cancer). When levels of resistance of bacteria from various sites of isolation were compared, there was no striking difference in R Scores, except with urine isolates, which had slightly but significantly lower R Scores. Yet, these urine isolates had remarkably the highest variance between their R scores (Fig. 4B,E).
When cancer was tested as a possible factor for higher risk of multiple resistance phenotypes, it was found as a significant predictor of R Score; however, this statistically significant difference is diluted by the difference of sampling sites, as many of the cancer patient isolates were recovered from blood and other tissues, while the majority of other isolates were from stool and urine. When non-cancer isolates were compared, the same pattern of lower resistance among urine isolates was observed.
It is still plausible to suggest a tendency of higher resistance in bacteria isolated from cancer patients, as they often suffer from chemotherapy-induced immunosuppression, which necessitates the use of multiple antibiotics in their therapy regimens. An example in this study is the detection of rare resistance to imipenem in only five isolates, four of which were from patients with cancer. This disproportionate distribution suggests that resistance to imipenem started developing in cancer patients, but not in other patients who were sampled in the study. On the other hand, as opposed to the effect of 'having cancer' as a predictor, we think there is not enough sample size or statistical power to fully explain the higher variance observed in urine samples. The major factor here seems to be whether the patient has cancer or not.
After phenotypic determination of susceptibility and resistance phenotypes, we resorted to molecular tools to explore the distribution of representative beta-lactamase genes, with focus on bla CTX-M genes and their variants. We found that bla CTX-M containing isolates were resistant to a wider range of beta-lactam antibiotics than those not containing a bla CTX-M gene (Table 3). This could be an indication that the presence of a bla CTX-M gene in a bacterium might be leveraged as a good biomarker for high resistance to beta-lactam antibiotics, and ought to be implemented in routine antibiotic susceptibility testing protocols.
A major goal of this study was to explore the extent of agreement/correlation between phenotypic and molecular assays (i.e., resistance phenotype-genotype correlation). For this purpose, more than one phenotype can be measured and considered. For example, phenotypic resistance to single antibiotics may be used as an indicator, especially third-and fourth-generation antibiotics, or those considered as last resort, such as imipenem. Another phenotypic indicator is the breadth of resistance to beta-lactams or the level of multiresistance to this class (expressed here as R score). Analyzing phenotype-genotype correlations is especially beneficial in the case of secondary mutations, which are sometimes overlooked, although they often play an important role in potentiating or attenuating resistance phenotypes.
A significant association between overall resistance to beta lactams and molecular tests was found with the use of a single CTX-M primer pair, but the use of two primer pairs (targeting bla CTX-M-1 and bla CTX-M-15 ) had unexpectedly higher predictive value of the overall resistance scores (R scores, Fig. 6).
This observation is bolstered by parallel genotype-phenotype clustering analysis (Fig. 3), which provides a good summary of the study and shows a partial but obvious phenotype-genotype agreement. Although the phenotype and genotype UPGMA trees (Fig. 3) do not fully superimpose, it is visually striking that most isolates positive to two bla CTX-M genes are within the highly resistant clusters (clusters x and z, Figs 2B, 3). On the other (2019) 9:4224 | https://doi.org/10.1038/s41598-019-39730-0 www.nature.com/scientificreports www.nature.com/scientificreports/ hand, only one isolate that had none of the screened resistance genes, 2A05 (labeled in green, Fig. 3), had a highly resistant phenotype (R score = 10.5 and part of Cluster z, Fig. 3).
Finally, perhaps the most impactful finding of this study, other than the genotype-phenotype correlations, is the determination of novel bla CTX-M gene variants, in spite of the vast predominance of bla CTX-M-15 among isolates. This discovery of two novel but rare alleles highlights the value of using a relatively large number of isolates, as the dominance of one bla CTX-M genotype (M-15 as reported here and in several studies from the region) may mask other novel variants. Thus using a large number of isolates (here, bla CTX-M was screened in 198 isolates and its gene was sequenced from 88 of them) is a key to finding novel variants, which might eventually spread and be the next dominant genotypes. The two novel alleles were named bla CTX-M-14.2 and bla CTX-M-15.2 . To avoid muddling the bla CTX-M nomenclature, we refrained from using new serial numbers until further phenotypic studies are conducted on whether the sequence variations offer any difference in the MIC, or in the enzymatic activity and enzyme kinetics.
In conclusion, using antibiotic sensitivity data for a couple of hundred clinical E. coli isolates, we were able to investigate a few clinical and molecular variables for their diagnostic value, and we discovered novel beta lactamase alleles. Additionally, we explored phenotype-phenotype and phenotype-genotype correlations within and between the isolates and the tested antibiotics. Future studies with larger data sets may be able to build networks and models to move from diagnosis to prediction of resistance emergence and hopefully interference with its pervasive dissemination.

Materials and Methods
ethical statement. All experiments and study protocols were in accordance with relevant guidelines, regulations, and international declarations for biomedical research ethics. All protocols, including sampling, handling biobanked samples, and analyzing anonymized patient records, were approved by the Ethics Committee of Faculty of Pharmacy, Cairo University (Approval# 923; year 2012 to AAR and MAA). Informed consent was obtained from all hospitalized patients or their legal guardians according to each hospital's ethical protocols. The authors had no access to any patient-identifying data, as the samples were obtained from hospital laboratories with anonymized copies of patient data that are only related to diagnosis, site of infection, and underlying diseases.
Microorganisms. Two hundred and fifty bacterial isolates were obtained from biobanked stocks in clinical laboratories of Qasr El-Ainy Educational Hospital, Abu-El-Rish Children Hospital, and the National Cancer Institute in Cairo-Egypt in the period between January and April 2013. Isolates were identified by conventional microbiological methods to the genus level 21 . Complete identification of the recovered isolates was carried out by biochemical reactions including oxidase test, citrate utilization, growth on Triple sugar iron (TSI), H 2 S production, and Indole Methyl Red Vogues Proskauer Citrate (IMViC) test, as well as API system (bioMérieux, Lyon, France).
Determination of antimicrobial susceptibility pattern by disc diffusion method and calculation of a Resistance Score. The sensitivity of the tested isolates against 12 antibiotics, representing different classes of beta-lactams (Supplementary Table S2), was determined by the disc diffusion method according to the Clinical and Laboratory Standards Institute (CLSI) guidelines [22][23][24] .
A resistance score (R Score) was defined for each isolate. The score was defined as the number of antibiotics to which this isolate is resistant. If an isolate was judged as intermediately resistant to an antibiotic, it was given a score of 0.5, while any resistant call was given a score of 1.
ESBL confirmatory tests using disc diffusion. Double disc synergy test (DDST). The Double disc synergy test (DDST) used in this study was modified from Jarlier's double-disc synergy (DDS) method 25 . Discs of cefotaxime, cefepime, ceftazidime, and aztreonam were placed around an amoxicillin/clavulanic acid disc at a distance of 20 mm (center to center 25,26 ). Positive ESBL production was marked by the observation of a keyhole ( Supplementary Fig. S1A).
Molecular identification of ESBL genotype using PCR. PCR was used to detect the presence of antibiotic resistance genes in E. coli showing phenotypic resistance to beta lactamases. Bacterial colonies were boiled in TE buffer for 5 min, flash centrifuged, and then the supernatant of boiled colonies was used as a PCR template. Go Taq DNA polymerase enzyme (Promega, Madison, WI, USA) was used for all amplifications. Primers are shown in Supplementary Table S3.
Amplification conditions for all reactions were: initial Denaturation at 94 °C for 5 min; 30 cycles of denaturation at 94 °C for 1 min, annealing at 52 °C for 45 s, and elongation at 72 °C for 1 min; then a final elongation at 72 °C for 10 min. These conditions were applied to all one-product or multiplex PCR reactions except for the following primer pairs: TEM, SHV, CTX-M-1, CTX-M-2, and CTX-M-15 which were annealed at 56 °C, 43 °C, 46 °C, 50 °C, and 50 °C, respectively. (2019) 9:4224 | https://doi.org/10.1038/s41598-019-39730-0 www.nature.com/scientificreports www.nature.com/scientificreports/ Nucleotide sequence of amplified bla CtX-M genes. To confirm the identity of bla CTX-M genes detected by PCR, and to determine any possible new bla CTX-M alleles specific to this study, we extracted the DNA out of the electrophoretic bands representing positive bla CTX-M PCR products (using QIAGen Gel purification kit, QIAGen, Germantown, MD, USA). This DNA was afterward sequenced by the Sanger method (Clinilab, Cairo, Egypt and EtonBio, San Diego, CA, USA) with the MA1 and MA2 primers 11 .
All obtained sequences were visually revised and corrected for minor electropherogram errors, then were compared to sequence databases by BlastN and BlastX with default settings and parameters 29 . statistical analysis. Several programs and software packages were used for data visualization, graphing, summary statistics, and hypothesis testing: these are the R statistical platform (https://www.r-project.org), Data Desk version 6.3 (Ithaca, NY, USA), and GraphPad Prism (La Jolla, CA, USA). Among R-packages used in data analysis and visualization are Readxl (version 1.0), Beanplot (version 1.2), and Corrplot (version 0.84). UPGMA clustering (http://genomes.urv.cat/UPGMA) was used for generating a distance matrix and a tree, which was drawn with FigTree (http://tree.bio.ed.ac.uk/software/figtree/). Graphs and trees were sometimes labeled or recolored on Adobe Illustrator CS5 version 15.0.0 (Adobe Systems, Inc., San Jose, CA, USA).

Data Availability
All data generated or analyzed during this study are included in the published article and its supplementary information files. New Sequences identified in this study have been deposited in NCBI GenBank with the accession numbers KX013145.1 and KX013146.1 (http://www.ncbi.nlm.nih.gov).