Changes in emm types and superantigen gene content of Streptococcus pyogenes causing invasive infections in Portugal

Fluctuations in the clonal composition of Group A Streptococcus (GAS) have been associated with the emergence of successful lineages and with upsurges of invasive infections (iGAS). This study aimed at identifying changes in the clones causing iGAS in Portugal. Antimicrobial susceptibility testing, emm typing and superantigen (SAg) gene profiling were performed for 381 iGAS isolates from 2010–2015. Macrolide resistance decreased to 4%, accompanied by the disappearance of the M phenotype and an increase of the iMLSB phenotype. The dominant emm types were: emm1 (28%), emm89 (11%), emm3 (9%), emm12 (8%), and emm6 (7%). There were no significant changes in the prevalence of individual emm types, emm clusters, or SAg profiles when comparing to 2006–2009, although an overall increasing trend was recorded during 2000–2015 for emm1, emm75, and emm87. Short-term increases in the prevalence of emm3, emm6, and emm75 may have been driven by concomitant SAg profile changes observed within these emm types, or reflect the emergence of novel genomic variants of the same emm types carrying different SAgs.

iGAS in the UK due to an emm3 lineage with an altered prophage profile 11 , and the spread of an emm89 clade lacking the hyaluronic acid capsule synthesis locus in North America and Europe since the 2000s [12][13][14] . Both the emm3 and emm89 epidemic lineages were associated with a change in the dominant profile of prophage-encoded genes relative to the previously dominant lineages of the respective emm types 11,12 , supporting the usefulness of methodologies like superantigen (SAg) gene profiling as complementary typing methods to further discriminate isolates sharing the same emm type 15 .
The molecular surveillance of GAS recovered from human infections worldwide is therefore crucial for providing information on possible shifts in clone prevalence with an impact on vaccine development, as well as for the early detection of clones with enhanced virulence, transmission, or antimicrobial resistance. Previous studies showed that the GAS population causing invasive disease in Portugal is genetically diverse, despite the dominance of the emm1 clone [16][17][18] . The great majority of the isolates were recovered from blood (n = 330). Other isolate sources included pleural fluid (n = 23), ascitic fluid (n = 12), synovial fluid (n = 10), cerebrospinal fluid (n = 4), and bone biopsy (n = 2). From the 381 isolates, 193 (51%) were recovered from female patients. Patient age ranged between 1 day and 97 years (median 58 years). The majority of the isolates were recovered from adults (≥18 years, n = 295, 77%), mostly from those ≥65 years old (n = 151, 40%). Among children, the majority of the isolates were from patients ≤5 years old (n = 67, 18%).  Table S1).
Five emm types accounted for 63% of the isolates, namely emm1 (28%), emm89 (11%), emm3 (9%), emm12 (8%), and emm6 (7%) ( Table 1). Although the majority of the emm clusters identified in this study were dominated by one emm type (Table 1 and Fig. 1), the cluster distribution did not directly reflect the prevalence of the respective dominant emm types due to the presence of multiple emm types in several clusters, including E3, E4, and E6.
Among the studied isolates, emm1 (and, as such, cluster A-C3) and emm cluster E4 were slightly overrepresented among paediatric and adult patients, respectively (p = 0.013 and p = 0.025, respectively), and emm cluster E1 was more prevalent in males (p = 0.008). However, all these associations lost statistical significance after the false-discovery rate (FDR) correction.
The absence of the hasABC locus encoding the GAS capsule biosynthesis pathway was used as a surrogate for the identification of the recently emerged acapsular emm89 clade 12 . Among the 42 iGAS isolates presenting emm89 in this study, only 3 (isolated in 2010 and 2011) were positive for the capsule locus.
Antimicrobial resistance. All 381 isolates were susceptible to penicillin, chloramphenicol, vancomycin, and linezolid. Fourteen isolates (4%) were resistant to erythromycin (Table 1), of which nine were constitutively resistant to clindamycin (cMLS B phenotype) and carried the erm(B) gene, while five presented inducible resistance to clindamycin (iMLS B phenotype), harbouring the erm(TR) gene. Despite the small number of macrolide resistant isolates, their genetic diversity was high [SID (CI 95% ) = 0.846 (0.755-0.937)], with six different emm types identified.
Two isolates presented intermediate resistance to levofloxacin (MIC = 4 and 6 µg/ml). Both belonged to emm28, were resistant to erythromycin, clindamycin (cMLS B ), and bacitracin, and carried the mutation S79Y in the quinolone resistance determining regions (QRDR) of the parC gene. One emm89 isolate presented high-level levofloxacin resistance (MIC > 32 µg/ml) and carried mutation S79F in parC and mutation E85K in gyrA.

Discussion
The iGAS isolates recovered throughout Portugal between 2010 and 2015 were genetically diverse, with SID values similar to the ones obtained for iGAS isolates recovered during 2006-2009 17 . Still, the five most prevalent emm types, namely emm types 1, 89, 3, 12, and 6, comprised 63% of the isolates, with emm1 persisting as the leading invasive emm type (28%). Twenty-one of the forty emm types identified in this study (94% of the isolates) are included in the 30-valent M protein-based vaccine currently under development. This vaccine could potentially cover up to 96% of the isolates of this study, considering the presumed cross-protection against a number of non-vaccine serotypes 19 . These results are in agreement with the overall scenario in Europe and the US in contemporary periods, although with some variations in the ranking of the top emm types 7,8,[20][21][22][23][24][25][26] . In contrast, remarkable heterogeneity is found in the Southern hemisphere and developing regions, where the diversity of emm types is significantly higher, resulting in a much lower estimated coverage of the 30-valent vaccine [27][28][29] .
The increasing trend in emm75 is in agreement with the rise in prevalence of this emm type from 0.5% in 2006-2009 to 5% in 2010-2015 (p = 0.006), which was not significant after FDR correction. This increase in emm75 was somewhat surprising, since we previously found this emm type to be significantly underrepresented among iGAS when compared with pharyngeal isolates recovered in the same period in Portugal 18 . In agreement, an emm75 strain was recently selected for a controlled human infection model of GAS pharyngitis based on its limited virulence 31 . In our previous study including iGAS and pharyngitis isolates, a high diversity among emm75 isolates was observed 18 . Among the emm75 isolates from 2010-2015, five different SAg profiles were identified, but 13/19 isolates presented SAg25. Previously, only two emm75-SAg25 isolates had been identified in Portugal, both recovered from pharyngeal infections in the period of 2000-2005 18 . The increasing trend in emm75 among iGAS could result from the emergence of this particular lineage from 2013 onwards. At present, it is not possible to know if this lineage is particularly prone to causing invasive disease or if it increased equally among non-invasive infections in Portugal.
The prevalence of emm87 has been gradually increasing among iGAS in Portugal, with no apparent new lineage emerging in recent years when considering SAg profiles, which remained the same (mostly SAg20). Isolates of emm87 have been associated with familial and hospital clusters of iGAS and proposed to be highly transmissible 32,33 , but have not been specifically associated with iGAS when compared with contemporary non-invasive isolates 18,34 .
In Portugal, the recent acapsular emm89 clade emerged among iGAS in 2007 and quickly outcompeted the previously circulating emm89 isolates carrying the hasABC locus 12 . Accordingly, among the 42 emm89 isolates recovered between 2010 and 2015, only 3 isolates carried the capsule locus (Fig. 3). The prevalence of emm89 among iGAS did not present an increasing trend, nor did it increase significantly in the period following the introduction of the new clade when compared with previous years, in contrast to what we reported among isolates from skin and soft tissue infections 30 . This indicates that the new clade was highly successful in outcompeting the previously circulating emm89 isolates in all infection types, but is not associated with an enhanced ability to cause infection in normally sterile sites.  www.nature.com/scientificreports www.nature.com/scientificreports/ isolates presented SAg2. In 2009 SAg51 emerged and became the most common SAg profile in 2010-2015 (n = 9), followed by SAg2 (n = 8) and three other SAg profiles that emerged in this period, namely SAg72 (n = 5), SAg26 (n = 2), and SAg16 (n = 1). Further studies are needed to clarify if the isolates presenting the new SAg profiles within emm3 and emm6 emerged from the previously dominant emm3-SAg8 and emm6-SAg2 lineages by loss or gain of SAg genes, or if they represent distinct genetic clades that could underlie the rise in prevalence of both emm types during 2010-2012 (Fig. 2).
Erythromycin resistance (4%) decreased relative to the previously studied period of 2006-2009 17 (8%, p = 0.026) (Fig. 4). The overall decreasing trend in macrolide resistance recorded among invasive GAS in the www.nature.com/scientificreports www.nature.com/scientificreports/ period of 2000-2015 (p < 0.001) mirrors the one previously reported for isolates recovered from pharyngitis and skin and soft tissue infections 30,35 . Despite this decrease, the genetic diversity of the macrolide resistant isolates remained high.
The genetic determinants of tetracycline resistance are often horizontally transferred together with macrolide resistance determinants in the same mobile genetic elements 36 . Among the 30 resistant iGAS isolates from 2010-2015 in Portugal (8%), only 5 were also resistant to erythromycin, 3 of which belong to a lineage of emm77-SAg30 isolates carrying erm(TR) and tet(O) that had not been previously identified among iGAS in Portugal. Although the tetracycline resistance rate did not decrease significantly relative to 2006-2009, an overall decreasing trend was observed during 2000-2015 (p = 0.002).
A limitation of this study is that isolate submission was voluntary, without any audit, preventing us from controlling any possible bias on the selection of the isolates submitted by each lab. Although we expect that not all isolates recovered from iGAS were submitted, the inclusion of 40 laboratories distributed throughout the country provided us with a representative collection of isolates, limiting the impact that any strain selection bias could have on the results and conclusions of the study. Screening of SAg and resistance genes by PCR alone presents another limitation given the possible occurrence of false-positives and false-negatives. In order to reduce the potential impact of this limitation on the results, we have used carefully optimised multiplex PCR reaction conditions, including both positive and negative controls in each reaction 15,17 . The high correlation between SAg www.nature.com/scientificreports www.nature.com/scientificreports/ profiles and the results of other typing methods 15 , as well as between the resistance genotypes determined and the respective resistance phenotypes and lineages, supports the accuracy of the PCR results. This is the first study providing detailed molecular epidemiological data on iGAS infections in a Southern European country in the current decade. The results suggest that the emm type and emm cluster composition of GAS causing invasive disease in Portugal has remained stable since the second half of the 2000s decade, presenting no major changes in prevalence of individual emm types or clusters 17 . However, there have been changes in the SAg gene content within multiple emm types, which may reflect the ongoing horizontal transfer of phage-encoded genes between GAS lineages, or the emergence of new genetic clades. In some cases, these changes seem to be associated with temporal fluctuations in the prevalence of the respective emm types. Streptococcal SAgs can directly contribute to the emergence of new successful lineages through their role in virulence and the immune response 37 . On the other hand, changes in SAg gene content reflect the loss and acquisition of prophages that often carry other virulence factors or antimicrobial resistance determinants that could also contribute to the success of those lineages 38 . Given that the emergence of clades with increased success within previously circulating emm types has been reported in multiple occasions 11,13 , the continued molecular surveillance of GAS infections using methods capable of further discriminating isolates sharing the same emm type is critical for the identification of the emergence of novel lineages which could drive increases in iGAS disease.

Materials and Methods
Bacterial isolates. Forty clinical microbiology laboratories distributed throughout Portugal were asked to submit, on a voluntary basis, all GAS isolated from normally sterile sites between January 2010 and December 2015. The study was approved by the Institutional Review Board of the Centro Académico de Medicina de Lisboa. Since only anonymized demographic patient information was used and the samples used were collected within the normal diagnostic procedure by the attending physician, the study was exempt from obtaining written informed consent from the patients. All methods were performed in accordance with the relevant guidelines and regulations. Strains were identified by the submitting laboratories and confirmed in our laboratory by colony morphology, β-haemolysis, and the presence of the characteristic Lancefield group A antigen (Oxoid, Basingstoke, UK).

Molecular typing.
The emm type was determined for all isolates according to the protocols and recommendations of the CDC (http://www.cdc.gov/streplab/groupa-strep/emm-typing-protocol.html), and the first 240 bases of each sequence were compared to the sequences deposited in the CDC emm database using the CDC BLAST tool (http://www2a.cdc.gov/ncidod/biotech/strepblast.asp). The presence of 11 SAg genes (speA, speC, speG, speH, speI, speJ, speK, speL, speM, smeZ, and ssa) was tested by two previously described multiplex PCR reactions, using the chromosomally encoded genes speB and speF as positive control fragments 15 . All emm89 isolates were screened for the presence of the has locus by PCR 12 .
Antimicrobial susceptibility testing. Susceptibility tests were performed for all isolates by disk diffusion according to the guidelines and interpretative criteria of the Clinical and Laboratory Standards Institute (CLSI) 39 , using the following disks (Oxoid, Basingstoke, UK): penicillin, vancomycin, erythromycin, tetracycline, levofloxacin, chloramphenicol, clindamycin, and linezolid. Macrolide resistance phenotypes were determined by the double-disk test 39 . E-test strips (BioMérieux, Marcy l'Etoile, France) and CLSI interpretative criteria 39 were used for MIC determination in levofloxacin non-susceptible isolates and in all cases of intermediate susceptibility by disk diffusion. Susceptibility to bacitracin was determined using BD BBL TM Taxo TM A Disks (Becton, Dickinson and Company, Sparks, MD, USA).
Detection of genetic determinants of antimicrobial resistance. The screening for the genetic determinants of resistance to macrolides, tetracycline and fluoroquinolones was performed as previously described 17 . Briefly, erythromycin-resistant isolates were tested for the presence of the mef, erm(A), and erm(B) genes by www.nature.com/scientificreports www.nature.com/scientificreports/ multiplex PCR, followed by a second PCR to distinguish between mef(A) and mef(E) in mef-positive isolates. Tetracycline-resistant isolates were PCR-screened for the presence of the tet(K), tet(L), tet(M), and tet(O) genes. For levofloxacin non-susceptible isolates, the QRDRs of the gyrA and parC genes were amplified by PCR and sequenced.
Statistical analysis. The diversity of the isolates according to different typing methods was evaluated using the SID with corresponding 95% confidence intervals (CI 95% ) 40 , calculated using an online tool (http://www. comparingpartitions.info). Two-tailed Fisher's exact test and odds ratios were used to identify significant pairwise associations. The Cochran-Armitage test was used to evaluate trends. The p-values for multiple tests were corrected using the FDR linear procedure 41 . Values of p < 0.05 were considered statistically significant.

Data availability
The datasets generated and analysed during the current study are available in the Zenodo repository, https://doi. org/10.5281/zenodo.3441765.