Development and external validation of a clinical prediction model for MRSA carriage at hospital admission in Southeast Lower Saxony, Germany

In countries with low endemic Methicillin-resistant Staphylococcus aureus (MRSA) prevalence, identification of risk groups at hospital admission is considered more cost-effective than universal MRSA screening. Predictive statistical models support the selection of suitable stratification factors for effective screening programs. Currently, there are no universal guidelines in Germany for MRSA screening. Instead, a list of criteria is available from the Commission for Hospital Hygiene and Infection Prevention (KRINKO) based on which local strategies should be adopted. We developed and externally validated a model for individual prediction of MRSA carriage at hospital admission in the region of Southeast Lower Saxony based on two prospective studies with universal screening in Braunschweig (n = 2065) and Wolfsburg (n = 461). Logistic regression was used for model development. The final model (simplified to an unweighted score) included history of MRSA carriage, care dependency and cancer treatment. In the external validation dataset, the score showed a sensitivity of 78.4% (95% CI: 64.7–88.7%), and a specificity of 70.3% (95% CI: 65.0–75.2%). Of all admitted patients, 25.4% had to be screened if the score was applied. A model based on KRINKO criteria showed similar sensitivity but lower specificity, leading to a considerably higher proportion of patients to be screened (49.5%).

Several studies confirmed that infections with Methicillin-resistant Staphylococcus aureus (MRSA) are associated with increased morbidity and mortality 1-3 as well as with high treatment and consecutive costs 4 . While universal screening at hospital admission is recommended only if MRSA prevalence is high 5 , targeted screening of risk groups was shown to be cost-effective in intermediate-prevalence countries 2,5-8 . Predictive models for the identification of MRSA carriers can contribute to optimizing screening strategies in hospitals, which have to balance sensitivity against the costs and efforts of a higher proportion of persons to be screened 9 .
For Germany, an intermediate-prevalence country, the DIMDI (German Institute for Medical Documentation and Information) recommends selective MRSA screening of patients at risk 10 . However, there are no clear guidelines on how to select these patients. The Commission for Hospital Hygiene and Infection Prevention (KRINKO) at the Robert Koch Institute released in 2008 a list of 11 factors indicating an increased risk of MRSA colonization at hospital admission 11 . In 2014, the list was slightly revised, and contained now 10 risk factors, some of which have to appear in combination; a definite screening recommendation has not yet been available 12  www.nature.com/scientificreports/ a complex list of risk factors is, however, hardly applicable in everyday clinical practice 13 . The development of a regional screening strategy as proposed by KRINKO 12 could, thus, be a sensible approach. This study aimed at the development and external validation of a predictive model for MRSA carriage at hospital admission in the region of Southeast Lower Saxony. The proposed model should translate into a regional screening recommendation that is easy to use in everyday clinical practice.

Materials and methods
Study design and population. This study was conducted in the catchment area of the "Hygienenetzwerk Südostniedersachsen", a cooperation of healthcare providers in eight municipalities in Southeast Lower Saxony to combat hygiene-relevant pathogens. For the construction of the training dataset, we performed a universal screening of newly admitted patients in two hospitals in Braunschweig (with a total of four locations, covering the majority of the city's population) over 2 weeks. This dataset was used to determine admission prevalence of MRSA carriage and relevant risk factors. All patients aged 18 or above who were admitted between the 18th of November and the 2nd of December 2013 were asked to complete a self-administered questionnaire on the day of admission. Completed questionnaires were collected by study staff; patients could receive assistance in answering the questionnaire if needed. Patients unable to consent (e.g., due to language barriers or consciousness level) could be represented by a next of kin or had to be excluded.
For external validation, a temporally and spatially independent study was performed. For this study, all patients testing positive for MRSA at admission to Wolfsburg hospital (universal screening in place, hospital located as well in the catchment area) between the 7th of September 2015 and the 9th of March 2016 who met the inclusion criteria received the questionnaire; MRSA-negative participants were recruited from all inpatients on two separate days within this period using the same questionnaire as well as the same inclusion and exclusion criteria.
Questionnaire. The selection of the risk factors examined in the questionnaire was based on the list of "Risk populations for colonization with MRSA" published by KRINKO in 2008 11 . Further variables were included based on literature search and expert opinion. In addition, data on age, sex, education, and occupation were collected. In total, 34 variables per patient (representing potential risk factors) were examined.

Laboratory analyses.
To identify MRSA carrier status, a combined nasopharyngeal swab and an additional swab of chronic wounds (if present) were taken within 48 h after admission. No additional swabs (e.g. from devices) were taken. In the case of multiple inpatient admissions during the study period, only the first admission was evaluated. MRSA was determined by cultivation on selective media. Confirmation of the species (e.g., by catalase plus coagulase) was followed by confirmation of Oxacillin resistance by a second independent method (resistance gene determination by PCR or VITEK 2).

Data management and statistical analysis.
Questionnaire data were read in automatically using Tel-eForm (Cardiff Software, Vista, California 92081, USA), were continuously monitored and validated, and were linked individually with the results of MRSA screening. In the main analysis, we used simple imputation for missing information on single disease statuses in the questionnaire (imputing missing values as not having had the respective disease).
Groups of patients with positive and negative MRSA status were compared using univariable analyses. χ 2 tests or Fisher's exact tests were used for categorical variables. The Wilcoxon rank-sum test was used for the continuous variable age.
Using multivariable logistic regression analysis, we developed a model to predict MRSA carrier status (dependent variable) at the time of hospital admission. Following the Hosmer-Lemeshow suggestion, variables with p < 0.25 in univariable analyses were included in the multivariable analysis 13 . Logistic regression with backward selection was performed using fractional polynomials for age as implemented in the Stata function "mfp" to allow for non-linear effects (p < 0.2 for selecting polynomials instead of linear effects). A p value of ≥ 0.05 (based on Wald tests) was selected as parameter exclusion criterion in the backward selection. For the final model, a probability cut-off for an optimal balance of sensitivity and specificity was determined by Youden index; the corresponding AUC (area under the receiver operating characteristic (ROC) curve) was calculated. Bootstrap sampling (1.000 replications) was applied to assess stability of variable selection and the effect estimates of the predictive model.
Additionally, a simplified score based on the predictive factors of the model was derived. The derived score was then applied to the validation dataset; subsequently, sensitivity, specificity, and positive and negative predictive values (PPV and NPV) were calculated. Calibration in the validation dataset was assessed by regressing MRSA status on the log odds of the predictor probabilities, which were calculated using the logistic regression model derived in the training dataset 14 . A calibration curve, showing the observed proportions of MRSA-positivity stratified by quantiles of predictions with at least 60 patients per group, was used to investigate goodness of fit among the whole range of predictions (R package 'rms' 15 version 5.1-3.1). The model and score were also compared to a model representing the KRINKO risk factors. As there was no question on history of contact with a known MRSA carrier, the KRINKO score had to be constructed without this criterion.
Statistical analyses were performed with Stata IC 12.1 (StataCorp, College Station, US) and R version 3.6.1 (www.R-proje ct.org).

Sensitivity analyses.
We performed an additional univariable analysis with non-imputed variables as a complete case analysis. Since the number of MRSA-positive patients in the training dataset was considerably www.nature.com/scientificreports/ lower than expected a priori, we also performed a sensitivity analysis in which we used the validation dataset for model development and the training dataset for validation to evaluate if the small number of cases in the training dataset might have affected our analyses.

Ethics and informed consent. This study received ethics approval from the Ethics Committee of Hanover
Medical School (ethics approval reference number 1980-2013 and amendments). All research was performed in accordance with relevant guidelines and regulations. All study participants and/or their legal representatives provided written informed consent.

Results
Baseline characteristics in the training dataset. Within the recruitment period, 2556 patients were admitted, and 2065 MRSA swabs (80.8%) were obtained. Thirty-eight percent of all admitted patients (n = 973) did not provide informed consent (neither themselves nor their legal representatives). During data processing, another 382 individuals (14.9%) were found not to meet the inclusion criteria. Finally, 1201 participants (47.0%) were included in the analysis.
Stability assessment. The same three variables were shown to be the most stably selected parameters when the model building process underwent a bootstrapping evaluation. "MRSA history" was selected in almost 95% of runs, "care dependency " in 52% and "under cancer treatment" in about 45%. Age was a relatively stable predictor as well, chosen in about 44% of runs (mutually exclusive to care dependency), followed by antibiotic treatment in the past 6 months with about 32%. The remaining variables were selected in less than 20% of the runs.

Score building.
A simplified score could be built including the same three variables. Each positive response to one of the three questions resulted in one point, so that score values between 0 and 3 could be reached. Since the probability cut-off of the regression model was extremely low at 0.01, there was no need to weight the score according to the regression coefficients. A single positive response to any of the three questions was enough to pass the probability threshold. When applying the score to the training dataset, 21.9% (263/1201) of all admitted patients had to be screened microbiologically to reach the reported diagnostic prediction accuracy.
The developed model and the model based on the KRINKO risk factors resulted in similarly high sensitivities for the detection of MRSA carriers (Table 3). However, the specificity was lower for the derived KRINKO risk www.nature.com/scientificreports/ factors model so that the proportion of patients classified to be microbiologically screened was considerably higher when using KRINKO criteria than our proposed score (49.5% vs. 25.4%).

Sensitivity analyses.
Univariable analysis of non-imputed data revealed only minor differences, and did not result in any qualitative change of results when compared to the primary analysis. Because of the limited number of MRSA-positive participants in the training dataset, we additionally performed a model building process based on the validation dataset using the same procedure as described for the training dataset. Again, no major differences in the performance of the model on the validation dataset was observed.

Discussion
We developed a model for individual prediction of MRSA carriage at the time of hospital admission for the region of Southeast Lower Saxony. The final model contained three predictors and could be easily transformed into a simple clinical score. In external validation, its diagnostic prediction accuracy was superior to a screening algorithm based on the KRINKO risk factor list when applied to our study region. Various sensitivity analyses provided evidence that the existing limitations of the underlying datasets did not affect overall results.
MRSA prevalence at hospital admission was 2.0% (95% CI: 1.5-2.7%) in our study. This is in line with other German studies that reported prevalence values between 1.6 and 2.3% 7, 14-16 . For the catchment area of the hospitals under study, MRSA prevalence in a population-based study was reported to be 1.3% (95% CI: 0.6-3.0%) 16 .
In univariable analyses, ten potentially predictive factors for MRSA carriage were found. They were either directly associated with MRSA carriage or related to increased age and morbidity. This is in line with other studies where a history of MRSA carriage was often the strongest predictor 17,18 . In addition, high age [19][20][21] and parameters associated with "contact with healthcare" 22 , such as history of antibiotic therapy or inpatient treatment, presence of chronic diseases, living in a long-term care facility, dialysis or skin disease were regularly described as risk factors for MRSA carriage [19][20][21][22][23] .
The final predictive model included three parameters (MRSA history, care dependency, and being under cancer treatment). They were confirmed as the three most stable risk factors in the bootstrapping analysis. Age was the next important variable in our stability analysis. Increased age is a well-known predictor for MRSA carriage in many other studies [19][20][21][22] , which was also confirmed in the univariable analysis in our study. Despite the fact that age was included in many of the models in the stability analysis, we decided not to use age for the prediction model, because age and care dependency were selected mutually exclusive in the bootstrap runs and  www.nature.com/scientificreports/ care dependency was the more stable predictor (and easier to include in simple clinical decision support tools) than the continuous variable age. Testing patients on geriatric wards might nevertheless be an alternative or additive screening concept, as increased MRSA colonization is common in this group 17,18,24 . A simple score with high diagnostic prediction accuracy was derived. Such a score is very easy to apply in clinical practice as information on all three risk factors usually is readily available at the time of admission. Thus, there is no need for access to additional data sources, as it has been the case for predictive models developed in other studies 19,21,22,25 .
Our score differs from others proposed, which often included a higher number of predictors and sometimes additional weighting factors, making implementation in admission settings and emergency rooms more difficult 18,22,26 . Harbarth et al. 22 considered the high proportion of patients to be microbiologically tested in their study and the high logistical and financial costs to be a problem, and reduced the number of predictors in their model from nine to four. As a consequence, the proportion to be screened decreased from about 70% to 50%, while sensitivity decreased slightly from 86 to 84%. However, these results were confirmed only in an internal validation setting without external validation datasets.
Another study proposed an incremental risk score containing three unweighted factors (recent antibiotic treatment, intra-hospital transfer and inpatient treatment within the past 2 years), but excluded patients with known MRSA history and used the information on risk factors obtained from the electronic patient records 25 . For the classification rule of ≥ 1 risk factors present, the sensitivity was 87% in internal and 88% in external validation, with the number of patients to be screened as high as 70% and 58%, respectively; sensitivity declined dramatically to 61% and 44% when using two risk factors or more as the classification rule 25 .
Compared with a model based on the KRINKO risk factor list, our model and the derived score showed a similar sensitivity (78.4% vs. 80.4%), and higher specificity (70.3% vs. 41.8%), while the proportion to be screened was considerably lower (25.4% vs. 49.2%).
Two further studies examined the diagnostic prediction accuracy of screening based on KRINKO's 2008 criteria 11 , and found a comparable predictive accuracy as in our study (sensitivity of 78.9% and 77.6%, and a proportion to be screened of 41.1% and 50.6% 7,18 ).
Our score showed its cost-effectiveness by the considerably lower proportion of persons requiring microbiological screening at admission. Creamer et al. 26 showed for their institution in Ireland as well that admission screening of MRSA-risk patients decreased costs by 60% compared to a form of screening where all patients were screened on admission. The ultimate aim of MRSA screening programmes is to decrease the incidence of hospital-associated MRSA infections. Reilly et al. 27 showed in a Scottish study that MRSA screening can actually accomplish this.
Limitations. Our study has several limitations, which correspond to the quality of data collection during routine clinical practice. A major limitation of our study was the number of MRSA-positive patients in the training dataset, which reduced the statistical power of the analyses. In total, only 1201 of 2556 patients could be included in the training dataset (47%), with the proportion of MRSA carriers being even lower at 38% (16 of 42). To assess if the low number of MRSA-positive cases in the training dataset might have affected model building, we performed a sensitivity analysis in which we used the validation dataset for training and vice versa; results were virtually unchanged. Due to the study design with patient-administered questionnaires, only few participants in intensive care units could be included. Since MRSA carriage is known to be higher there 2,6,28,29 , the screening recommendation might not be applicable to patients requiring intensive care support at admission; a screening recommendation might thus be extended to all patients admitted to intensive care units. The sampling schemes of the training and validation dataset were different. While the training dataset was derived using a classic surveillance setting with low MRSA prevalence, the validation dataset was designed in a more balanced way based on an ongoing universal screening program so that a larger number of MRSA-positive individuals could be included. This was done deliberately based on the experience with the training dataset but could have affected how the results of the study can be generalized. The model based on the KRINKO criteria did not include one of the criteria mentioned by KRINKO because it was not part of the questionnaire. Since the classification rule for the KRINKO model corresponded to at least one positive criterion, our analysis might have slightly underestimated the true sensitivity of the KRINKO model, while it would have overestimated specificity.

Conclusions
We developed and externally validated a score for the identification of MRSA carriers at hospital admission in the region of Southeast Lower Saxony. The score showed better diagnostic prediction accuracy than the previous overall German screening considerations, with a lower proportion of individuals to be screened, and is easily applicable in clinical practice. The validity of the score outside the catchment area needs to be examined. Furthermore, it needs to be evaluated if additional universal screening of patients in intensive care units or geriatric patients leads to an improvement in sensitivity without disproportionately decreasing specificity.

Data availability
All data generated or analysed during this study are available from the corresponding author on reasonable request.