A data-driven mathematical model of multi-drug resistant Acinetobacter baumannii transmission in an intensive care unit

Major challenges remain when attempting to quantify and evaluate the impacts of contaminated environments and heterogeneity in the cohorting of health care workers (HCWs) on hospital infections. Data on the detection rate of multidrug-resistant Acinetobacter baumannii (MRAB) in a Chinese intensive care unit (ICU) were obtained to accurately evaluate the level of environmental contamination and also to simplify existing models. Data-driven mathematical models, including mean-field and pair approximation models, were proposed to examine the comprehensive effect of integrated measures including cohorting, increasing nurse-patient ratios and improvement of environmental sanitation on MRAB infection. Our results indicate that for clean environments and with strict cohorting, increasing the nurse-patient ratio results in an initial increase and then a decline in MRAB colonization. In contrast, in contaminated environments, increasing the nurse-patient ratio may lead to either a consistent increase or an initial increase followed by a decline of MRAB colonization, depending on the level of environmental contamination and the cohorting rate. For developing more effective control strategies, the findings suggest that increasing the cohorting rate and nurse-patient ratio are effective interventions for relatively clean environments, while cleaning the environment more frequently and increasing hand washing rate are suitable measures in contaminated environments.

In this supplementary material, we provide the descriptions of The contact rate, the detailed deduction for the relationship between the detection rate and the number of colonized patients, the pair-approximation model with cohorting of nurses, the detailed configuration of two kinds of networks and the event-driven stochastic simulation method on the network.
1 The detailed description of contact rate "Contact" was defined as any physical contact with a patient or the patient's surroundings or indwelling devices as described previously 1 . The contact rate was defined as the number of contacts per patient per hour. The contact rates between patients and nurses or doctors were obtained through direct observation for a total of 59 hours over 37 days in August, September, and October 2008. Observations were mainly conducted in the morning and afternoon shifts (about 8 hours during the daytime). All of the patients present during the investigation time were observed. The average duration of each observation per patient was 6.0 ± 1.7 minutes. The total numbers of contacts during the observation periods were 1,236 for nurses and 267 for doctors. Our data show that the contact rates in the different periods might be very different, which made it difficult to obtain an accurate contact rate per unit of time. However, the contact rates for the morning and afternoon shifts were 5.8 and 1.1 for the doctors and 24.2 and 12.1 for the nurses, respectively. So, we could deduce that the contact rate of patients with nurses was approximately 4.6 times greater than that of patients with doctors.

The relationship between the detection rate and number of colonized patients
The detection rate of MRAB in the environment is defined by the proportion of positive samples, which could represent the concentration of bacteria in the environment.
As we know, the bacteria in the environment come from colonized patients. So, there may be a certain correlation between the detection rate and the number of colonized patients. Comparing Figs.1(a) and 1(b) in the main text suggests that these two data sets have similar trends. Moreover, Pearson's correlation coefficient was calculated to be 0.57 (p = 7.24 * 10 −5 ) between the number of colonized patients and the detection rates of bacteria in a ward. Further, to clearly and more directly show the relationship between the number of colonized patients and the MRAB detection rates, we used a linear function, a exponential function and a power-law function to fit the mean detection rates. We calculated the R 2 coefficient of determination which is a statistical measure of how well the regression line approximates the real data points. We foundd that the R 2 of the power law function was the biggest (i.e., 0.8878) among the three R 2 coefficients. Hence, a powerlaw dependence was chosen to fit the data. Let W(t) be the detection rate of MRAB in the ICU environment and P C (t) be the number of colonized patients, then we have , and the goodness of fit was shown in Fig.1(c) of the main text. In the perspective of biological meaning, this relationship is sensible.

The pair-approximation model with cohorting of nurses
Here we provide more detailed descriptions of the pair-approximation model with cohorting of nurses. The parameter p is defined to be the proportion of nurses who are cohorted.
With the cohorting of nurses, the total number of nurses The model is as follows: Similar to model (2), but only a part of the nodes and pairs is included in the equation and others can be derived from the following equalities. [ Values of parameters corresponding to node events in the pairwise model were the same as those in the mean-field model (1) except for the admission rate. The admission of patients in the mean-field model is a constant input but depends on the number of beds that are available, so we assume that it is proportional to the number of empty nodes in the contact network. Hence, in the pairwise model (2), the admission rate Λ is assumed to be Λ E , where E is the average number of empty beds in the ward obtained from our data. According to the result in the paper of Keeling 2 , the rates of edge events in the pairwise model can be converted from the rates for the mean-field model according to the following equalities:

Configuration of two networks
Two kinds of networks are considered in this paper. Specifically, for the strict cohorting network, edges are connected strictly according to the cohort so that the degree of every node is fixed. While, for the random network, 3 it is not the real degrees but the average degrees for doctors and nurses that are the same as those of the strict cohorting network because edges are connected randomly. Therefore, the random network is generated by random connection of two nodes when average degrees of all kinds of nodes are given.
The strict cohorting network is a special random network in which the edges are strictly connected according to the grouping.

Stochastic simulation on the network
Event-driven stochastic simulation 4 on the network was conducted in this paper. According to the Gillespie algorithm, 5 the key points of the event-driven stochastic simulation are how to determine the time to next event and which event happens every time. Sum all of the rates of possible events together, denoted by R total -the total event rates, and view the event process as a Poisson process, then the time interval between every two events follows an exponential distribution. So, the time until the next event can be obtained by sampling a random number r 1 from the exponential distribution with parameter R total . To determine the next event in a stochastic simulation, the rates of all possible events should be ordered in an array first. Then, the cumulative sum of the array can be obtained, which can be used to choose the next event by sampling a random number r 2 from a uniform distribution within [0, R total ] and choosing the first event, such that r 2 is less than the rate associated with that event. It is worth noting that stochastic simulation on a contact net-work, each event impinges on the state of only its neighborhood of contacts. So, if an event is chosen to occur to a node, both the event rate of this node and its neighbors should be updated accordingly.
The main steps of the algorithm can be summarized as follows: 1. Set the time value t = 0 and initialize all the states and event rates of every nodes in a given network.
2. Generate random numbers r 1 and r 2 to determine the time to next event and which the next event may occur.
3. Increase the time step by the randomly generated time r 1 and update states of the associated nodes.

Go back to
Step 2 unless the number of reactants is zero or the simulation time has been exceeded.  Table 1.  Table 1.