Influence of a patient transfer network of US inpatient facilities on the incidence of nosocomial infections

Antibiotic-resistant bacterial infections are a substantial source of morbidity and mortality and have a common reservoir in inpatient settings. Transferring patients between facilities could be a mechanism for the spread of these infections. We wanted to assess whether a network of hospitals, linked by inpatient transfers, contributes to the spread of nosocomial infections and investigate how network structure may be leveraged to design efficient surveillance systems. We construct a network defined by the transfer of Medicare patients across US inpatient facilities using a 100% sample of inpatient discharge claims from 2006–2007. We show the association between network structure and C. difficile incidence, with a 1% increase in a facility’s C. difficile incidence being associated with a 0.53% increase in C. difficile incidence of neighboring facilities. Finally, we used network science methods to determine the facilities to monitor to maximize surveillance efficiency. An optimal surveillance strategy for selecting “sensor” hospitals, based on their network position, detects 80% of the C. difficile infections using only 2% of hospitals as sensors. Selecting a small fraction of facilities as “sensors” could be a cost-effective mechanism to monitor emerging nosocomial infections.


Transfer network
We examine the structural connectivity and geographic characteristics of the static transfer network in Fig. S1. In terms of network topology, the in-degree distribution has a broader tail than the out-degree distribution. The network has an average (local) clustering coefficient of 0.51. This coefficient measures the probability that any two hospitals connected to an index hospital are in turn connected to each other, forming a closed triad (a cycle of three nodes and three edges). A random graph with the same number of nodes and edges yields an average local clustering coefficient of 0.0057 ± 0.0001 (SE), which is substantially lower than the observed value, a finding that likely reflects the network's geographic embeddedness. The average shortest path length of the network is 4.69. To put this number in perspective, we performed network randomizations using a slight variant of the directed configuration model that preserves both in-degree and out-degree distributions S1 . This approach gave rise to an average shortest path length of 3.6 ± 0.4 (SE). The observed network is therefore a somewhat "larger world" than what would be expected by chance, but this is almost certainly driven by the underlying geography and the objective of keeping transfers as short as possible. In fact, about 90% of the transfers are to hospitals less than 200km away.
Degree assortativity is the concept that nodes with many connections tend to be connected to other nodes with many connections S2,S3 . When the static network is taken as undirected, we can use the assortativity coefficient to measure the extent to which the degrees of hospitals in each pair of connected hospitals are similar. We obtain a slightly negative value of -0.06, but similar values of -0.005 ± 0.001 (SE) also arise from randomizations of the network using the algorithm discussed above. Consequently, there is no statistically significant assortativity in the network over and above what would be expected by chance given the network's degree distributions.

Robustness of the transfer extraction
Since the patient transfers are not explicit in the data but instead need to be inferred from the data, we investigated the robustness of some of the results to our definition of what constitutes a hospital transfer. Instead of requiring readmission on the day of discharge, we relaxed this definition by allowing the readmission to take place also on the day after discharge. A visual examination of Fig. S2 shows that the edges induced by the same-day rule (red edges) and the additional edges that result using the relaxed rule (blue edges). This relaxation leads to 67472 additional transfers (7.2% increase). There are 11827 new edges that appear on the transfer network (15.6% increase), with an average transfer load of 1.2 with a standard deviation of 0.7. For the connections that appear under both rules, the difference in transfer loads averages to 0.7 transfers with a standard deviation of 1.9. The distribution of edge weights for both cases are shown in the upper left panel of Fig. S3, and the two distributions appear visually very similar to one another. The weight distribution of the additional edges, as well as the distribution of weight differences for the common edges in both cases can be seen in the upper right panel of Fig. S3. The range of this distribution is much more constrained than that of the actual weight distributions. The number of transfers increases, but the deviations are minimal. Note also that both measures of transfers are strictly speaking wrong, as the first one based on the one-day rule is really a lower bound on the number of transfers and the second one (based on the relaxed rule) is an upper bound. Given the similarity of these findings across the two rules, in the following we work with the lower bound (same day discharge and readmission).

C. difficile incidence correlation by hospital type
In order to gain more insight about the influence of different types of facilities we divided the hospitals into three categories, namely general hospitals, rehabilitation facilities and other hospitals. We plot in Fig.  S4 the C. difficile incidence vs. the average C. difficile incidence in the neighborhood. Two separate "regimes" of correlation were apparent for general hospitals (Fig. S4a, 4546 out of 5667 hospitals), separated by a mean hospital incidence of 3.1%. The high incidence regime is populated by just 149 hospitals. In the low incidence regime, the two variables are reasonably strongly correlated (Pearson correlation coefficient 0.50, 95% CI: 0.48, 0.52), whereas in the high incidence regime there is no correlation (correlation coefficient -0.07, 95% CI: -0.23, 0.09). Linear regression of the average mean incidence in neighboring hospitals on incidence in the case hospital reveals a similar trend, where the regression coefficient is estimated as 0.53±0.02 in the low incidence regime and -0.03±0.04 in the high incidence regime. The regression coefficient measures the average magnitude of the correlation. So if the regression coefficient is 0.53, it means that a variation of 1% in C. difficile incidence will translate on average in a 0.53% variation in the neighboring hospitals. In contrast to general hospitals, the plot reveals just one regime for rehabilitation hospitals (Fig. S4b, 526 hospitals), where the correlation coefficient is 0.33 (95% CI: 0.25, 0.40) and the estimated slope in the linear regression is 0.049±0.006. The correlation in incidences for the remaining hospitals ( Fig S4c, 535 hospitals) was weaker (0.18, 95% CI: 0.09, 0.26) with the slope estimated as 0.10±0.02.

Finding a threshold for the two regimes in C. difficile incidence correlation for general hospitals
Although in the main text we showed only the data for C. difficile incidence below 0.05 for general hospitals, when plotting the C. difficile incidence versus the average mean C. difficile incidence in the hospital neighborhood we found two regimes for the full range. In order to objectively give a cut-off separating the two regimes we computed the Pearson correlation coefficient of the two regimes as a function of the cut-off and kept the value for which the correlation coefficient of the low incidence cloud was maximum. See Fig. S5.

Adjusting for hospital size
It could be argued that hospital size, as a proxy for its activity, would be a major indicator of C. difficile incidence. In order to control for hospital size, we adjusted three different models to the data of C. difficile incidence in the low incidence regime for general hospitals, which is the one showing the strongest signal (see the two previous sections). The first model is a logistic regression of C. difficile incidence using as indicator the average C. difficile incidence in the network neighborhood of the central hospital, the second model uses hospital size as indicator and the third model uses both. The complete model is written as where ρ i is the incidence at hospital i, ⟨ ρ ⟩ i is the average incidence at the network neighborhood of hospital i and s i / 100 is the size of hospital i in number of beds divided by 100. The main result is that the effect of size per 100 beds is an increase in 7% in the odds of C. difficile incidence, while for a 10% increase in the average C. difficile in the neighborhood the effect is a 3000% increase in the odds of C. difficile incidence. Thus network neighborhood has a much stronger effect on C. difficile incidence. Results are shown in Table S2.

Community acquired infections vs hospital acquired infections
In order to know if the correlation in C. difficile incidence is due to transfers of patients between hospitals or because the infections come from a common reservoir in the geographical vicinity of the hospitals, one should be able to access genetic data in order to assess that the same strain is flowing from a hospital to another one. These data are not available in our case. At most we can show that the incidence at a hospital i, ρ i is more correlated with the hospitals that are at less than a certain distance D and connected through the transfer network, ⟨ ρ ⟩ i Net ( D) , than with the average incidence of all the hospitals that are We checked that the indicators are not extremely correlated by computing their Pearson correlation coefficient as a function of distance D. as can be seen in Fig.S6A The results of the coefficients as a function of the distance d are shown in Fig. S6B. The inset shows the value R 2 associated with the regression also as a function of the distance D. The results show that the effect associated to hospitals which are connected through the transfer network saturates around 40 from 100km on, while the effect of the average of all hospitals up to a certain distance is a growing function of D which is less important for distances below 250km, but reaches the effect of connected hospital from there on. In conclusion the average incidence in network neighbor hospitals is substantially more representative of a hospitals own incidence than the incidence in the geographical vicinity unconstrained with the transfer network for distances smaller than 250km and both effects are comparable at larger distances.
We also standardized the values of ⟨ ρ ⟩ i Net ( D) and ⟨ ρ ⟩ i Com ( D) , by subtracting their means and dividing by their standard deviation. In this way we have comparable ranges among all variables. The different values of the standard deviation for the network and community indicators are shown in Fig.S6c. Then we applied the same regression technique to obtain the results in FigS6d, which corroborate our finding in Fig.S6b that the network neighbors have a stronger influence on the incidence of a hospital than those hospitals at similar distances but not connected via transfers and extends the results to all the range of distances explored.

Optimal sensor set
We determine the best sensor set we could have possibly chosen given the observed data. In order to do this, we use greedy algorithms, S4 as checking all possible combinations of hospitals to use as sensors grows exponentially in the number of hospitals and is therefore not feasible for any but the smallest hospital transfer networks. For a fast algorithm that is not guaranteed to give the optimal answer (as is true with any heuristic algorithm), we choose the sensors sequentially. We first compute the number of cases each hospital would detect and we choose the one that will detect the highest number of cases. We then re-compute how many new cases would be covered by each subsequent hospital if added to the existing sensor set. This continues until we find the sensor set that covers all cases. As mentioned above, this procedure does not guarantee that we will choose the optimal sensor set given a number of sensors N, but it is however very efficient and yields an effective sensor set not far from the optimal one. In order to check that our solution is sufficiently close to the actual best solution, we used simulated annealing S5 . The simulated annealing procedure is suitable for optimization problems of large scale, especially ones where a desired global extremum is hidden among many, poorer, local extrema. There is an objective function to be minimized, in our case the coverage of cases to be maximized, but the space over which that function is defined is not simply the N-dimensional space of N continuously variable parameters. Rather, it is a discrete, but very large, configuration space with the number of elements factorially large, so that they cannot be explored exhaustively. This method ensures that the global maximum (or minimum depending on what is searched) is obtained, although it may take a diverging time. We checked that we found the same result for slower and slower implementations of the method, so that we can trust the result to be the global extremum. The result of this method is in agreement with the result of the fast sequential algorithm.
In Fig. S7 we show the results of finding the sensor set that maximizes the number of detected cases in the training dataset for the static network case. This method is data-based and tries to maximize the number of detected cases without the use of any strategy of choosing sensors other than the optimization procedure. In this case we find that for a very small number of 26 (0.46%) sensors, we can detect 88% of the cases. This very high performance is however likely a consequence of over-fitting the model to the observed (training) data. Using this set of hospitals as sensors for a new dataset on patient transfers would likely result in lower (and more variable) performance of the sensor system.
Finally, in Fig. S8 we can see a map depicting the sensor set that is the result of the optimization for the aggregated case.

Robustness of sensor set performance
The performance of statistical methods is generally quantified using some error metric, and most fitting procedures attempt to minimize this error in the process of finding suitable values for model parameters.
It is often possible to reduce this training error by increasing model complexity, but generally the goal of modeling is to have the model perform well on a test data set, ideally an independent data set, that the model was not trained on. Good performance on a test data set, quantified by a low test error, generally leads to better overall model performance and avoids the problem of over fitting, which refers to the model adapting to the test data "too well" at the expense of poor generalizability to different realizations of data from the same data generating mechanism.
In analogy with this approach to statistical learning, we performed a series of analyses to investigate the performance of sensor sets derived from one set of data and tested on another. The objective of the analysis is twofold. First, it will enable us to ascertain the validity of our methods when applied to training data, i.e., data not used to select the set of sensors. Two, given that there are likely temporal correlations in the data, it enables us to study the performance of sensor sets on data that are temporally far removed from the training data.
Here we divided our data to disjoint (non-overlapping) windows of width L, where we used values of 1 month, 2 months, 4 months, 6 months, and a year for L. For any given window, we take the first window to be our training data and use all subsequent windows as different realizations of test data. We used the training data for generating the sensor sets (based on in-degree, out-degree, and the greedy algorithm; we exclude considerations of the random strategy here because there is no real distinction between testing and training) and evaluated the relative efficacy and the percentage of cases detected separately for each test data window.
Although intuitively it seems that the sensor sets would perform worse the greater the temporal separation between the training window and test window, we found that our methods were robust against this separation. Little variation is observed as the validation window gets more and more separated temporally from the training window that was used to construct the sensor sets (see Figs. S9-S11). This is counterintuitive especially for the sensor set obtained using the greedy algorithm because in principle we are over-fitting our model to the data and consequently this should result in more variability. Nevertheless, temporal correlations in the dynamics of the system make it well behaved in this sense. An important lesson here is that it is possible to determine efficient sensor sets even using outdated data.

List of hospitals included in the sensor sets
In this section we list the first 26 hospitals included in the in-degree and out-degree strategies, as well as those that arise from the greedy optimization approach.
In-degree strategy TABLES Abbreviations: Acquired immunodeficiency syndrome (AIDS), interquartile range (IQR), standard deviation (SD).    The black curve is for the correlation coefficient of the cloud in the low incidence regime, given the cut-off marked by the x-axis. The red one is for the high incidence regime. infections") is represented by the red curve and by the blue curve. Error bars show the 95% confidence ɣ intervals. We see that the effect of connected hospitals is bigger for connected hospitals than those which are not at least up to 250km. c) Standard deviation of the network indicators (red) and of the community indicator (blue) as a function of distance. d) Result of the regressions after standardization. The results support the results in plot b (same color codes).

Fig S7. Finding the optimal number of sensors for the best sensor selection (static network). a)
shows the efficacy and b) the fraction of detected cases, both as a function of the fraction of hospitals used as sensors. There is a peak for a very low fraction of sensors, but this point however corresponds to no more than 30% of detected cases. The second peak located at around 0.005 (using 0.5% of hospitals as sensors) is able to detect over 80% of the cases.  Due to temporal correlations in the data, the sensor sets derived from the first slice of data perform comparably to their performance on the training set when applied to the remaining slices of data as test data. In all the plots, the results for the training set are shown as black solid lines while the red dashed lines refer to the sensor set applied to the test data sets. From left to right and top to bottom, the different plots refer to window widths of 1 (a), 2 (b), 4 (c), 6 (d), and 12 (e) months.

Fig. S10
In-degree strategy: information vs. validation sets. The panels are arranged as above. Due to the temporal correlation of the data the sensor sets derived from the first slice of data perform comparably to their performance on the training set when applied to the remaining slices of data as test data. to the temporal correlation of the data the sensor sets derived from the first slice of data perform comparably to their performance on the training set when applied to the remaining slices of data as test data. Nevertheless, when compared to the other strategies this is slightly more variable when compared training and test data results.