Study of risk factors for healthcare-associated infections in acute cardiac patients using categorical principal component analysis (CATPCA)

Using categorical principal component analysis, we aimed to determine the relationship between health care-associated infections (HAIs) and diagnostic categories (DCs) in patients with acute heart disease using data collected in the Spanish prospective ENVIN-HELICS intensive care registry over a 10-year period (2005–2015). A total of 69,876 admissions were included, of which 5597 developed HAIs. Two 2-component CATPCA models were developed. In the first model, all cases were included; the first component was determined by the duration of the invasive devices, the ICU stay, the APACHE II score and the HAIs; the second component was determined by the type of admission (medical or surgical) and by the DCs. No clear association between DCs and HAIs was found. Cronbach’s alpha was 0.899, and the variance accounted for (VAF) was 52.5%. The second model included only admissions that developed HAIs; the first component was determined by the duration of the invasive devices and the ICU stay; the second component was determined by the inflammatory response, the mortality in the ICU and the HAIs. Cronbach’s alpha value was 0.855, and VAF was 46.9%. These findings highlight the role of exposure to invasive devices in the development of HAIS in patients with acute heart disease.


Methods
Study sample. To conduct our study, the database of the national ENVIN-HELICS registry was used. This is a voluntary prospective registry in which the majority of Spanish ICUs participate, which collects significant information about HAIs in patients admitted for more than 24 h in the ICU between April and June. The data were obtained prospectively over a period of 10 years (from 2005 to 2015). We selected all adult patients aged over 18 whose cause of admission to the ICU was acute cardiac disease, in accordance with the categories shown in the DC variable in Table 1. The database contained a total of 175,014 admissions, of which 69,876 fulfilled this criterion.
Definition of the variables. Basic demographic data and the variables shown in Table 1 were included.
Only HAIs acquired during the ICU stay that took place 48 h after admission were analysed. Both the HAI definitions and the microbiological diagnostic criteria were drawn up in accordance with the criteria of the European Centre for Disease Prevention and Control (ECDC) 12 . Device-associated HAIs were included (Table 1): ventilator-associated pneumonia (VAP), catheter-associated urinary tract infection (CA-UTI), catheter-related bloodstream infection (CRBSI), and ventilator-associated tracheobronchitis (VAT) 13 . In addition, other HAIs not associated with invasive devices were included: bloodstream infection secondary to other infection sites (BSI-S), health care-associated pneumonia (HAP), urinary tract infection not associated with urinary catheters (NCA-UTI), surgical site infection (SSI), and the miscellaneous group. The inflammatory response to the infection was classified according to the Second International Sepsis Definitions Conference 14 .
Statistical analysis. A basic descriptive analysis was conducted on the incidence of the different types of HAIs, both those associated with invasive devices and the other HAIs acquired in the ICU. Missing data were tested for randomness using Little's MCAR test.
CATPCA 11,15 is a technique derived from linear principal component analysis (PCA) that reduces a set of numerical variables to a lower number of noncorrelated components, with the smallest loss of data possible. CATPCA performs a process known as "optimal quantification", which allows nominal, numerical, ordinal, and categorical variables to be transformed into quantitative variables. CATPCA generates as many components as variables included in the model, although generally 2 or 3 components that contribute a greater quantity of variance are used.
In the generated graph, the transformed variables are shown as vectors in the case of numerical, ordinal, and nominal variables. The vector length or component loadings express the correlation between the component and the variable: it is an indicator of variance accounted for (VAF) and its their contribution to the component. The cosines of the angles between the vectors represent the correlation coefficient between them: an angle close to zero indicates a high correlation between variables, a 90° angle indicates no relationship and a 180° angle indicates an inverse relationship.
In the case of categorical variables, the graphic representation is produced in the form of centroids for each category. The proximity between the centroids indicates the relationship between the categories. It is possible to produce a graph that combines the vectors and the position of the centroids of the categorical variables.
To discover the relationship between DCs and HAIs, a first CATPCA model was produced for the entire sample, including a total of 10 variables (Table 1). A second model was produced including only patients who developed HAIs, which incorporated the inflammatory response to the infection and eliminated the variables related to admission. In both cases, a model with the 2 components with the greatest quantity of variance was chosen, the outliers were eliminated, and categories of low prevalence were merged. Table 1 shows the level of analysis of the variables used.
A VAF of more than 0.3 was accepted as the significant effect of a variable on the component 16 , and a Cronbach's alpha value of more than 0.7 17

Description.
A total of 69,876 admissions (65.8% male) were included out of a total of 175,014 patients admitted to 231 ICUs during the study period. The average age was 66.8 years (CI 95%, 66.7 to 66.9). There were a total of 5,597 HAIs in 3616 patients; therefore, 5.1% had one or more infections (CI 95%, 5.01 to 5.34). Table 2 summarizes the incidence of the different HAIs.

Discussion
To our knowledge, this is the first study on HAIs that uses this methodology instead of the traditional statistical approach based on the inferential model. A regression analysis would be the most appropriate technique to explore the risk factors for HAIs and would provide confidence intervals for each risk factor. However, the HAI concept itself encompasses heterogeneous entities, and this would require the study of each infection www.nature.com/scientificreports/ individually. In our study, 3 variables of interest were qualitative variables with multiple categories, with very different sample weights, and often with artificial or irrelevant boundaries between very similar clinical entities; this makes it difficult not only to interpret the results in a regression analysis, but the preliminary descriptive analysis of these variables would be very complex as it would require multiple contingency tables with different levels. The technique used, CATPCA, makes it possible to determine nonlinear relationships between variables and to discover patterns without a pre-established hypothesis 11 . It also makes it possible to include variables of a different nature in a single model and offers us an overview of the same through its graphic representation, which makes it a very suitable tool for the study of large datasets, without the restrictions of other statistical techniques. Although rarely used in clinical studies, CATPCA is widely used in other scientific disciplines. On this point, it is worth remembering that an exploratory statistical technique was employed to study the outbreak of cholera in London in 1854 18 , 100 years before modern data mining. The first model aims to identify the association of various risk factors and the development of HAIs in a total of 69,876 patients with cardiac diagnoses admitted to the ICU. In the model, we consider classification by DC, factors present on admission such as APACHE II (which includes factors such as age and previous comorbidities), and hospital complexity evaluated by medical-surgical character. We included the risk associated with the need for invasive devices, an essential factor in the development of HAIs 2 . The graph, which reflects the relationship established by the model between the categories of variables, distributes said categories according to their severity in the first component and medical-surgical character in the second component.
In the first component, we observe a distribution from lower to higher severity of patients, which is reflected in the APACHE II score, the DCs, the duration of the invasive devices and the development of HAIs. However, DCs and different HAIs are not associated with each other in a natural way, indicating that the development of  www.nature.com/scientificreports/ HAIs is related more to the intrinsic severity of the patient and to the exposure to invasive devices maintained (in turn, a consequence of the severity) than to the aetiological classification on admission.
The first model has a large proportion of patients with less severe disease who do not develop HAIs. To verify whether the lack of association between HAIs and DCs is maintained, the second model limits the study to patients who presented some type of HAIs. As in the first model, the first component is principally explained by exposure to invasive devices. The second component has a different significance related to the inflammatory response to the infection and its impact on mortality.
The distribution of the infections in the first component shows a predisposition to infection associated with the devices (VAP, CRBSI) in patients with a high intrinsic severity (APACHE II greater than 20) and exposure times to devices in excess of 5-6 days. Regarding the distribution in the second component (lower vs. upper area) according to its impact in terms of inflammatory response and severity, particularly noteworthy is the low significance of ITU, VAT and SSI; in contrast, HAP, VAP, CRBSI and BSI-S are strongly associated with the development of sepsis and mortality. The distancing between VAP and VAT in the second model highlights the differences in both processes 19,20 .
The distribution of DCs is related to their intrinsic severity and therefore to exposure to invasive devices during admission. As in the first model, the surgical patient is isolated in a quadrant independent of the other categories, which indicates the convenience of treating the same as a specific entity in studies aimed at researching HAIs 21 .
There is an interesting discovery with regard to the distribution of certain DCs in relation to the prognosis: The first model places endocarditis, non-ACS CS and CA in the first component as categories with higher severity, mortality and incidence of infections. However, this perspective changes in the second model: the second component specifically expresses the prognosis associated with the development of HAIs, but CA and non-ACS CS strikingly lose the importance of the first model.  www.nature.com/scientificreports/ In the case of patients with CA, the high incidence of pneumonias in which bronchoaspiration plays a key role is well known 22,23 . However, premature mortality of neurological or cardiac origin would reduce the impact of HAIs.
Non-ACS CS demonstrates a similar behaviour to CA with regard to its position in the second model, which indicates that the principal determining factor of the prognosis is not infections.
Particularly noteworthy is the different behaviour of complicated AMI in the two models. The first model links complicated AMI to related categories such as heart failure (HF) and cardiogenic pulmonary oedema (CPE), distancing them from serious HAIs. However, in the second model, it is closer to HAIs with a higher inflammatory response and mortality: the HAIs show a major prognostic significance in this subgroup, which differentiates it from non-ACS CS.
The high incidence of HAIs is well documented in registries of patients with AMI 24 and specific studies on CS after ACS 25,26 . Recent studies point to the higher severity of CS after ACS 27 compared to non-ACS-CS, which includes patients with acutely decompensated chronic heart failure.
Both models demonstrate the importance of external factors in the development of HAIs in acute cardiac patients: exposure to invasive devices is the principal determining factor of HAIs, as shown in the VAF of those variables in the first component of both models. Increased exposure to devices in patients with a higher intrinsic severity may partially explain the development of HAIs; however, mechanisms have been suggested that favour the development of HAIs in more acute cardiac patients, such as immune paralysis 28,29 . Our study shows that DCs on admittance play a secondary role in the development of HAIs; however, it also reveals the existence of diagnostic groups with similar behaviour or, to the contrary, markedly different behaviour in terms of severity, prognosis and development of HAIs.
Our study reveals certain limitations. CATPCA is an exploratory technique in which no hypotheses are established based on a dependent variable and consequently no causal relationships are established, only associations. The ENVIN-HELICS registry is aimed at the study of HAIs in critical patients and does not contain specific information about the DCs that could be of interest, such as the existence of a category for CS owing to acute ischaemic heart disease or the performance of coronary intervention. Neither does it contain the need for circulatory support with ECMO, which is a recognized risk factor in the development of HAIs 30 .
In our study, only the APACHE II score was available. In postoperative cardiac patients, higher scores on the EUROSCORE index have been related to a greater risk of HAIs 31 ; however, the role of other severity scores in nonsurgical cardiac patients 32,33 is not well defined.
The most important clinical implication of our study is that it shows the existence of higher risk subgroups within acute cardiac patients, on which prevention measures to reduce the incidence of HAIs, such as reducing the duration of invasive devices, should be focused.

Conclusion
By applying the CATPCA methodology in the ENVIN-HELICS registry, it is possible to obtain an overview of the factors involved in the development of HAI in acute cardiac patients, highlighting the role of exposure to invasive devices in the development of HAI and revealing the consequences of HAI in terms of severity in certain specific DCs.