Identification of distinct immune activation profiles in adult humans

Latent infectious agents, microbial translocation, some metabolites and immune cell subpopulations, as well as senescence modulate the level and quality of activation of our immune system. Here, we tested whether various in vivo immune activation profiles may be distinguished in a general population. We measured 43 markers of immune activation by 8-color flow cytometry and ELISA in 150 adults, and performed a double hierarchical clustering of biomarkers and volunteers. We identified five different immune activation profiles. Profile 1 had a high proportion of naïve T cells. By contrast, Profiles 2 and 3 had an elevated percentage of terminally differentiated and of senescent CD4+ T cells and CD8+ T cells, respectively. The fourth profile was characterized by NK cell activation, and the last profile, Profile 5, by a high proportion of monocytes. In search for etiologic factors that could determine these profiles, we observed a high frequency of naïve Treg cells in Profile 1, contrasting with a tendency to a low percentage of Treg cells in Profiles 2 and 3. Moreover, Profile 5 tended to have a high level of 16s ribosomal DNA, a direct marker of microbial translocation. These data are compatible with a model in which specific causes, as the frequency of Treg or the level of microbial translocation, shape specific profiles of immune activation. It will be of interest to analyze whether some of these profiles drive preferentially some morbidities known to be fueled by immune activation, as insulin resistance, atherothrombosis or liver steatosis.

Persistent immune activation (IA) fuels major chronic morbidities, including insulin resistance, metabolic syndrome, diabetes, atherothrombosis, neurocognitive disorders or liver steatosis. A model for IA is HIV-1 infection under efficient combined antiretroviral therapy. In people living with HIV-1, the immune system remains potentially activated, even if viral replication is controled by the treatment 1 . To better characterize IA in this model, we previously measured a series of cell-surface and soluble markers, in 120 efficiently treated HIV patients. A hierarchical clustering analysis identified 5 different IA profiles in these people 2 . To test whether the IA profiles were robust rather than specific to the 120 patients we had analyzed, we recruited 20 more HIV patients with divergent bioclinical characteristics, and performed the hierarchical clustering analysis again 3 . Once more, we observed 5 different IA profiles in these 140 HIV patients. We also analyzed the possibility that these IA profiles were the consequence of different causes. In favour of this model, we found a link between microbial translocation and one of the IA profiles 3 .
The general population, particularly in old age, shares many of the causes of IA with people living with HIV-1. First, we all harbour infectious agents that trigger our immune system 4 . Second, there is a low level of microbial translocation in each individual that increases over time 5 . Third, metabolic disorders that may stimulate the immune system also increase with age 4 . Fourth, as with any senescent cell, immune cells in aging individuals release factors responsible for IA 4 . And last, the efficiency of Treg cells, a CD4+ T cell subpopulation whose function is to downregulate IA, decreases over time 6 . Therefore, we reasoned that a more general population www.nature.com/scientificreports/ might also present with different IA profiles driven by these various etiologic factors. To test this hypothesis, we looked in the present study for the presence of distinct IA profiles in a general population, and for potential etiologic factors linked to these profiles.

Materials and methods
Study design. We recruited 150 adults over 55 years and below 70 years of age, affiliated to the French Social Security system who volunteered for a free health checkup at a Social Security Center in Nîmes, France. Pregnant women, people under immunomodulatory treatment or with diseases likely to modify their immune system were not included. This study was approved by the French Ethics Committee Sud Est IV. All methods were carried out in accordance with the French guidelines and regulations. All individuals had provided written informed consent. The trial was registered on ClinicalTrials.gov under the reference NCT04028882.
For Treg quantification, cells were first fixed with Reagent 1 of the IntraPrep Permeabilization kit (Beckman Coulter) in the dark, and then stained with the CD4-FITC/CD45RA-ECD/CD25-PC7/CD127-APC750 cocktail of antibodies. Secondly, cells were permeabilized and an anti-FoxP3-APC antibody was added. Finally, red blood cells were lysed using Reagent 2. After one hour, cells were washed with Reagent 3.
Cells were run on a Navios flow cytometer and results were analyzed by using Kaluza ® software (Beckman Coulter). A minimum of 20,000 lymphocytes were gated to analyze the subpopulations. We controled the interrun variability with the same batch of Rainbow 8-peak beads (Beckman Coulter). During the study, no voltage adjustment was necessary to keep the beads into their respective defined targets. Soluble immunologic markers in peripheral blood. ELISA was used to quantify soluble TNF receptor I (sTNFRI), soluble CD14 and soluble CD163 (sCD163) (Quantikine, R&D systems, Rennes, France), as well as tissue Plasminogen Activator (tPA) and soluble Endothelial Protein C Receptor (sEPCR) (Asserachrom, Stago, Asnières-sur-Seine, France) in plasma collected in EDTA Vacutainer tubes (Becton Dickinson, Le Pontde-Claix, France) and frozen. C-Reactive Protein (CRP) and immunoglobulins (Ig) were measured by turbidimetry in plasma collected by the same way. 16s ribosomal bacterial DNA was measured in plasma by quantitative PCR as previously described 7 .
Statistical analysis. All data were standardized before statistical analysis. Next, a visual assessment of the possibility to cluster the data was made using principal component analysis, and also by seeking a cluster structure in the distance matrix. Next, the Hopkins statistic was calculated, with a value of 1 indicating the highest possibility to cluster the data 8 . Second, we determined the optimal number of clusters using several indexes (e.g., Silhouette 9 , Gap statistic 10 ). The majority rule was used to determine the optimal number of clusters. Third, we performed two hierarchical clustering analyses. One clustering analysis was carried for volunteers, using the Euclidian distance to measure the distance between individuals and the other one for markers, using 1-abs (correlation) as a distance. For both of them, Ward's minimum variance method was used as a linkage method. We then generated a heatmap using the classification of volunteers and markers. We evaluated the appropriateness of the classification through an internal validation test. We used two indexes, based on compactness and separation, (i) the silhouette width which varies between − 1 and 1 representing a wrong and perfect classification 9 , respectively, and (ii) the Dunn index which varies from 0 to infinity and should be maximized 11,12 . In addition, to analyze the significance of the hierarchical classification, we performed a permutation test on volunteer groups. To this aim, we computed the ratio between between-group and within-group variability in the true groups using the lda function of the MASS package (MASS_7.3-51.6) of the R software (R version 4.0.2 (2020-06-22)). This function computes the singular values, which give the ratio of the between-and within-group standard deviations on the linear discriminant variables. Their squares are the canonical F-statistics. The sum of all singular values provides the global ratio among between-group and within-group variability. In parallel we generated random groups by permutation of group labels. We repeated the permutation 1000 times and visualized their distribution compared to the ratio obtained from true groups.
In order to characterize each immune profile, a V-test was calculated 13 . The bigger the absolute value of the V-test is, the more characteristic the variable is. All analyses were performed using version 3.6.1 R software (R Development Core Team, A Language and Environment for Statistical Computing, Vienna, Austria, 2016. https ://www.R-proje ct.org/).
We used the Mann-Whitney test to compare markers and IA profiles. The links between biomarkers were determined by Spearman rank correlations.  Fig. 1. IgG, IgA, IgM, and sCD163 peripheral blood levels were used as markers of B-cell and monocyte activation, respectively. Inflammation was evaluated via sTNFRI and CRP concentrations, and endothelium activation via sEPCR and tPA concentrations in peripheral blood.
For both markers and individuals, a clustering tendency was observed (Hopkins statistics: 0.68 and 0.73, respectively). We performed two independent hierarchical clustering analyses, one for the activation markers and another one for the volunteers. The number of clusters chosen for markers and donors were 2 and 5, respectively, as they corresponded to the results obtained with the majority of the indexes we tested. Thus, the analysis of volunteers identified 5 groups of individuals presenting with different IA profiles (Fig. 2). Concerning the internal validation step, the Dunn index and the silhouette width were 0.26 and 0.06 for individuals, and 0.39 and 0.12 for markers, respectively. To show that the groups identified by hierarchical classification reflect a true structure of the data, we generated random groups of the same size as the true ones. The within-group and between-group variability allow to investigate the quality of clusters, as "good" clusters are compact (individuals in the same group have similar properties, reflected in a low within-group variability) and far from each other (individuals in different groups present distinct profiles, reflected in high between-group variability). Hence, random groups, or data with no defined clusters, would show a lower ratio between between-group and withingroup variability. The distribution of the ratio of the between-and within-group standard deviations on the linear discriminant variables is represented in the Fig. 3. The histogram shows that the between/within ratio is much better for the real groups as the "best" random group exhibits a much lower ratio. Indeed, the median values for the random groups is 15.22 (minimum, 12.04 and maximum, 18.92), whereas it is 42.85 for the true groups, identified by the hierarchical classification (p < 0.001). Next, we computed a Principal Component Analysis of the volunteers (Fig. 4).

Links between immune activation profiles and etiologic factors.
In Humans, various factors may be responsible for chronic immune activation. Thus, for instance, a deficiency in the mechanisms responsible for downregulating IA may be responsible for an overactivity of the immune system. Microbial translocation, the entry into our organism of microbial products originating from our microbiota, is another potential cause of IA. In a given individual, only some of these etiologic factors may be at work, e.g., immune dysregulation, but not microbial translocation. Therefore, we tested the hypothesis that the IA profiles that we unraveled might be Another potential cause of IA is the intensity of microbial translocation which increases with age 5 . A direct marker of microbial translocation is the presence of bacterial DNA in the circulation quantified by PCR targeting conserved sequences of the 16s ribosomal gene (rDNA). Strikingly, we observed a high, although not significant, level of rDNA in Profile 5 people (45 ± 59 versus 14 ± 13 copies/mL, p = 0.242, Fig. 6G), comparatively to the other volunteers.

Discussion
In this study, we have shown that various IA profiles may be distinguished using an unsupervised learning method in a population of adults volunteering for a health checkup. We revealed 5 distinct IA profiles that may be characterized according to their levels of CD4+ T cell, CD8+ T cell, NK cell and monocyte frequency, activation, and/or differentiation. One profile, Profile 1, has the lowest level of differentiated and senescent T cells, of activated NK cells, and the lowest percentage of monocytes. It may therefore be considered as the less activated profile. Of note, it is the group with the highest percentage of women. Profiles 2 and 3 are remarkable by their reduced percentages of naïve T cells and their elevated percentages of differentiated and senescent T cells. These profiles may therefore be considered as the T cell activated profiles. In Profile 4 it is the NK cells that are activated, and in Profile 5 it is the frequency of circulating monocytes that is noteworthy. www.nature.com/scientificreports/ A second finding of this study is that some of these IA profiles are linked to potential causes of IA. Thus, Profile 5 individuals, characterized by a high frequency of peripheral blood monocytes, have a high circulating bacterial DNA load. Yet, the difference in rDNA level between Profile 5 and the other profiles was not significant, probably due to the small number of participants with this Profile (n = 5). Yet, this link between microbial translocation and the frequency of circulationg monocytes is in line with the observation that bacterial products may boost monopoiesis via TLR signaling 15 . Profiles 2 and 3 which are particular because of their high degree of CD4+ T cell and CD8+ T cell differentiation and senescence, tend to have low percentages of Treg. As Treg are known to interrupt the process of T cell activation 16 , these low Treg levels might participate in the increased T cell differentiation and senescence specific to these profiles. This hypothesis is supported by our observation of an inverse correlation between the percentage of Treg cells on one hand, and of differentiated and senescent CD4+ T cells on the other hand. Conversely, the high percentage of naïve Treg cells in Profile 1 may at least partly explain the low level of T cell differentiation and senescence in that profile.  www.nature.com/scientificreports/ One of the limitations of our study is that it is cross-sectional, highlighting only correlations. Further analysis is needed to definitively establish causative links between etiologic factors and IA profiles. Also, additional etiologic factors, different from the one we tested, could shape the IA profiles, as for instance the genetic background and the clinical history. Moreover, the stability over time of IA profiles has to be verified. Our study is also limited by the technology we used. Thus, we did not test the functionality of the various immune cell subpopulations we analyzed. The immune phenotyping could also be more precise by using single-cell transcriptomics analysis, high dimensional flow cytometry or CyTOF. On the other hand, our ultimate goal is to identify a simple signature of easily measurable markers characteristic of immune activation profiles that could fuel immune activationinduced morbidities. This goal may be achievable in routine with the tools we used.
Globally, we show that different IA profiles may be distinguished in a general population, and that some of these profiles are linked to potential etiologic factors such as Treg frequency and microbial translocation.
We propose a model where in each individual one or a few specific causes of IA shape a specific IA profile. Of interest, the different IA profiles we describe here may fuel different morbidities among those driven by IA, as insulin resistance, atherothrombosis or liver steatosis for instance, In this hypothesis, immune profiling might help to tailor the prevention and the screening of these IA-induced diseases. Moreover, deciphering soluble immune factors favoring each of these morbidities might open the way to new therapeutic strategies. In the near future, the immune activation profile of each individual might be identified via a simple signature of a reduced number of activation markers easily measurable, and this immune activation profile might predict the chronic morbidiites this individual is at risk of developing.