Multiplex analysis of 40 cytokines do not allow separation between endometriosis patients and controls

Endometriosis is a common gynaecological condition characterized by severe pelvic pain and/or infertility. The combination of nonspecific symptoms and invasive laparoscopic diagnostics have prompted researchers to evaluate potential biomarkers that would enable a non-invasive diagnosis of endometriosis. Endometriosis is an inflammatory disease thus different cytokines represent potential diagnostic biomarkers. As panels of biomarkers are expected to enable better separation between patients and controls we evaluated 40 different cytokines in plasma samples of 210 patients (116 patients with endometriosis; 94 controls) from two medical centres (Slovenian, Austrian). Results of the univariate statistical analysis showed no differences in concentrations of the measured cytokines between patients and controls, confirmed by principal component analysis showing no clear separation amongst these two groups. In order to validate the hypothesis of a more profound (non-linear) differentiating dependency between features, machine learning methods were used. We trained four common machine learning algorithms (decision tree, linear model, k-nearest neighbour, random forest) on data from plasma levels of proteins and patients’ clinical data. The constructed models, however, did not separate patients with endometriosis from the controls with sufficient sensitivity and specificity. This study thus indicates that plasma levels of the selected cytokines have limited potential for diagnosis of endometriosis.

B cells, adhesion and cell chemotaxis. Cytokines via inflammation can therefore influence the onset and progression of endometriosis 11 . These proteins include growth factors, interferons, interleukins (IL) and chemokines 12,13 . Chemokines are a small (8-10 kDa) group of pro-inflammatory polypeptides and signal proteins as they induce chemotaxis and are involved in the inflammatory response. Based on the distance between the first two cysteine residues chemokines can be divided into four groups; namely C (γ chemokines), CC (β chemokines), CXC (α chemokines), and CX3C (δ chemokines). The CXC group of chemokines can be further subdivided according to the presence/absence of ELR (glutamic acid-leucine-arginine) motif 14 . Since cytokines and chemokines can be released into the bloodstream their plasma/serum concentrations can easily be determined and thus represent potential biomarkers for the non-invasive diagnosis of endometriosis. There have been several thorough review papers published by May et al., Rižner, Gupta et al. and Nisenblat et al. describing potential biomarkers for endometriosis, reported from 1984 to 2015 [15][16][17][18] . In addition, Borrelli et al. systematically reviewed published studies on chemokines as potential biomarkers of endometriosis where in total 27 different chemokines have been evaluated where the majority of the studies focused on the diagnostic potential of CXCL8, CCL2 and CCL5 19 . The authors of these systematic reviews emphasized the importance of employing high quality standardized procedures when evaluating biomarkers for the diagnosis of endometriosis. Starting from sample collection and storage to collecting more detailed clinical data. These reviews emphasized also a need for multicentre validation studies performed on an independent set of patients from different populations.
In our previous study we evaluated the concentrations of 16 cytokines and other secretory proteins in peritoneal fluid and serum samples from patients with ovarian endometriosis, benign ovarian cysts and healthy women 20 . In peritoneal fluid the models with the highest diagnostic accuracies included: (i) IL-8 and the ratio of ficolin2 to glycodelin (ii) the ratio of biglycan to leptin and also the ratio of RANTES to IL-6; both in combination with age; the model with the highest diagnostic accuracy had an area under the curve (AUC) of 0.9. In serum the best characteristics were shown for models including: (i) the ratio between leptin and glycodelin and (ii) the ratio between ficolin2 and glycodelin; again both in combination with age; where the models with the highest diagnostic accuracies had a slightly lower AUC of 0.86 and 0.85, respectively 20 . The present study was performed on a different set of patient samples that were collected from two medical centres (Slovenian, Austrian) and included evaluation of 40 different cytokines -mainly chemokines in plasma samples. We decided to evaluate a different set of proteins from aforementioned studies in order to broaden the set of potential biomarkers for further validation studies that could include previous, as well as potential novel biomarkers. To the best of our knowledge this is the first study that evaluated such a broad spectrum of inflammatory proteins in plasma samples from a large, well-defined group of patients with different types of endometriosis. Aims of the present study were therefore to evaluate whether a single cytokine or combination of cytokines in a large, well-defined patient population can differentiate patients with endometriosis from control patients. If we identified cytokines with diagnostic potential we planned to design a diagnostic model with sufficient sensitivity and specificity, based on the plasma concentrations of cytokines and gathered patients' clinical data, and with the use of appropriate statistical and bioinformatics analysis.

Materials and Methods
Study design and sample source. The prospective case-control study was approved by both (i.e. Slovenian and Austrian) National Medical Ethics Committees (0120-127/2016-2 and EMMA 545/2010, respectively) and all the participants signed their written informed consent before being included in the study. Inclusion criteria comprised endometriosis-like symptoms (i.e. infertility and/or pain) as well as benign gynaecological conditions (i.e. different types of cysts and/or myomas). Exclusion criteria included pregnancy, age below 18 or above 50 years, menopausal status, gynaecological malignancies, other types of cancer, cancelled operation, HIV infection and the presence of haemolysis in plasma samples. The aim was to collect approximately 200 samples, with approximately one-to-one ratio of patients and controls to achieve more than 80% statistical power (probability to reject null hypothesis if it is false) and less than 5% Type I error rate under assumption that mean concentrations of cytokines noticeably differ between conditions. Patient enrolment took place from March 2013 to September 2016 at the Departments of Obstetrics and Gynaecology, University Medical Centre Ljubljana, Slovenia and the Medical University Vienna, Austria. At both Departments of Gynecology patients were recruited by senior gynecologists with the help of study nurses. Blood samples were analyzed in 2016. The time interval between recruitment/surgery and blood analysis (index test) was few weeks to 3 years. On the day of the surgery (Vienna) or one day to one week before surgery (Ljubljana) blood samples were collected according to a strict standard operating procedure. Blood samples of 4 ml were taken into BD Vacutainer tubes, (#368861, Becton Dickinson and Company, NJ, USA). Within one hour after collection the samples were centrifuged at 1400 g for 10 min at 4 °C. The plasma was aspirated and samples were aliquoted into 100 μL volumes and stored at −80 °C until analysis. Participants were interviewed regarding their ethnic origin, life style (i.e. diet, smoking status, sport and recreation, stress level), medical history especially with regards to different types of pain that are associated with endometriosis (pelvic pain, dysmenorrhea, dyschezia and dyspareunia) as well as medication intake a week prior to surgery, the use of oral contraceptives and hormonal therapy, current or in the three months prior to surgery. The intensity of dysmenorrhea and dyspareunia were evaluated using a validated visual analogue scale of 10 points. The reference test was laparoscopy (in exceptional cases laparotomy) with visualization of typical lesions and histological evaluation. Laparoscopy and laparotomy were performed by expert surgeons with at least ten years of experience. In total out of 233 patients 210 met inclusion criteria of whom 116 were laparoscopically (or by laparotomy) and histologically characterized by the presence of endometriosis and 94 by the absence of it (Table 1, Fig. 1).
Additional pathologies/conditions were identified after the surgical procedure. The phase of the menstrual cycle was estimated based on the date of the last menstruation and the thickness, as well as appearance, of the endometrium determined by ultrasound. The study was designed to meet the principles of the Declaration of Biomarker measurements. All methods were performed in accordance with the relevant guidelines and regulations. The Luminex xMAP multiplexing and the Bio-Plex Pro ™ Human Chemokine Assay platforms (#171ak99mr2, lots: #64025638, #64040537 Bio-Rad Laboratories, CA, USA) were used according to the manufacturer's protocol. Briefly, the method is based on 5.5 μm polystyrene beads that are labelled with two different fluorescent dyes in different ratios assigned for each individual antibody, thus enabling quantification of 40 different cytokines, mainly chemokines in each sample ( Table 2). The intra assay and inter assay variability of the Human Chemokine Assay, as specified by the producer, was 2-6% CV and 2-8% CV, respectively. The samples www.nature.com/scientificreports www.nature.com/scientificreports/ were anonymized and the person performing the assays was blind to identity of the samples and the result of the surgery. According to the producers' instruction manual plasma was diluted fourfold prior to analysis. Bio-Plex ™ Manager Software with a 5-parameter logistic regression modelling was used to calculate final concentrations. Calibrations and verifications were performed prior to every analysis with the use of commercially available and recommended kits (MPX-PVER-K25, MPX-CAL-K25; Luminex, Austin, Texas, USA). Clinical data that were obtained from the patients (i.e. metadata) was included in the statistical modelling. The data were processed using Microsoft Excel 2003, and for statistical analysis we used GraphPad Prism Software version 5.00 for Windows (San Diego, CA, USA), R programming language 21 version 3.4.3 (2017-11-30) -"Kite-Eating-Tree" and R Studio version 1.1.383 with packages such as mice, caret and ggplot2. Corrected P value of <0.05 was considered significant.

Statistics. For univariate statistical analysis two sided Wilcoxon rank-sum test (Mann-Whitney U test) was
used to assess statistical significance of the difference in plasma concentrations of 40 different cytokines and chemokines between endometriosis patients (i.e. also according to the different types of endometriosis) and control group of women. Results of the univariate analysis were then also corrected according to Bonferroni's correction for multiple testing. To assess the normality of the distributions Shapiro-Wilk test was used. Fisher's exact and Chi-square tests were used for comparison of categorical variables. Results of the descriptive analysis (i.e. patient's clinical data) were presented as mean ± standard deviation (SD) while the concentrations of the measured proteins were presented as median and also as mean ± SD (Tables 1 and 2, respectively). Before further analysis we excluded proteins with reported out of range concentrations (i.e. GM-CSF, CXCL5). Apart from the remaining single proteins additional variables were constructed which represented ratios of the protein's concentrations. Batch effect between samples collected in different centres was identified with principle component analysis (PCA) and removed using mean-centring and normalisation of standard deviations of all protein features across samples from each batch.
Machine learning. Machine learning algorithms such as decision tree 22 , generalised linear model 23 , weighted k-nearest neighbour 24 and random forest 25 were applied to identify proteins or panels of proteins that would discriminate patients with endometriosis from the controls. R packages rpart, GLMNET, KKNN and RandomForest were used to implement aforementioned models. Selected machine learning methods represent very popular, however, intrinsically different classes of classification algorithms. Each employed method is sufficiently simple to produce interpretable results, but at the same time powerful enough to model complex and often non-linear interactions between input features. In order to ensure robustness of the reported results, 4-fold repeated cross-validation (4-fold repeated CV) technique has been used. For each classifier average accuracy across all the folds and repetitions were reported. Reported accuracy has been compared to the accuracy of the hypothetical random classifier trained on the same data to assess the diagnostic potential of the trained models. At times when number of samples was not equal in modelled groups, balanced accuracy which takes into account imbalanced representation of samples was applied instead of regular accuracy. We have included additional clinical data into our analysis such as the use of hormonal therapy and/or oral contraception three months prior to surgery, medication intake a week prior to surgery as potential important confounders or effect modifiers. The obtained metadata are included in the Table 1 and in the Supplementary Table S1.

Results
Characteristics of the patient's cohorts. Our case group comprised 116 patients with different types of endometriosis (Tables 1 and S1). Staging of endometriosis was done according to the revised American Society for Reproductive Medicine classification 3 . Minimal to mild endometriosis was present in 72 patients (62%) and moderate to severe in 40 patients (35%) and for four (3%) patients the information regarding the extent of www.nature.com/scientificreports www.nature.com/scientificreports/ endometriosis was not known. Patients with endometriosis were 32 ± 6 years of age (range between 19 and 50 years) and with a body mass index (BMI) of 23 ± 5 kg/m 2 (range between 16 and 50 kg/m 2 ). According to the menstrual phase 59 patients (51%) were in their secretory and 49 (42%) in their proliferative phase (Table 1), six (5%) patients were on oral contraceptives at the time of the hospitalization, and for two (2%) patients this information was missing.
Patients with benign gynaecological conditions (i.e. different types of cysts and/or myoma), unexplained infertility and/or severe pain where laparoscopy excluded the presence of endometriosis totalled 94 controls. Controls were 32 ± 8 years of age (range between 18 and 50 years) and with a BMI of 24 ± 4 kg/m 2 (range between 18 and 42 kg/m 2 ). In total 41 (44%) controls were in secretory and the same number of controls were in proliferative phase of their menstrual cycle (Table 1) while four (4%) controls were taking oral contraceptives at the time of the surgery and for eight (8%) controls the information was missing or the phase of the menstrual cycle could not be determined.
Three months prior to surgery the vast majority of our study participants was not on hormonal therapy, only 8.5% controls and 10.3% of endometriosis patients used hormonal therapy (mainly progesterone and progestins), and additional 11.7% controls and 12.9% patients with endometriosis was on oral contraception (Table 1). A week before surgery 54 patients with endometriosis (47%) and 39 controls (42%) were taking medications, mostly analgesics, anti-inflammatory and anti-rheumatic products and psychoanaleptics. More than half of the patients with endometriosis (59%) and less than a half of controls (48%) were non-smokers (Table 1). Sport or recreation two days before surgery was reported for 39 patients with endometriosis (34%) and 19 (20%) controls.
The two study groups did not differ in age, menstrual phase, use of hormonal therapy and oral contraceptives three months prior to surgery, use of other medications a week before surgery, and smoking status. However, they differed in BMI distribution (P < 0.05), frequency of dysmenorrhea (P < 0.01), intensity of dysmenorrhea (P < 0.05) and in presence of additional pathologies/conditions such as fallopian tube related pathologies (P < 0.01), cysts (P < 0.01) and inflammation related conditions (P < 0.05). Most of the study participants (59%) were of Slovene or Austrian origin and all of the participants were of European descent. This clinical information is summarized in Tables 1 and S1.

Levels of cytokines in patients with endometriosis and in control.
In all 210 plasma samples concentrations of all 40 different chemokines were measured (Table 2). Univariate statistical analysis revealed that there are no statistically significant differences in cytokine levels between all patients and controls. We also compared plasma concentrations of cytokines from patients with different types of endometriosis with controls where we identified eleven potential biomarkers for a specific type of endometriosis. In total we have identified seven potential biomarkers for peritoneal endometriosis (i.e. CCL21, CCL11, CCL26, CX3CL1, CCL1, IL-6, and CCL3), two for the presence of peritoneal and ovarian endometriosis (i.e. CXCL11, CXCL12), one for peritoneal and deep infiltrating endometriosis (i.e. IFN-γ) and two for all three types of endometriosis (i.e. CCL15, CXCL12). The  www.nature.com/scientificreports www.nature.com/scientificreports/ most differential proteins for peritoneal endometriosis CCL1, CCL3 and CCL21 (Fig. 2) but after correction for multiple testing a boundary of the statistical significance was set at P < 0.001 and the differences in the concentrations of these proteins were not statistically significant.

Analysis of cytokines by different machine learning approaches did not allow separation between cases and controls. As more biomarkers potentially increase the reliability of a diagnostic test
we decided to use machine learning to evaluate whether a panel of proteins with or without incorporation of metadata can differentiate among our two phenotypes. Results of the PCA showed there was no meaningful separation between patients with endometriosis and the controls based on the measured plasma levels of cytokines (Fig. 3). The highest average classification performance was achieved by the random forest algorithm (balanced accuracy = 59%, see Fig. 4a) with signal of six most influential features (i.e. proteins) illustrated as boxplots in Fig. 4. Obtained accuracy was not sufficiently different from random chance. Next, we trained the random forest model on different numbers of protein features to test a hypothesis that model trained on fewer proteins would generate better diagnostic characteristics (i.e. higher sensitivity, specificity and AUC) rather than using the whole panel of proteins and protein ratios at once. Results showed that a combination of three proteins would generate the highest combination of the selected diagnostic characteristics with a sensitivity of 40%, specificity of 65% and an AUC of 0.61 (Fig. 5), which, however is still far from being acceptable for diagnostics.
We then added metadata features (included in Tables 1 and S1) along with protein levels to the training data, but it did not improve the overall fit of the models and we could still not see a clear separation between patients with endometriosis and controls (Fig. 6). Performing separate analysis on individual types of endometriosis with the inclusion of metadata revealed no significant differences (Fig. 7). Comparing all four stages (minimal, mild, moderate and severe) of endometriosis as well as comparing minimal/mild with moderate/severe endometriosis with controls did not end up in significantly different features. Except for TNFα/CCL27 protein ratio that has been consistently reported by the random forest algorithm as the most valuable feature for separating patients with minimal/mild endometriosis from controls. However, despite high importance of TNFα/CCL27, the accuracy achieved by the algorithm remained modest (57.4%). We also did not observe any significant differences between patients and controls when divided with respect to the medication intake (use of any kind of medication or the use of nonsteroidal anti-inflammatory drugs or the use of any type of hormonal medication). Other personal and clinical data also showed no differences in the plasma profiles of the patients and controls.

Discussion
Endometriosis is a common benign gynaecological condition that is characterized by the presence of endometrial lesions in the peritoneal cavity and is thus also described as a chronic inflammatory disease where diagnostic biomarkers that would be applicable for clinical use have not yet been identified 16 . Cytokines have already been investigated as potential biomarkers of endometriosis in the blood and/or peritoneal fluid. In addition to individual inflammatory proteins, also panels of cytokines in conjunction with other proteins have been studied, although the results of these studies varied 16,19,26 . The majority of these studies investigated IL-6 and INF-γ which are included in the activation and differentiation of inflammatory cells and are also involved in the pathogenesis of endometriosis [27][28][29][30] .
In the current study there were no statistically significant differences in cytokine plasma levels between patient with different types of endometriosis and controls. Although we found potential importance of the ratio TNFα/ CCL27 for separating patients with minimal/mild endometriosis from the control group of patients, the accuracy achieved by the algorithm was insufficient. Univariate analysis revealed the lowest P values when we compared concentrations of cytokines/chemokines CCL1, CCL3, CCL21 in patients with peritoneal endometriosis and controls. We found no published studies evaluating blood levels of these cytokines/chemokines in patients with endometriosis Borrelli et al., evaluated the levels of CCL21 in the peritoneal fluid of 36 patients with endometriosis and 27 controls and reported no significant differences 31 . Other studies focused on the expression of the corresponding genes. Unchanged or changed expression (i.e. increased/decreased) in the endometrium from www.nature.com/scientificreports www.nature.com/scientificreports/ patients with endometriosis was reported for CCL21 with no explanation on how these changes might contribute to the aetiology or pathogenesis of endometriosis [32][33][34] . Expression and/or role of CCL1 and its receptor (i.e. CCR8) in endometrial tissue was studied by Shi et al. 35,36 and revealed higher expression and their potential role in the pathogenesis of endometriosis. Although these three cytokines/chemokines identified in our study have so far not been sufficiently investigated in endometriosis our experimental data link changes in concentrations of these proteins to peritoneal endometriosis, which implies that CCL1, CCL3 and CCL21 might have a role in the aetiology and pathogenesis of this type of endometriosis.
The studies that evaluated blood concentrations of cytokines and chemokines as potential biomarkers of endometriosis are scarce, as the most studies so far evaluated the diagnostic potential of inflammatory proteins in peritoneal fluid and/or tissue samples (i.e. eutopic/ectopic endometrium) of patients with endometriosis. Kalu   39 . Amongstudies evaluating cytokines as blood biomarkers Rocha et al. 40 followed a criteria for case-control studies where case and control groups originate from the same cohort 41 . In their study all of the patients presented with at least one endometriosis-like symptom (i.e. chronic pelvic pain and/or infertility and/  www.nature.com/scientificreports www.nature.com/scientificreports/ or potential presence of endometrioma based on the ultrasound). After the laparoscopic operation and histological evaluation patients were divided into two groups; patients with endometriosis (n = 44) and control group of patients (n = 31). Concentrations of seven different cytokines (i.e. IL-2, IL-4, IL-6, IL-10, CCL2, CXCL10, and CCL11) were simultaneously determined using cytometric bead array and results showed that based on the Nested Cross-validation results for three machine learning methods on patients with ovarian (A,B) peritoneal (C,D) and deep infiltrating endometriosis (E,F) and control samples. 5-fold CV was repeated 10 times without any parameter learning or sharing allowed between the folds to ensure generalisation and robustness of the obtained models. Results suggest that machine learning models cannot differentiate between different types of endometriosis and controls with an accuracy that exceeds the one of random chance. (2019) 9:16738 | https://doi.org/10.1038/s41598-019-52899-8 www.nature.com/scientificreports www.nature.com/scientificreports/ panel of these cytokines and clinical data it was not possible to predict the presence of endometriosis in a group of symptomatic patients 40 .
Recently, Aalamat et al. published a systematic review on the use of multiplex technology for identification of potential novel biomarkers of endometriosis among inflammation associated proteins 42 . They reported that the majority of studies that adapted multiplex technology evaluated potential novel biomarkers of endometriosis in peritoneal fluid 20,31,[43][44][45][46][47] . Although peritoneal fluid is collected by a semi-invasive method it is the most representative sample that closely reflects inflammatory changes that are associated with the pathogenesis of endometriosis 48 . Based on the literature and our published studies 20 , we conclude that cytokines and chemokines in peritoneal fluid have a far greater diagnostic potential for endometriosis than their plasma or serum concentrations. Results of our study are also in concordance with the study conducted by Lee et al. that evaluated the diagnostic potential of pro-inflammatory oxylipins and cytokines in serum samples of 103 women undergoing laparoscopy. Results of their study showed limited diagnostic potential of the measured circulating biomarkers for the diagnosis of endometriosis, warranting additional studies to evaluate the exact role of systemic inflammation in endometriosis 49 .
Although we evaluated a broad spectrum of inflammatory proteins in plasma samples of patients with different types of endometriosis and controls with several different multifactorial benign gynaecological conditions, both within a well-defined cohort, included detailed protocols, obtained a large set of clinical data, included different nationalities, combined with high throughput methodology and advanced statistical approaches, our results were consistent with several previous studies indicating limited diagnostic potential of circulating cytokines for the diagnosis of endometriosis. Having said this, presented results need to be considered carefully as they might be subject to various sources of bias and noise. Self-reporting of metadata by patients, undetected technical batch effects, unpredictable statistical fluctuations are all potential sources of bias and thus, limiting factors of the current study.

Conclusions
In this study we evaluated the diagnostic potential of 40 different cytokines in plasma samples from 210 patients with different types of endometriosis and control group of patients from two medical centres. Although several studies have associated inflammation with the development and progression of endometriosis, and inflammatory cytokines in endometrial tissue, peritoneal fluid and blood have been evaluated as potential biomarkers for endometriosis, the published results are inconsistent and identified no clinically useful biomarker to date. Based on the evaluated plasma concentrations of these 40 different cytokines, clinical data and appropriate statistical analysis we were unable to develop a diagnostic algorithm that would separate patients with endometriosis from the control group of patients with sufficient sensitivity and specificity. For development of a model with potential clinical applicability, which would enable diagnosis of patients with endometriosis with sufficient accuracy, further approaches of targeted and non-targeted "omics" technologies will be needed in conjunction with appropriate statistical/bioinformatics methods. These have to be followed by independent validation studies to confirm the results obtained in a research setting.

Data availability
All data are fully available without restriction.