Diagnostic accuracy of the different hormonal tests used for the diagnosis of autonomous cortisol secretion

To evaluate the diagnostic accuracy of the different tests commonly used in the evaluation of adrenal incidentalomas (AIs) for the identification of autonomous cortisol secretion (ACS) and comorbidities potentially related to ACS. In a retrospective study of patients with AIs ≥ 1 cm, we evaluated the diagnostic reliability and validity of the dexamethasone suppression test (DST), urinary free cortisol (UFC), ACTH, late-night salivary cortisol (LNSC), and dehydroepiandrosterone-sulphate (DHEAS) for the diagnosis of comorbidities potentially related to ACS. Diagnostic indexes were also calculated for UFC, ACTH, LNSC, and DHEAS considering DST as the gold standard test for the diagnosis of ACS, using three different post-DST cortisol thresholds (138 nmol/L, 50 nmol/L and 83 nmol/L). We included 197 patients with AIs in whom the results of the five tests abovementioned were available. At diagnosis, 85.9% of patients with one or more AIs had any comorbidity potentially related to ACS, whereas 9.6% had ACS as defined by post-DST cortisol > 138 nmol/L. The reliability of UFC, ACTH, LNSC, and DHEAS for the diagnosis of ACS was low (kappa index < 0.30). Of them, LNSC reached the highest diagnosis accuracy for ACS identification (AUC = 0.696 [95% CI 0.626–0.759]). The diagnostic performances of these tests for comorbidities potentially related to ACS was poor; of them, the DST was the most accurate (AUC = 0.661 [95% CI 0.546–0.778]) and had the strongest association with these comorbidities (OR 2.6, P = 0.045). Patients presenting with increased values of both DST and LNSC had the strongest association with hypertension (OR 7.1, P = 0.002) and with cardiovascular events (OR 3.6, P = 0.041). In conclusion, LNSC was the test showing the highest diagnosis accuracy for the identification of ACS when a positive DST was used as the gold standard for its diagnosis. The DST test showed the strongest association with comorbidities potentially related to ACS. The definition of ACS based on the combination of elevated DST and LNSC levels improved the identification of patients with increased cardiometabolic risk.

www.nature.com/scientificreports/ of the DST in this setting. On the other hand, under usual routine clinical practice conditions, the diagnostic performance of the DST and complementary tests for the identification of comorbidities potentially related to ACS seems to be poor. We hypothesized that the identification of cardiometabolic morbidities potentially related to ACS in patients with AIs could improve with the use of a panel of tests usually used to characterize adrenal function, either individually or in combination. Moreover, we evaluated the reliability and validity for the diagnosis of ACSconsidering an increased DST result as the gold standard for ACS definition following current European clinical guidelines 2 -of four tests routinely used for the evaluation of adrenal function, including plasma ACTH, age and sex adjusted serum dehydroepiandrosterone sulphate (DHEA-S) levels, UFC and LNSC.

Methods
Patients. We retrospectively queried the electronic registry of the hormone laboratory of Hospital Universitario Ramón y Cajal to identify all patients in whom a DST had been performed between 2013 and 2020. We reviewed their medical records and selected those patients aged 18 to 90 years-old who presented with incidentally discovered unilateral and/or bilateral AIs of at least 10 mm in the largest diameter. We excluded patients with: (i) known diagnosis of hereditary syndromes associated with adrenal tumours; (ii) chronic treatment with glucocorticoids or drugs that might affect dexamethasone metabolism; (iii) treatment with oral hormonal contraceptives during the 6 weeks preceding the test; (iv) AIs identified during the extension study of an extraadrenal cancer; (v) patients with overt syndromes of adrenal hormone excess, (vi) adrenocortical carcinoma; (vii) adrenal metastasis from extra-adrenal tumours; and (viii) missing information in the results of one or more of the five tests evaluated here) (Fig. 1). We analysed patients' data obtained during their initial evaluation and at their last available follow-up visit.
Clinical evaluation. Demographics information such as age and sex; presence of comorbidities potentially related to ACS (hypertension, type 2 diabetes, obesity, dyslipidaemia, cerebrovascular and cardiovascular disease); body mass index (BMI); and systolic and diastolic blood pressure were extracted from medical records. Obesity was defined by a BMI equal or greater to 30 kg/m 2 . Hypertension was defined as systolic blood pressure equal to or greater than 140 mmHg and/or diastolic blood pressure equal to or greater than 90 mmHg, or treatment with blood pressure lowering medications. Diagnosis of type 2 diabetes and dyslipidaemia was based on current standards 9,10 . Cardiovascular disease was defined as ischemic heart disease or heart failure, and cerebrovascular disease as transient ischemic attack or acute stroke.
Management decision regarding AIs-either observation or surgery-after the last follow-up visit was also registered.
Biochemical and hormonal evaluation. Routine biochemical profile after an 8 h overnight fasting was performed at diagnosis and at the last follow-up visit available. Biochemical profiles included fasting plasma glucose, total cholesterol, LDL-cholesterol, HDL-cholesterol, triglycerides and HbA1c (the latter was available only in 55 cases). Hormonal studies at the initial evaluation included urinary catecholamines and/or urinary metanephrines, DST, UFC, ACTH, DHEA-S and LNSC.
DST, UFC, ACTH, age-and sex-adjusted DHEA-S, and LNSC were analysed as continuous and categorical variables. When considering the DST test as the gold standard for the calculation of reliability and validity for ACS diagnosis of the others tests of adrenal function, we evaluated not only the post-DST cortisol 138 nmol/L www.nature.com/scientificreports/ (5.0 µg/dL) 2 , but also the 50 nmol/L (1.8 µg/dL) and 83 nmol/L (3.0 µg/dL) cut-off values. For the evaluation of the diagnosis accuracy of the DST for the identification of comorbidities potentially related to ACS, the > 50 nmol/L threshold was employed, based on the results of the ROC curves and on previous studies that found that this cut-off was the most sensitive for this purpose [11][12][13][14][15] . UFC levels above the upper limit of the reference range in our laboratory were considered elevated. Besides, patients with UFC levels within the reference range were classified into two groups-normal-low or normal-high UFC levels-using 1930 nmol/24 h (70 µg/24 h) as threshold, because this was the value that associated the highest specificity for the diagnosis of ACS according to the results of the ROC curve. Patients with UFC levels two-fold above the reference range were diagnosed with overt Cushing's syndrome and excluded from the study (Fig. 1). ACTH levels below 2 pmol/L (10 pg/mL) were considered low. LNSC levels above the upper limit of the reference range in our laboratory were considered elevated. DHEA-S levels were considered to be elevated or decreased according to age-and sex-specific reference ranges in our laboratory.
Imaging studies. At diagnosis, abdominal computed tomography or magnetic resonance imaging were obtained in all AIs patients. Tumour size (largest diameter), uni-or bilaterality, presence of necrosis, calcification and atypical characteristics, lipid content and radiodensity measured in Hounsfield units (HU) were registered. In bilateral AIs, the recorded tumour size was that of the largest AI. The adrenal tumour was classified as having rich lipid content when attenuation was low (< 10 HU) in a CT performed without contrast administration or when the washout in a CT with contrast was rapid (> 60% absolute washout or > 40% relative washout) 4 . Computed tomography was repeated in 99 patients and magnetic resonance imaging was repeated in 80 patients during follow-up.
Statistical analysis. We checked continuous variables for normality using the Shapiro-Wilk test, and for homogeneity of the variances using Levene's test. Categorical variables were expressed as counts and percentages, whereas continuous variables were expressed as mean ± standard deviation or median and interquartile range (IQR) as appropriate. Odds ratios (with 95% confidence intervals) and mean differences were calculated as association measures using logistic regression models or lineal regression β coefficients. For variables following the normal distribution, we used Student's t test to compare differences between two groups. The chi-square test was used for the comparison of categorical variables between independent groups. Cox regression analysis was used to estimate hazard ratios during follow-up. Reliability was evaluated with the kappa index and the specific positive and negative agreement indexes. Nonparametric receiver-operator curve (ROC) analysis was used to determine the diagnostic accuracy for the diagnosis of ACS, and of comorbidities potentially related to ACS, of the different hormonal tests, either individually or in combination. In all cases, a two-tailed P value < 0.05 was considered as statistically significant. All statistical analyses were performed using STATA 15

Results
Cardiometabolic profile at diagnosis and during follow-up. Following inclusion and exclusion criteria, 197 patients-of a total of 709 patients with AIs consecutively evaluated between 2013 and 2020 at our centre-were included in the analysis. No statistically significant differences were detected between the patients with AIs included or excluded in the study with the exception of higher cortisol post-DST, lower ACTH levels and a larger tumour size in the former (Supplementary Material Table S1). Baseline characteristics of the cohort included in the present study are summarized in Table 2. At diagnosis, 19 patients (9.6%) had ACS (as defined by a post-DST cortisol > 138 nmol/L) and 169 patients (85.9%) presented with one or more comorbidities potentially related to ACS. The prevalence of obesity was of 31%, yet no statistically significant differences in the post-DST cortisol levels were found between patients with and without obesity (59 ± 49 nmol/L vs 71 ± 82 nmol/L, respectively, P = 0.316). Four patients presenting with non-functioning AIs > 4 cm underwent adrenalectomy, and active surveillance was carried out in the remainder. After a median follow-up of 30.6 (IQR = 2.0-114.7) months, 6 out of 120 patients with non-functioning AIs developed ACS and 23 patients developed one or more new comorbidities: 20 (23.0%) developed dyslipidaemia; 6 (8.8%) developed hypertension; 9 (11.5%) became obese; 6 (4.5%) were diagnosed with type 2 diabetes; and 5 (3.2%) suffered a cardiovascular event. No cerebrovascular events were registered during follow-up.

Reliability and accuracy of LNSC, UFC, ACTH and DHEAS for the diagnosis of ACS. The degree
of agreement (reliability) of LNSC, UFC, ACTH and DHEA-S for the diagnosis of ACS was low, independently of the DST threshold used for the definition of ACS, with kappa indexes below 0.3 for all tests. However, the specific negative agreement was high, around 80-90%. Regarding their validity, the highest specificity was reached when ACS definition was based on the 138 nmol/L (5.0 µg/dL) threshold. Nevertheless, all tests had poor sensitivity for the diagnosis of ACS independently of the DST threshold employed for the diagnosis of ACS (Table 3). Association of the individual ACS diagnostic tests' results with comorbidities potentially related to ACS. Seventy-six (38.6%) patients showed a DST serum cortisol level > 50 nmol/L (1.8 µg/dL) at diagnosis. These patients had a risk of comorbidities potentially related to ACS two-fold higher than those with DST ≤ 50 nmol/L. The prevalence of dyslipidaemia and hypertension in patients with DST > 50 nmol/L was 1.8 and 2.5 times higher than in patients with DST ≤ 50 nmol/L, respectively (Table 4). However, the diagnostic performance of the DST to predict the presence of one or more comorbidities potentially related to ACS either individually or collectively, was poor, because all areas under the ROC curve analyses were below 0.67) (Fig. 4). UFC was above the > 3862 nmol/24 h in 2 (1.0%) patients whereas another 22 (11.2%) subjects showed normal-high (1931-3862 nmol/24 h) UFC concentrations. The prevalence of hypertension was three times higher in patients with normal-high UFC than in patients with normal-low UFC (< 1931 nmol/24 h) ( Table 4). LNSC was above the reference range in 30 (15.2%) patients, who had higher prevalences of hypertension and lower HDL-c levels when compared with patients showing LNSC levels within the reference range (Table 4). Basal ACTH levels were < 2 pmol/L in 68 (34.5%) patients and DHEAS levels were below the age and sex-adjusted reference ranges in 48 (24.4%) patients. No differences were found in the prevalence of ACS-related comorbidities according to ACTH or DHEAS levels. The AUCs for the diagnosis of ACS-related comorbidities were poor for UFC, LNSC, ACTH and DHEAS levels; and do not even reaching that of the DST ROC curve (Fig. 4). Even, when the five tests (including the DST) were used in combination for the prediction of comorbidities potentially related to ACS, the AUC was modest with an AUC of 0.70 [0.58-0.82].
When we evaluated the combined use of the tests for the diagnosis of comorbidities potentially related to ACS, the best association was that of the combination of a DST > 50 nmol/L and a LNSC > 149 nmol/L, which was present in 19 patients in our cohort. These patients had increased risks of hypertension (OR 7.1, 95% CI 1.6-31.6) and cardiovascular events (OR 3.6, 95% CI 1.2-11.3) ( Table 5).

Discussion
Our study confirms that, when used as single tests, plasma ACTH, LNSC, UFC and DHEA-S had poor sensitivity for the diagnosis of ACS. The combination of the four tests, however, improved diagnostic accuracy for ACS reaching an AUC in the ROC curve of 0.73. On the other hand, the diagnosis accuracy of DST for the prediction of comorbidities potentially related to ACS is low, albeit other tests routinely used for the study of AIs showed even worse performances. The association of a positive DST test with hypertension and cardiovascular events seems to increase when combined with increased LNSC levels, with the addition of ACTH, DHEA-S or UFC not improving the strength of such an association.
Several studies found that patients with AIs and elevated post-DST cortisol concentrations had worse cardiometabolic profiles and increased mortality compared with patients reaching adequate cortisol suppression  www.nature.com/scientificreports/ after this test [11][12][13]17 . It is currently debated which DST threshold should be used for the diagnosis of ACS. Several studies suggested that 50 nmol/L is the most sensitive threshold to identify patients with AIs and increased cardiometabolic risk [11][12][13][14][15] . In this line, Morelli et al. 14 demonstrated that in patients with AI, post-DST cortisol levels increased according to the number of chronic complications. In another study 15 , using artificial neural networks, she found that the optimal cut-off of post-DST cortisol levels for detecting patients with increased cardiovascular events was 50 nmol/L (accuracy 67.3%, AUC, 0.673). Furthermore, in another study 18 an increased risk of cardiovascular events was observed with post-DST cortisol values above 41 nmol/L (1.5 µg/dL). Our study found that, although there were some associations between DST results and cardiometabolic comorbidities, the DST had a poor diagnostic performance for the presence of these comorbidities. This finding is in agreement with earlier studies 14,15,18 , supporting that post-DST cortisol is neither accurate enough to predict the occurrence of post-surgical hypocortisolism nor the improvement of surgical complications in patients with AIs. The poor performance of the DST and other tests of adrenal function on the prediction of comorbidities potentially related to ACS might be explained by the multifactorial origin of these prevalent cardiometabolic disorders. Hence, ACS as a single factor, is unlikely to fully predict them especially when some factors known to increase the cardiometabolic risk such as older age 19 and subclinical co-secretion of other hormones like aldosterone 20 are also associated with the presence of AIs. Other factors such as obesity, which can promote hyperinsulinism and thus the development of AIs, could be indirectly associated with cortisol production as well 21 . However, until better and or reliable markers of ACS become available, the DST using the serum cortisol level > 1.8 µg/dL threshold seems the most sensitive single test to identify ACS patients at risk of cardiometabolic comorbidities. Moreover, in the presence of an elevated post-DST cortisol concentration, an elevated LNSC identifies patients at even higher cardiometabolic risk.
The performance of UFC, DHEA-S, ACTH and LNSC levels for the diagnosis of ACS was poor and, for the identification of comorbidities potentially related to ACS, were even poorer than that of DST in our study. This finding supports the recommendation of most professional societies to use the DST for the evaluation of ACS in AIs 4,7,8 . At present, UFC is not recommended for the diagnosis of ACS, given that less than 20% of patients with ACS present elevated UFC levels 5,22 . The role of DHEA-S in the diagnosis of ACS is currently controversial [23][24][25][26][27] .  www.nature.com/scientificreports/ In our study, DHEA-S as a single test or in combination with DST did not achieve better diagnostic performances for comorbidities potentially related to ACS than using the DST alone. Previous studies found basal ACTH levels > 2 pmol/L in up to 50% of patients with ACS and < 2 pmol/L in as many as 20% of patients with normal cortisol metabolism, also suggesting a poor diagnostic performance for ACS 28 . We found basal ACTH levels to have a weak association with the results of the DST, but no association with cardiometabolic comorbidities. LNSC-an easy, stress-free, and cost-effective alternative to late night serum cortisol-also showed limited utility for the diagnosis of ACS as suggested by previous studies 29 . Of the tests of adrenal function studied here, LNSC levels showed the greater reliability for the diagnosis of ACS as defined by the DST test, and patients with elevated LNSC and post-DST cortisol levels were those with the worst cardiometabolic profiles. Moreover, we found that the combination of basal plasma ACTH, UFC, LNSC and DHEA-S significantly increased the diagnostic accuracy for the diagnosis of ACS compared with their use as single tests, reaching an AUC of 0.73 in the ROC curve. This is in line with the recommendation of most guidelines and experts in this field of using the combination of several hormonal parameters to evaluate the presence of ACS 2,4-10 .
Our present study, however, is not free of limitations, starting by its retrospective design. Because we only included patients in whom all the diagnostic tests had been obtained, and such a decision was made on a clinical basis by their physicians, possibility exists of a selection bias towards the inclusion of a subset of more complicated patients as higher tumour size, higher DST and lower ACTH levels were found in the inclusion population compared to the excluded patients. However, we included all consecutive patients fulfilling the inclusion criteria during the study period within a single institution, thus allowing for comparable laboratory results. We did not evaluate osteoporosis, which is a recognized comorbidity related to ACS, due to inconsistent evaluation in the medical records. Therefore, the association of the results of the different evaluated tests with osteoporosis could not be evaluated. The metabolism of dexamethasone varies widely among patients 30 . Although we excluded patients with known factors associated with false positive results in the DST such as treatment with oral hormone www.nature.com/scientificreports/ contraceptives or other drugs known to alter dexamethasone metabolism, alcoholism, and psychiatric illness, some of these conditions might have not been registered in the medical records, and dexamethasone levels were not routinely evaluated during the DST 31 . Furthermore, other factors could also lead to false positive results in the DST 32 . Added to this is the known variability between techniques and assay kits for cortisol assays 33 and intra-assay variability in measurements which increases in the range of low cortisol levels. Furthermore, in our institution UFC and LNSC are measured by immunochemiluminescence, which are substandard compared with the liquid chromatography/tandem mass assays recommended nowadays 34 . This limitation is supported by the results of a recent study 35 that demonstrated that with the use of liquid chromatography/tandem mass assays, low DHEA-S levels were associated with diabetes, an association that was lost when DHEA-S was measured by immunochemiluminescence. Future studies are needed to identify more reliable and accurate markers of cortisol Table 4. Baseline features and association of ACS-diagnostic tests with the diagnosis of comorbidities potentially related to ACS. Differences in quantitative variables are expressed in mean differences (d) between ACS and NFAI group, and for qualitative variables differences are expressed in odds ratios (OR) and 95% confident interval (in brackets). ACS autonomous cortisol secretion, DST dexamethasone suppression test, DHEAS dehydroepiandrosterone sulphate, NFAI non-functioning adrenal incidentalomas, LNSC late-night salivary cortisol, UFC urinary-free cortisol.  www.nature.com/scientificreports/ autonomy. In this regard, urine metabolomics 34 and functional imaging studies such as adrenal iodomethylnorcholesterol scintigraphy hold promise.

Conclusion
LNSC is the one test with the highest diagnosis accuracy for ACS identification when a positive DST is used as the gold standard for ACS diagnosis. Comorbidities potentially related to ACS cannot be predicted by any single test of adrenal function possibly translating their multifactorial nature. In fact, the association of the tests evaluated here with comorbidities potentially related to ACS was poor. As a single test, DST, had the strongest association with comorbidities potentially related to ACS. Patients with elevated DST results and elevated LNSC levels had the highest cardiometabolic risk in our cohort.