Validation of the diagnostic efficacy of O-RADS in adnexal masses

The aim of this study was to validate the performance of the Ovarian-Adnexal Reporting and Data Systems (O-RADS) series models proposed by the American College of Radiology (ACR) in the preoperative diagnosis of adnexal masses (AMs). Two experienced sonologists examined 218 patients with AMs and gave the assessment results after the examination. Pathological findings were used as a reference standard. Of the 218 lesions, 166 were benign and 52 were malignant. Based on the receiver operating characteristic (ROC) curve, we defined a malignant lesion as O-RADS > 3 (i.e., lesions in O-RADS categories 4 and 5 were malignant). The area under the curve (AUC) of O-RADS (v2022) was 0.970 (95% CI 0.938–0.988), which wasn’t statistically significantly different from the O-RADS (v1) combined Simple Rules Risk (SRR) assessment model with the largest AUC of 0.976 (95% CI 0.946–0.992) (p = 0.1534), but was significantly higher than the O-RADS (v1) (AUC = 0.959, p = 0.0133) and subjective assessment (AUC = 0.918, p = 0.0255). The O-RADS series models have good diagnostic performance for AMs. Where, O-RADS (v2022) has higher accuracy and specificity than O-RADS (v1). The accuracy and specificity of O-RADS (v1), however, can be further improved when combined with SRR assessment.

In addition, the guideline also sets out management strategies for each risk category to standardize the clinical management of patients with different lesion categories.
From the introduction of O-RADS to its widespread use in the clinic, a large number of studies are still needed to validate its diagnostic efficacy.This study aims to validate the diagnostic efficacy of the O-RADS series models by comparing them with the subjective assessment of experienced sonologists.We also combined the O-RADS (v1) with the SRR assessment model in the course of the study, intending to improve the diagnostic specificity of the O-RADS (v1).

Materials and methods
Patients.This prospective study was approved by the ethics committee of our hospital (Peking Union Medical College Hospital).All experiments were performed in accordance with relevant guidelines and regulations and all patients undergoing the examination have signed informed consent.We conducted a prospective study of 425 patients with AMs seen at Peking Union Medical College Hospital between June 2021 and July 2022.The inclusion criteria for this study were patients who presented with AMs (detected on imaging or clinical palpation).If patients had multiple lesions at the same time, we selected only one of the most suspicious lesions.The exclusion criteria for this study were as follows: (1) no surgical treatment after US or no specific type of pathology was provided; (2) failed image quality audit.We ultimately included 218 lesions from 218 patients.The flow chart of the study is shown in Fig. 1.
Ultrasound examinations.US examinations were performed using Nuewa R9 equipment (Myriad Medical, Shenzhen, China).All ultrasound examinations were performed by two of the authors (N.S., Y.Y.), both of whom are sonologists in the Gynecology Specialty Group.Both sonologists received training on the O-RADS lexicon and O-RADS guidelines prior to the start of the study.
Transvaginal US was the preferred modality, but we used transabdominal US if the patient was unable to undergo transvaginal US or if the lesion was too large.Of the 218 patients, two patients underwent transabdominal US due to non-sexuality, 58 patients underwent transabdominal combined with transvaginal US due to the large size of the lesion (transabdominal US was used to measure the overall extent of the lesion and transvaginal US was used to observe the details of the lesion and blood flow), and the remaining patients underwent transvaginal US only.During the examination, both sonologists were required to capture images of the lesion: (1) greyscale images of the largest transverse and longitudinal planes of the lesion with and without measurement markers; (2) Color Doppler images of the most abundant blood flow; and (3) images of suspicious signs of the lesion, such as the presence of papillae and ascites.All images were reviewed by a sonologist with more than 15 years of experience in gynecology at our hospital.Images ultimately included in the study must meet the following criteria: (1) The images had to be clear; (2) The images were acquired according to the above requirements.
Once the examination was completed, the two sonologists were required to together give a subjective assessment of the lesion based on their experience and collaborated to give the lesion an O-RADS (v1) category based on the O-RADS lexicon and guidelines.The above assessments were given based on the characteristics of the lesions included in the studies.Due to the large malignancy span (10-50%) of O-RADS category 4, similar to the inclusive category in the IOTA SRs 10,11  www.nature.com/scientificreports/using the SRR assessment model, and in this study, a malignancy rate of 10% was used as the cut-off value for SSR to distinguish benign and malignant lesions 3,12 .For O-RADS category 4 lesions, the corresponding SRR assessment malignancy rate was further calculated and those lesions with less than 10% malignancy were downgraded to category 3.In addition, these 2 sonologists retrospectively analysed the images and gave a revised O-RADS classification based on the terminology described during the examination and the O-RADS US (v2022).If there was a disagreement between the two sonologists during the assessment, the senior sonologist responsible for the review decided the final outcome.
Gold standard.The pathological diagnosis of surgically excised tissue was used as the gold standard during the study, and the tumors were classified according to the World Health Organization's International Classification of Ovarian Neoplasms 13 .As borderline tumors require surgical intervention as much as malignant tumors, they were classified as malignant in the course of this study 10 .
Data analysis.SPSS 25.0 (IBM Corporation, Armonk, NY) and MedCalc 20.022 (MedCalc Software, Ostend, Belgium) software were used for statistical analysis during the study.Continuous variables were expressed as mean ± standard deviation, and categorical variables were expressed as frequencies.Comparisons of continuous variables were assessed using unpaired t-tests, and comparisons of categorical variables were made using chi-square tests and Fisher's exact tests.The receiver operating characteristic (ROC) curves were applied to calculate and compare the area under the curve (AUC) and to determine the best cut-off value.p < 0.05 was considered to be statistically significant.

Results
Clinical and image characteristics of the lesion.We studied 218 lesions in 218 patients with 166 benign and 52 malignant lesions with clinical and image characteristics as shown in Table 1.The mean age of the 218 patients was 44.78 ± 13.72 years (range 16-80 years).Among them, patients with benign lesions were younger than those with malignant lesions (p < 0.001).And, the maximum diameter of benign lesions was significantly smaller than that of malignant lesions (p < 0.001).There were also statistically significant differences in the O-RADS category and blood flow scores between benign and malignant lesions (all P < 0.001).The pathological types of lesions were shown in Table 2.
The results of these classification systems.The results of the subjective assessment and O-RADS (v1) assessment of 218 lesions were shown in Table 3, and the results of the O-RADS (v2022) assessment were shown in Table 1.The malignancy rates for the benign and malignant groups for the subjective assessment method were 4.35% and 90% respectively, and the malignancy rates for O-RADS 2, O-RADS 3, O-RADS 4 and O-RADS 5 were 0%, 0%, 36% and 94.4% for O-RADS (v1) and 0%, 0%, 46.2% and 94.4% for O-RADS (v2022) respectively.

Discussion
As a common gynecological tumor, it is extremely important that ovarian cancer is accurately assessed preoperatively 14 .As a newly proposed US classification system, most studies are still in the process of validating the diagnostic efficacy of O-RADS and its observer agreement [15][16][17][18][19][20][21] .Currently, subjective assessment by senior sonologists is considered the most accurate method for diagnosing AMs 22 .In this study, we evaluated the performance of the O-RADS series of models in the preoperative identification of benign and malignant AMs and compared them separately with other commonly used clinical assessment methods.The overall results indicated that the O-RADS, particularly the O-RADS (v2022), has a high diagnostic efficacy and can assist the sonologists in the accurate preoperative assessment of benign and malignant AMs.
As in previous studies, there were statistically significant differences between benign and malignant AMs in this study in terms of patient age, lesion size, lesion type, and lesion blood flow score (all P < 0.001) 17,21 .Meanwhile, Di Legge et al. 23 and Bruno et al. 24 mentioned that even for lesions of small dimension, some ultrasound features such as irregular contour, absence of acoustic shadowing, vascularized solid areas, ≥ 1 papillae, vascularised septum and moderate-severe ascites, also play a role in the differentiation of benign and malignant lesions.The results of this study showed that the malignancy rates for O-RADS 2, O-RADS 3, O-RADS 4 and O-RADS 5 were 0%, 0%, 36% and 94.4% for O-RADS (v1) and 0%, 0%, 46.2% and 94.4% for O-RADS (v2022) respectively, with the malignancy rate of O-RADS 3 being less than the 1-10% provided in the guidelines 7    www.nature.com/scientificreports/ of the reasons for this may be related to the small sample size included in this study and the small number of pathological types involved.The low specificity of O-RADS (v1) has received a lot of attention 10,19,21 .Lan Cao et al. 10 suggested that the diagnostic accuracy and specificity of O-RADS (v1) could be effectively improved if multilocular cysts and smooth solid masses in the 4 categories of O-RADS (v1) were classified as benign.Based on existing studies, O-RADS (v2022) provides a more specific classification of multilocular cysts and smooth solid masses in the O-RADS category 4 10,19,21 .It also downgrades smooth bilocular cyst, which is ≥ 10 cm, and smooth solid lesion with acoustic shadow and color score (CS) of 2-3 to category 3.During the study, 11 benign lesions were successfully downgraded when classified using O-RADS (v2022), with significant improvements in diagnostic accuracy (84.4-89.4%)and specificity (79.5-86.1%)without altering sensitivity.These 11 lesions included three ovarian fibromas and one Brenner tumor (with acoustic shadow, CS = 2).Ovarian fibromas are the most common type of sex cord-stromal tumors and the lesions tend to present as smooth solid masses with acoustic shadowing and a small or moderate amount of blood flow (CS = 2-3) 25 .According to the O-RADS (v1) classification criteria, the lesions are mostly classified as O-RADS 4 7 .When > O-RADS 3 is used as a predictor of malignancy, the lesions are often classified in the malignant category.The O-RADS (v2022) classification system classifies smooth solid lesions with acoustic shadowing and a 2-3 color score as category 3. When using this classification method, some ovarian fibromas and fibrothecomas with typical US features can be correctly classified as benign, effectively avoiding unnecessary surgery in some patients.
Lan Cao et al. 10 proposed that the O-RADS (v1) category 4 of lesions are similar to the uncertain category in the IOTA SRs.To calculate the specific malignancy risk of lesions in the SRs model, the IOTA group developed the SRR assessment model in 2016 12 .Numerous studies have confirmed that IOTA SRR assessment, ADNEX  www.nature.com/scientificreports/model and ORADS can help in the differentiation of benign and malignant masses [15][16][17][18][19][20][21]26 . In his study, the category 4 of O-RADS (v1) lesions were assessed for malignancy risk using SRR assessment and downgraded using 10% as the cutoff value, resulting in a combined assessment model with the largest AUC (0.976, 95% CI 0.946-0.992).However, the AUC of the combined model was not statistically significantly different from the AUC of O-RADS (v2022) (p = 0.1534).Thanks to its higher sensitivity, O-RADS (v1) is able to detect malignancies sensitively, minimising the occurrence of missed diagnoses, but its lower specificity may allow patients with AMs to be over-treated in the clinic 15,19,21 .Similar to the O-RADS (v2022) classification system, when assessed in combination with the SRR assessment, the specificity of the O-RADS (v1) was significantly improved (79.5-90.4,p = 0.006) without reducing diagnostic sensitivity.However, this is a single-centre study and much research is needed to determine the diagnostic efficacy of O-RADS (v1) combined with SRR assessment and how to further improve the specificity of O-RADS (v1) diagnosis.
A study by Moro et al. 27 mentioned that serous borderline ovarian tumor showed an overlaping ultrasound appearance with non-invasive low-grade serous ovarian carcinoma, both presenting as cysts with papillary projections.However, unlike ovarian cancer, the prognosis for borderline tumors is relatively good, and women of fertile age can be treated with fertility-sparing surgery 28 .Therefore, it is extremely important and necessary to accurately distinguish borderline tumors from ovarian cancer before surgery.A total of 6 borderline tumors (4 Serous and 2 mucinous ovarian borderline tumors) were enrolled in the present study, and considering that the patients were in Stage I, and all were women of fertile age (range, 22-34 years), the surgical approach used for this group of patients was fertility-sparing surgery.Moro et al. 27 proposed that the serous borderline ovarian tumor were described as unilocular-solid or as multilocular-solid with solid papillary projection.Meanwhile, another study by Moro et al 29 suggested that a multilocular cyst with 2-10 locules is representative of a benign cystadenoma, whereas a multilocular cyst with > 10 locules is indicative of a gastrointestinal (GI)-type borderline tumor.The borderline tumors included in this study exhibited multilocular cyst (two cases, maximum diameter > 10 cm and > 10 locules) or multilocular cyst with solid component on ultrasound, and such lesions were classified as   O-RADS categories 4 and 5 for both O-RADS (v1) and O-RADS (v2022) assessments, and lesions classified as category 4 failed to be downgraded for combined SRR assessment.Ludovisi et al. 30 described the serous surface papillary borderline ovarian tumors (SSPBOTs) a rare morphologic variant of serous ovarian tumors that are typically confined to the ovarian surface, as irregular solid lesions surrounding normal ovarian parenchima.There were no SSPBOTs in the cases included in this study, but according to the O-RADS classification guidelines, such lesions met the classification criteria of O-RADS 5 in both O-RADS (v1) and O-RADS (v2022).Considering that the biological behavior of borderline tumors is intermediate between benign and malignant 31 , giving them a higher assessment of malignant risk can draw the attention of clinicians to avoid delaying patient treatment.However, the ability of the O-RADS classification system to identify borderline tumors is indeed limited, and which of the lesions assessed to be at moderate or high risk of malignancy are borderline tumors will have to be subjectively evaluated by experienced sonologists, which is a limitation of the O-RADS classification system that should be improved in subsequent studies.
The main strength of this study is that the results of subjective assessment and O-RADS (v1) assessment were collected prospectively and pathological results were available for all lesions.However, the O-RADS (v2022) classification results in this study were obtained from retrospective analysis of lesions, and the small sample size and single-centre nature of this study may lead to limitations in the wider application of the findings.In addition, all patients included in this study were those with obtainable pathology after surgery for AMs.Patients in both O-RADS 0 and O-RADS 1 categories were not included, which may result in selection bias and overestimation of PPV.
In summary, the O-RADS series models have good diagnostic performance for AMs.Among them, O-RADS (v2022) has higher diagnostic efficacy and diagnostic specificity than O-RADS (v1).However, when O-RADS (v1) is combined with SRR assessment, its diagnostic accuracy and specificity can be further improved.
, non-simple cyst/bilocular, smooth cyst (< 10 cm) solid component, < 4 pps or solid component not considered a pp; any size 4 multilocular cyst with solid component, any size, CS 1multilocular cyst with solid component, any size, CS 3

Table 4 .Figure 3 .
Figure 3.A 65-year-old woman with a fibroma of the ovary in the left adnexal region.(a) and (b) Longitudinal and transverse section of the lesion, B-mode US showed a smooth solid mass with acoustic shadowing.(c) Small amount of blood flow signal within the lesion (Color Score = 2).(d) Results of SRR assessment of the lesion.Lesions was classified as O-RADS (v1) category 4, O-RADS (v1) combined with SRR assessment and O-RADS (v2022) category 3.

Figure 4 .
Figure 4.A 38-year-old woman with a Mature teratoma in the right adnexal region.(a) and (b) Longitudinal and transverse section of the lesion, B-mode US showed a multilocular cyst with a solid component (maximal diameter 4.4 cm).(c) No clear blood flow signal was seen within the lesion (Color Score = 1).(d) Results of SRR assessment of the lesion.Both O-RADS (v1) and O-RADS (v2022) of the lesion were category 4, and O-RADS (v1) combined with SRR assessment was category 3.

Table 1 .
Clinical and image characteristics of the 218 lesions.O-RADS, Ovarian-Adnexal Reporting and Data Systems; v, version; B, Benign; M, Malignant; CS, Color Score; pps, papillary projections.

Table 2 .
Pathological types of the 218 lesions.

Table 3 .
Subjective assessment of 218 lesions with O-RADS (v1) assessment results.O-RADS, Ovarian-Adnexal Reporting and Data Systems; v, version; SRR, Simple Rules Risk assessment model.