## Introduction

Uterine leiomyosarcoma accounts for ~50–70% of uterine malignant mesenchymal neoplasms and 1–2% of all uterine malignancies1,2,3,4,5, with an annual incidence of ~1 per 100,000 among women ≥35 years old in the United States2,5,6. Five-year survival is ~40–50% among all patients2,7,8,9,10, with more dismal outcomes in patients presenting at high stage2,7,8,11,12,13,14,15. Although 50–70% of patients present with FIGO stage I disease1,3,8,9,14,16,17, low stage does not equate to low-risk, as these women face a 40–75% risk of tumor recurrence12,18,19 and a 50% risk of death within 5 years3,9, despite hysterectomy. Unfortunately, there is currently no accepted risk stratification system for patients with stage I uterine leiomyosarcoma15, significantly impairing patient counseling and implementation of risk-targeted clinical trials. Prior work indicates that tumor morphology -- including cytologic atypia, number of mitoses, lymphovascular invasion, and serosal involvement -- may be significantly associated with survival across tumor stages1,3,7,9,11,12,13,14,16,19,20,21,22. However, results have been inconsistent23, and histopathologic studies specific to stage I uterine leiomyosarcoma have been limited by small cohorts18,24.

We hypothesized that routinely evaluated morphologic parameters could be combined in a robust yet mathematically transparent risk stratification model for stage I uterine leiomyosarcoma for routine application at hysterectomy.

## Methods

### Cohort

This study was approved by the institutional review board at Brigham and Women’s Hospital (BWH) (2017P001291). A retrospective search of the electronic pathology database identified 1177 gynecological leiomyosarcomas diagnosed between August 4, 1989 and June 1, 2020. Inclusion criteria were: (1) hysterectomy for leiomyosarcoma at BWH; or patient referred to Dana Farber Cancer Institute within 3 months of initial leiomyosarcoma diagnosis, with hysterectomy pathology reviewed at BWH, and with no clinical or radiographic recurrence at referral (i.e., only tumors referred for primary management, to limit referral bias and increase external validity); (2) FIGO (2018) stage IA or IB at diagnosis, as confirmed by multidisciplinary review, including radiologic staging; (3) uterine primary site; and (4) hematoxylin and eosin-stained slides available for review. Exclusion criteria included: (1) tumor seen for pathological diagnostic consultation only; (2) leiomyosarcoma diagnosis not confirmed on pathological review; or (3) neoadjuvant leiomyosarcoma-directed therapy. Twenty-four and 15 additional stage I uterine leiomyosarcoma meeting criteria were obtained from an Italian consortium and the University of Chicago, respectively. Case entry is diagrammed in Fig. 1.

Of 164 BWH tumors meeting criteria, 123 & 41 were assigned to training and validation cohorts, respectively, using a random number generator. All non-BWH tumors were assigned to the validation cohort, providing a measure of model transportability in addition to reproducibility25 and yielding a prespecified 60:40 ratio of training to validation cohort size.

### Clinical data

Clinical and outcomes data were collected from the electronic medical record. Clinical parameters included age at diagnosis, primary site, FIGO (2018) stage26, tumor morcellation, and adjuvant therapy. Outcome parameters included date of first pathologic diagnosis, date and site of first clinical or radiographic recurrence, date of last clinical or radiographic follow-up, and clinical status at last follow-up.

### Morphologic data

Gross pathologic descriptions were reviewed. Tumor size, pattern of myometrial invasion, cervical involvement, and relation to uterine serosa and surgical margins were documented. Original diagnostic pathology reports were reviewed to corroborate morphologic findings and extent of disease. All original diagnoses were rendered by subspecialty-trained gynecologic pathologists. All available hematoxylin and eosin-stained (range, 1–30 slides per case; median, 5 slides per case overall; median, 9 slides per in-house case; median, 2 slides per referral case) and immunostained slides were re-reviewed by a subspecialty-trained gynecologic pathologist (D.B.C.), and a subset was reviewed with two additional gynecologic pathologists (M.R.N., C.P.H.) to set morphologic thresholds and resolve ambiguous findings. Non-BWH tumors were reviewed by gynecologic pathologists at contributing institutions. Primary site, stage, and pathologic diagnosis (by WHO criteria27,28) were confirmed.

Morphologic parameters are listed in Table 1. For detailed definitions, see Supplementary Information 1. Briefly, predominant cytomorphology included conventional spindled, epithelioid, and myxoid (per WHO definitions28) as well as pleomorphic, defined by predominant (>50%) pleomorphic tumor cells. Tumor-myometrial interface was classified as circumscribed, microscopically invasive, or grossly invasive. Broders nuclear grade 113 was defined by mild cytologic atypia, abundant eosinophilic cytoplasm, and fascicular architecture; nuclear grade 2 by moderate nuclear atypia, with somewhat greater variation in nuclear shape and size, but without overt pleomorphism; nuclear grade 3 by areas of nuclear pleomorphism, but with retention of fascicular growth and eosinophilic cytoplasm characteristic of smooth muscle differentiation; and nuclear grade 4 by predominant pleomorphic nuclei with at most minor foci with morphologic features characteristic of smooth muscle differentiation. Mitotic count was expressed as mitoses per 2.4 mm2 (i.e., 10 high-power fields under a 40× objective with field diameter 0.55 mm; see Supplementary Table 1) in the most mitotically active tumor focus. Atypical mitoses were defined by multipolar, bizarre, or ring spindles and were scored as “present” if at least 1 atypical mitosis was identified in 50 high-power fields. Coagulative necrosis was characterized by abrupt transition from viable to necrotic tumor, often with convoluted “geographic” borders. Lymphovascular invasion was defined by tumor entirely within or protruding into vessels outside of the main tumor mass. Serosal abutment was defined by tumor invasion through the myometrium to abut the underside of the uterine serosa, with no residual histologically unaltered intervening myometrial fibers recognizable on hematoxylin and eosin-stained slides. Tumor growth on the mesothelial-lined serosal surface (either via destructive invasion directly through the serosa or implantation on the serosa via peritoneal spread) was considered stage II disease and excluded from this study.

After initial morphologic review, additional diagnostic immunostains were performed on 46 tumors, and 6 were reclassified and excluded from further study (2 PEComa, 1 inflammatory myofibroblastic tumor, 1 adult granulosa cell tumor, 1 low-grade endometrial stromal sarcoma, and 1 low-grade fibromyxoid spindle cell tumor of uncertain type). All tissue slides were prepared from formalin-fixed paraffin-embedded tissues maintained in a climate-controlled anatomic pathology archive.

### Statistical analyses

All statistical analyses were performed in SAS 9.4 (SAS Institute, Cary, NC, USA). Analyses were performed in accordance with REMARK29 and AJCC30 criteria. All P values are two-tailed, with alpha (α) <0.05 for statistical significance; 95% confidence intervals are presented.

### Survival analyses

The primary outcomes endpoint was 5-year disease-specific survival, with 5-year disease-free survival considered a secondary endpoint. Disease-free survival was defined as the interval from initial pathologic diagnosis to first clinical or radiographic recurrence, death from disease, or last recurrence-free follow-up (censored). Disease-specific survival was defined as the interval from initial pathologic diagnosis to death from leiomyosarcoma, death from other cause (censored), or last follow-up (censored).

Univariate analyses of disease-free and disease-specific survival were performed by the log-rank test (Kaplan–Meier method), with multiple comparisons corrected by Tukey’s method. Parameters significantly associated with survival on univariate analyses were assessed in multivariate Cox proportional hazard regression models, with stepwise variable selection (P < 0.25 for entry; P > 0.15 for removal). Multivariate models were repeated with and without missing data points. Assumptions of proportional hazards regression were validated (Supplementary Information 2)31.

### Risk model selection and internal validation

Twelve algebraic risk models were created from the multivariate data, with individual parameters weighted by hazard ratios (Supplementary Information 3). The candidate risk models were tested by logistic regression, with 5-year disease-free and disease-specific survival as separate binary endpoints. Five patients with completely resected local single-site recurrence achieved >5-year disease-free survival after re-resection and were considered to have achieved 5-year disease-free survival32,33. Patients alive with no evidence of recurrence but less than 60 months follow-up were excluded from logistic regression analyses. Discrimination was assessed with a concordance statistic (represented as area under a receiver operator characteristic (ROC) curve (AUC)), and the risk model with the greatest AUC was selected and cross-validated34. This optimal risk model was internally validated on the training cohort by proportional hazards regression on 1000 leave-one-out bootstrap replicates with replacement. Time-dependent ROC curves for this risk model were generated, with integrated AUC calculated by the inverse probability of censoring weighting method.

### External validation

The novel risk model was then applied to the validation cohort, and proportional hazards and logistic regression analyses were repeated. Concordance statistics of the training and validation cohorts were compared by Pearson’s chi-squared test (categorical variables) or Mann–Whitney U test (continuous variables). Optimal risk score cutoffs were determined on the combined training and validation cohorts by Youden’s criteria using the SAS %rocplot macro. Variable selection and model validation were repeated on the combined (training + validation) cohort, as previously recommended34, and model calibration (predicted vs observed outcomes) was assessed with loess calibration curves.

### Comparison with existing sarcoma grading systems

Our novel risk model was compared to 4 alternative grading systems: subjective tumor grade assigned by a subspecialty-trained gynecologic pathologist at original diagnosis, the French (FNCLCC) sarcoma grading system35, and 2 distinct non-stage-specific uterine leiomyosarcoma grading systems from Mayo Clinic13 and Memorial Sloan-Kettering Cancer Center (MSKCC)9. Each comparator system was applied to our combined training and validation cohorts and compared to our novel risk model by logistic regression modeling of 5-year disease-free and disease-specific survival.

### Modified risk models

Because lymphovascular invasion and serosal abutment are not readily assessed in myomectomy and morcellated specimens, a “simplified” risk model excluding these terms was tested in the combined training and validation cohorts by logistic regression for 5-year disease-free and disease-specific survival. Performance of this modified risk model was compared to the complete novel risk model by chi-squared testing. Optimal score cutoffs were determined using Youden’s criteria, as described above. To ensure applicability in the intended clinical scenario, the subcohort of leiomyosarcomas diagnosed on myomectomy or morcellated specimens was stratified according to the simplified risk model, with low- and high-risk groups compared by log-rank testing.

To account for possible differences in interpretation of existing staging criteria and to promote continued validity in the event of future changes to staging criteria, we also tested a second modified risk model that excluded only the serosal abutment term, with comparison to the complete risk model as described above.

## Results

### Clinical and morphologic parameters

Clinical and morphologic data are detailed in Table 1. Representative photomicrographs are in Fig. 2 and Supplementary Figs. A1A7. The training cohort included 123 women with stage I uterine leiomyosarcoma. All patients were clinically disease-free after primary surgery and staging. Seventy-three (59%) women had tumor recurrence, at median 10 (range, 2–188) months after diagnosis, including 24 women with primary local (abdominopelvic) recurrence and 35 recurring at distant sites, including lung (n = 25), liver (3), soft tissue (3), bone (3), and retroperitoneum (1). Fifty-three (43%) women died of disease, at median 34 (range, 2–188) months, and 12 (10%) were living with disease at last follow-up. Five (4%) women had died of other causes (1 at 9 months from breast cancer, 1 at 11 months from heart failure, 1 at 80 months from unrelated bowel perforation, 1 at 114 months from Crohn’s disease, and 1 at 143 months from unknown causes).

Lymphovascular invasion could not be assessed in 8 cases (2 morcellated; 6 consults with a single hematoxylin and eosin-stained slide). Tumor-myometrial interface could not be assessed in 8 consults with a single hematoxylin and eosin-stained slide. Serosal abutment could not be assessed in 10 cases (5 morcellated; 5 consults with a single hematoxylin and eosin-stained slide). Margin status could not be assessed in 12 cases (9 morcellated; 3 consults with a single hematoxylin and eosin-stained slide).

### Univariate and multivariate survival analyses

Univariate survival analyses for the training cohort are detailed in Table 2. Shorter disease-free and disease-specific survival were significantly associated with increased mitotic count, coagulative necrosis, lymphovascular, and serosal abutment. Shorter disease-free survival was also associated with atypical mitoses. Positive surgical margins showed a non-significant trend toward shorter disease-free and disease-specific survival, and all patients with positive surgical margins died of disease.

Across 5 multivariate models (Supplementary Table 2), shorter disease-free and disease-specific survival were consistently associated with lymphovascular invasion and serosal abutment. Coagulative necrosis, mitotic count >25 per 2.4 mm2, and atypical mitoses were associated with shorter survival in certain models. Due to collinearity with serosal abutment, positive surgical margins were paradoxically associated with significantly reduced hazard. Because serosal abutment was associated with substantially greater hazard in independent models, margin status was excluded from further modeling. Models with and without missing data points differed slightly but did not substantially affect parameter selection for risk formulas.

### Risk model selection and internal validation

The candidate risk model with the best performance for both 5-year disease-free survival and 5-year disease-specific survival was:

$${{{{{{{\mathrm{Risk}}}}}}}}\,{{{{{{{\mathrm{score}}}}}}}} = \, \left( {{{{{{{{\mathrm{coagulative}}}}}}}}\,{{{{{{{\mathrm{necrosis}}}}}}}}} \right)\left( 1 \right) + \left( {{{{{{{{\mathrm{mitoses}}}}}}}} \, > \, 25\,{{{{{{{\mathrm{per}}}}}}}}\,2.4\,{{{{{{{\mathrm{mm}}}}}}}}^2} \right)\left( 2 \right)\\ + \left( {{{{{{{{\mathrm{atypical}}}}}}}}\,{{{{{{{\mathrm{mitoses}}}}}}}}} \right)\left( 2 \right) + \left( {{{{{{{{\mathrm{lymphovascular}}}}}}}}\,{{{{{{{\mathrm{invasion}}}}}}}}} \right)\left( 3 \right)\\ + \left( {{{{{{{{\mathrm{serosal}}}}}}}}\,{{{{{{{\mathrm{abutment}}}}}}}}} \right)\left( 5 \right)$$

As detailed in Supplementary Information 1, each morphologic parameter is treated as binary (“yes” or “no,” “present” or “absent,” scored as “1” or “0,” respectively), yielding a possible score range from 0-13. This model, hereafter termed the “novel risk model,” was consequently validated for both disease-free and disease-specific survival.

Training cohort tumors spanned the risk score range (0–13) (Supplementary Fig. B). On logistic regression, AUC was 0.9270 (0.8770–0.9771) for 5-year disease-free survival and 0.8517 (0.7711–0.9322) for disease-specific survival (Supplementary Fig. C). Cross-validation ROC curves supported strong model concordance (AUC 0.9060 (0.8462–0.9659) for disease-free survival; 0.8215 (0.7316–0.9114) for disease-specific survival) (Supplementary Fig. D). On time-dependent ROC analyses, the novel risk model was significantly associated with disease-free and disease-specific survival at 12, 24, 60, 120, and 180 months after diagnosis (P < 0.0001 for all analyses) (Supplementary Figs. E1 and E2). The novel risk model supplanted all individual morphologic parameters in a proportional hazards regression model (Supplementary Table 2), and higher risk score was significantly associated with poorer outcomes on bootstrap analysis (HR 1.276 (1.272–1.279) for disease-free survival; 1.269 (1.266–1.272) for disease-specific survival).

### External validation

Clinical and pathologic data for the validation cohort are detailed in Table 1. The recurrence rate was significantly higher in the training (61%) than in the validation cohort (45%); no other parameters differed significantly.

The novel risk model was validated by logistic regression in the validation cohort. AUC for 5-year disease-free and disease-specific survival were 0.8263 (0.7101–0.9425) and 0.8018 (0.6766–0.9269), respectively (Supplementary Fig. F), which did not differ significantly from the training cohort (P = 0.13 & 0.51, respectively) (Supplementary Fig. G).

Of 12 candidate risk models (see Supplementary Information 3), the novel risk model also had the highest concordance statistic in combined training and validation cohorts (AUC 0.8875 (0.7749–0.9451) for disease-free survival, 0.8348 (0.7667–0.9029) for disease-specific survival). Model predictions approximated ideal performance on a calibration curve (Supplementary Fig. H), and supplanted other risk factors in a multivariate model with bootstrap validation (Table 3). In the combined cohort, there was no significant difference in AUC between BWH in-house and cancer center referral cases (P = 0.25 & 0.12 for disease-free and disease-specific survival, respectively), or between BWH and non-BWH cases (P = 0.82 & 0.52 for disease-free and disease-specific survival, respectively). There were too few deaths in the non-BWH subcohorts for individual comparison.

### Identification of optimal risk score cutpoints

Optimal ROC cutpoint analysis identified 3 risk score groups: low risk (0–2 points; n = 53), intermediate risk (3–5 points; n = 54), and high risk (6–13 points; n = 69). Low-, intermediate-, and high-risk tumors had 5-year disease-free survival of 88%, 44%, and 9%, respectively, and 5-year disease-specific survival of 95%, 64%, and 33%, respectively (Fig. 3). Repeat logistic regression analysis using this three-tiered system resulted in minimal, non-significant loss of concordance (AUC = 0.8767 (0.8222–0.9312) for disease-free survival; 0.8180 (0.7509–0.8852) for disease-specific survival), compared to the continuous (0–13) risk score (Supplementary Fig. I). Case examples for application of the risk model with these score cutoffs are in Supplementary Information 4.

### Re-examination of tumors with unusual findings

After establishing the optimal risk model and stratification cutoffs, we revisited tumor subsets with unusual outcomes or morphologic findings to better characterize these rare scenarios.

1. 1.

Low-risk tumors with adverse outcomes: Seven patients experienced tumor recurrence despite low-risk tumor morphology. Re-examination of these cases did not reveal any parameters either within or outside of our risk model to consistently distinguish these 7 from non-recurring low-risk tumors. Detailed clinicopathologic parameters for these 7 tumors are provided in Supplementary Information 5. Five had died of disease at 39, 43, 77, 141, and 169 months, and 2 were alive with disease at 49 and 159 months. Among all patients with adverse outcome, median disease-free and disease-specific survival were longer for those with low-risk (141 and 36 months, respectively) tumors than for intermediate-risk (51 and 13 months, respectively) and high-risk (43 and 10 months, respectively) tumors, though the differences were not statistically significant given the low number of adverse outcomes among low-risk tumors.

2. 2.

Tumors with positive margins: Of 8 tumors with positive margins, 7 were classified as high risk and 1 as intermediate risk. High-risk tumors with versus without positive margins showed no significant difference in median disease-free (9 versus 12 months, respectively; P = 0.45) or disease-specific (45 versus 49 months, respectively; P = 0.97) survival, or in rates of 5-year disease-free (0% versus 6%, respectively; P = 0.52) or 5-year disease-specific (43% versus 26%, respectively; P = 0.35) survival. The sole patient with intermediate-risk morphology and positive margins recurred at 29 months and was alive with disease 108 months after initial diagnosis.

3. 3.

Tumors with non-spindled morphology: We performed subcohort survival analyses of our three-tier risk stratification on each of the four morphologic subtypes (spindled, epithelioid, myxoid, and pleomorphic), shown in Supplementary Fig. J. Among spindled tumors, low-, intermediate-, and high-risk tumors showed significant differences in disease-free and disease-specific survival. Statistical analyses of non-spindled tumors were limited by small numbers. However, Kaplan–Meier plots of disease-free survival show a trend toward separation of low-, intermediate-, and high-risk tumors among epithelioid, myxoid, and pleomorphic tumors, with significant differences between low- and high-risk tumors for all 3 groups. Plots of disease-specific survival show the same patterns for epithelioid and pleomorphic tumors. Although low-risk myxoid tumors appear to separate from intermediate- and high-risk myxoid tumors, only four disease-specific deaths occurred among all myxoid tumors, rendering subcohort analysis statistically moot.

### Comparisons with other risk systems

For each sarcoma grading system, distributions between prognostic groups and respective survival characteristics are summarized in Table 4.

By logistic regression modeling, the MSKCC risk nomogram was significantly associated with 5-year disease-free and disease-specific survival (AUC 0.8081 (0.7371–0.8790), P < 0.0001; AUC 0.7809 (0.7037–0.8582), P < 0.0001, respectively), as were FNCLCC grade (AUC 0.7392 (0.6687–0.8098), P < 0.0001; AUC 0.7232 (0.6555–0.7910), P < 0.0001, respectively), subjective grade assigned at original diagnosis (AUC 0.6539 (0.5546–0.7533), P = 0.0022; AUC 0.6706 (0.5883–0.7530, respectively), P = 0.0049), and Mayo risk group (AUC 0.5456 (0.5041–0.5870), P = 0.02; AUC 0.5344 (0.5004–0.5684), P = 0.047, respectively).

Our novel risk model performed significantly better than the MSKCC nomogram for predicting 5-year disease-free survival (P = 0.04) and showed a non-significant trend toward better performance for disease-specific survival (P = 0.11). Our novel risk model was better at predicting 5-year disease-free and disease-specific survival than FNCLCC grade (P < 0.0001 & P = 0.0031, respectively), Mayo risk group (P < 0.0001), and subjective grade assigned at original diagnosis (P = 0.0001 & P = 0.01) (Fig. 4).

### Modified risk models

A “simplified” version of the novel risk model excluding lymphovascular invasion and serosal abutment terms was significantly associated with disease-free survival (P < 0.0001, AUC 0.7812 (0.7088-0.8536)) and disease-specific survival (P < 0.0001, AUC 0.7386 (0.6629–0.8143)) across the entire cohort. However, the simplified model significantly underperformed the complete risk model in predicting 5-year disease-free (P = 0.01) and disease-specific survival (P = 0.01) (Supplementary Fig. K). Optimal ROC cutpoint analysis identified two risk groups: low risk (0–2 points) and high risk (3–5 points). Among only those tumors diagnosed on myomectomy or morcellated specimens (n = 25), low- and high-risk tumors showed significantly different disease-free survival (89% versus 42% at 5 years; P = 0.02) and a trend toward different disease-specific survival (90% vs 61% at 5 years; P = 0.08) (Supplementary Fig. L).

A second modified risk model excluding only the serosal abutment term was significantly associated with disease-free (P < 0.0001, AUC 0.8268 (0.7619–0.8917)) and disease-specific survival (P < 0.0001, AUC 0.8021 (0.7314–0.8728)) across the entire cohort. This modified model was inferior to the complete risk model in predicting 5-year disease-free survival (P = 0.025), with no significant difference in predicting 5-year disease-specific survival (P = 0.16) (Supplementary Fig. K).

## Discussion

We present a novel, accurate, mathematically intuitive risk model for stage I uterine leiomyosarcoma, developed in a large institutional cohort controlled for referral bias, and validated in a separate multi-institutional cohort including independent observers. This novel risk model is designed around routinely assessed and carefully defined histologic parameters and omits immunohistochemical and molecular studies. This design is intended to promote interobserver reproducibility; circumvent issues of cost, access, and diagnostic delay; and facilitate broad uptake, including in resource-disadvantaged settings. Our novel risk model optimally partitions patients into three, approximately equally sized groups with significantly different risks of recurrence and death from disease. This three-tier stratification is intuitive for patients and providers, provides a standardized basis for clinical trials, and offers sensitivity and specificity comparable to the underlying continuous risk score.

Prognostic models for uterine leiomyosarcoma have been proposed3,9,13,36. However, none of these has achieved widespread clinical uptake15, and none has been endorsed by the most recent editions of the American Joint Committee on Cancer staging and prognostic guidelines37 or the WHO Classification of Tumours of the Female Genital Tract28. Among published models, the MSKCC risk nomogram9 is the most rigorously validated and prognostically accurate10. However, it has certain shortcomings and has not been widely adopted. Clinically, the nomogram is somewhat cumbersome to apply, and it does not translate to discrete, intuitive risk groups for patient counseling or trial design. Morphologically, the nomogram relies on a subjective (and likely poorly reproducible35,38) assessment of tumor grade. Further, the nomogram is intended for application to all uterine leiomyosarcoma (i.e., not specific to stage I). One stage I-specific prognostic model has been published, but it is derived from a cohort not conforming to current diagnostic criteria3 and significantly underperformed our novel model (data not shown).

Management of stage I uterine leiomyosarcoma has not changed substantially in the last 3 decades. The core tenet is hysterectomy followed by close observation15, and there is no proven role for adjuvant radiation39 or chemotherapy17,21,40. An international randomized phase III trial of observation versus adjuvant chemotherapy in stage I uterine leiomyosarcoma closed in 2016 due to poor recruitment15, highlighting a need for risk-stratified clinical trials8. With this aim in mind, our novel risk model was validated on 5-year disease-free and disease-specific survival endpoints. This timeframe approximates median survival for uterine leiomyosarcoma patients and provides a rational basis for endpoints in subsequent trials. Further, because our model was validated for both disease-free and disease-specific survival, it can be flexibly applied in clinical trials with different outcomes. Finally, time-dependent analyses indicate that our model remains significantly prognostic 180 months after diagnosis (approximating the extreme of uterine leiomyosarcoma recurrence38), offering further flexibility for trials with different time endpoints.

This study is strengthened by rigorous pathology review, few missing data points, and external validation through a multi-institutional cohort involving review by pathologists not directly involved in model training. This study also has certain limitations. First, although our model was supported on both internal and external validation, there were relatively few recurrences or deaths from disease in the Italian and Chicago subcohorts, and further external validation is warranted. Second, morphologic review in the current study was principally performed by a single gynecologic pathologist (D.B.C.), precluding assessment of interobserver reproducibility in scoring of individual parameters (e.g., serosal abutment, atypical mitoses and lymphovascular invasion41) or in summative scoring of individual tumors. This must be established in a subsequent study. Third, the novel risk model may not be readily applicable to every uterine leiomyosarcoma encountered in clinical practice, principally because lymphovascular invasion and serosal abutment are difficult to assess in myomectomies, morcellated hysterectomies, and referrals (where, not uncommonly, only select tumor slides undergo pathological review). Nonetheless, the complete risk model was applicable to 176/203 (87%) tumors in this cohort, likely representing a conservative estimate, given the retrospective nature of the study and inclusion of referral material. In practice, patients with a diagnosis of uterine leiomyosarcoma in myomectomy will be likely followed by hysterectomy, in which the assessment of all parameters required for our risk formula may be performed. For referral cases, obtaining all tumor slides will help maximize the applicability of our model. A simplified risk model omitting the lymphovascular invasion and serosal abutment terms could be applied to 201/203 tumors (including 25 of 25 tumors diagnosed on myomectomy or morcellated specimens). Although this simplified model slightly underperformed the complete novel risk model in the entire cohort, we nonetheless advocate applying it in cases where missing information precludes applying the complete model, as it is significantly associated with outcome. (For comparison, the MSKCC nomogram was applicable to 184/203 (91%) tumors in this cohort.) Fourth, myxoid, epithelioid, and pleomorphic morphologies were relatively rare in this cohort. Nonetheless, these non-spindled histotypes were included in our training and validation cohorts, and post hoc subcohort survival analyses (see Supplementary Figure J) indicate that our model significantly separates low-risk from high-risk tumors across non-spindled morphotypes. These findings, together with probable interobserver variability in histotype assignment in some cases, suggest that our risk model can be applied to all stage I uterine leiomyosarcomas in routine practice. Validation in rarer morphologic subtypes should be an emphasis of subsequent study. Fifth, tumor morcellation was too rare in this cohort to detect its well-characterized association with adverse outcome42,43,44, and surgical margins were excluded due to high collinearity with serosal abutment in this cohort. Thus, while our assessment of routinely evaluated clinical and morphologic parameters is robust, our model should not be taken as a negation of rare but clinically important adverse indicators, and we consider it prudent to record the presence of a positive margin.

Meticulous attention was devoted to proper staging of all tumors in this cohort. One area of potential variability in routine diagnostic practice pertains to the classification and staging implications of different forms of serosal involvement. International Collaboration on Cancer Reporting guidelines45 advocate reporting of serosal involvement, though the manuscript cited in support of this recommendation did not provide a morphologic definition of serosal involvement or include histopathologic review, and nearly half of studied cases lacked data on serosal involvement22. FIGO 2018 criteria define stage I uterine leiomyosarcoma as tumor limited to the uterus, whereas stage II disease is defined by tumor beyond the uterus but confined to the pelvis26. Strictly interpreting these definitions, we consider tumor invasion through the myometrium to abut the underside of the uterine serosa (i.e., tumor covered by a thin capsule, with no residual intervening myometrial fibers on H&E-stained slides, termed here “serosal abutment”) to be still in keeping with stage I disease, whereas direct invasion through the serosa or serosal deposits via trans-peritoneal spread are best regarded as stage II disease. Our finding that serosal abutment in uterine-confined (i.e., stage I) leiomyosarcoma is associated with poorer prognosis is in keeping with previous data from a smaller cohort at a separate institution46. Together, these data raise the possibility that any form of serosal involvement, including serosal abutment, is associated with more aggressive behavior and should be regarded as stage II disease. However, until comparative studies are performed and existing staging criteria are updated, we believe the staging approach used in this study is most in keeping with current definitions and advocate this approach for consistency between institutions. Furthermore, our risk model remained significantly associated with both disease-free and disease-specific survival even after exclusion of the serosal abutment term, indicating that it could still be reliably applied under a modified staging system rubric in which any form of serosal involvement were regarded as stage II disease.

Our morphology-based risk model is neither perfectly sensitive nor specific for individual patient outcomes -- low-risk tumors carried a 12% risk of tumor recurrence within 5 years, whereas 9% of patients with high-risk tumors had no recurrence in this timeframe. Future studies could investigate whether our novel risk model is enhanced by molecular and/or immunohistochemical profiling of some or all tumors. Relevant molecular alterations could serve as predictive as well as prognostic markers, facilitating study of targeted therapies. Furthermore, although this study was not designed to revise diagnostic criteria for uterine leiomyosarcoma or examine diagnostic thresholds for leiomyosarcoma versus smooth muscle tumor of uncertain malignant potential, our description of a discrete low-risk group of uterine leiomyosarcomas with 5-year disease-free survival of 88% suggests that this uterine leiomyosarcoma subset (as diagnosed by conventional morphologic criteria27,28) has considerable clinical overlap with smooth muscle tumor of uncertain malignant potential47. Additional studies comparing these groups are warranted and could foreseeably lead to a refinement of diagnostic criteria and terminology. However, until such studies are performed, this novel risk model should only be applied to uterine leiomyosarcomas, as defined by the World Health Organization28). In diagnostic reporting, we advocate for use of the terms “uterine leiomyosarcoma, low risk,” “…intermediate risk,” and “…high risk,” as appropriate to an individual tumor. To avoid confusion or ambiguity, we strongly advocate against use of the terms “low grade,” “intermediate grade,” and “high grade” to describe the risk categories outlined by our model.

In summary, we have developed a three-tiered risk model with excellent prognostic power for 5-year disease-free and disease-specific survival in women with stage I uterine leiomyosarcoma. This novel risk model is 1) based on routine morphologic features, to promote routine application by practicing pathologists; 2) mathematically intuitive, to facilitate patient counseling; and 3) prognostically robust and validated across time and outcome endpoints, to promote use in much-needed clinical trials for women with stage I uterine leiomyosarcoma.