Histopathological whole slide image dataset for classification of treatment effectiveness to ovarian cancer

Ovarian cancer is the leading cause of gynecologic cancer death among women. Regardless of the development made in the past two decades in the surgery and chemotherapy of ovarian cancer, most of the advanced-stage patients are with recurrent cancer and die. The conventional treatment for ovarian cancer is to remove cancerous tissues using surgery followed by chemotherapy, however, patients with such treatment remain at great risk for tumor recurrence and progressive resistance. Nowadays, new treatment with molecular-targeted agents have become accessible. Bevacizumab as a monotherapy in combination with chemotherapy has been recently approved by FDA for the treatment of epithelial ovarian cancer (EOC). Prediction of therapeutic effects and individualization of therapeutic strategies are critical, but to the authors’ best knowledge, there are no effective biomarkers that can be used to predict patient response to bevacizumab treatment for EOC and peritoneal serous papillary carcinoma (PSPC). This dataset helps researchers to explore and develop methods to predict the therapeutic effect of patients with EOC and PSPC to bevacizumab. Measurement(s) Therapeutic Effect Technology Type(s) Artificial Intelligence Factor Type(s) whole slide image Sample Characteristic - Environment pathologic primary tumor stage for ovary according to AJCC 7th edition Measurement(s) Therapeutic Effect Technology Type(s) Artificial Intelligence Factor Type(s) whole slide image Sample Characteristic - Environment pathologic primary tumor stage for ovary according to AJCC 7th edition

effectiveness relies upon the disease stage and histology. 90% of the women are curable if the disease is detected and treated at an early stage, even in those with more aggressive histologic subtypes. But at the time of diagnosis, most of the women have advanced stage disease, challenging the effectiveness of debulking surgery, chemo therapy, and biologic therapy 10 . Over the last two decades, the conventional treatment for the EOC is optimal cytoreductive surgery and platnium based chemotherapy. Although approximately 80% of patients respond to firstline chemotherapy, the recurrence rate and chemoresistance is still high 11,12 . Thus, current studies are aimed on finding new therapeutic targets. Different targeted therapy and biological drugs have been gradually introduced into clinical trials for the treatment of recurrent disease as single agents, followed by combination treatments, bring the promise of turning ovarian cancer into a controllable chronic disease 5,12 .
For the treatment of recurrent disease, targeted agents have been introduced in the clinical trials to evaluate the activity as single agents, followed by the combination treatment. Over the past few years, from the introduc tion of concurrent bevacizumab, there has been a huge advancement in nonoverlapping toxicity and improving activity 5 . Bevacizumab has been recently approved by FDA as a monotherapy for advanced EOC in combination with chemotherapy. Bevacizumab is a recombinant humanized monoclonal antibody that binds with vascular endothelial growth factor (VEGF) and neutralizes the biological activity of VEGF, and inhibits tumor angi ogenesis. In 2011, according to the GOG0218 and ICON7 trial data, bevacizumab has been approved by the European Commission for firstline treatment together with standard chemotherapy in women with advanced EOC 12 . Bevacizumab has been shown to improve progression free survival for 2-4 months and in some set tings also overall survival 12 . Although it is important to optimize the antiangiogenic therapeutic effect, but antiangiogenic agents can be very expensive and can cause serious side effects, such as delayed wound healing, hypertension, and intestinal perforation or fistula formation 5 . Therefore, considering the cost, potential toxicity, and finding that only a portion of patients will benefit from these drugs, the identification of new predictive method for the treatment of EOC remains an urgent unmet medical need. Currently, the standard diagnosis of EOC is done from the microscopic analysis of tissues section from debulking surgery, that are mounted on hematoxylin and eosin (H&E) stained glass slides 13 . Digital whole slide images (WSIs) are used to study an entire histology slide. Further, WSIs helps the pathologists to refine their decisions by performing computeraided diagnostic (CAD) analysis. CAD analysis can automatically produce diagnostic cues that can increase the diag nostic accuracy [14][15][16] , while also saving time. To the authors' best knowledge, there is no effective biomarkers that can be used to predict the therapeutic effect of EOC and PSPC to bevacizumab treatment. Here we present a dataset of H&E WSIs with clinical information of EOC and PSPC patients, which will help researchers to explore and develop methods to predict the therapeutic effect of patients with EOC and PSPC to bevacizumab. This dataset is composed of deidentified 288 H&E stained WSIs (including 162 effective and 126 invalid WSIs) with clinical information of EOC and PSPC patients collected from 78 patients at the TriService General Hospital and the National Defense Medical Center, Taipei, Taiwan. Examples of both effective and invalid treatment are shown in Fig. 1. Selection and preparation of specimen. Specimens of EOC and PSPC had been surgically removed by the treating gynecologist for clinical suspicious of primary ovarian tumor. The tissue had been routinely fixed in formalin and embedded in paraffin. For this study tissue sections of treated effective and invalid groups were produced from the tissue blocks and staining with hematoxylin and eosin (Fig. 1) using an automated slide stainer (ST5010 Autostainer XL, Leica, Germany). Case selection was random and specimens with accept able tissue quality were included. All images were digitized using a linear whole slide scanner (Leica AT Turbo, Leica, Germany) at a resolution of 0.5 microns per pixel (20X). Furthermore, all the images were examined by a pathologist who found them consistent and displaying no evidence of significant variations in intensity and color. Figure 2 shows the procedure followed for the generation of H&E stained WSIs.

Data Records
Data information. Data has been accepted for publication on The Cancer Imaging Archive (TCIA). The dataset presented here is publicly available freeofcharge from the TCIA 17 . The dataset consists of 288 H&E stained WSIs, including 162 effective and 126 invalid WSIs were obtained from different tissue blocks of posttreatment specimens. Figure 3

Technical Validation
To validate the proposed dataset, deidentified, digitized WSIs of 70 EOC and 8 PSPC patients (n = 78) including HGSOC (n = 58), endometrioid carcinoma (n = 4), clear cell carcinoma (n = 7), mucinous carcinoma (n = 2) and unclassied adenocarcinoma (n = 7) were obtained from the tissue bank of the Department of Pathology, TriService General Hospital, National Defense Medical Center, Taipei, Taiwan. The clinicopathologic characteristics of patients were recorded by the data managers of the Gynecologic Oncology Center. Age, pre and posttreatment serum CA125 concentrations, histologic subtype, and recurrence status were recorded. These patients had received debulk ing surgery and chemotherapy with bevacizumab treatment. The regimen of chemotherapy with bevacizumab was based on the GOG218, ICON7, and GOG213 trials. Patients with persistently high levels of CA125 during bevacizumab therapy or who experienced tumor progression or recurrence (assessed by CT/PET imaging) within six months posttreatment were classified as the bevacizumabresistant group. Patients with normal levels of CA125 and no tumor progression or recurrence (based on imaging) during or within six months of bevacizumab treatment were classified as the bevacizumab sensitive group. Out of 288 patients slides, the bevacizumab treatment is effective for 162 slides (56.3%) and invalid for 126 slides (43.7%), as shown in Fig. 3(b).

Code availability
No code was used in the generation of this data. No code is required to access or analyze this dataset.