Multimodality annotated hepatocellular carcinoma data set including pre- and post-TACE with imaging segmentation

Moawad, Ahmed W.; Morshid, Ali; Khalaf, Ahmed M.; Elmohr, Mohab M.; Hazle, John D.; Fuentes, David; Badawy, Mohamed; Kaseb, Ahmed O.; Hassan, Manal; Mahvash, Armeen; Szklaruk, Janio; Qayyum, Aliyya; Abusaif, Abdelrahman; Bennett, William C.; Nolan, Tracy S.; Camp, Brittney; Elsayes, Khaled M.

doi:10.1038/s41597-023-01928-3

Download PDF

Data Descriptor
Open access
Published: 18 January 2023

Multimodality annotated hepatocellular carcinoma data set including pre- and post-TACE with imaging segmentation

Ahmed W. Moawad ORCID: orcid.org/0000-0001-6860-1513^1,2,
Ali Morshid¹,
Ahmed M. Khalaf ORCID: orcid.org/0000-0002-0355-1516¹,
Mohab M. Elmohr^1,7,
John D. Hazle¹,
David Fuentes¹,
Mohamed Badawy¹,
Ahmed O. Kaseb³,
Manal Hassan³,
Armeen Mahvash⁴,
Janio Szklaruk⁵,
Aliyya Qayyum⁵,
Abdelrahman Abusaif⁵,
William C. Bennett⁶,
Tracy S. Nolan ORCID: orcid.org/0000-0002-7023-7586⁶,
Brittney Camp⁶ &
…
Khaled M. Elsayes⁵

Scientific Data volume 10, Article number: 33 (2023) Cite this article

2813 Accesses
6 Citations
Metrics details

Subjects

Abstract

Hepatocellular carcinoma (HCC) is the most common primary liver neoplasm, and its incidence has doubled over the past two decades owing to increasing risk factors. Despite surveillance, most HCC cases are diagnosed at advanced stages and can only be treated using transarterial chemo-embolization (TACE) or systemic therapy. TACE failure may occur with incidence reaching up to 60% of cases, leaving patients with a financial and emotional burden. Radiomics has emerged as a new tool capable of predicting tumor response to TACE from pre-procedural computed tomography (CT) studies. This data report defines the HCC-TACE data collection of confirmed HCC patients who underwent TACE and have pre- and post-procedure CT imaging studies and available treatment outcomes (time-to-progression and overall survival). Clinically curated segmentation of pre-procedural CT studies was done for the purpose of algorithm training for prediction and automatic liver tumor segmentation.

Measurement(s)	Image Segmentation • hepatocellular carcinoma
Technology Type(s)	Neural Network • Multiphasic Computed Tomography of the Abdomen
Sample Characteristic - Organism	multiphasic CT of the abdomen
Sample Characteristic - Location	contiguous United States of America

Quantitative dual-energy CT for evaluating hepatocellular carcinoma after transarterial chemoembolization

Article Open access 27 May 2021

Value of perfusion parameters histogram analysis of triphasic CT in differentiating intrahepatic mass forming cholangiocarcinoma from hepatocellular carcinoma

Article Open access 30 November 2021

Automated feature quantification of Lipiodol as imaging biomarker to predict therapeutic efficacy of conventional transarterial chemoembolization of liver cancer

Article Open access 22 October 2020

Background & Summary

Hepatocellular carcinoma (HCC) is the most common primary liver neoplasm, with an incidence of 42,810 newly diagnosed cases in the United States in 2020¹. The rates of HCC have doubled over the past two decades and are anticipated to continue to increase owing to increasing risk factors such as liver cirrhosis, steatohepatitis, and obesity^2,3,4. Despite close HCC surveillance, 70%–80% of HCC cases are diagnosed at advanced stages when they are unresectable. Treatment of unresectable HCC includes transarterial chemo-embolization (TACE) and systemic sorafenib therapy^5,6,7.

TACE selectively delivers chemotherapy to targeted liver tumors, taking advantage of the fact that HCC primarily receives its blood supply from the hepatic artery, while the liver parenchyma primarily receives its blood supply from the portal vein. This technique spares the healthy liver parenchyma from being damaged by the chemotherapy⁵. The Barcelona Clinic Liver Cancer (BCLC) staging system is recommended for HCC patient stratification and treatment selection. It includes patient performance status, severity of the underlying liver disease using clinical and laboratory markers of liver synthetic performance (Child Pugh Grading [CPG]), and extent of tumor including its size, number of tumor foci, metastasis, and vascular invasion^7,8. Candidates for TACE therapy include BCLC stage B patients (with intermediate HCC) and patients waiting for a liver transplant who underwent TACE as bridging therapy. The median survival duration of patients with HCC is 16 months from diagnosis for intermediate cases, which decreases to 6–8 months for advanced cases^8,9,10.

TACE is an invasive procedure with several potential adverse effects, including treatment failure, organ failure, and even death. In addition, studies have shown that up to 60% of patients with HCC who undergo TACE do not benefit from the procedure in spite of multiple sessions, causing financial and emotional burdens for the patient^11,12,13. Clinical models are being developed for the prediction of TACE outcomes and the selection of patients who will benefit from the procedure. However, the high variability in TACE response has encouraged researchers to develop complicated predictive models using histological tumor markers like vascular endothelial growth factor, biological markers like alpha-fetoprotein, or a combination of both^{13,14,15,16,17}. However larger patient sample sizes are needed to validate their prognostic value. In addition, poor detection rate of alpha-feto protein in patients with small residual size tumor after TACE may limit its prognostic significance^1,18. Artificial intelligence (AI) and its sub-classes, deep learning, machine learning, and radiomics, have been used in imaging for various tasks including classification, segmentation, and detection. Prediction is an important task that has been shown to be best solved by AI. This includes prediction of disease severity, prediction of treatment response, and prediction of disease progression^19,20,21,22. AI-based prediction models are usually not included in reports from large multi-center randomized trials, limiting the development of widely used predictive models. Furthermore, most of the in-house-developed models lack generalizability to other institutions or even with different datasets. Therefore, external validation of AI models is an integral step for the model to be widely used in day-to-day clinical practice^23,24,25,26. Although large imaging datasets have been made available to the public to be used in further research steps or in re-generating the original results, treatment-related datasets are scarce owing to the complexity of various treatments and the variability in tumor grading and treatment selection. Recent advances in patient de-identification and image registration have allowed for the creation of different imaging databases within The Cancer Imaging Archive (TCIA) that may offer a better opportunity for treatment response validation^27,28.

Here, we present the HCC-TACE collection, a single-institution collection of 105 patients with confirmed HCC treated at The University of Texas MD Anderson Cancer Center from 2002 to 2012. The HCC-TACE collection integrates de-identified, comprehensive clinical data with diagnostic imaging and its manual segmentation and makes these data publicly available to researchers. Unique to this dataset is the inclusion of TACE procedure details, imaging before and after the procedure, manual segmentation (liver parenchyma, viable and necrotic tumor tissue, intra-hepatic vessels, and aorta), radiological measures of treatment response by experts in abdominal imaging, as well as patient outcomes after the procedure in the form of overall survival (OS) and time-to-progression (TTP). These data were used in prior publications, which evaluated prediction of TACE procedure outcomes using pre-procedural imaging and proposed convolutional neural network architecture for segmentation of HCC and automation of the prediction process^29,30. The open access to this data allows for inter-institutional comparisons of non-randomized patient, treatment, and outcome data, in addition to the development of new architectures and models with higher performance and the external validation of other developed models using our patient cohort.

Methods

Study cohort and patient selection

To develop this dataset, the MD Anderson Cancer Center institutional database was searched for patients with HCC treated from November 2002 to June 2012. Inclusion criteria were patients with HCC who (1) underwent TACE as the sole first-line therapy or initial bridging therapy, (2) had available multi-phasic contrast-enhanced CT images with liver protocol obtained prior to TACE (Pre-procedural scans), (3) had available multi-phasic contrast-enhanced CT images with liver protocol obtained within 14 weeks from the TACE procedure (post-procedural scans), and (4) had CT images of acceptable quality with no obvious artifacts. Patients undergoing TACE who had more than one HCC focus were excluded, so that calculation of OS and TTP depends only on one tumor and with no confounders. This study was approved by our institutional review board; written informed consent was waived due to the retrospective nature of the study. Pre-procedural CT images were obtained 1–12 weeks prior to the first TACE session (average 3 weeks).

The final patient cohort identified in our institutional database (N = 105) was 68 male patients (average age, 66.4 years [range, 31–88 years]) and 37 female patients (average age, 69.6 years [range, 46–93 years]). Risk factors for development of HCC were reported for each patient, including hepatitis, smoking, alcohol use, diabetes, cirrhosis, and family history of either liver disease or cancer. Multiple grading systems were recorded for our cohort to enhance the usage of our dataset: CLIP score³¹, Okuda score³², TNM staging system³³, and BCLC staging system³⁴. In addition, performance status according to the Eastern Cooperative Oncology Group (ECOG) scale, CPG, alpha-fetoprotein level, and tumor extent (tumor size, vascular and lymph node invasion, distant metastasis) were reported.

Image acquisition

All patients underwent contrast-enhanced CT of the abdomen, with liver protocol on 16–, or 64–detector row CT scanners (LightSpeed; GE Healthcare, Waukesha, WI, USA). A pre-contrast scan was obtained, followed by an arterial phase scan 17 seconds after peak enhancement (using bolus tracking) of the aorta after injection of contrast medium. The porto-venous phase was scanned at 60 seconds. Images were acquired with the following scanner parameters: CT tube voltage of 120–140 KVp; Tube current of 150–630 mA; slice thickness of 0.63–5 mm; Pitch of 0.9–0.98; revolution time of 0.40–0.80 seconds; table speed of 18.75–39.38 mm/gantry rotation and field of view of 360–460 mm. The injection rate of contrast medium was 3–5 ml/sec. Standard image reconstruction algorithm was used in all cases. A total of 621 CT series (pre-procedural and post-procedural multi-phasic scans) from 105 patients were examined.

Assessment of tumor response

Tumor response to TACE was assessed using European Association for the Study of the Liver (EASL), Response Evaluation Criteria in Solid Tumors (RECIST) 1.1, and modified RECIST (mRECIST) guidelines. All pre-and post-procedural studies were reviewed by three different board-certified radiologists (K.M.E. [reader 1], J.S. [reader 2], and A.Q. [reader 3]), each with more than 20 years of experience in abdominal imaging. They independently measured tumors in both pre- and post-procedural studies, taking into consideration tumor viability and enhancement in the arterial phase. EASL measurements were recorded by reader 1 only, RECIST 1.1 and mRECIST measurements were recorded by all the three readers.

TACE procedure and study endpoints

Patients undergoing TACE were administered one of the following chemotherapy regimens: (a) doxorubicin in 20- to 100-mg drug-eluting beads (33 lesions; LC Beads, DEBDOX, BTG International, London, England) or (b) cisplatin, doxorubicin, and mitomycin C (100, 50, and 10 mg, respectively; 55 lesions). Details about the TACE procedure were missing in 17 patients. The patient cohort was monitored longitudinally with post-procedural CT. Each lesion was monitored for progression based on radiology reports. The TTP was defined as the number of weeks from the treatment (TACE) to the date of radiological evidence of progression according to mRECIST criteria. Lesions were considered as censored if (a) there was no progression by the study date, (b) the patient was lost to follow-up, either due to not appearing or died in the meantime, (c) the date of death was before the date of tumor progression, or (d) treatment was changed to something other than TACE. In our previous studies, we divided patients into TACE-susceptible and TACE-refractory groups with a cutoff TTP of 14 weeks. TACE-susceptible patients are those who do not show radiological progression at follow-up CT, while TACE-refractory patients show radiological progression of the tumor.

Data processing and curation

The AI approach for building a neural network or machine learning model is to extract imaging features from sub-volumes in the study. Segmentation of CT studies has been the main obstacle in creating such models from publicly available datasets as acquiring well-curated data is usually a time-consuming and labor-intensive process that requires dedicated personnel. In order to build our model, segmentation of the tumor (both viable and necrotic volumes) and background liver was done. The porto-venous phase of pre-procedural CT studies was used to simplify lesion assessment, and CT studies were exported in DICOM format and subsequently converted into Neuroimaging Informatics Technology Initiative (NIFTI) format to preserve orientation information and pixel spacing, with simpler headers than standard DICOM format. For each patient study, pre-contrast, arterial, and port-venous images of the pre-procedural scans were registered and re-sampled to the port-venous phase images. Segmentation of intra-hepatic vessels and the abdominal aorta was done as well. Manual segmentation was done using semi-automated segmentation tools available in AMIRA software (FEI, Thermo Fisher Scientific, Hillsboro, OR, USA) by three different radiology residents (A.M., A.M.K., and M.E.) and reviewed by a body imaging radiologist with 20 years of experience (K.M.E.).

The three different segmentations were validated and combined together to produce a single image using the STAPLE algorithm to produce the ground truth segmentation. The STAPLE algorithm uses the different segmentations as an input and generate a binary image of each voxel being the “true” segmentation. This process is achieved on each label³⁵. Image registration was done using affine transformation and linear interpolation. All images manipulation, and STAPLE image production were performed using the Convert3D medical image processing tool available with the ITK-SNAP software package³⁶. To enhance the generalizability of the dataset, segmented NIFTI files have been converted to DICOM-SEG using the DICOM for Quantitative Imaging (dcmqi) library for Quantitative Image Informatics for Cancer Research (QIICR) in the 3D-Slicer software package^37,38. The process of data curation and processing is demonstrated in Fig. 1.

Neural network architecture and training

Our original work²⁹ was to automate the treatment response prediction process by automating segmentation of the liver and both viable and necrotic tumor. Two back-to-back convolutional neural networks (CNNs) were constructed for that purpose. The first CNN (CNN1) was built to segment liver tissue from the background using axial images of port-venous phase CT scans; this network was trained on the axial CT images and corresponding liver segmentations from the Medical Image Computing and Computer Assisted Intervention Society Liver Tumor Segmentation (or LiTS) challenge (130 manually labeled CT images; publicly available)³⁹. CNN2 was built to segment HCC from the output of CNN1, and it was trained separately on the manual segmentation of our cohort (105 manually labeled images; publicly available). Both CNNs follow the U-Net architecture^40,41. The output of CNN1 is a CT image with binary classification of the liver tissue; this image serves as input to CNN2, which segments the tumor from the liver mask. Each convolution operation uses a 3 × 3 kernel size and is followed by batch normalization. The rectified linear unit activation function was also used. A dropout (P = 0.5) was used before each convolution in the up-sampling limb of the U-Net. Our CNN architecture and codes can be found in our GitHub repository (https://github.com/fuentesdt/livermask).

Data Records

Our dataset consists of (a) 51,832 DICOM files from 621 series and 210 studies collected from 105 patients; (b) 105 DICOM-SEG files, each containing segmentation of the liver, tumor (viable and necrotic), intra-hepatic vessels, and aorta of pre-procedural CT images; and (c) a single spreadsheet file including all of the demographic, clinical, and diagnostic data, as well as EASL, RECIST, and mRECIST readings of each pre- and post-procedural CT images.

There are total of 203 series of pre-contrast phase, 204 series of arterial phase, 210 series of porto-venous phase, and 4 series of delayed phase. Of note, there are 48 series that have combined phases in one series owing to technical errors export of DICOM files from our PACS system to a separate research folder for de-identification. Detailed description for each series is found in supplementary table 1. Available corresponding clinical and survival information for the patients is found in tabular format. Table 1 shows selected headers from the spreadsheet and their relevant descriptions. All of the CT Images (stored as de-identified DICOM files), segmentations (stored as DICOM-Seg files), and spreadsheets containing relevant clinical and radiological information are available at TCIA database in this reference⁴².

Table 1 Selected headers from the spreadsheet.

Full size table

Technical Validation

Patient’s DICOM files were de-identified using curated clinical trial protocol (CTP) developed by medical imaging resource center (MIRC) recommended by radiological society of North America. The program removes all protected health information (PHI) from all DICOM files metadata. It also replaces study, series and image unique identifiers (UIDs) with hashed version so we ensure complete de-identification of the images in accordance with the Health Insurance Portability and Accountability Act (HIPAA)²⁸.

The POSDA Tools used by TCIA for technical validation of this DICOM collection. These tools are openly available, and contributions from the research community are encouraged: https://github.com/UAMS-DBMI/PosdaTools. For further detail on the curation of DICOM CT and SEG files using POSDA Tools within TCIA, please refer to https://posda.com.

To support validation of this collection, the capabilities of POSDA Tools were extended in the following way: extract each 3D-slicewise segmentation from a DICOM-SEG file, convert it to a set of 2D contours, and display those contours superimposed on the referenced CT slice for a (non-radiologist) curator’s eye to confirm reasonable alignment and file linkage. Every CT file and CT file referenced by Unique Identifier in the SEG files was confirmed to exist and checked to ensure data completeness; this capability was added in response to a missing CT file error, which affects one segmentation series for Patient ID HCC_001. This missing file corresponds to image slice far from liver, tumor and their masks, so clinical interpretation should be unaffected by this discrepancy.

Usage Notes

Multiple open-source software can be used to visualize the DICOM-Seg files; we highly recommend using the latest stable version of 3D-Slicer for data visualization after installing “quantitative reporting” extension. Step-by-step guidance for installation and guidance can be found in: https://qiicr.gitbook.io/quantitativereporting-guide/. For the full list of the available software, please visit dcmqi documentation for instructions at: https://dicom4qi.readthedocs.io/en/latest/results/seg/.

Code availability

The POSDA Tools used by TCIA for technical validation of this DICOM collection, are openly available, and contributions from the research community are encouraged: https://wiki.cancerimagingarchive.net/display/Public/Submission+and+De-identification+Overview For further detail on the curation of DICOM CT and SEG files using Posda Tools within TCIA, please refer to https://posda.com. CNN architecture and codes used in this manuscript can be found in our GitHub repository (https://github.com/fuentesdt/livermask).

References

Armstrong, S. A. & He, A. R. Immuno-oncology for Hepatocellular Carcinoma: The Present and the Future. Clinics in Liver Disease 24, 739–753 (2020).
Article Google Scholar
Aly, A., Ronnebaum, S., Patel, D., Doleh, Y. & Benavente, F. Epidemiologic, humanistic and economic burden of hepatocellular carcinoma in the USA: a systematic literature review. Hepatic Oncology 7, HEP27, https://doi.org/10.2217/hep-2020-0024 (2020).
Article Google Scholar
Kim, H.-s & El-Serag, H. B. The Epidemiology of Hepatocellular Carcinoma in the USA. Current Gastroenterology Reports 21, 17, https://doi.org/10.1007/s11894-019-0681-x (2019).
Article CAS Google Scholar
Moawad, A. W. et al. Angiogenesis in Hepatocellular Carcinoma; Pathophysiology, Targeted Therapy, and Role of Imaging. J Hepatocell Carcinoma 7, 77–89, https://doi.org/10.2147/JHC.S224471 (2020).
Article CAS Google Scholar
Pesapane, F., Nezami, N., Patella, F. & Geschwind, J. New concepts in embolotherapy of HCC. Medical Oncology 34, 58 (2017).
Article CAS Google Scholar
Xiang, X. et al. Distribution of tumor stage and initial treatment modality in patients with primary hepatocellular carcinoma. Clinical and Translational Oncology 19, 891–897 (2017).
Article CAS Google Scholar
Marrero, J. A. et al. Diagnosis, staging, and management of hepatocellular carcinoma: 2018 practice guidance by the American Association for the Study of Liver Diseases. Hepatology 68, 723–750 (2018).
Article Google Scholar
Liver, E. A. F. T. S. O. T. EASL clinical practice guidelines: management of hepatocellular carcinoma. Journal of hepatology 69, 182–236 (2018).
Article Google Scholar
Singal, A. G., Pillai, A. & Tiro, J. Early detection, curative treatment, and survival rates for hepatocellular carcinoma surveillance in patients with cirrhosis: a meta-analysis. PLoS Med 11, e1001624, https://doi.org/10.1371/journal.pmed.1001624 (2014).
Article Google Scholar
Heimbach, J. K. et al. AASLD guidelines for the treatment of hepatocellular carcinoma. Hepatology 67, 358–380, https://doi.org/10.1002/hep.29086 (2018).
Article Google Scholar
Llovet, J. M. et al. Arterial embolisation or chemoembolisation versus symptomatic treatment in patients with unresectable hepatocellular carcinoma: a randomised controlled trial. The Lancet 359, 1734–1739 (2002).
Article Google Scholar
Garwood, E. R., Fidelman, N., Hoch, S. E., Kerlan, R. K. Jr & Yao, F. Y. Morbidity and mortality following transarterial liver chemoembolization in patients with hepatocellular carcinoma and synthetic hepatic dysfunction. Liver Transplantation 19, 164–173 (2013).
Article Google Scholar
Sciarra, A. et al. TRIP: a pathological score for transarterial chemoembolization resistance individualized prediction in hepatocellular carcinoma. Liver international 35, 2466–2473 (2015).
Article CAS Google Scholar
Huang, G.-W., Yang, L.-Y. & Lu, W.-Q. Expression of hypoxia-inducible factor 1α and vascular endothelial growth factor in hepatocellular carcinoma: impact on neovascularization and survival. World journal of gastroenterology: WJG 11, 1705 (2005).
Article CAS Google Scholar
Yu, S. J. et al. Targeted proteomics predicts a sustained complete-response after transarterial chemoembolization and clinical outcomes in patients with hepatocellular carcinoma: a prospective cohort study. Journal of proteome research 16, 1239–1248 (2017).
Article CAS Google Scholar
Kim, B. K. et al. Risk prediction for patients with hepatocellular carcinoma undergoing chemoembolization: development of a prediction model. Liver International 36, 92–99, https://doi.org/10.1111/liv.12865 (2016).
Article CAS Google Scholar
Jeong, S. O. et al. Predictive factors for complete response and recurrence after transarterial chemoembolization in hepatocellular carcinoma. Gut and liver 11, 409 (2017).
Article CAS Google Scholar
Guo, J.-h, Zhu, X., Li, X.-t & Yang, R.-j Impact of serum vascular endothelial growth factor on prognosis in patients with unresectable hepatocellular carcinoma after transarterial chemoembolization. Chinese journal of cancer research 24, 36–43 (2012).
Article CAS Google Scholar
Zhu, H. B. et al. Deep learning-assisted magnetic resonance imaging prediction of tumor response to chemotherapy in patients with colorectal liver metastases. Int J Cancer 148, 1717–1730, https://doi.org/10.1002/ijc.33427 (2021).
Article CAS Google Scholar
Wang, C. J. et al. Deep learning for liver tumor diagnosis part II: convolutional neural network interpretation using radiologic imaging features. Eur Radiol 29, 3348–3357, https://doi.org/10.1007/s00330-019-06214-8 (2019).
Article Google Scholar
Shui, L. et al. The Era of Radiogenomics in Precision Medicine: An Emerging Approach to Support Diagnosis, Treatment Decisions, and Prognostication in Oncology. Front Oncol 10, 570465, https://doi.org/10.3389/fonc.2020.570465 (2020).
Article Google Scholar
Jin, J. et al. Deep learning radiomics model accurately predicts hepatocellular carcinoma occurrence in chronic hepatitis B patients: a five-year follow-up. Am J Cancer Res 11, 576–589 (2021).
CAS Google Scholar
Liu, D. et al. Accurate prediction of responses to transarterial chemoembolization for patients with hepatocellular carcinoma by using artificial intelligence in contrast-enhanced ultrasound. European Radiology 30, 2365–2376, https://doi.org/10.1007/s00330-019-06553-6 (2020).
Article Google Scholar
Shaish, H. et al. Radiomics of MRI for pretreatment prediction of pathologic complete response, tumor regression grade, and neoadjuvant rectal score in patients with locally advanced rectal cancer undergoing neoadjuvant chemoradiation: an international multicenter study. European Radiology 30, 6263–6273, https://doi.org/10.1007/s00330-020-06968-6 (2020).
Article CAS Google Scholar
Trebeschi, S. et al. Predicting response to cancer immunotherapy using noninvasive radiomic biomarkers. Annals of Oncology 30, 998–1004, https://doi.org/10.1093/annonc/mdz108 (2019).
Article CAS Google Scholar
Bi, W. L. et al. Artificial intelligence in cancer imaging: Clinical challenges and applications. CA: A Cancer Journal for Clinicians 69, 127–157, https://doi.org/10.3322/caac.21552 (2019).
Article Google Scholar
Clark, K. et al. The Cancer Imaging Archive (TCIA): maintaining and operating a public information repository. Journal of digital imaging 26, 1045–1057 (2013).
Article Google Scholar
Moore, S. M. et al. De-identification of medical images with retention of scientific research value. Radiographics 35, 727–735 (2015).
Article Google Scholar
Morshid, A. et al. A machine learning model to predict hepatocellular carcinoma response to transcatheter arterial chemoembolization. Radiology: Artificial Intelligence 1, e180021 (2019).
Google Scholar
Khalaf, A. et al. Hepatocellular carcinoma response to transcatheter arterial chemoembolisation using automatically generated pre-therapeutic tumour volumes by a random forest-based segmentation protocol. Clinical radiology 74, 974. e913–974. e920 (2019).
Article Google Scholar
Investigators, C. o. t. L. I. P. A new prognostic system for hepatocellular carcinoma: a retrospective study of 435 patients. Hepatology 28, 751–755 (1998).
Article Google Scholar
Okuda, K. et al. Natural history of hepatocellular carcinoma and prognosis in relation to treatment study of 850 patients. Cancer 56, 918–928 (1985).
Article CAS Google Scholar
Minagawa, M., Ikai, I., Matsuyama, Y., Yamaoka, Y. & Makuuchi, M. Staging of hepatocellular carcinoma: assessment of the Japanese TNM and AJCC/UICC TNM systems in a cohort of 13,772 patients in Japan. Annals of surgery 245, 909 (2007).
Article Google Scholar
Llovet, J. M., Brú, C. & Bruix, J. in Seminars in liver disease. 329–338 (© 1999 by Thieme Medical Publishers, Inc.).
Warfield, S. K., Zou, K. H. & Wells, W. M. Simultaneous truth and performance level estimation (STAPLE): an algorithm for the validation of image segmentation. IEEE Trans Med Imaging 23, 903–921, https://doi.org/10.1109/tmi.2004.828354 (2004).
Article Google Scholar
Yushkevich, P. A. et al. User-guided 3D active contour segmentation of anatomical structures: Significantly improved efficiency and reliability. NeuroImage 31, 1116–1128, https://doi.org/10.1016/j.neuroimage.2006.01.015 (2006).
Article Google Scholar
Fedorov, A. et al. DICOM for quantitative imaging biomarker development: a standards based approach to sharing clinical data and structured PET/CT analysis results in head and neck cancer research. PeerJ 4, e2057 (2016).
Article Google Scholar
Herz, C. et al. DCMQI: an open source library for standardized communication of quantitative image analysis results using DICOM. Cancer research 77, e87–e90 (2017).
Article CAS Google Scholar
Bilic, P. et al. The liver tumor segmentation benchmark (lits). arXiv preprint arXiv:1901.04056 (2019).
Chlebus, G. et al. Automatic liver tumor segmentation in CT with fully convolutional neural networks and object-based postprocessing. Scientific reports 8, 1–7 (2018).
Article CAS Google Scholar
Vorontsov, E. et al. Deep learning for automated segmentation of liver lesions at CT in patients with colorectal cancer liver metastases. Radiology: Artificial Intelligence 1, 180014 (2019).
Google Scholar
Moawad, A. W. et al. Multimodality annotated HCC cases with and without advanced imaging segmentation (HCC-TACE-Seg). The Cancer Imaging Archive. https://doi.org/10.7937/TCIA.5FNA-0924 (2021).

Download references

Acknowledgements

Work was supported by the NIH/NCI under award number P30 CA016672 and U24CA215109 and also by the MD Anderson QIAC Partnership in Research Grants. The TCIA and POSDA Tools portions of this project have been funded in whole or in part with federal funds from the National Cancer Institute, National Institutes of Health under Contract No. HHSN261200800001E. The content of this publication does not necessarily reflect the views or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products, or organizations imply endorsement by the U.S. Government. Under this contract the University of Arkansas is funded by Leidos Biomedical Research subcontract 16 × 011. Support was also provided by U24CA215109.

We thank Dawn Chalaire at the Scientific Publication Department at the University of Texas MD Anderson Cancer Center for her contribution to this article.

Author information

Authors and Affiliations

Departments of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA
Ahmed W. Moawad, Ali Morshid, Ahmed M. Khalaf, Mohab M. Elmohr, John D. Hazle, David Fuentes & Mohamed Badawy
Department of radiology, Mercy catholic medical center, Darby, PA, 19023, USA
Ahmed W. Moawad
Departments of Gastrointestinal Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA
Ahmed O. Kaseb & Manal Hassan
Departments of Interventional Radiology, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA
Armeen Mahvash
Departments of Body Imaging, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA
Janio Szklaruk, Aliyya Qayyum, Abdelrahman Abusaif & Khaled M. Elsayes
Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, AR, 72205, USA
William C. Bennett, Tracy S. Nolan & Brittney Camp
Department of radiology, Baylor college of medicine, TX, 77030, Houston, USA
Mohab M. Elmohr

Authors

Ahmed W. Moawad
View author publications
You can also search for this author in PubMed Google Scholar
Ali Morshid
View author publications
You can also search for this author in PubMed Google Scholar
Ahmed M. Khalaf
View author publications
You can also search for this author in PubMed Google Scholar
Mohab M. Elmohr
View author publications
You can also search for this author in PubMed Google Scholar
John D. Hazle
View author publications
You can also search for this author in PubMed Google Scholar
David Fuentes
View author publications
You can also search for this author in PubMed Google Scholar
Mohamed Badawy
View author publications
You can also search for this author in PubMed Google Scholar
Ahmed O. Kaseb
View author publications
You can also search for this author in PubMed Google Scholar
Manal Hassan
View author publications
You can also search for this author in PubMed Google Scholar
Armeen Mahvash
View author publications
You can also search for this author in PubMed Google Scholar
Janio Szklaruk
View author publications
You can also search for this author in PubMed Google Scholar
Aliyya Qayyum
View author publications
You can also search for this author in PubMed Google Scholar
Abdelrahman Abusaif
View author publications
You can also search for this author in PubMed Google Scholar
William C. Bennett
View author publications
You can also search for this author in PubMed Google Scholar
Tracy S. Nolan
View author publications
You can also search for this author in PubMed Google Scholar
Brittney Camp
View author publications
You can also search for this author in PubMed Google Scholar
Khaled M. Elsayes
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed extensively to the work presented in this paper. A.W.M. and K.M.E. designed and supervised all of the steps of this project. A.M., A.M.K., M.M.E., M.B. and A.A. collected the patient’s data, performed the data segmentation, performed quality check of the CT images and uploaded the data to TCIA repository. J.D.H. and D.F. designed, implemented and maintained the neural network used for segmentation and prediction of clinical data. They also supervised the data segmentation process. A.O.K., M.H. and A.M. recruited the patients and provided clinical and procedural information and support. J.S., A.Q. and K.M.E. performed extensive image quality check, calculated response criteria and supervised the segmentation phase. W.B., T.S.N. and B.C. performed final quality check for the processed data, hosted and maintaining the TCIA portal to upload the cases and provide technical support for anonymization.

Corresponding authors

Correspondence to Ahmed W. Moawad, Ali Morshid, Mohab M. Elmohr, John D. Hazle, David Fuentes, Ahmed O. Kaseb, Manal Hassan, Janio Szklaruk, William C. Bennett, Tracy S. Nolan, Brittney Camp or Khaled M. Elsayes.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Detailed description of each CT series uploaded to TCIA

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Moawad, A.W., Morshid, A., Khalaf, A.M. et al. Multimodality annotated hepatocellular carcinoma data set including pre- and post-TACE with imaging segmentation. Sci Data 10, 33 (2023). https://doi.org/10.1038/s41597-023-01928-3

Download citation

Received: 06 October 2021
Accepted: 03 January 2023
Published: 18 January 2023
DOI: https://doi.org/10.1038/s41597-023-01928-3

This article is cited by

OIMHS: An Optical Coherence Tomography Image Dataset Based on Macular Hole Manual Segmentation
- Xin Ye
- Shucheng He
- Lijun Shen
Scientific Data (2023)