Abstract
Breast cancer is one of the most pervasive forms of cancer and its inherent intra- and inter-tumor heterogeneity contributes towards its poor prognosis. Multiple studies have reported results from either private institutional data or publicly available datasets. However, current public datasets are limited in terms of having consistency in: a) data quality, b) quality of expert annotation of pathology, and c) availability of baseline results from computational algorithms. To address these limitations, here we propose the enhancement of the I-SPY1 data collection, with uniformly curated data, tumor annotations, and quantitative imaging features. Specifically, the proposed dataset includes a) uniformly processed scans that are harmonized to match intensity and spatial characteristics, facilitating immediate use in computational studies, b) computationally-generated and manually-revised expert annotations of tumor regions, as well as c) a comprehensive set of quantitative imaging (also known as radiomic) features corresponding to the tumor regions. This collection describes our contribution towards repeatable, reproducible, and comparative quantitative studies leading to new predictive, prognostic, and diagnostic assessments.
Similar content being viewed by others
Background & Summary
The spatial manifestation of inter- and intra-tumor heterogeneity in breast cancer is well established1,2. Current breast cancer diagnosis and subsequent disease management primarily occurs on the basis of histopathologic assessment and biomarkers, which are derived from the sampled tissue. Utilization of biopsies and conventional biomarkers cannot fully capture the intra-tumor heterogeneity, as they are limited by the tissue sampling error, leading to over- or under-treatment. As such, there is a clinical need to characterize the intra-tumor heterogeneity to better understand this disease and its progression mechanisms.
The use of magnetic resonance imaging (MRI) in breast cancer screening, diagnosis, and treatment management, allows for the non-invasive and longitudinal sampling of disease burden3,4. Beyond the conventional and qualitative uses of MRI in breast cancer disease management, the field of radiomics, broadly defined as the extraction of high-throughput visual and sub-visual cues derived from medical imaging5,6,7, has allowed for a quantitative characterization and assessment of the breast tumor disease burden. This has led to the development of prognostic and predictive radiomic biomarkers that capture breast intra-tumor heterogeneity, promoting personalized clinical decision making8.
Clinical and computational studies analyzing the radiologic presentations of breast tumor disease burden require ample and diverse data to ensure robust characterization. Publicly available datasets, such as those hosted through The Cancer Imaging Archive (TCIA www.cancerimagingarchive.net)9, created by the National Cancer Institute (NCI) of the National Institutes of Health (NIH), provide large study cohorts for meaningful research development. Furthermore, such datasets10,11,12,13 allow for study reproducibility and analyses comparisons across varying institutions, promoting increasingly robust conclusions. However, publicly available radiographic scans require accompanying expertly annotated ground truth tumor annotations to ensure accurate study comparisons and reproducible analyses. Furthermore, any computational analyses, including radiomics-based pipelines, require standardized image normalization and feature parameter selections for consistent analyses6,7,14,15,16.
To address this limitation, this manuscript provides the ‘I-SPY1-Tumor-SEG-Radiomics’ collection, which extends the current TCIA collection ‘I-SPY1’ (https://wiki.cancerimagingarchive.net/display/Public/I-SPY1)17,18, with segmentations labels and radiomic features panel for the ACRIN 6657/I-SPY1 TRIAL cohort. The latter contains dynamic contrast enhanced (DCE) MRI images of women diagnosed with locally advanced breast cancer who underwent longitudinal neoadjuvant chemotherapy17,18. The primary goal is to allow standardized expert image annotations and radiomic features for researchers to conduct reproducible analyses. To this end, annotations and radiomic features for the baseline (pre-treatment) images of n = 163 women have been provided. Based on the analyses that needs to be performed, the selected cohort includes women with baseline (T1) DCE-MRI with at least two post-contrast images for future studies wishing to explore dynamic assessments of breast tumor behavior and treatment response prediction. For each patient visit, three MRI scans are provided over the duration of a single contrast administration: a pre-contrast image, and two post-contrast images. All provided images are pre-operative and pre-treatment. Two sets of annotated labels are provided: i) structural tumor volume (STV) segmentations assessed by an expert board-certified breast radiologist, and ii) functional tumor volume (FTV) segmentations, as described in prior studies18,19. While FTV segmentations can provide an assessment of tumor vascularity and perfusion, they are limited in describing the entire structural tumor burden as they only account for voxels of a region of interest (ROI) above a specific intensity threshold. In contrast, the provided STV segmentations annotate the entire structural region (i.e., the whole extent) of the primary lesion. The STV segmentations have been used in prior studies in which radiomic features extracted from the STV region resulted in improved prognostic performance than FTV values20. Preliminary evaluation of radiomic features extracted from STV defined primary lesion volumes has demonstrated improved prognostic performance over established clinical covariates21.
Additionally, the data cohort includes a comprehensive panel of radiomic features characterizing breast tumor morphology, intensity, and texture. This panel of radiomic features is extracted in compliance with the Image Biomarker Standardization initiative (IBSI)7, using the publicly available Cancer Imaging Phenomics Toolkit (CaPTk,https://www.cbica.upenn.edu/captk)22,23,24.
The availability of annotations characterizing the functional active regions around the lesion’s ROI, the entire primary lesion structure, and the computed radiomic features can enable for the development of prognostic and predictive biomarkers characterizing breast tumor heterogeneity through the direct utilization of the TCIA ACRIN 6657/I-SPY1 TRIAL data potential in clinical and computational studies, but importantly can contribute to repeatable, reproducible, and comparative quantitative studies enabling direct utilization of the TCIA I-SPY collection.
Methods
Data collection
The ACRIN 6657/I-SPY1 TRIAL17,18 enrolled n = 237 women with their consent from May 2002 to March 2006. From this cohort, n = 230 women met the eligibility criteria of being diagnosed with locally advanced breast cancer with primary tumors of stage T3 measuring at least 3 cm in diameter18. The pre-operative DCE-MRI images of 222 women were publicly available via The Cancer Imaging Archive (TCIA)9. From this TCIA set, 15 women were excluded for our present study, due to incomplete DCE acquisition scans. A subsequent 44 women were also excluded due to either incomplete histopathologic data or recurrence free survival (RFS) outcome, or missing pre-treatment DCE-MRI scans. This resulted in the inclusion of n = 163 women for this study, for whom at least two post-contrast scans from the baseline pre-treatment DCE-MRI scans were available. Women underwent neoadjuvant chemotherapy with an anthracycline-cyclophosphamide regimen alone or followed by taxane. All women underwent longitudinal DCE-MRI imaging on a 1.5 T field-strength system. Distributions of patient histopathologic characteristics and image scanner manufacturer details can be found in Tables 1 and 2. An exemplary illustration showing the spatial intratumor heterogeneity is shown in Fig. 1. The complete clinical metadata is available in the Supplementary Table.
Preprocessing
The preprocessing procedures involved in preparing the data for further analyses were conducted using the Cancer Imaging Phenomics Toolkit (CaPTk)22,23,24, and they are outlined as follows:
-
1.
Image format conversion: For each patient, baseline images were converted to the Neuroimaging Informatics Technology Initiative (NIfTI)25 file format from the publicly available DICOM scans. This format does not include any identifiable information as the DICOM headers hold, and only preserves the actual imaging information and the necessary information to define the data in the physical coordinates.
-
2.
Bias Field Correction: All the converted NIfTI images were bias corrected to rectify any non-uniformity associated with the magnetic field of the MRI scanner26,27.
-
3.
Data harmonization: This step is required to ensure consistency in the entire dataset as described below.
-
(a).
Resampling: The raw I-SPY images have different voxel resolutions, preventing cohesive analysis across the entire dataset. To mitigate this, all the images were resampled to the standard 1mm3 isotropic resolution to ensure harmonized processing for computational algorithms. This resolution is chosen because this resizes all the images to a size which can fit in the GPU memory (more details will be explained later)
-
(b).
Z-Scoring: After the images are resampled, we Z-score the images using instance level (considering all timepoints of the given patient rather than entire dataset) statistics of mean and variance. Z-scoring is a widely accepted method from extended observations28,29,30,31, that normalizing every single multi-timepoint scan (i.e., instance-level normalization) to zero mean and a unit variance helps to improve algorithmic generalizability and to preserve the relative intensity differences between the pre- and post-contrast excitation scans.
-
(a).
DCE-MRI NIfTI volumes
Three volumes have been provided for each patient from the pre-operative, pre-treatment visit. These images include the pre-contrast administration MRI scan (0000), first post-contrast image (0001), and second post-contrast image (0002).
Expert tumor annotations
From the NIfTI images, the functional tumor volume (FTV) segmentation was identified within the region of interest (ROI), provided through TCIA, from the signal enhancement ratio image, as previously described18,32. In order to generate the structural tumor volume (STV) segmentations, voxels outside of the largest contiguous volume region and voxels greater than 2 cm away from the largest contiguous volume region, within the FTV, were manually removed. Our expert board-certified breast radiologist then identified the primary lesions in each of the n = 163 baseline DCE-MRI images using the manually cleaned, FTV segmentation as a guide. The first-post contrast image for each case was used by the radiologist to delineate the entire 3-D primary tumor segmentation for each patient. Satellite lesions were not considered in the primary tumor segmentations. ITK-SNAP (www.itksnap.org)33 was utilized to perform the manual delineations.
Computationally-generated annotations
A 3D Convolutional Neural Network based on U-Net34, with residual connections35, was trained on all the preprocessed 3 timepoints to perform automated segmentations of the STV and the code has been made available for reproducibility. The models are trained using the Multi-class Dice36 Loss function37 with on-the-fly data augmentation techniques such as ghosting, blur, and gaussian noise applied in a random manner with a given probability for each type of augmentation38. All the experiments are done using nested k-fold cross validation and the median Dice score across the holdout folds is 0.74. An initial learning rate of 0.01 is used, which is varied in a linear triangular fashion having a minimum learning rate of 10−3 times the initial learning rate. We use the Stochastic Gradient Descent optimizer to update weights of our network.
Radiomic features
An comprehensive array of 370 unique features were extracted. These are from 8 different feature families, based on intensity statistics (n = 20), morphology (n = 21), histograms (n = 285), Gray-level co-occurrence matrix (GLCM) (n = 8), Gray-level run-length matrix (GLRLM) (n = 12), Gray-level size zone matrix (GLSZM) (n = 18), Neighborhood gray tone difference matrix (NGTDM) (n = 5), and Local binary patters (LBP) (n = 1). We used non-filtered images after the first post-contract injection that were bias-corrected, resampled and z-score normalized. The radiomic features were then extracted from the region defined by the STV. The extraction was done using the Cancer imaging Phenomics Toolkit (CaPTk, www.cbica.upenn.edu/captk)22,23,24. CaPTk is an open-source software toolkit, which offers functionalities to extract a wide array of radiomic features compliant with the image biomarker standardisation initiative (IBSI)7, the Quantitative Imaging Network6, and has been extensively used in radiomic analysis studies39,40,41,42,43. The exact parameters used for the radiomic analysis are available through TCIA’s repository, at https://doi.org/10.7937/TCIA.XC7A-QT2044.
Data Records
We are using the data17 published through the ACRIN 6657/I-SPY1 TRIAL study18. Specifically, we selected baseline subjects for whom at least two pre-operative post-contrast scans were available. The raw and generated data, which includes the preprocessed images in isotropic resolution of 1mm3, the expert and computationally-generated annotations, and the extracted radiomic features, have been made available through TCIA’s Analysis Results Directory www.cancerimagingarchive.net/tcia-analysis-results/ using https://doi.org/10.7937/TCIA.XC7A-QT2044. The computationally generated annotations can stand as a benchmark for improving segmentation algorithms related to this data in future computational studies.
Technical Validation
Data collection
The dataset was directly downloaded from TCIA and quantitatively analyzed to ensure all images have a defined coordinate system and contain non-zero pixel values. Two cases, 1183 and 1187, had white image artifacts outside of the breast region. While these artifacts do not affect intensity distributions within the anatomical breast or the corresponding lesion segmentations, they may cause difficulties in image visualization, and downstream analyses. These artifacts were present in images directly downloaded from TCIA (illustrated in Fig. 2). Additionally, qualitative assessment was performed to look for any visual data corruption.
Preprocessing
Each step of preprocessing was followed by manual qualitative assessment of the image to ensure data validity. In addition, quantitative assessment was performed following the data harmonization step to ensure that the entire dataset had the same parametric definition (i.e., same resolution and pixel intensity distribution).
Expert tumor annotations
The expert annotated STV segmentations were qualitatively assessed, manually edited and approved by a board certified, fellowship-trained breast radiologist.
Computationally-generated annotations
The FTV annotations were quantitatively compared with the corresponding STV annotations using the Dice score in order to quantify the difference between the two annotations. Additionally, a qualitative analysis was performed for the best and worst performing cases (illustrated in Fig. 3).
Feature extraction
Considering the mathematical formulation of these features, it is possible for a division by zero to occur (lack of heterogeneity or very small number of voxels). In CaPTk, we provide “not a number” for the result of these features to provide a position of clarity for the user to make subsequent downstream analyses more coherent based on the entire population. We acknowledge this could be provided as “inf” instead, but we are providing this as “NaN” to have parity between various programming languages and processing protocols.
Usage Notes
This collection of images (both normalized and resampled) and accompanying annotations can be analyzed using different tools or software. We provide all the annotations in a research-friendly NIfTI format to allow users to read the images and annotations through many programming languages such as C++, Python, R, or others. The data is accompanied by a XSLX file that provides additional information about each subject.
Code availability
In favor of transparency and reproducibility, but also in line with the scientific data principles of Findability, Accessibility, Interoperability, and Reusability (FAIR)45, we have made the tools used to generate the data for this study publicly available38. Specifically, the CaPTk platform22,23,24, version 1.8.1, was used for all the preprocessing steps. CaPTk’s source code and binary executables are publicly available for multiple operative systems through its official GitHub repository (https://github.com/CBICA/CaPTk). The implementation and configuration of the U-Net with residual connections, used in this study, can be found in the GitHub page of the Generally Nuanced Deep Learning Framework (GaNDLF), version 0.0.14 (https://github.com/CBICA/GaNDLF). Finally, ITK-SNAP33, was used for all the manual annotation refinements.
References
Polyak, K. et al. Heterogeneity in breast cancer. The Journal of clinical investigation 121, 3786–3788 (2011).
Marusyk, A. & Polyak, K. Tumor heterogeneity: causes and consequences. Biochimica et Biophysica Acta (BBA)-Reviews on Cancer 1805, 105–117 (2010).
Gavenonis, S. C. & Roth, S. O. Role of magnetic resonance imaging in evaluating the extent of disease. Magnetic Resonance Imaging Clinics 18, 199–206 (2010).
Weinstein, S. & Rosen, M. Breast mr imaging: current indications and advanced imaging techniques. Radiologic Clinics 48, 1013–1042 (2010).
Gillies, R. J., Kinahan, P. E. & Hricak, H. Radiomics: images are more than pictures, they are data. Radiology 278, 563–577 (2016).
McNitt-Gray, M. et al. Standardization in quantitative imaging: a multicenter comparison of radiomic features from different software packages on digital reference objects and patient data sets. Tomography 6, 118–128 (2020).
Zwanenburg, A. et al. The image biomarker standardization initiative: standardized quantitative radiomics for high-throughput image-based phenotyping. Radiology 295, 328–338 (2020).
Valdora, F., Houssami, N., Rossi, F., Calabrese, M. & Tagliafico, A. S. Rapid review: radiomics and breast cancer. Breast cancer research and treatment 169, 217–229 (2018).
Clark, K. et al. The cancer imaging archive (tcia): maintaining and operating a public information repository. Journal of digital imaging 26, 1045–1057 (2013).
Saha, A. et al. A machine learning approach to radiogenomics of breast cancer: a study of 922 subjects and 529 dce-mri features. British journal of cancer 119, 508–516 (2018).
Saha, A. et al. Dynamic contrast-enhanced magnetic resonance images of breast cancer patients with tumor locations. The Cancer Imaging Archive. https://doi.org/10.7937/TCIA.e3sv-re93 (2021).
Lehman, C. et al. Acrin trial 6667 investigators group. mri evaluation of the contralateral breast in women with recently diagnosed breast cancer. N Engl J Med 356, 1295–303 (2007).
Kinahan, P., Muzi, M., Bialecki, B., Herman, B. & Coombs, L. Acrin-contralateral-breast-mr (acrin 6667). The Cancer Imaging Archive. https://doi.org/10.7937/Q1EE-J082 (2021).
Castaldo, R., Pane, K., Nicolai, E., Salvatore, M. & Franzese, M. The impact of normalization approaches to automatically detect radiogenomic phenotypes characterizing breast cancer receptors status. Cancers 12, 518 (2020).
Pati, S. et al. Reproducibility analysis of multi-institutional paired expert annotations and radiomic features of the ivy glioblastoma atlas project (ivy gap) dataset. Medical Physics 47, 6039–6052 (2020).
Saint Martin, M.-J. et al. A radiomics pipeline dedicated to breast mri: validation on a multi-scanner phantom study. Magnetic Resonance Materials in Physics, Biology and Medicine 34, 355–366 (2021).
Newitt, D. et al. Multi-center breast dce-mri data and segmentations from patients in the i-spy 1/acrin 6657 trials. The Cancer Imaging Archive. https://doi.org/10.7937/K9/TCIA.2016.HdHpgJLK (2016).
Hylton, N. M. et al. Neoadjuvant chemotherapy for breast cancer: functional tumor volume by mr imaging predicts recurrence-free survival—results from the acrin 6657/calgb 150007 i-spy 1 trial. Radiology 279, 44–55 (2016).
Hylton, N. M. Vascularity assessment of breast lesions with gadolinium-enhanced mr imaging. Magnetic resonance imaging clinics of North America 7, 411–20 (1999).
Chitalia, R. et al. Radiomic tumor phenotypes can augment molecular profiling in predicting survival after breast neoadjuvant chemotherapy: Results from acrin 6657/i-spy 1. Under review (2021).
Chitalia, R. D. et al. Imaging phenotypes of breast cancer heterogeneity in preoperative breast dynamic contrast enhanced magnetic resonance imaging (dce-mri) scans predict 10-year recurrence. Clinical Cancer Research 26, 862–869 (2020).
Davatzikos, C. et al. Cancer imaging phenomics toolkit: quantitative imaging analytics for precision diagnostics and predictive modeling of clinical outcome. Journal of medical imaging 5, 011018 (2018).
Pati, S. et al. The cancer imaging phenomics toolkit (captk): Technical overview. In International MICCAI Brainlesion Workshop, 380–394 (Springer, 2019).
Rathore, S. et al. Brain cancer imaging phenomics toolkit (brain-captk): an interactive platform for quantitative analysis of glioblastoma. In International MICCAI Brainlesion Workshop, 133–145 (Springer, 2017).
Cox, R. et al. A (sort of) new image data format standard: Nifti-1: We 150. Neuroimage 22 (2004).
Sled, J. G., Zijdenbos, A. P. & Evans, A. C. A nonparametric method for automatic correction of intensity nonuniformity in mri data. IEEE transactions on medical imaging 17, 87–97 (1998).
Tustison, N. J. et al. N4itk: improved n3 bias correction. IEEE transactions on medical imaging 29, 1310–1320 (2010).
Al Shalabi, L. & Shaaban, Z. Normalization as a preprocessing engine for data mining and the approach of preference matrix. In 2006 International conference on dependability of computer systems, 207–214 (IEEE, 2006).
Ribaric, S. & Fratric, I. Experimental evaluation of matching-score normalization techniques on different multimodal biometric systems. In MELECON 2006-2006 IEEE Mediterranean Electrotechnical Conference, 498–501 (IEEE, 2006).
Bakas, S. et al. Identifying the best machine learning algorithms for brain tumor segmentation, progression assessment, and overall survival prediction in the brats challenge. arXivpreprintarXiv:1811.02629 (2018)..
Abdi, H., et al. Normalizing data. Encyclopedia of research design 1 (2010).
Jafri, N. F. et al. Optimized breast mri functional tumor volume as a biomarker of recurrence-free survival following neoadjuvant chemotherapy. Journal of Magnetic Resonance Imaging 40, 476–482 (2014).
Yushkevich, P. A. et al. User-guided 3D active contour segmentation of anatomical structures: Significantly improved efficiency and reliability. Neuroimage 31, 1116–1128 (2006).
Ronneberger, O., Fischer, P. & Brox, T. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention, 234–241 (Springer, 2015).
Thakur, S. et al. Brain extraction on mri scans in presence of diffuse glioma: Multi-institutional performance evaluation of deep learning methods and robust modality-agnostic training. NeuroImage 220, 117081 (2020).
Zijdenbos, A. P., Dawant, B. M., Margolin, R. A. & Palmer, A. C. Morphometric analysis of white matter lesions in mr images: method and validation. IEEE transactions on medical imaging 13, 716–724 (1994).
Sudre, C. H., Li, W., Vercauteren, T., Ourselin, S. & Cardoso, M. J. Generalised dice overlap as a deep learning loss function for highly unbalanced segmentations. In Deep learning in medical image analysis and multimodal learning for clinical decision support, 240–248 (Springer, 2017).
Pati, S. et al. Gandlf: A generally nuanced deep learning framework for scalable end-to-end clinical workflows in medical imaging. arXiv preprint arXiv:2103.01006 (2021).
Macyszyn, L. et al. Imaging patterns predict patient survival and molecular subtype in glioblastoma via machine learning techniques. Neuro-oncology 18, 417–425 (2015).
Bakas, S. et al. Advancing the cancer genome atlas glioma mri collections with expert segmentation labels and radiomic features. Scientific data 4, 170117 (2017).
Fathi Kazerooni, A. et al. Cancer imaging phenomics via captk: Multi-institutional prediction of progression-free survival and pattern of recurrence in glioblastoma. JCO Clinical Cancer Informatics 4, 234–244 (2020).
Bakas, S. et al. Integrative radiomic analysis for pre-surgical prognostic stratification of glioblastoma patients: from advanced to basic mri protocols. In Medical Imaging 2020: Image-Guided Procedures, Robotic Interventions, and Modeling, vol. 11315, 113151S (International Society for Optics and Photonics, 2020).
Thakur, S. P. et al. Skull-stripping of glioblastoma mri scans using 3d deep learning. In International MICCAI Brainlesion Workshop, 57–68 (Springer, 2019).
Chitalia, R. et al. Expert tumor annotations and radiomic features for the ispy1/acrin 6657 trial data collection. The Cancer Imaging Archive. https://doi.org/10.7937/TCIA.XC7A-QT20 (2022).
Wilkinson, M. D. et al. The fair guiding principles for scientific data management and stewardship. Scientific data 3, 1–9 (2016).
Acknowledgements
Research reported in this publication was partly supported by the National Cancer Institute (NCI) of the National Institutes of Health (NIH), under award numbers U01CA242871, U24CA189523, U01CA151235, R01CA197000, and R01CA132870. The content of this publication is solely the responsibility of the authors and does not represent the official views of the NIH.
Author information
Authors and Affiliations
Contributions
D.K. and S.B. conceived the experiment(s). R.C., S.P., M.B., S.T., N.J., V.B. and E.M. conducted the experiment(s). R.C., S.P. and M.B. analysed the results. R.C. and S.P. wrote the first version of the manuscript. J.G., D.N. and N.H. provided the data. All authors reviewed, edited, and approved the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Chitalia, R., Pati, S., Bhalerao, M. et al. Expert tumor annotations and radiomics for locally advanced breast cancer in DCE-MRI for ACRIN 6657/I-SPY1. Sci Data 9, 440 (2022). https://doi.org/10.1038/s41597-022-01555-4
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41597-022-01555-4
This article is cited by
-
Large scale crowdsourced radiotherapy segmentations across a variety of cancer anatomic sites
Scientific Data (2023)
-
GaNDLF: the generally nuanced deep learning framework for scalable end-to-end clinical workflows
Communications Engineering (2023)