ISLES 2022: A multi-center magnetic resonance imaging stroke lesion segmentation dataset

Magnetic resonance imaging (MRI) is an important imaging modality in stroke. Computer based automated medical image processing is increasingly finding its way into clinical routine. The Ischemic Stroke Lesion Segmentation (ISLES) challenge is a continuous effort to develop and identify benchmark methods for acute and sub-acute ischemic stroke lesion segmentation. Here we introduce an expert-annotated, multicenter MRI dataset for segmentation of acute to subacute stroke lesions (10.5281/zenodo.7153326). This dataset comprises 400 multi-vendor MRI cases with high variability in stroke lesion size, quantity and location. It is split into a training dataset of n = 250 and a test dataset of n = 150. All training data is publicly available. The test dataset will be used for model validation only and will not be released to the public. This dataset serves as the foundation of the ISLES 2022 challenge (https://www.isles-challenge.org/) with the goal of finding algorithmic methods to enable the development and benchmarking of automatic, robust and accurate segmentation methods for ischemic stroke.


Background & Summary
Stroke is a leading cause of morbidity and mortality worldwide [23].Up to two thirds of stroke survivors suffer permanent disability [13].In the last decade, the advent of endovascular reperfusion therapy has significantly improved stroke outcome in patients with large vessel occlusions [5,15,19,34].Image-based guidance of revascularization treatment decisions has further improved patient outcome for computer tomography (CT) [1,6,25] and magnetic resonance imaging (MRI) [17,41].Computer aided image analysis, especially for CT perfusion data has already found entry into clinical routine in many centers and is recommended by national guidelines [32] to aid decision making regarding reperfusion therapy [30,44,28,31].Deep learning approaches have been shown to facilitate clinical interpretation of CT perfusion data [8,16,33,10,11].Segmentation based volumetric analyses of stroke lesions in magnetic resonance imaging (MRI) are often performed for research purposes and have been shown to predict ischemia outcome [27,47,14].However, stroke lesion segmentations are usually painstakingly performed by hand and the quality of annotations are heavily dependent on preexisting neuroimaging experience of the rater and the total time and effort invested.The time-consuming nature of this task prevents the regular use of segmentations during clinical routine, which is further impeded by the high inter-observer variability.Automated annotation of stroke lesions could be used in clinical routine to guide therapeutic decisions in an acute setting and to predict outcome at the subacute to chronic stage.Stroke lesion segmentation could also be used to automatically classify stroke etiology in post-stroke MRI.
The first Ischemic Stroke Lesion Segmentation (ISLES) challenge, which took place in 2015, was split into two sub-challenges: Sub-acute Stroke Lesion Segmentation (SISS) and Stroke Perfusion Estimation.The goal of SISS (with a total of 64 cases for training and testing) was to segment subacute stroke lesions using conventional post-stroke MRI sequences, including T2 and T1 weighted imaging, fluid attenuated inversion recover (FLAIR), and diffusion weighted imaging (DWI) [26].The ISLES challenge 2018, its second iteration, was set up to predict infarct core in DWI using CT perfusion data [16].Both ISLES events received major attention from the research community: there were 120 database downloads until the ISLES15 challenge day with 14 participating teams, and the number of participating teams was roughly duplicated in the latest ISLES'18 edition.The ISLES'15 and ISLES'18 challenges played a crucial role in identifying prominent methods for acute and sub-acute ischemic stroke lesion segmenta-tion.These datasets have since served as important benchmarks for the scientific community.
Based on the experience gained from these previous editions, ISLES '22 aims to benchmark acute and sub-acute ischemic stroke MRI segmentation using 400 cases.This MICCAI 2022 challenge edition is organized with the task of DWI infarct segmentation in acute and sub-acute stroke.ISLES'22 differs in several ways from the previous challenge editions in ischemic stroke by: 1) targeting the delineation of not only large infarct lesions, but also of multiple embolic and/or cortical infarcts (typically seen after mechanical recanalization), and 2) by evaluating both pre-and post-interventional MRI images.Altogether, ISLES'22 provides a considerably larger dataset than before (more than 5x the caseload than in the ISLES'15 edition).For detailed information about the ISLES'22 challenge event, readers are referred to [9].This challenge is held together with the ATLAS challenge (https://atlas.grand-challenge.org/[22] using the web-based platforms http://www.isles-challenge.org/and https://isles22.grand-challenge.org/. From a clinical perspective, ISLES'22 focuses on the clinical growing interest of acute embolic infarct patterns, both pre-intervention (i.e. at a very early disease state) and post-intervention (typical, post-interventional sub-acute infarct patterns not restricted to a single vessel territory).This clinical problem also challenges the participants from a technical perspective: teams will deal with a wider ischemic stroke disease spectrum, involving variable lesion size and burden, more complex infarct patterns and variable anatomically located lesions in data from multiple centers.The diversity of the ISLES'22 dataset will provide a unique challenge for participants.

Ethical Statement
This retrospective evaluation of imaging data was approved by the local ethics boards of all participating centers.

Subject selection
Inclusion criteria for the dataset: Subjects 18 years or older who had received MR imaging of the brain for previously diagnosed or suspected stroke were included in this study.The imaging protocol required at least a FLAIR and DWI sequence.DWI consists of a trace image at a b-value up to 1000 s/mm 2 as a well as its corresponding apparent diffusion coefficient (ADC) map.Image acquisition was performed on one of the following devices: 3T Philips MRI scanners (Achieva, Ingenia), 3T Siemens MRI scanner (Verio) or 1.5T Siemens MAGNE-TOM MRI scanners (Avanto, Aera).MRIs included were intentionally chosen to be heterogeneous to ensure the best possible training and generalization of the algorithms.See Table 1 for an overview of MRI acquisition parameters.Images were obtained by healthcare professionals as part of the clinical imaging routine for stroke patients at three different stroke centers.Care was taken to select a broad spectrum of infarct patterns.All vascular territories were included at a similar rate.A large subset of patients with posterior circulation ischemia (e.g.due to basilar artery occlusion) was included in this study.Due to a high degree of closely situated scull-base artefacts in DWI, infratentorial ischemias are difficult to segment for unexperienced raters and are frequently overlooked by segmentation algorithms with little training exposure to posterior circulation ischemia.These additional labeling challenges lead us to include more cases of posterior circulation ischemia than would be expected if case selection were truly random.Figure 1 shows sample cases portraying the spectrum of ischemia included in this dataset.
Segmentation difficulties also frequently arise in cases with large amounts of punctiform infarcts.For human raters, to adequately capture the entire ischemic territory, a time-consuming segmentation effort is required.Similarly, machine trained algorithms frequently fail to capture all the small affected regions.In n = 5 patients, where MR imaging was acquired for suspected stroke, no ischemia was found.We chose to include this small subset to diversify the dataset.Table 2 gives an overview of infarct volumes per scan and per lesion, as well as the number of unconnected ischemias per scan in the entire dataset.The number of unconnected ischemias has been calculated using the python library cc3d [40].In the hyper-acute phase of ischemic stroke, up to 4.5 hours post onset, restricted diffusion is present (high signal on DWI and low signal on ADC) while the FLAIR in the affected parenchyma remains without changes.This imaging phenomenon is called a FLAIR-DWI mismatch and is used in clinical practice to estimate the time window in patients where the time of onset is unknown.An accurate estimate of the time of onset is crucial to make decisions regarding revascularization treatment [41].In acute ischemic stroke, following the hyperacute phase and usually defined in literature as 0 to 7 days from onset, DWI and FLAIR show a high signal with reduced ADC values in the affected brain parenchyma.In the subacute stage, between 1 to 3 weeks post onset, high DWI signal begins to diminish while ADC first normalizes to values of healthy brain tissue, a phenomenon frequently referred to as pseudonormalization.FLAIR signal remains high.In the chronic stage, beginning 3 weeks after onset, DWI signal is variable but usually iso-to hypo-intense depending on underlying T2 signal, while ADC values are high [2,21,42,43,24,35,29,38,45,4].MRIs with late acquisition post stroke (> 1 week) often lead to a decreased DWI signal intensity for ischemic brain parenchyma.A lower signal intensity in DWI leads to lower MRI sensitivity for stroke and segmentation difficulty for both human and machine raters.In these cases, it is especially difficult to adequately annotate the border between ischemia and healthy brain tissue.This dataset includes cases of MRIs in various stage of sub-acute stroke from multiple previous studies to find machine learning solutions to this frequent issue in stroke lesion segmentation [36,37].

Ground truth stroke lesion segmentation
A hybrid human-algorithm annotation scheme was applied.First, the MR input data was anonymized by conversion to Neuroimaging Informatics Technology Initiative (NIfTI) format (https://nifti.nimh.nih.gov/nifti-1),according to the Brain Imaging Data Structure (BIDS) convention (https://bids.neuroimaging.io/).As part of pre-processing, DWI and its corresponding ADC map were resliced using ANTs [3] to an axial isotropic voxel size of 2×2 mm 2 .This was performed only for the cases of Center #1 (see below), as this represents the original acquisition voxel size.The slice thickness remained as acquired.FLAIR data remained as exported.A 3D UNet [7] was trained on DWI data that was pre-annotated for other research projects by using a single MRI modality (B=1000 DWI).This algorithm was trained in-house with labeled data from University Hospital of Munich.Later, this algorithm pre-segmented all scans intended for later release.All algorithmic segmentations were then checked and edited by medical students with special stroke lesion segmentation training.Their annotations were in turn revised by a neuroradiology resident.Due to inadequate performance of the primary pre-segmentation algorithm, largely due to the reasons mentioned above, the original training data was critically revised and a large part was deemed inadequate and discarded.A new pre-segmentation algorithm was trained on annotations deemed correct or edited by the neuroradiologist in training.This resulted in more accurate predictions and a lesser effort of correcting annotations by medical students.All medical-student edited annotations were critically revised by the neuroradiologist in training and a large part of the final data set were reviewed by one out of three attending neuroradiologists, all of them with more than 10 years of experience in stroke imaging.Manual stroke lesion segmentations were performed using ITK-Snap or 3D Slicer open-source tools for brain imaging visualization and segmentation [46,12].Figure 2 shows an overview of the annotation workflow followed in preparation of the release of the dataset.

Post-annotation Data Pre-processing
In preparation for data-set release and in accordance to the ethical approval obtained for this challenge, the imaging data was irreversibly anonymized.For this de-identification of patients, brain-extraction was performed mask-based using the HD-BET algorithm [18] completion of the annotation process.FLAIR to DWI rigid registration was performed using Elastix [20,39] and subsequent skull-stripping of DWI and ADC using the registered brain mask was performed.After skull stripping for de-identification, all imaging sequences were returned to their native space before data release.

Data Records
Data structure and file formats.All medical imaging files were exported from the Picture Archiving and Communication System in the NIfTI format.Segmentation masks are also saved in NIfTI format.All data in the ISLES ' The train set comprises data from centers #1 and #2.The test set comprises data from all the three centers in equal parts as follows: Acute to early sub-acute stroke data from centers #1 and #3 (MRIs acquired after revascularization therapy).Hyper-acute to acute stroke data from center #2 (MRIs acquired before revascularization therapy).Thus, in this ISLES'22 task we will evaluate the robustness and generalization capability of the proposed models over 1) new scans coming from two centers already used at training stage, 2) new scans coming from a new (unseen at training stage) center, and 3) new scans acquired before revascularization therapy, coming from a center already seen at training stage.The split between training and test data set will be done so that both sets will include a similar variance of stroke lesion patterns ranging from large territorial infarcts to small punctiform ischemia.

Technical Validation
The presented medical imaging data was derived from the picture archiving and communication system of the corresponding institution and therefore fully complies with the legal standards and quality controls for the acquisition of medical imaging in Germany, European Union and Switzerland, as well as the industrial standards of the scanner vendors.Segmentation masks were prepared and annotated at voxel-level by a human-machine hybrid algorithm with hierarchical manual checks and corrections first by specifically trained medical students and later a neuroradiology resident.Afterwards, the masks were reviewed, corrected, and finally approved by an expert neuroradiologist.

Data and Code Availability
The ISLES 2022 data can be downloaded by following the steps described at https://isles22.grand-challenge.org/dataset/.This data is published under the creative commons license CC BY-SA 4.0.Upon reasonable request, respecting patient confidentiality and ethical standards, special arrangements for use and re-distribution can be made directly with the authors.In order to facilitate future users of this dataset to get familiarized with the images, we have released the following ISLES 2022 Github repository: https://github.com/ezequieldlrosa/isles22.The repository contains scripts to read the images, visualize them, and to quantify the algorithmic results performance with the same metrics used in the challenge to rank participants.

Fig. 1 .
Fig. 1.Exemplary 3D Snapshots through the ischemic center of mass.Axial FLAIR and DWI images are displayed in the top leftmost columns.The 3rd column from the left shows the maximum intensity projection (MIP) of the mask (Msk).The three rightmost columns show DWI with mask overlay (without MIP) in the three anatomical planes.Top row: Larger left-sided infarct with bilateral punctiform, likely embolic satellite ischemias.2 nd row: Infarct of the entire right sided middle cerebral artery territory.3 rd row: Bilateral cerebellar and occipital infarcts after posterior circulation ischemia.Bottom row: Bilateral punctiform infarcts, likely resulting from multiple micro-embolic occlusions.

Fig. 2 .
Fig. 2. Workflow of the hybrid human-algorithm stroke lesion segmentation applied in this dataset.

Table 1 .
Overview of MRI imaging parameters.Range of values [Min-Max] are shown.TR: Repetition time; TE: Echo time; TI: Inversion time.

Table 2 .
Overview of MRI imaging parameters.Range of values are shown.TR: Repetition time; TE: Echo time; TI: Inversion time.
22 dataset was separated into a training dataset (250 subjects) and a test dataset (150 subjects).Corresponding scanner metadata from the Digital Imaging and Communications in Medicine (DICOM) header in the JSON file format is provided with the datasets if available.MRIs from the following centers were included: * Center #1: University Hospital of the Technical University Munich, Munich, Germany.* Center #2: University Hospital of Bern, Bern, Switzerland.* Center #3: University Medical Center Hamburg-Eppendorf, Hamburg, Germany.