Stability of radiomics features in apparent diffusion coefficient maps from a multi-centre test-retest trial

Quantitative radiomics features, extracted from medical images, characterize tumour-phenotypes and have been shown to provide prognostic value in predicting clinical outcomes. Stability of radiomics features extracted from apparent diffusion coefficient (ADC)-maps is essential for reliable correlation with the underlying pathology and its clinical applications. Within a multicentre, multi-vendor trial we established a method to analyse radiomics features from ADC-maps of ovarian (n = 12), lung (n = 19), and colorectal liver metastasis (n = 30) cancer patients who underwent repeated (<7 days) diffusion-weighted imaging at 1.5 T and 3 T. From these ADC-maps, 1322 features describing tumour shape, texture and intensity were retrospectively extracted and stable features were selected using the concordance correlation coefficient (CCC > 0.85). Although some features were tissue- and/or respiratory motion-specific, 122 features were stable for all tumour-entities. A large proportion of features were stable across different vendors and field strengths. By extracting stable phenotypic features, fitting-dimensionality is reduced and reliable prognostic models can be created, paving the way for clinical implementation of ADC-based radiomics.

Given the FD processed image , with elements: Where is the mean of .

First-order Grey-level statistics
First-order Grey-level statistics describe the distribution of grey values within the volume. Let denote the three dimensional image matrix with voxels, the first order histogram, ( ) the fraction of voxels with intensity level and ! the number of discrete intensity levels.

Energy
Energy is also known as the sum of squares.

Entropy
Note: Defined by IBSI as Intensity Histogram Entropy.

Kurtosis
where is the mean of . Note: The IBSI feature definition implements excess kurtosis, where kurtosis is corrected by -3, yielding 0 for normal distributions. The kurtosis presented above is not corrected, yielding a value 3 higher than the IBSI kurtosis.

Maximum
The maximum grey value of . = max ( )

Mean
The mean grey value of .
Mean ADC values of the entire cohort was calculated over the mean of each lesion

Mean absolute deviation
The mean of the absolute deviations of all voxel intensities around the mean intensity value.
where is the mean of .

Median
The sample median of , or the 50 th percentile of .
Median ADC values of the entire cohort was calculated over the mean of each lesion

8.
Minimum The minimum intensity value of .

9.
Range The range of intensity values of .

Root mean square (RMS)
The quadratic mean, or the square root of the mean of squares of all voxel intensities.
where is the mean of .

12.
Standard deviation * where is the mean of .

Robust mean absolute deviation
The mean absolute deviation (0) of only those voxels in with a grey value between the 10 th and 90 th percentile.
14. 10 th percentile The 10 th percentile of , a robust alternative to the minimum grey value (8).

90 th percentile
The 90 th percentile of , a robust alternative to the maximum grey value (4).

Interquartile range
The interquartile range is defined as the 75 th minus the 25 th percentile of .

Uniformity
Note: Defined by IBSI as Intensity Histogram Uniformity.

Variance
where is the mean of . Variance is the square of the standard deviation (12).

Intensity histogram features
Intensity histogram features describe the distribution of grey values within the volume, after discretization into intensity level bins was applied. Let:

Coefficient of variance (cov)
= standard deviation mean

Energy
Energy is also known as the sum of squares.

5.
Kurtosis where ! is the mean of ! .

6.
Maximum The maximum discretized intensity value of ! .

8.
Maximum histogram gradient intensity level (maxgradi) The discretized intensity level corresponding to the maximum histogram gradient.

9.
Mean The mean discretized intensity value of ! .

Mean absolute deviation (meand)
The mean of the absolute deviations of all discretized intensity levels around the mean of ! .
where ! is the mean of ! .

Median
The sample median of ! or the 50 th percentile of ! .

Median absolute deviation (mediand)
The dispersion from the median of ! .
where is the median of ! .

Minimum
The minimum discretized intensity value of ! . = min( ! )

Minimum histogram gradient intensity level (mingradi)
The discretized intensity level corresponding to the minimum histogram gradient.

Mode
The mode of ! is the most frequently occurring discretized image level present. In case multiple bins have the highest count ! , the mode is the smallest of those values.

Uniformity
Note: Defined by IBSI as Intensity Histogram Uniformity.

Range
The range of bins in the histogram, i.e. the width of the histogram.

20.
Robust mean absolute deviation (rmeand) Similar to mean absolute deviation, but in this case only considering the set of intensity levels in the range between the 10 th and 90 th percentile of ! .
where !"!!" represents the set of !"!!" voxels in ! whose discretized intensity levels fall within the range of the 10 th till the 90 th percentile of ! .

Skewness
where ! is the mean of ! .

10 th percentile
The 10 th percentile of ! .

90 th percentile
The 90 th percentile of ! .

Quartile coefficient of dispersion (qcod)
The quartile coefficient of dispersion is a robust alternative to the coefficient of variance.

Local Intensity features *
Local Intensity (LocInt) features are defined based on local intensity values around a center voxel (2).

Local intensity peak
Mean intensity level in a 1 cm 3 spherical volume, centered on the voxel with the maximum intensity level in the volume of interest. In case multiple voxels contain the maximum intensity level, the highest mean intensity level of all spherical volumes is used.

2.
Global Intensity peak Similar to local intensity peak, but in this case the mean intensity level in a 1 cm 3 spherical volume is calculated for every voxel in the volume of interest. The highest mean intensity level of all spherical volumes is selected as the global intensity peak feature.

Geometric features
Geometric features describe the shape and size of the volume of interest. Let be the volume and the surface area of the volume of interest. Let be the total number of voxels, = ! , ! , … , ! the set of N Cartesian coordinate vectors and = ! , ! , … , ! the corresponding intensity values.

Asphericity
Centroid distance The centroid distance is the Euclidean distance between the geometric centroid ( ! ) and the centroid weighing each voxel by its intensity value ( ! ). The centroid distance is a measure of how close the high intensity values are to the geometric center.

3.
Compactness 1 Compactness is a measure of how much the volume resembles a sphere, as described by Aerts et al.

6.
Maximum diameter The maximum diameter is the largest pairwise difference between voxels on the surface of the volume, in 3D and for each plane separately. The following diameters are calculated: a. The maximum three-dimensional tumor diameter. b. The maximum two-dimensional diameter of all transversal planes.
c. The maximum two-dimensional diameter of all sagittal planes. d. The maximum two-dimensional diameter of all coronal planes.

7.
Major axis length Axis lengths are measures of the extent of the volume along its three principle axis. Principle component analysis (PCA) on the x, y and z coordinates of all voxels within the volume is used to determine the three orthogonal eigenvectors and corresponding eigenvalues ( !"# , !"#$% , !"# ). The major axis length is the largest eigenvalue ( !"# ) as determined by PCA.

Sphericity (4)
Sphericity is a measure of how much the volume resembles a sphere.

Surface area
The surface area is calculated by triangulation (i.e. dividing the surface into connected triangles, which define the isosurface enclosing the volume) and is defined as: Where is the total number of triangles covering the surface and , and are edge vectors of the triangles.

15.
Surface to volume ratio =

Volume
The volume is defined as the number of voxels within the volume multiplied by the voxel volume. = Where is the volume of a single voxel. Note: In the IBSI feature definitions, a more precise approximation of the volume is used. That method uses tetrahedrons consisting of the origin and faces in the ROI. Although the method implemented here overestimates the volume, especially in small volumes, the difference will be negligible in large ROIs.

Grey-Level Co-Occurrence Matrix based features
Grey-level co-occurrence matrix (GLCM) based features, as originally described by Haralick et al (5). A normalized GLCM is defined as ( , ; , ), a matrix with size ! × ! describing the second-order joint probability function of an image, where the ( , )th element represents the number of times the combination of intensity levels and occur in two pixels in the image, that are separated by a distance of pixels in direction , and ! is the maximum discrete intensity level in the image. Let: ( , ) be the normalized (i.e. , = 1) co-occurrence matrix, generalized for any and , Note that for a symmetrical GLCM, = ! = ! .

5.
Cluster Tendency This feature is also called Angular Second Moment (ASM) and Uniformity (6).

Entropy (H)
This feature is also called Inverse Difference (6).

Maximal Correlation Coefficient
= second largest eigenvalue of 23. Sum average (SA)

Grey-Level Run-Length matrix based features
Grey-level run-length matrix (GLRLM) based features, as described by Galloway et al. (8). Run length metrics quantify grey level runs in an image. A grey level run is defined as the length in number of pixels, of consecutive pixels that have the same grey level value. In a grey level run length matrix ( , | ), the ( , )th element describes the number of times a grey level appears consecutively in the direction specified by . Let: ( , ) be the , th entry in the given run-length matrix , generalized for any direction ,

Grey-Level size-zone matrix based features
Grey-level size-zone matrix (GLSZM) based features, as described by Thibault et al. (10,11). A grey level size-zone matrix describes the amount of homogeneous connected areas within the volume, of a certain size and intensity. The , th entry of the GLSZM ( , ) is the number of connected areas of grey level (i.e. intensity value) and size . GLSZM features therefore describe homogeneous areas within the tumor volume, describing tumor heterogeneity at a regional scale (12). Let: ( , ) be the , th entry in the given GLSZM ,

Low intensity small area Emphasis (LISAE)
. High intensity small area Emphasis (HISAE)

Grey-Level distance-zone matrix based features
Grey-level distance-zone matrix (GLDZM) based features, as described by Thibault et al. (13). A grey level distance-zone matrix describes the amount of homogeneous connected areas within the volume, of a certain intensity and distance to the shape border. The shape border is defined by 6-connectedness in 3D (i.e. a voxel is on the border, if at least one face is exposed). In contrast to the original definition by Thibault et al. (13), the minimum distance to the border is 1, instead of 0 (i.e. voxels on the border have a distance of 1), to allow for correct feature calculations. The , th entry of the GLDZM ( , ) is the number of connected areas of grey level (i.e. intensity value) and minimum distance to the shape border. GLSZM features therefore describe the radial distribution of homogeneous areas within the tumor volume. Let: ( , ) be the , th entry in the given GLDZM ,