Glandular Morphometrics for Objective Grading of Colorectal Adenocarcinoma Histology Images

Determining the grade of colon cancer from tissue slides is a routine part of the pathological analysis. In the case of colorectal adenocarcinoma (CRA), grading is partly determined by morphology and degree of formation of glandular structures. Achieving consistency between pathologists is difficult due to the subjective nature of grading assessment. An objective grading using computer algorithms will be more consistent, and will be able to analyse images in more detail. In this paper, we measure the shape of glands with a novel metric that we call the Best Alignment Metric (BAM). We show a strong correlation between a novel measure of glandular shape and grade of the tumour. We used shape specific parameters to perform a two-class classification of images into normal or cancerous tissue and a three-class classification into normal, low grade cancer, and high grade cancer. The task of detecting gland boundaries, which is a prerequisite of shape-based analysis, was carried out using a deep convolutional neural network designed for segmentation of glandular structures. A support vector machine (SVM) classifier was trained using shape features derived from BAM. Through cross-validation, we achieved an accuracy of 97% for the two-class and 91% for three-class classification.


Best Alignment Metric (BAM)
In this section we will explain what we mean by shape space and define and discuss the BAM metric on this space. Our theory works for rectifiable paths and in particular, for smooth paths. However we restrict our discussion to polygonal paths for the sake of simplicity. A boundary curve of a biological object in a micrograph will usually be given as a polygonal path, and a smooth path can be approximated by a polygonal path. Moreover complexity estimates for our algorithms present fewer technical difficulties if the inputs are polygonal.
Definition 1. A polygonal simple closed curve in the complex plane (abbreviation PSCC) is a map u : [0, L] → C together with a cyclic sequence (u 0 , u 1 , . . . , u P−1 , u P = u 0 ) of complex numbers, related to u as follows. The straight intervals [u i−1 , u i ] are traversed successively, for i = 1, 2, . . . , P, by u at unit speed. We have ∑ P i=1 |u i − u i−1 | = L, where L is the length of the curve. We assume that u has no self-intersections, except for the endpoints u(0) = u(L) = u 0 = u P . We also assume that u goes in an anti-clockwise direction around the topological open disk that it bounds. We set V (u) = P. For s ∈ R, we define p s : [0, L] → [0, L] by p s (t) = s + t mod L. This is a convenient notation for changing the starting point of a parametrization of a simple closed curve.
Definition 2. By an oriented shape we mean an equivalence class of PSCCs u, defined by the following equivalence. Let u and v be PSCCs of the same length L. We say that u is oriented shape equivalent to v and write u ∼ v if there is an orientation preserving isometry g, and an s ∈ R such that v = g • u • p s . (See Definition 1.) Our formal definition is that an oriented shape is an equivalence class of PSCCs under oriented shape equivalence. Since the lengths of two equivalent PSCCs are equal, it makes sense to talk of the length of (the boundary of) a shape. The reflection in a line of an oriented shape is also an oriented shape (possibly the same oriented shape, if it is symmetric under reflection), and we define an unoriented shape to be the union of an oriented shape and its reflection. Given a shape or an oriented shape, we can scale it to have length one.
For brevity, we will use the term "shape" in this paper to mean an oriented shape of length one. There is no need to consider unoriented shapes, because each of our shapes is compared with an associated ellipse, and an ellipse is invariant under reflection in either of its axes. Also we will not take scale into account.
Let σ 1 and σ 2 be two (oriented) shapes (of length one). We define One can prove that this definition does indeed satisfy the standard axioms for a metric. In order to have practical and fast computational methods, we approximate each length one PSCC by a cyclic sequence of complex numbers. The accuracy of the approximation is determined by N, the number of points in the cyclic sequence. We fix N = 2 n for some positive integer n. (Later we will invoke Fast Fourier Transform (FFT) for fast computation, and this works best for powers of two.) Definition 3. Given a PSCC f : [0, 1] → C of length one, we define its N-approximation A N ( f ) to be the cyclic sequence (a 0 , . . . , a N = a 0 ) of complex numbers given by a i = f (i/N). Note that this sequence is almost certainly not the same as the sequence of vertices used in the definition of the polygonal curve f . We have ∑ N i=1 |a i − a i−1 | ≤ 1, and, although the curve has length one, equality is rare. The union of the intervals [a i−1 , a i ] may have self-intersections (though it will be a PSCC if N is large enough, depending on f ). α(t) = a 0 = a N to the It is convenient to recast Definition 2 in the context of cyclic sequences.
Definition 4. By a sequence shape we mean an equivalence class of cyclic sequences of complex numbers, defined by the following equivalence. Let A = (a 0 , . . . , a N = a 0 ) and B = (b 0 , . . . , b N = b 0 ) be cyclic sequences of complex numbers of the same length N = 2 m . For k ∈ Z, we define B k = (b k , b k+1 . . . , b k+N = b k ), where the indices are taken mod N. We say that A is (oriented) shape equivalent to B and write A ∼ B if there is an orientation preserving isometry g, and a k ∈ Z such that A = g(B k ).
In order to use N-approximations to estimate d BAM between shapes, it is convenient to extend the definitions to cyclic sequences of complex numbers of the same length N.
The denominator N is usually omitted in defining the L 2 distance between two finite sequences. However, with this denominator, N-approximations converge as N tends to infinity.
We have the following result, which we state without proof.
Lemma 1. Let σ 1 and σ 2 be two shapes of length one. Let f 1 ∈ σ 1 and f 2 ∈ σ 2 . There are euclidean translations t 1 and t 2 such If α = (α 1 , . . . , α N ) and β = (β 1 , . . . , β N ) are N-dimensional complex vectors, we recall the notation α, β = ∑ N i=1 α i β i for the Hermitian inner product. We show how to minimize d 2 (A, e iθ B k ) in terms of the inner product. We write A, Equivalently we maximize C k Re(e i(θ +ψ k ) ). This is done by choosing k so as to maximize C k = | A, B k | and then choosing θ = −ψ k . (3)

Fast Computation of BAM
The value of C k as k varies can be computed by FFT with time complexity O(N log(N)) as follows: j=0 , the inverse fast Fourier transform of ( f j g j ) N−1 j=0 . 4. Compute k = argmax j |X j |. Then C k = |X k | by the convolution theorem.. For i = 1, 2, σ i is given by specifying a PSCC f i ∈ σ i . We can then determine A and B with time complexity O(V ( f 1 ) +V ( f 2 ) + N), where V ( f 1 ) and V ( f 2 ) are the number of vertices used to define f 1 and f 2 respectively, as discussed in Definition 1. It follows that, if σ 1 and σ 2 are shapes of length one, with representative PSCCs f 1 and f 2 , then d BAM (σ 1 , σ 2 ) is computed to within an error of 3/2N with time complexity O(V ( f 1 ) +V ( f 2 ) + N log(N)). A reasonable scheme is to fix N = 2 n greater than V ( f ) for any PSCC f under consideration. We also choose N sufficiently large so that the error resulting from using N-approximations is small enough for one's purposes. This error is independent of f , as shown in Lemma 1.
Note that the above formulae do not consider the reflection R(σ ) of a shape σ in a line to automatically be considered to be the same shape as σ . However, in most biomedical applications, it would be very reasonable for us to change the definition of "shape" so as to make R(σ ) = σ . It is a matter of chance which side of a tissue section will be uppermost when it is placed on a glass slide, and so, if σ is the shape of a gland in a micrograph, we would want R(σ ) = σ . If we wish our shapes to be invariant under reflection, we change the definition of the distance between shapes tõ Another issue is whether or not to scale the shapes being analyzed. In many applications, we want our shapes to be scale invariant. This can be achieved by scaling the PSCCs to all have length one. If, on the other hand, one wishes to take length into account, we can define the square of the distance between a shape σ 1 of length λ 1 and a shape σ 2 of length λ 2 by (d BAM (σ 1 , σ 2 )) 2 = (log(λ 1 /λ 2 )) 2 + (d BAM (σ 1 /λ 1 , σ 2 /λ 2 )) 2 .
One can prove that this extension satisfies the axioms for a metric on the space of unscaled oriented shapes.

2/7 BAM Comparisons Comparison with Symmetric Difference and Fourier Descriptors
For comparison purpose, we experimented with the contours of human retinal pigment epithelial (RPE) cells. In this section, we compare BAM to two other approaches namely Symmetric Difference and Fourier Descriptors. Comparison is presented in terms of time taken to compute the shape distance and in terms of ability to generate five simulated shapes which closely resembles the original shape. For time comparison, we measured the total time taken to compute the distances of 100 pairs, dividing that number by 100 to give a measure of the time to compute the distance once. We ran this process 100 times to create a mean and standard deviation.
The Symmetric Difference Metric, denoted by d SD , is defined on the measurable subsets of any measure space (X, µ), by the formula This gives us a metric on regions with polygonal boundary by integrating the area over the symmetric difference. In order to apply the notion to shapes, we have to minimize d SD (A, ρ(B)), where ρ varies over all orientation preserving isometries, that is, over all transformations resulting from translations and rotations. The speed and accuracy of this calculation will obviously depend on choices for the resolution of the binary images and the number of rotation angles considered. Even at this low resolution of 128 × 128 pixels and 10 rotation options, the computation time is already 1500 times slower than BAM. In the case of BAM, computations with different rotation angles is unnecessary, as shown by Equation 3. With Fourier descriptors, a commonly used shape descriptor, we found that it is approximately 4.5 times faster than BAM. Here the accuracy depends both on the number of Fourier coefficients we decide to use, and on the number of rotation angles we wish to try, each of which affects the speed of computation linearly. Neither of these is relevant for BAM (see Equation 3), and so the factor 4.5 that we present here is somewhat artificial, as we can vary it up or down at our pleasure.
The average time taken with each of these three measures is shown in Supplementary Table 1. On comparing the simulated shape generated by Fourier descriptors and BAM, BAM outperforms Fourier descriptors, as shown in Supplementary Figure  1. This emphasizes the fact that aspects of shape information are lost when the phase information is removed and the power spectrum alone is not enough to faithfully represent shape. Symmetric difference, with the parameters we used, seems to be as good as BAM for selecting similar shapes, while being much slower.

Best Alignment Metric Symmetric Difference Fourier Descriptors
Average Time 35 ± 1.6µs 53000 ± 9200µs 7.8 ± 5.4µs Supplementary Table 1. Time comparison of BAM with two shape difference measures: Symmetric difference and Fourier descriptors.
Supplementary Figure 1. Comparative performance of shape distance measures. Each row above shows a randomly selected target RPE cell outline, followed by 5 outlines identified as closest to the target outline (excluding outlines of the same cell as the target) according to 3 different shape distance measures.

Comparison with Square Root Elastic Measure
The Square Root Elastic (SRE) distance is highly regarded as a sensible measure of shape distance. SRE has a big advantage over BAM in the case of computer vision, because its elastic property enables it to follow changes in shape, as we might have

3/7
with a movie of a person running. When applied to images of tissue sections with no movement, this advantage disappears, and we are left only with the crippling disadvantage of slowness. When computing distances between shapes, the choice of number of points around the curve (N) affects both accuracy and speed. To evaluate the performance of BAM and SRE in terms of accuracy and speed, five pairs of RPE cell curves were chosen. The BAM and SRE distances were computed between each pair with different number of sampling points. For each pair of shapes, an absolute relative error was computed and is presented in Supplementary Figure 2. It can be seen that error of the SRE measurements are not reliably below 1% until N = 1024.
The time taken to compute each shape distance with different number of N is shown in Supplementary Table 2. There is an exponential increase in computational time with SRE on increasing the number of sampling points while in case of BAM, the increase in computational time is linear.

Network training
We made use of the open source library Tensorflow to train our segmentation network. The initial weights of the network were generated using a commonly followed approach, i.e., taking values randomly from a Gaussian distribution with standard deviation (2/N), where N is the total number of inputs to the computational unit. A batch size of 10 patches was used to train the network using a Dell T7610 computer with NVIDIA TitanX 12GB GPU. The network was trained with a dataset of more than 120,000 patches and the training was stopped after 1,44,000 iterations (12 epochs). We used weighted cross entropy as a cost function. This penalises at each pixel according to the deviation of its network-calculated probability from 1. A weighted map was generated, assigning a large weight at the dilated gland boundaries (slightly overlapped between background and gland around the boundaries), so that the network learned the boundary of each gland.