Deep learning for the fully automated segmentation of the inner ear on MRI

Vaidyanathan, Akshayaa; van der Lubbe, Marly F. J. A.; Leijenaar, Ralph T. H.; van Hoof, Marc; Zerka, Fadila; Miraglio, Benjamin; Primakov, Sergey; Postma, Alida A.; Bruintjes, Tjasse D.; Bilderbeek, Monique A. L.; Sebastiaan, Hammer; Dammeijer, Patrick F. M.; van Rompaey, Vincent; Woodruff, Henry C.; Vos, Wim; Walsh, Seán; van de Berg, Raymond; Lambin, Philippe

doi:10.1038/s41598-021-82289-y

Download PDF

Article
Open access
Published: 03 February 2021

Deep learning for the fully automated segmentation of the inner ear on MRI

Akshayaa Vaidyanathan^1,2,3,
Marly F. J. A. van der Lubbe⁴,
Ralph T. H. Leijenaar³,
Marc van Hoof⁴,
Fadila Zerka³,
Benjamin Miraglio³,
Sergey Primakov^1,2,
Alida A. Postma^5,6,
Tjasse D. Bruintjes^7,8,
Monique A. L. Bilderbeek⁹,
Hammer Sebastiaan¹⁴,
Patrick F. M. Dammeijer¹⁰,
Vincent van Rompaey^11,12,
Henry C. Woodruff^1,2,13,
Wim Vos³,
Seán Walsh³,
Raymond van de Berg^4,6^na1 &
…
Philippe Lambin^1,2,13^na1

Scientific Reports volume 11, Article number: 2885 (2021) Cite this article

8601 Accesses
35 Citations
4 Altmetric
Metrics details

Subjects

Abstract

Segmentation of anatomical structures is valuable in a variety of tasks, including 3D visualization, surgical planning, and quantitative image analysis. Manual segmentation is time-consuming and deals with intra and inter-observer variability. To develop a deep-learning approach for the fully automated segmentation of the inner ear in MRI, a 3D U-net was trained on 944 MRI scans with manually segmented inner ears as reference standard. The model was validated on an independent, multicentric dataset consisting of 177 MRI scans from three different centers. The model was also evaluated on a clinical validation set containing eight MRI scans with severe changes in the morphology of the labyrinth. The 3D U-net model showed precise Dice Similarity Coefficient scores (mean DSC-0.8790) with a high True Positive Rate (91.5%) and low False Discovery Rate and False Negative Rates (14.8% and 8.49% respectively) across images from three different centers. The model proved to perform well with a DSC of 0.8768 on the clinical validation dataset. The proposed auto-segmentation model is equivalent to human readers and is a reliable, consistent, and efficient method for inner ear segmentation, which can be used in a variety of clinical applications such as surgical planning and quantitative image analysis.

Towards fully automated inner ear analysis with deep-learning-based joint segmentation and landmark detection framework

Article Open access 04 November 2023

Jannik Stebani, Martin Blaimer, … Kristen Rak

Fully automated preoperative segmentation of temporal bone structures from clinical CT scans

Article Open access 08 January 2021

C. A. Neves, E. D. Tran, … N. H. Blevins

The Dresden in vivo OCT dataset for automatic middle ear segmentation

Article Open access 26 February 2024

Peng Liu, Svea Steuer, … Marcus Neudert

Introduction

The inner ear, also known as the labyrinth, is a complex structure located in the temporal bone. It roughly consists of the cochlea, the vestibule, and the semi-circular canals. Understanding changes and variations within these structures can help diagnose and predict a number of conditions¹, such as inflammatory and neoplastic processes. Technological developments in imaging techniques have allowed (neuro)radiologists to evaluate the human labyrinth, with recent advances increasing the level of detail¹. Moreover, applications of artificial intelligence and the quantitative assessment of medical images for the non-invasive exploration of anatomical structures and the classification of diseases have remarkably increased in recent years².

The process of the automated extraction and analysis of large amounts of quantitative information from medical images is known as radiomics^3,4. A recent study investigated the value of radiomics for diagnosis of Meniere’s disease (MD), an inner ear disorder characterized by episodic vertigo spells, hearing loss, and tinnitus⁵. Other labyrinthine disorders such as sensorineural hearing loss might benefit from quantitative image analysis as well⁶.

Image segmentation is a critical step to work towards fully automated diagnostic tools for inner ear disorders. Manual segmentation requires experienced readers, is time-consuming and prone to intra-and inter-observer variability^7,8,9. Over the past years, several automatic and semi-automatic inner ear segmentation methods were proposed for both MRI and CT imaging^{10,11,12,13,14}, including region-growing, thresholding and edge detection¹⁵, model-based^10,14, atlas-based^12,13 and machine-learning techniques¹¹. The inner ear’s small and complex structure makes segmentation challenging, especially in MR imaging due to non-homogenous image intensities^10,11.

Recent work proposed a statistic shape model (SSM) for inner ear segmentation in MR images (13). However, the proposed methodology present a high computational burden, both in term of time and cost. Another recently published segmentation model showed very good agreement between an atlas-based segmentation and the manual gold standard¹², yet requires manual intervention. Additionally, segmentation performance of atlas-based methods decreases for complex structures with variable shape and size¹⁶.

Recent studies have demonstrated the successful application of deep learning techniques for detection, segmentation, and classification tasks in the medical field¹⁷. Among deep learning techniques, the U-Net architecture is a specific type of convolutional neural network (CNN) consisting of multilayer neural networks. These networks have been implemented successfully, especially for auto-segmentation in medical images^18,19. Although U-Net based deep learning approaches do exist for segmentation of the inner ear^18,20, they lack the incorporation of anatomical variations, pathological situations or missing anatomical structures, which are part of daily clinical practice. Hence, there is currently no fully automated, generic segmentation method for the inner ear to meet the growing demand for the developments in 3D visualization and quantitative image assessments.

Therefore, this study’s objective was to develop a deep-learning approach for the automatic segmentation of the inner ear in clinical MR images, focusing on the robustness of the method in varying clinical situations, and to evaluate its performance and generalizability with manual segmentation as reference.

Materials and methods

Ethical considerations

This study was performed in accordance with the guidelines outlined by Dutch and Belgian legislation. MRI scans were collected and fully anonymized by the local investigators of four centers. Ethics committee of University Hospital Antwerp approved the study (Approval number—17/09/093) and written informed consent was obtained from the participants. The other centers waived the ethics approval due to the retrospective nature and full anonymization of the data according to the Medical Research involving Human Subjects act (WMO).

Automatic segmentation workflow

The workflow applied in this study consisted of four steps and is illustrated in Fig. 1. Each step of the workflow is detailed in the following paragraphs. Additional details on the model architecture, training, validation and testing can be found in Supplementary Sect. 1 of the Supplementary Information.

Training dataset

A total of 1203 images of patients who underwent an MRI scan of the cerebellopontine angle for diverse neuro-otological indications in the period of December 2015 to April 2019 in Maastricht Medical University center (center A) were collected and fully anonymized. All high resolution T2-weighted images were acquired in 1.5 and 3 T (T) MRI scanners, from different vendors with a variety of high-resolution T2-weigted sequences (3D cochlea, DRIVE, SPC_TRA_ISO), with local optimized protocols. MRI scans of the cerebellopontine angle were included if they allowed labyrinth visualization with at least a portion of the labyrinth recognizable and suitable for manual segmentation. MRI scans, which did not allow a clear manual segmentation, were excluded from this study. In total, 259 MRI images were excluded due to unsuitable sequences (DWI, T1, SURVEY MST), poor quality, or skewed MR images. The final training dataset included MRI scans of 944 cases (Table 1).

Table 1 Data set characteristics.

Full size table

Validation and test dataset

The validation dataset included MRI scans of 99 cases collected from Maastricht University Medical Center + (center A) in the period from 2005 to 2015 (Table 1). MRI scans collected from 3 different centers, University Hospital Antwerp (center B), Viecure Hospital Venlo (center C), and Apeldoorn dizziness center (center D) during the period of 2005–2017 (Table 1) were used as an independent Test dataset. Both validation and test dataset consisted of T2-weighted MR images of the cerebellopontine angle of patients with uni- or bilateral definitive Meniere’s disease and idiopathic asymmetric sensorineural hearing loss.

Manual segmentation

A team of six readers was trained by the second author (MvdL), an experienced clinician and researcher in inner ear imaging, to manually segment the labyrinth on both sides in 3D Slicer 4.8.1¹⁹. Manual segmentation was facilitated by intensity-based thresholding and region-growing algorithms. The original MRI scans and the manually segmented masks were visualized by 3D maximum intensity projections as shown in Fig. 2. This provided an overview of the manually segmented results which allowed for thorough quality assessment. All segmentations were curated by the experienced reader (MvdL), where any missing or incorrectly segmented masks were re-segmented by the experienced reader (MvdL). The final segmentation results served as the ground truth for training the CNN. The independent validation and test datasets were segmented and curated by the experienced reader (MvdL).The resulting manual segmentations on the test dataset was used as the ground truth (reference standard).

Pre-processing

In order to generate homogeneous MRI volumes as input for the model, the following pre-processing steps were performed. Firstly, all volumes were resampled by B-spline interpolation to an isotropic voxel size of 0.45 mm. Secondly, the intensities of the MRI volumes were normalized to range [0–1] using the minimum and maximum intensity of each volume. Lastly, since the model’s architecture required inputs of the same dimensions, a center crop of 256 × 256 × 64 pixels was obtained from the pre-processed volumes. This crop size was large enough to contain contextual information of the inner ear. Images smaller than 256 × 256 pixels in the transversal plane and 64 pixels in slice direction were padded with zeros. More details can be found in Supplementary Information Sect. 2.

Model architecture

The model’s architecture is based on a classical 3D U-net²¹, as illustrated in Fig. 3a,b. It comprises an encoder, a decoder block and skip connections. The encoder network is a contracting path with convolution layers, which extracts high-level features, decreasing the spatial resolution at each layer. The decoder network is an expanding path, which increases the spatial resolution by up-sampling and uses the feature information to segment the pixels corresponding to the Region of interest. Skip connections, between encoder and decoder, allow retrieval of fine details, which might be lost during spatial down-sampling.

The model’s architecture was adapted with attention gates, as the relevant features of the inner ear showed large shape variability and were very small compared to the surrounding structures²².

The attention gates highlight the regions that correspond to inner ear and suppress the regions that correspond to background. The highlighted features are propagated by the skip connections from the deep stages of contracting paths to the expanding paths. More specifically, attention Gates are used to propagate the important spatial information corresponding to inner ear from the encoding to the decoding part of the model. As shown in the Fig. 3b, the input feature maps from the encoder part of the network are scaled by the attention coefficients generated by the Attention Gates, thereby outputting the features relevant to the inner ear. The scaled features are then concatenated with the up-sampled output feature maps at each level in the decoder part of the network.

Since different components of inner ear are more easily accessible at different scales, we additionally input the same volume at three different scales along the encoder path, which has been previously described as an input image pyramid by Oktay et al.²².

Other network parameter changes included an increase in number of convolutional filters from 16 to 128 in the encoder network. Each Maxpooling layer reduced the image spatial resolution by a factor of two. Along the decoder path, transposed convolutions were used for up-sampling which increased the image size by a factor of two at each layer. All the convolutional blocks included 3D convolutions²³, ReLu activation²⁴ and Instance Normalization²⁵.

Training, validation and testing

The model was trained with the pre-processed volumes and their corresponding ground truth labels of the training dataset. Randomly selected input volumes were augmented by vertical flipping or rotation during training. The network weights were initialized by using the He-normal initialization method²⁶ and updated by using the Adam optimizer²⁷ at an initial learning rate of 1e⁻⁴.

Since the number of positive voxels (i.e. part of the inner ear) and the negative voxels were highly imbalanced, Tversky loss²⁸ was used as an objective loss function while training the model, which penalized false negatives more than false positives at a false positive penalty score (β) of 0.3 and a false negative penalty score (α) of 0.7. This approach emphasizes learning features corresponding to the positive voxels. The loss was calculated in a mini batch of two images per iteration and at the end of each epoch, Tversky loss was calculated on the model’s predictions on the validation dataset to ensure validation loss convergence (i.e., decrease in validation loss).

The final model’s performance was evaluated on the multicentric, independent test dataset.

Outcome measurements

The main outcomes of this study were the Dice similarity coefficient (DSC), true positive rate (TPR), false positive rate (FPR), false negative rate (FNR) and false discovery rate (FDR).

As a secondary outcome a subjective evaluation of clinical validation was performed by the second author (MvdL) in consensus with an experienced neuroradiologist (A.A.Postma). Towards clinical implementation, it is critical that a deep learning model is able to segment the inner ear under all conditions, including those that might alter the shape of the inner ear (e.g., by pathology). Therefore, eight MR images, with their corresponding masks, were selected by the second author (MvdL) in which the signal intensities of the inner ear were altered either by pathology or post-therapeutic changes. These scans were left out from the training dataset and were used for clinical validation of the performance of the model.

Qualitative assessment: in silico clinical study

An in silico clinical study was performed to make a qualitative comparison between manual and model-generated segmentations for 50 MRI volumes randomly selected from the test cohort. An in-house developed software was used to display pairs of segmentations (automated vs manual), at randomized screen positions (left or right) blinded to the participants, overlaid on MRI images, as shown in Fig. 4. The software allowed for scrolling through all image slices and adjustment of window level settings. We enrolled 7 participants (3 computer scientists working in the field of medical imaging and 4 radiologists with an average experience of 2.5 years). For each image, the participants were asked to select their preferred segmentation. For each participant, the qualitative preference score was defined as the percentage of cases with preferred automated segmentation.

Results

The final training dataset included MRI scans of 944 cases (489 men, 455 women, aged 41–74; mean age 57 ± 15). The final validation dataset included MRI scans of 99 cases from center A (69 men, 30 woman, aged 43–69; mean age 56 ± 13) and 177 cases from centers B, C and D (79 men, 59 women, aged 45–74; mean age 59 ± 14).

Segmentation performance

The segmentation accuracy was evaluated against the ground truth by assessing the DSC. DSC measures the overlap between the reference and the model’s output. The overall average metrics of segmentation accuracy, DSC, TPR, FNR, FDR and FPR are summarized in Table 2. Figure 5 shows a comparison between ground truth volume and predicted true positive volume on the validation and test dataset. Figure 6 shows the distribution of DSCs on validation and test dataset. The correlation between the true positive volume and ground truth volume was also investigated. In Fig. 7a,b, agreements for ground truth volume and predicted volume are graphically displayed by Bland–Altman plots. Figure 8a,b shows an example of a well predicted and poorly predicted segmentation.

Table 2 Performance of the proposed 3D U-Net for the automatic segmentation of the inner ear.

Full size table

Performance on clinical validation dataset

On the held-out clinical validation dataset, the model achieved an average DSC of 0.876, TPR of 87.86%, FDR of 15.2% and FNR of 12.13%. The automated segmentations on this dataset are included in Supplementary Information. It includes labyrinths in which parts of the semi-circular canals, the vestibule or the cochlea were missing or not properly displayed. As an example, an MRI scan with a vestibular schwannoma (a tumorous process growing from the vestibular nerve) was included in Fig. 9.

Qualitative assessment: in silico clinical study

On average, the participants preferred the automated segmentation in 67% of the cases. A paired one-sided t-test for the hypothesis that this average score is greater than 50% was significant (p = 9.82474e⁻¹⁷), indicating that expert users preferred the segmentations generated by the proposed model over the manual segmentations.

Discussion

In this work, a first proof-of-concept of an artificial intelligence based model for the fully automatic segmentation of the inner ear on MRI was demonstrated and validated.

The proposed model showed high performance, with a mean DSC of 0.87 between the manual and the automated segmentation validated across images from three different centers. The mean TPR of 91.5% implies accurate segmentation of the inner ear without significant over or under segmentation as indicated by the low FDR and FNR metrics (14.8% and 8.49% respectively).

The in silico based qualitative analysis showed that on an average, the expert users (radiologists and computer scientists) are more likely to prefer model generated segmentations over manual segmentations. The Bland–Altman plot (Fig. 6) shows 5 outliers. The fact that the slice thicknesses of those scans were high (mean slice thickness—1.1 mm) compared to the mean slice thickness of the training cohort (0.32 mm) could explain the miss-segmentations for these cases. Also, the three scans from the Center D contained either moving artifacts or had tight margins around the labyrinth, which might explain the lower limits of agreements. All the MRI scans of the Center C had noticeably more hyperintense areas at the apex of the pas petrosa compared to the other centers. Although this might have ‘challenged’ the model, it does not explain why only two out of 21 scans had lower DSCs.

A prior study, that used deep learning to facilitate the auto-segmentation of the inner ear, compared the performance of a 3D Fully Connected Network (FCN) to a 2D-FCN²⁰. The study reported an overall DSC of 0.66 and 0.58 when using 3D-FCN and 2D-FCN, respectively. Another recent study reported a high DSC of 0.95 using SSMs based level set¹⁰. However, their model was evaluated on a small dataset (10 cases out of 23 cases were held out for testing) and no independent validation was performed. Directly comparing the present approach with the already published methods in terms of DSC is not possible due to differences in datasets. Nevertheless, it is worth noting that our presented method achieves state of the art performance, which can be ascribed to the robust deep learning approach combined with a wide and varied dataset, both for training and validation, an aspect often neglected in similar studies.

There are several important strengths of this study. First of all, the model was trained on a diverse set of MR images of the cerebellopontine region. Although all MR images of the training dataset were collected in one center, they were acquired over a wide time span (2015–2019) and include different acquisition and reconstruction protocols²⁹. Next to this, the training dataset was manually segmented by five independent readers. Therefore, the model learned to eliminate noise in the manually segmented labels caused by inter-reader variability. These methodological aspects resulted in a model that is well generalizable, which is reflected in the high-validation performance. Past studies have shown high inter-reader and intra-reader variability on medical image segmentation tasks^30,31. Our method’s consistency (i.e., no segmentation variability) alleviates this issue. Additionally, the interaction time was approximately 10 min per case for manual segmentation by an experienced reader compared to only 6.5 s for the automatic segmentation.

One of the most important strengths of this study is the evaluation on the MR images containing deviant morphological shapes and decreased signal intensities of the labyrinth caused by cerebellopontine pathology. On this held-out clinical validation dataset, the model proved to generalize well with an average DSC and TPR of 0.8768 and 87.86% respectively. So far, previous auto-segmentation studies have trained their models on normal ears or small datasets (13–15). To the best of our knowledge, our study is the first to assess generalizability with respect to pathologies.

Limitations

Several limitations of this study should be noted. First of all, the most important limitation is the lack of a gold standard for manual segmentations from highly experienced neuroradiologists. Due to the extent of the segmentation process, manual segmentation of approximately 1500 labyrinths by one or more senior radiologist was not feasible. Therefore, in this study the authors chose to work with independent readers who were trained and supervised by an experienced clinical researcher in inner ear imaging (MvdL) to generate a first proof of concept. This could have induced noise in the manual segmentations. Also, the intra- and inter-observer variability of the segmentation team was not evaluated. Although manual segmentation was performed under strict supervision of the second author and a curating process was performed to detect incorrectly segmented masks, the quality of the manual segmentation could not be fully guaranteed. Since the manually segmented masks were considered as the reference standard for the evaluation of the model, lower DSC scores might have indicated better automated segmentation compared to manual segmentation.

Nevertheless, efforts have been made to contain this limitation by training a deep learning architecture with large number of parameters and applying Early Stopping to prevent overfitting on the noise in the manual segmentation. Previous studies have proved that overparameterized networks are more robust against noisy labels when Early Stopping is applied³².

Given the very small are occupied by the inner ear in the whole MRI volume, the performance of our model might be further improved by applying bounding box detection³³ or shape identification³⁴ prior to automated segmentation especially for abnormal cases.

Secondly, poor generalizability is the most common problems pertaining to deep learning models²⁹. In this study, attempts were made to prevent overfitting by training the model on a large dataset from one center and testing its generalizability by holding out 3 independent validation cohorts. Although the overall DSC scores were markedly high, the model performed poorly and failed to generalize in five cases out of 177 (3 cases from center C and 2 cases from center D had DSC < 0.70). This situation could have been mitigated by training the model on all of the centers. This would have made the training dataset more diverse (e.g., in terms of image acquisition and reconstruction) and the model’s performance could have been evaluated by cross-validation techniques (i.e., holding out 20–30% of the data from each center for a single validation test data). However, this would degrade the credibility of the generalizability of the model due to concerns regarding overfitting.

Lastly, the model was trained and evaluated on datasets that included only Dutch and Belgian population. The generalizability of the model on MRI images from an international cohort is currently unexplored.

Clinical implications and future perspectives

The future clinical advantages of automated 3D image segmentation of the inner ear are versatile. Image segmentation can be used for 3D visualization, allowing a better understanding of the spatial relations and morphological changes within the inner ear, assisting radiologists in the diagnostic process and providing tools for surgical planning³⁵ or learning purposes³⁶. Previous studies have proven the usability of auto-segmentation for pre-operative planning of cochlear implant surgery using CT imaging³⁷ and for the diagnosis of adolescent idiopathic scoliosis using MRI imaging¹¹. Our model proved to be efficient on MRI imaging. However, the proposed methodology can be easily leveraged for similar auto-segmentation applications on different imaging modalities.

Nowadays, quantitative analysis of the inner ear is gaining more importance. Techniques like radiomics⁶, volumetric assessment of fluid compartments in the labyrinth^12,38 and the analysis of the morphoanatomy for the vestibular system¹¹ are used to aid diagnosis of vestibular diseases. Radiomics refers to the process of the automated extraction and analysis of large amounts of quantitative features from medical images. These features are sometimes not perceptual for the human eye and might contain information that reflects underlying tissue heterogeneity and pathophysiology^4,39. Quantitative image features involve descriptors of shape, size, volume, intensity distributions and texture heterogeneity patterns³⁹.

A histological feature strongly associated with Meniere’s disease is endolymphatic hydrops (EH), a distention of the endolymphatic compartment in the inner ear⁴⁰. In conventional MRI, the endolymphatic compartment cannot be distinguished from the perilymphatic compartment, and thus, EH is not depicted⁴¹. The differences found in radiomic features between MD and controls could hypothetically be explained by the different composition of the fluids in the labyrinth, causing a different distribution of signal intensities⁵. Possibly, EH is captured in the quantitative image features due to damage to or morphological changes to the endolymphatic space. Since Meniere’s disease is still a clinical diagnosis challenge⁴², discovering distinctive image features might benefit the diagnostic trajectory of MD. Another study showed that cochlea CT image features can be useful biomarkers for predicting sensorineural hearing loss in patient with head and neck cancers which received chemoradiation therapy⁶. Different machine learning methods were used for feature selection, classification and prediction. The advantage of using machine learning in combination with radiomics is that the analysis of the labyrinth could be done autonomously in the future⁵. However, for both studies, setting a Region Of Interest (ROI) by manual segmentation was necessary. The fully automated segmentation of the inner ear contributes to efficient research on quantitative image analysis of the inner ear.

Next to analyses on conventional MRI and CT imaging, the volumetric assessment of fluid compartments in the labyrinth is also promising for vestibular research³⁸. Contrast-enhanced MR imaging allows the in vivo confirmation and quantification of endolymphatic hydrops^12,43.

Several studies investigated the value of the 3D volumetric assessment of the endolymphatic space (ELS) to better monitor EH in vivo, for example in therapeutic trials in Meniere’s disease, and to better compare the ELS in patients with different otological diseases^38,44,45. However, the 3D reconstruction were all rendered semi-automatic. Due to this time-consuming process, the applications for volumetric assessment are yet more scientifically than clinically relevant. A recent study proposed atlas-based segmentation for the volume-based quantification of the fluid spaces of the inner ear¹². Which created fast, standardized (auto)segmentation. Further research is necessary to explore the option of the proposed U-net model can be leveraged for contrast-enhanced imaging as well, to facilitate volumetric assessment of the ELS in clinic.

Auto-segmentation in its current form, is a step towards fully automated diagnostic tools for inner ear disorders.

Conclusion

In this study, a working first proof-of-concept is demonstrated regarding the fully automatic segmentation of the inner ear using deep learning. Overall, the proposed auto-segmentation model is equivalent to manual segmentation and is a reliable, consistent, and efficient method for inner ear segmentation which can be used in a variety of clinical applications, such as 3D visualization, surgical planning and quantitative image analysis. Auto-segmentation of the inner ear in its current form, might open doors towards automated diagnostic tools for inner ear disorders.

Abbreviations

DSC:: Dice similarity coefficient
TPR:: True positive rate
FNR:: False negative rate
FPR:: False positive rate
FDR:: False discovery rate

References

Pyykkö, I., Zou, J., Gürkov, R., Naganawa, S. & Nakashima, T. Imaging of temporal bone. Adv. Otorhinolaryngol. https://doi.org/10.1159/000490268 (2019).
Article PubMed Google Scholar
Sollini, M., Antunovic, L., Chiti, A. & Kirienko, M. Towards clinical application of image mining: A systematic review on artificial intelligence and radiomics. Eur. J. Nucl. Med. Mol. Imaging https://doi.org/10.1007/s00259-019-04372-x (2019).
Article PubMed PubMed Central Google Scholar
Kumar, V. et al. Radiomics: The process and the challenges. Magn. Reson. Imaging 30, 1234–1248 (2012).
Article Google Scholar
Gillies, R. J., Kinahan, P. E. & Hricak, H. Radiomics: Images are more than pictures, they are data. Radiology https://doi.org/10.1148/radiol.2015151169 (2016).
Article PubMed PubMed Central Google Scholar
van den Burg, E. L. et al. An exploratory study to detect ménière’s disease in conventional MRI scans using radiomics. Front. Neurol. 7, 190 (2016).
PubMed PubMed Central Google Scholar
Abdollahi, H. et al. Cochlea CT radiomics predicts chemoradiotherapy induced sensorineural hearing loss in head and neck cancer patients: A machine learning and multi-variable modelling study. Phys. Med. 45, 198–204 (2018).
Article Google Scholar
Nogovitsyn, N. et al. Testing a deep convolutional neural network for automated hippocampus segmentation in a longitudinal sample of healthy participants. Neuroimage 197, 589–597 (2019).
Article Google Scholar
Men, K. et al. Fully automatic and robust segmentation of the clinical target volume for radiotherapy of breast cancer using big data and deep learning. Phys. Med. 50, 13–19 (2018).
Article Google Scholar
Akkus, Z., Galimzianova, A., Hoogi, A., Rubin, D. L. & Erickson, B. J. Deep learning for brain MRI segmentation: State of the art and future directions. J. Digit. Imaging 30, 449–459 (2017).
Article Google Scholar
Zhu, S. et al. 3D automatic MRI level set segmentation of inner ear based on statistical shape models prior. In Proceedings—2017 10th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics, CISP-BMEI 2017 (2018). https://doi.org/10.1109/CISP-BMEI.2017.8301973.
Shi, L. et al. Automatic MRI segmentation and morphoanatomy analysis of the vestibular system in adolescent idiopathic scoliosis. Neuroimage https://doi.org/10.1016/j.neuroimage.2010.04.002 (2011).
Article PubMed PubMed Central Google Scholar
Kirsch, V., Nejatbakhshesfahani, F., Ahmadi, S. A., Dieterich, M. & Ertl-Wagner, B. A probabilistic atlas of the human inner ear’s bony labyrinth enables reliable atlas-based segmentation of the total fluid space. J. Neurol. https://doi.org/10.1007/s00415-019-09488-6 (2019).
Article PubMed Google Scholar
Powell, K. A. et al. Atlas-based segmentation of temporal bone anatomy. Int. J. Comput. Assist. Radiol. Surg. https://doi.org/10.1007/s11548-017-1658-6 (2017).
Article PubMed PubMed Central Google Scholar
Reda, F. A., Noble, J. H., Labadie, R. F. & Dawant, B. M. An artifact-robust, shape library-based algorithm for automatic segmentation of inner ear anatomy in post-cochlear-implantation CT. In Medical Imaging 2014: Image Processing (2014). https://doi.org/10.1117/12.2043260.
Todd, C., Kirillov, M., Tarabichi, M., Naghdy, F. & Naghdy, G. An analysis of medical image processing methods for segmentation of the inner ear. In Proceedings of the IADIS International Conference Computer Graphics, Visualization, Computer Vision and Image Processing 2009, CGVCVIP 2009. Part of the IADIS MCCSIS 2009 (2009).
Sharma, N. et al. Automated medical image segmentation techniques. J. Med. Phys. https://doi.org/10.4103/0971-6203.58777 (2010).
Article PubMed PubMed Central Google Scholar
Litjens, G. et al. A survey on deep learning in medical image analysis. Med. Image Anal. 42, 60–88 (2017).
Article Google Scholar
Fauser, J. et al. Toward an automatic preoperative pipeline for image-guided temporal bone surgery. Int. J. Comput. Assist. Radiol. Surg. https://doi.org/10.1007/s11548-019-01937-x (2019).
Article PubMed Google Scholar
Fedorov, A. et al. 3D Slicer as an image computing platform for the Quantitative Imaging Network. Magn. Reson. Imaging 30, 1323–1341 (2012).
Article Google Scholar
Gong, Z., Li, X., Zhou, L. & Zhang, H. A 3D fully convolutional network based semantic segmentation for ear computed tomography images. In Proceedings—2018 11th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics, CISP-BMEI 2018 (2019). https://doi.org/10.1109/CISP-BMEI.2018.8633242.
Çiçek, Ö., Abdulkadir, A., Lienkamp, S. S., Brox, T. & Ronneberger, O. 3D U-net: Learning dense volumetric segmentation from sparse annotation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2016). https://doi.org/10.1007/978-3-319-46723-8_49.
Oktay, O., et al. Attention U-Net: learning where to look for the Pancreas (2018). https://arxiv.org/abs/1804.03999v3
Ji, S., Xu, W., Yang, M., Yu, K. 3D convolutional neural networks for human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 221–231 (2013). https://doi.org/10.1109/TPAMI.2012.59.
Agarap, A. F. Deep Learning using Rectified Linear Units (ReLU). https://arxiv.org/abs/1803.08375 (2019).
Ulyanov, D., Vedaldi, A., Lempitsky, V. Instance normalization: the missing ingredient for fast stylization. https://arxiv.org/abs/1607.08022 (2017).
He, K., Zhang, X., Ren, S. & Sun, J. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE International Conference on Computer Vision. https://doi.org/10.1109/ICCV.2015.123 (2015).
Kingma, D. P. & Ba, J. L. Adam: A method for stochastic optimization. In 3rd International Conference on Learning Representations, ICLR 2015—Conference Track Proceedings (2015).
Abraham, N. & Khan, N. M. A novel focal tversky loss function with improved attention u-net for lesion segmentation. In Proceedings—International Symposium on Biomedical Imaging (2019). https://doi.org/10.1109/ISBI.2019.8759329.
Caruana, R., Lawrence, S. & Giles, L. Overfitting in neural nets: Backpropagation, conjugate gradient, and early stopping. In Advances in Neural Information Processing Systems (2001).
Norman, B., Pedoia, V. & Majumdar, S. Use of 2D U-net convolutional neural networks for automated cartilage and meniscus segmentation of knee MR imaging data to determine relaxometry and morphometry. Radiology 288, 177–185 (2018).
Article Google Scholar
Yasaka, K., Akai, H., Abe, O. & Kiryu, S. Deep learning with convolutional neural network for differentiation of liver masses at dynamic contrast-enhanced CT: A preliminary study. Radiology 286, 887–896 (2018).
Article Google Scholar
Li, M., Soltanolkotabi, M. & Oymak, S. Gradient Descent with Early Stopping is Provably Robust to Label Noise for Overparameterized Neural Networks. arXiV (2019).
Zhuang, X. et al. Evaluation of algorithms for multi-modality whole heart segmentation: An open-access grand challenge. Med. Image Anal. 58, 101537 (2019).
Article Google Scholar
Yang, G. et al. Simultaneous left atrium anatomy and scar segmentations via deep learning in multiview information with attention. Future. Gener. Comput. Syst. 107, 215–228 (2020).
Article Google Scholar
Heutink, F. et al. Multi-scale deep learning framework for cochlea localization, segmentation and analysis on clinical ultra-high-resolution CT images. Comput. Methods Progr. Biomed. 191, 105387 (2020).
Article Google Scholar
Ferreira, A., Gentil, F. & Tavares, J. M. R. S. Segmentation algorithms for ear image data towards biomechanical studies. Comput. Methods Biomech. Biomed. Eng. https://doi.org/10.1080/10255842.2012.723700 (2014).
Article Google Scholar
Noble, J. H., Labadie, R. F., Majdani, O. & Dawant, B. M. Automatic segmentation of intracochlear anatomy in conventional CT. IEEE Trans. Biomed. Eng. https://doi.org/10.1109/TBME.2011.2160262 (2011).
Article PubMed PubMed Central Google Scholar
Gürkov, R. et al. MR volumetric assessment of endolymphatic hydrops. Eur. Radiol. https://doi.org/10.1007/s00330-014-3414-4 (2015).
Article PubMed Google Scholar
Lambin, P. et al. Radiomics: Extracting more information from medical images using advanced feature analysis. Eur. J. Cancer 48, 441–446 (2012).
Article Google Scholar
Merchant, S. N., Adams, J. C. & Nadol, J. B. Pathophysiology of Ménière’s syndrome: Are symptoms caused by endolymphatic hydrops?. Otol. Neurotol. https://doi.org/10.1097/00129492-200501000-00013 (2005).
Article PubMed Google Scholar
Lingam, R. K., Connor, S. E. J., Casselman, J. W. & Beale, T. MRI in otology: Applications in cholesteatoma and Ménière’s disease. Clin. Radiol. https://doi.org/10.1016/j.crad.2017.09.002 (2018).
Article PubMed Google Scholar
Lopez-Escamez, J. A. et al. Diagnostic criteria for Menière’s disease. J. Vestib. Res. Equilib. Orientat. https://doi.org/10.3233/VES-150549 (2015).
Article Google Scholar
Naganawa, S. & Nakashima, T. Visualization of endolymphatic hydrops with MR imaging in patients with Ménière’s disease and related pathologies: Current status of its methods and clinical significance. Jpn. J. Radiol. https://doi.org/10.1007/s11604-014-0290-4 (2014).
Article PubMed Google Scholar
Homann, G. et al. Semi-quantitative vs. volumetric determination of endolymphatic space in Menière’s disease using endolymphatic hydrops 3T-HR-MRI after intravenous gadolinium injection. PLoS ONE https://doi.org/10.1371/journal.pone.0120357 (2015).
Article PubMed PubMed Central Google Scholar
Inui, H., Sakamoto, T., Ito, T. & Kitahara, T. Magnetic resonance-based volumetric measurement of the endolymphatic space in patients with Meniere’s disease and other endolymphatic hydrops-related diseases. Auris Nasus Larynx https://doi.org/10.1016/j.anl.2018.11.008 (2019).
Article PubMed Google Scholar

Download references

Acknowledgements

The authors would like to thank the segmentation team, consisting of the following medical students: J.G.V Muijsenberg, D.M.F. Hoeben, L. Mulder, S.V. Feitsma and Y. L. Leroux, for their contributions to this work. The authors would like to thank Fabio Bottari for the revision of the manuscript. Furthermore, the authors acknowledge financial support from MUMC+ and VieCurie Venlo, ERC advanced grant (ERC-ADG-2015, n° 694812)—Hypoximmuno and the European Program H2020 (PREDICT-ITN-n° 766276), Interreg V-A Euregio Meuse-Rhine (“Euradiomics”).

Author information

These authors contributed equally: Raymond van de Berg and Philippe Lambin.

Authors and Affiliations

The D-Lab, Department of Precision Medicine, GROW-School for Oncology, Maastricht University, Maastricht, The Netherlands
Akshayaa Vaidyanathan, Sergey Primakov, Henry C. Woodruff & Philippe Lambin
Department of Radiology and Nuclear Imaging, GROW-School for Oncology, Maastricht University Medical Centre, Maastricht, The Netherlands
Akshayaa Vaidyanathan, Sergey Primakov, Henry C. Woodruff & Philippe Lambin
Oncoradiomics SA, Clos Chanmurly 13, 4000, Liège, Belgium
Akshayaa Vaidyanathan, Ralph T. H. Leijenaar, Fadila Zerka, Benjamin Miraglio, Wim Vos & Seán Walsh
Department of Otolaryngology and Head and Neck Surgery, Maastricht University Medical Center, Maastricht, The Netherlands
Marly F. J. A. van der Lubbe, Marc van Hoof & Raymond van de Berg
Department of Radiology and Nuclear Medicine, Maastricht University Medical Center, Maastricht, The Netherlands
Alida A. Postma
School for Mental Health and Sciences, Maastricht University, Maastricht, The Netherlands
Alida A. Postma & Raymond van de Berg
Department of Otorhinolaryngology, Gelre Hospital, Apeldoorn, The Netherlands
Tjasse D. Bruintjes
Department of Otorhinolaryngology, Leiden University Medical Center, Leiden, The Netherlands
Tjasse D. Bruintjes
Department of Radiology, Viecuri Medical Center, Venlo, The Netherlands
Monique A. L. Bilderbeek
Department of Otorhinolaryngology, Viecuri Medical Center, Venlo, The Netherlands
Patrick F. M. Dammeijer
Department of Otorhinolaryngology and Head & Neck Surgery, Antwerp University Hospital, Antwerp, Belgium
Vincent van Rompaey
Department Translational Neuroscience, Faculty of Medicine and Health Sciences, University of Antwerp, Antwerp, Belgium
Vincent van Rompaey
Department of Radiology and Nuclear Medicine, GROW-School for Oncology and Developmental Biology, Maastricht University Medical Centre+, Maastricht, The Netherlands
Henry C. Woodruff & Philippe Lambin
Haga Hospital, Radiology, Els Borst-Eilersplein 275, Den Haag, Zuid-Holland, The Netherlands
Hammer Sebastiaan

Authors

Akshayaa Vaidyanathan
View author publications
You can also search for this author in PubMed Google Scholar
Marly F. J. A. van der Lubbe
View author publications
You can also search for this author in PubMed Google Scholar
Ralph T. H. Leijenaar
View author publications
You can also search for this author in PubMed Google Scholar
Marc van Hoof
View author publications
You can also search for this author in PubMed Google Scholar
Fadila Zerka
View author publications
You can also search for this author in PubMed Google Scholar
Benjamin Miraglio
View author publications
You can also search for this author in PubMed Google Scholar
Sergey Primakov
View author publications
You can also search for this author in PubMed Google Scholar
Alida A. Postma
View author publications
You can also search for this author in PubMed Google Scholar
Tjasse D. Bruintjes
View author publications
You can also search for this author in PubMed Google Scholar
Monique A. L. Bilderbeek
View author publications
You can also search for this author in PubMed Google Scholar
Hammer Sebastiaan
View author publications
You can also search for this author in PubMed Google Scholar
Patrick F. M. Dammeijer
View author publications
You can also search for this author in PubMed Google Scholar
Vincent van Rompaey
View author publications
You can also search for this author in PubMed Google Scholar
Henry C. Woodruff
View author publications
You can also search for this author in PubMed Google Scholar
Wim Vos
View author publications
You can also search for this author in PubMed Google Scholar
Seán Walsh
View author publications
You can also search for this author in PubMed Google Scholar
Raymond van de Berg
View author publications
You can also search for this author in PubMed Google Scholar
Philippe Lambin
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

A.V.: Conceptualization, methodology, software, validation, formal analysis, investigation, writing—original draft, writing—review & editing, resources, visualization M.F.J.A.L.: formal analysis, investigation, data curation, writing—original draft, writing—review & editing, visualization, R.T.H.L.: Supervision, writing—review & editing, M.H.: Conceptualization, supervision F.Z.: Validation, B.M.: Review & editing, S.P.: Validation, A.A.P.: Resources, data curation, review & editing, validation, T.D.B.: Resources, review & editing, M.A.L.B.: Resources, review & editing, S.H.: Validation, review & editing, P.F.M.D.: Resources, review & editing, V.R.: Writing—resources, review & editing, H.C.W.: Writing—review & editing, W.V.: Conceptualization, supervision, investigation, visualization, S.W.: Conceptualization, supervision, R.B.: Supervision, project administration, funding acquisition, P.L.: Supervision, project administration, funding acquisition.

Corresponding author

Correspondence to Akshayaa Vaidyanathan.

Ethics declarations

Competing interests

M.F.J.A van der Lubbe, Marc van Hoof, Sergey Primakov, A.A. Postma, T.D. Bruintjes, M.A.L. Bilderbeek , Hammer Sebastiaan, P.F.M. Dammeijer, V. van Rompaey and R. van de Berg have no competing interests Fadila Zerka, Akshayaa Vaidyanathan, and Benjamin Miraglio are salaried employees of Oncoradiomics SA. Dr Philippe Lambin reports, within and outside the submitted work, grants/sponsored research agreements from Varian medical, Oncoradiomics, ptTheragnostic/DNAmito, Health Innovation Ventures. He received an advisor/presenter fee and/or reimbursement of travel costs/external grant writing fee and/or in kind manpower contribution from Oncoradiomics, BHV, Merck, Varian, Elekta, ptTheragnostic and Convert pharmaceuticals. Dr Lambin has shares in the company Oncoradiomics SA, Convert pharmaceuticals SA and The Medical Cloud Company SPRL and is co-inventor of two issued patents with royalties on radiomics (PCT/NL2014/050248, PCT/NL2014/050728) licensed to Oncoradiomics and one issue patent on mtDNA (PCT/EP2014/059089) licensed to ptTheragnostic/DNAmito, three non-patented invention (softwares) licensed to ptTheragnostic/DNAmito, Oncoradiomics and Health Innovation Ventures and three non-issues, non licensed patents on Deep Learning-Radiomics and LSRT (N2024482, N2024889, N2024889. Ralph T.H. Leijenaar has shares in the company Oncoradiomics and is co-inventor of an issued patent with royalties on radiomics (PCT/NL2014/050728) licensed to Oncoradiomics. Sean Walsh and Wim Vos have shares in the company Oncoradiomics. Henry C. Woodruff has minority shares in the company Oncoradiomics.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Vaidyanathan, A., van der Lubbe, M.F.J.A., Leijenaar, R.T.H. et al. Deep learning for the fully automated segmentation of the inner ear on MRI. Sci Rep 11, 2885 (2021). https://doi.org/10.1038/s41598-021-82289-y

Download citation

Received: 20 May 2020
Accepted: 19 January 2021
Published: 03 February 2021
DOI: https://doi.org/10.1038/s41598-021-82289-y

This article is cited by

AI-based MRI auto-segmentation of brain tumor in rodents, a multicenter study
- Shuncong Wang
- Xin Pang
- Yicheng Ni
Acta Neuropathologica Communications (2023)
75% radiation dose reduction using deep learning reconstruction on low-dose chest CT
- Gyeong Deok Jo
- Chulkyun Ahn
- Ju Gang Nam
BMC Medical Imaging (2023)
Towards fully automated inner ear analysis with deep-learning-based joint segmentation and landmark detection framework
- Jannik Stebani
- Martin Blaimer
- Kristen Rak
Scientific Reports (2023)
Teeth and prostheses detection in dental panoramic X-rays using CNN-based object detector and a priori knowledge-based algorithm
- Md. Anas Ali
- Daisuke Fujita
- Syoji Kobashi
Scientific Reports (2023)
Deep learning-based pancreas volume assessment in individuals with type 1 diabetes
- Raphael Roger
- Melissa A. Hilmes
- John Virostko
BMC Medical Imaging (2022)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Introduction

Materials and methods

Ethical considerations

Automatic segmentation workflow

Training dataset

Validation and test dataset

Manual segmentation

Pre-processing

Model architecture

Training, validation and testing

Outcome measurements

Qualitative assessment: in silico clinical study

Results

Segmentation performance

Performance on clinical validation dataset

Qualitative assessment: in silico clinical study

Discussion

Limitations

Clinical implications and future perspectives

Conclusion

Abbreviations

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Supplementary Information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Comments

Search

Quick links