Article | Published:

An integrated iterative annotation technique for easing neural network training in medical image analysis


Neural networks promise to bring robust, quantitative analysis to medical fields. However, their adoption is limited by the technicalities of training these networks and the required volume and quality of human-generated annotations. To address this gap in the field of pathology, we have created an intuitive interface for data annotation and the display of neural network predictions within a commonly used digital pathology whole-slide viewer. This strategy used a ‘human-in-the-loop’ to reduce the annotation burden. We demonstrate that segmentation of human and mouse renal micro compartments is repeatedly improved when humans interact with automatically generated annotations throughout the training process. Finally, to show the adaptability of this technique to other medical imaging fields, we demonstrate its ability to iteratively segment human prostate glands from radiology imaging data.

A preprint version of the article is available at ArXiv.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Data availability

We have made the data used for analysing the performance of H-AI-L method available at The folder contains a detailed note describing the data. Namely, the folder contains pathology and radiology image data used for training and testing our H-AI-L method, ground-truth and predicted segmentations of the test image data, network corections and respective annotations of the training image data for different iterations, and the network models trained at different iterations. We have made our code openly available online at

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.


  1. 1.

    Krizhevsky, A., Sutskever, I. & Hinton, G. E. ImageNet classification with deep convolutional neural networks. Commun. ACM 60, 84–90 (2017).

  2. 2.

    LeCun, Y. & Bengio, Y. in The Handbook of Brain Theory and Neural Networks (ed. Michael, A. A.) 255–258 (MIT Press, Cambridge, 1998).

  3. 3.

    LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).

  4. 4.

    Pedraza, A. et al. Glomerulus classification with convolutional neural networks. In Proc. Medical Image Understanding and Analysis: 21st Annual Conference, MIUA 2017 (eds Valdés Hernández, M. & González-Castro, V.) 839–849 (Springer, 2017).

  5. 5.

    Schmidhuber, J. Deep learning in neural networks: an overview. Neural Netw. 61, 85–117 (2015).

  6. 6.

    Bottou, L. Large-scale machine learning with stochastic gradient descent. In Proc. COMPSTAT’2010 (eds Lechevallier, Y. & Saporta, G.) 177–186 (Springer, 2010).

  7. 7.

    Szegedy, C. et al. Going deeper with convolutions. In IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2015).

  8. 8.

    Swingler, K. Applying Neural Networks: A Practical Guide (Morgan Kaufmann, Burlington, 1996).

  9. 9.

    Ronneberger, O., Fischer, P. & Brox, T. U-net: convolutional networks for biomedical image segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention (eds Navab, N., Hornegger, J., Wells, W. M. & Frangi, A. F.) (Springer, 2015).

  10. 10.

    Zhang, T. & Nakamura, M. Neural network-based hybrid human-in-the-loop control for meal assistance orthosis. IEEE Trans. Neural Syst. Rehabil. Eng. 14, 64–75 (2006).

  11. 11.

    Krogh, A. & Vedelsby, J. in Advances in Neural Information Processing Systems (1995).

  12. 12.

    Cohn, D., Atlas, L. & Ladner, R. Improving generalization with active learning. Mach. Learn. 15, 201–221 (1994).

  13. 13.

    Gosselin, P. H. & Cord, M. Active learning methods for interactive image retrieval. IEEE Trans. Image Process. 17, 1200–1211 (2008).

  14. 14.

    Shi, L. & Wang, X.-c. Artificial neural networks: current applications in modern medicine. In Computer and Communication Technologies in Agriculture Engineering, 2010 International Conference (IEEE, 2010).

  15. 15.

    Madabhushi, A. & Lee, G. Image analysis and machine learning in digital pathology: challenges and opportunities. Med. Image Anal. 33, 170–175 (2016).

  16. 16.

    Baxevanis, A. D. & Bateman, A. The importance of biological databases in biological discovery. Curr. Protoc. Bioinformatics 50, 1.1.1-8 (2015).

  17. 17.

    Cheplygina, V. et al. in Deep Learning and Data Labeling for Medical Applications 209–218 (Springer, New York, 2016).

  18. 18.

    Szolovits, P., Patil, R. S. & Schwartz, W. B. Artificial intelligence in medical diagnosis. Ann. Intern. Med. 108, 80–87 (1988).

  19. 19.

    Orthuber, W. et al. Design of a global medical database which is searchable by human diagnostic patterns. Open Med. Inform. J. 2, 21 (2008).

  20. 20.

    Smeulders, A. W. et al. Content-based image retrieval at the end of the early years. IEEE Trans. Pattern Anal. Mach. Intell. 22, 1349–1380 (2000).

  21. 21.

    Müller, H. et al. A review of content-based image retrieval systems in medical applications—clinical benefits and future directions. Int. J. Med. Inform. 73, 1–23 (2004).

  22. 22.

    Gong, T. et al. Automatic pathology annotation on medical images: a statistical machine translation framework. In Proc. 20th International Conference on Pattern Recognition (IEEE, 2010).

  23. 23.

    Abe, N., Zadrozny, B. & Langford, J. Outlier detection by active learning. In Proc. 12th ACM SIGKDD International Conference on Knowledge discovery and Data mining (ACM, 2006).

  24. 24.

    Doyle, S. & Madabhushi, A. Consensus of Ambiguity: Theory and Application of Active Learning for Biomedical Image Analysis (Springer, Berlin, 2010).

  25. 25.

    Chen, L.-C. et al. Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 40, 834–848 (2018).

  26. 26.

    Aperio Imagescope (Leica Biosystems);

  27. 27.

    Skodras, A., Christopoulos, C. & Ebrahimi, T. The JPEG 2000 still image compression standard. IEEE Signal Process. Mag. 18, 36–58 (2001).

  28. 28.

    Sedeen Viewer (Pathcore);

  29. 29.

    Ginley, B., Tomaszewski, J. E. & Sarder, P. Automatic computational labeling of glomerular textural boundaries. In Proc. SPIE 10140, Medical Imaging 2017: Digital Pathology 101400G (2017).

  30. 30.

    Kato, T. et al. Segmental HOG: new descriptor for glomerulus detection in kidney microscopy image. BMC Bioinformatics 16, 316 (2015).

  31. 31.

    Sarder, P., Ginley, B. & Tomaszewski, J. E. Automated renal histopathology: digital extraction and quantification of renal pathology. In Proc. SPIE 9791, Medical Imaging 2016: Digital Pathology 97910F (2016).

  32. 32.

    Simon, O., Yacoub, R., Jain, S., Tomaszewski, J. E. & Sarder, P. Multi-radial LBP features as a tool for rapid glomerular detection and assessment in whole slide histopathology images. Sci. Rep. 8, 2032 (2018).

  33. 33.

    Tesch, G. H & Allen, T. J. Rodent models of streptozotocin-induced diabetic nephropathy. Nephrology 12, 261–216 (2007).

  34. 34.

    Goyal, S. N. et al. Challenges and issues with streptozotocin-induced diabetes - a clinically relevant animal model to understand the diabetes pathogenesis and evaluate therapeutics. Chem. Biol. Interact. 244, 49–63 (2016).

  35. 35.

    Kitada, M., Ogura, Y. & Koya, D. Rodent models of diabetic nephropathy: their utility and limitations. Int. J. Nephrol. Renov. Dis. 9, 279–290 (2016).

  36. 36.

    Wu, K. K. & Huan, Y. Streptozotocin-induced diabetic models in mice and rats.Curr. Protoc. Pharmacol. 40, 5.47 (2008).

  37. 37.

    Hripcsak, G. & Rothschild, A. S. Agreement, the F-measure, and reliability in information retrieval. J. Am. Med. Inform. Assoc. 12, 296–298 (2005).

  38. 38.

    Sokolova, M., Japkowicz, N. & Szpakowicz, S. Beyond accuracy, F-score and ROC: a family of discriminant measures for performance evaluation. In Australasian Joint Conference on Artificial Intelligence (eds Sattar, A. & Kang, B.-H.) (Springer, 2006).

  39. 39.

    Japkowicz, N. & Stephen, S. The class imbalance problem: A systematic study. Intell. Data Anal. 6, 429–449 (2002).

  40. 40.

    Bariety, J. et al. Parietal podocytes in normal human glomeruli. J. Am. Soc. Nephrol. 17, 2770–2780 (2006).

  41. 41.

    Pavenstadt, H., Kriz, W. & Kretzler, M. Cell biology of the glomerular podocyte. Physiol. Rev. 83, 253–307 (2003).

  42. 42.

    Solez, K. et al. Banff 07 classification of renal allograft pathology: updates and future directions. Am. J. Transplant. 8, 753–760 (2008).

  43. 43.

    Mengel, M. Deconstructing interstitial fibrosis and tubular atrophy: a step toward precision medicine in renal transplantation. Kidney Int. 92, 553–555 (2017).

  44. 44.

    Wang, X. et al. Glomerular pathology in dent disease and its association with kidney function. Clin. J. Am. Soc. Nephrol. 11, 2168–2176 (2016).

  45. 45.

    McGarry, S. D. et al. Radio-pathomic maps of epithelium and lumen density predict the location of high-grade prostate cancer. Int. J. Radiat. Oncol. Biol. Phys. 101, 1179–1187 (2018).

  46. 46.

    Janowczyk, A. et al. A resolution adaptive deep hierarchical (RADHicaL) learning scheme applied to nuclear segmentation of digital pathology images. Comput. Methods Biomech. Biomed. Eng. Imaging Vis. 6, 270–276 (2016).

  47. 47.

    McGarry, S. D. et al. Radio-pathomic maps of epithelium and lumen density predict the location of high-grade prostate cancer. Int. J. Radiat. Oncol. Biol. Phys. 101, 1179–1187 (2018).

  48. 48.

    Bray, T. et al. Extensible markup language (XML). World Wide Web J. 2, 27–66 (1997).

  49. 49.

    Bradski, G. The OpenCV Library. Dr. Dobb’s (2000).

  50. 50.

    Klette, R. et al. Computer Vision (Springer, New York, 1998)

  51. 51.

    Goode, A. et al. OpenSlide: a vendor-neutral software foundation for digital pathology. J. Pathol. Inform. 4, 27 (2013).

  52. 52.

    Lu, C. & Mandal, M. Automated segmentation and analysis of the epidermis area in skin histopathological images. In 2012 Annual International Conference of the IEEE Engineering in Medicine and Biology Society (IEEE, 2012).

  53. 53.

    Govind, D. et al. Automated erythrocyte detection and classification from whole slide images. J. Med. Imaging 5, 027501 (2018).

  54. 54.

    Jung, A. imgaug (2017);

  55. 55.

    Zhou, Z.-H. & Liu, X.-Y. Training cost-sensitive neural networks with methods addressing the class imbalance problem. IEEE Trans. Knowl. Data Eng. 18, 63–77 (2006).

Download references


The project was supported by the faculty start-up funds from the Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, the University at Buffalo IMPACT award, NIDDK Diabetic Complications Consortium grant DK076169 and NIDDK grant R01 DK114485. The prostate imaging data were collected with funds from the State of Wisconsin Tax Check-off Program for Prostate Cancer research. Percent efforts for P.S.L. and S.D.M. were provided by R01 CA218144, and the National Center for Advancing Translational Sciences NIH UL1TR001436 and TL1TR001437. We thank NVIDIA Corporation for the donation of the Titan X Pascal GPU used for this research.

Author information

B.L. conceived the H-AI-L method, analysed the data and wrote the manuscript. The code was written by B.L. and B.G. D.G. contributed in generating results for Fig. 4. S.D.M. and P.S.L. provided the radiology data and annotations for the prostate MRI analysis, and edited the manuscript. R.Y. implemented the mouse model. S.J. provided human renal biopsy data. J.E.T. evaluated renal pathology segmentation as a domain expert. K.-Y.J. provided the IFTA annotation for Fig. 5. P.S. is responsible for the overall coordination of the project, mentoring and formalizing the image analysis concept and oversaw manuscript preparation.

Competing interests

The authors declare no competing interests.

Correspondence to Pinaki Sarder.

Supplementary information

Supplementary Information

Supplementary figures and table

Reporting Summary

Supplementary Video 1

Supplementary video for Fig. 1

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark
Fig. 1: Iterative H-AI-L pipeline overview.
Fig. 2: H-AI-L pipeline performance analysis for glomerular segmentation on holdout mouse WSIs.
Fig. 3: H-AI-L human annotation errors (mouse data).
Fig. 4: Multiclass nuclei prediction on a mouse WSI.
Fig. 5: Multiclass IFTA prediction on a holdout human renal WSI.
Fig. 6: H-AI-L method performance analysis for human prostate segmentation from T2 MRI slices.
Fig. 7: Annotation time-savings using the H-AI-L method while comparing to baseline segmentation speed.