Abstract
Automatic mitosis detection from video is an essential step in analyzing proliferative behaviour of cells. In existing studies, a conventional object detector such as Unet is combined with a link prediction algorithm to find correspondences between parent and daughter cells. However, they do not take into account the biological constraint that a cell in a frame can correspond to up to two cells in the next frame. Our model called GNNDOL enables mitosis detection by complementing a graph neural network (GNN) with a differentiable optimization layer (DOL) that implements the constraint. In timelapse microscopy sequences cultured under four different conditions, we observed that the layer substantially improved detection performance in comparison with GNNbased link prediction. Our results illustrate the importance of incorporating biological knowledge explicitly into deep learning models.
Similar content being viewed by others
Introduction
With recent advances in imaging techniques, tracking cell nuclei in timelapse living cell microscopy images has become possible in the studies of developmental biology^{1}. The cell tracking problem^{2,3} is different from general object tracking^{4,5} due to cell mitosis, i.e., a cell divides into two daughter cells (Fig. 1). Existing computer vision studies about mitosis detection can be classified into two categories: shapebased and linkbased. In shapebased detection^{6,7}, each frame of the video is treated separately and cells with the characteristic shape, i.e., spherical and bright bordered, are detected as parent cells by conventional deep learning detectors. Note that the characteristic shape is specific to certain types of medium and microscopies. In linkbased detection^{8,9}, cells of all shapes are detected and the links between the cells in neighboring frames are predicted. Subsequently, cells with two outgoing links are identified as parents. An advantage of linkbased methods over shapebased ones is that daughter cells can be identified as well. Jug et al.^{8} proposed to detect cells by random forest and construct links by mathematical programming. An advantage of mathematical programming is that biological constraints, i.e., a cell is divided to at most two cells, can be explicitly taken into account. More recently, BenHaim and Raviv^{9} proposed a graph neural network (GNN)^{10}based method for inferring cell links including mitosis. While it outperformed other methods such as AGC^{11}, STGDA^{12}, BFP^{13} and MPM^{14}, the constraints are not imposed. In this paper, we develop a deep neural network that explicitly takes the constraint into account via differentiable optimization^{15}.
A deep neural network (DNN) is a function with multidimensional input and output vectors. It is a composite function of multiple functions called layers, where the output vectors of upstream layers are fed to downstream layers. Layers have a number of parameters that are optimized to minimize a given loss function dependent on the training examples. Parameter optimization is done by computing the derivative of the loss function with respect to the parameters with back propagation. Namely, the derivatives of downstream layers are computed first and those of upstream layers are computed using downstream derivatives based on the chain rule. Layers are implemented by diverse kinds of computation including convolution^{16}, attention^{17} and message passing^{18}. DNNs have been applied to various domains^{19,20,21,22}. Amos and Zico^{15} proposed to employ a mathematical program as a layer, that is, the coefficients and solution of the program are designated as input and output, respectively. To enable back propagation, the derivative of the solution with respect to the coefficients is computed based on Karush–Kuhn–Tucker (KKT) condition and implicit function theorem. This layer is called differentiable optimization layer (DOL). An advantage of DOL is that a hard constraint can be imposed on the output. In the following, we use DOL to impose the biological constraint of mitosis.
Our method assumes that all cell positions at frame t and \(t+1\) are readily identified, e.g., by Unet^{16}. Next, a cell graph is made for each frame by designating the identified cells as nodes and connecting neighboring nodes by edges. A node feature vector contains visual features and an edge feature vector has positional features. Our DNN has two blocks as shown in Fig. 2. The upstream one, a GNN, converts the two cell graphs into a similarity matrix Q and the downstream one, a DOL, takes Q as the coefficient of a quadratic program and produce the cell correspondence matrix across the frames. If a cell at frame t corresponds to two cells in frame \(t+1\), it is identified as a parent cell.
Our method is evaluated on public C2C12 dataset, which contains time lapse microscopy image sequences. In addition to the original annotations of cell identities, extra annotations by Su et al.^{7} were used. To measure the impact of DOL, we implemented a naive GNN that predicts the correspondence matrix by a multilayer perceptron (MLP) layer that does not impose the constraint. In experiments, we observed that our method enhances the accuracy of mitosis detection in comparison to the naive GNN and GNN by BenHaim and Raviv^{9}, demonstating that the use of DOL affected the upstream parameters of GNN in a favorable manner. A general tendency of deep learning approaches is to avoid hard constraints that describe the rules and let the model learn the rules from a large amount of data^{23}. In comparison to general image data, however, scientific image data are smaller in amount by orders of magnitude^{1}. Our results show that incorporating biological constraints via DOL is effective in scientific domains and our approach may be applicable to other scientific problems.
Method
Preprocessing by Unet
As training data, the positions of all cells at all frames and the correspondence between cells at neighboring frames are given. Each cell corresponding to two cells in the next frame is labeled as a parent cell. First, we build a predictor of cell positions using a deep neural network called Unet^{16}. The input of Unet is an image and the output is a segmentation map, i.e., the image of the same size where all cell centroids are marked as bright pixels. Unet has a network architecture that first contracts an input image to a latent vector gradually with multiple layers and then expanding it to the output image. By applying Unet to a test image after training, we can predict the position of all cells. In addition, the shape feature vector of the image patch including each cell can be extracted from one of the contraction layers.
GNNDOL
Using the information from trained Unet, we construct a parent cell predictor that judges if a cell is a parent or not based on cell graphs. The ith cell at frame t is denoted as \((x_i,y_i,{\textbf{v}}_i), i \in [1,m]\), where \((x_i,y_i)\) denote the position and \({\textbf{v}}_i\) is the shape feature vector. Similarly, those at frame \(t+1\) are denoted as \((x_i^\prime , y^\prime _i, {\textbf{v}}^\prime _i), i \in [1,m^\prime ]\). The cell graph for frame t is constructed by designating the m cells as nodes and connecting each cell to two nearest neighbors among the cells by edges. Each node is labeled by \({\textbf{v}}_i\) and each edge is labeled by positional feature vector, \({\textbf{e}}_{ij} = (x_s,y_s,x_t,y_t)\), where \(s,t (s<t)\) indicate the end nodes.
Our method GNNDOL takes two cell graphs as input and provides the cell correpondence matrix as output. First, node features and edge features are updated via message passing^{18}. Let N(i) denote the adjacent nodes of i. The node feature vector is updated using a multilayer perceptron (MLP) as
where
The edge feature vector is updated as
A MLP with two fully connected layers is used for both updates. Let \({{\bar{{\textbf{v}}}}}_k^\prime\) and \({{\bar{{\textbf{e}}}}}_{kl}^\prime\) denote the updated feature vectors from frame \(t+1\). An \(m m^\prime \times m m^\prime\) pairwise similarity matrix Q is derived as follows. Let \(\alpha (i,j) = im+j\) and \(\beta (k,l) = k m^\prime +l\). Also let E and \(E^\prime\) denote the set of edges in the cell graphs. Then, Q is described as
Let Z denote a \(m \times m^\prime\) correspondence matrix, and \({\textbf{z}}= \textrm{vec}(Z)\). The quadratic program implemented in our DOL is described as
where \(A_1 \in {\mathbb {R}}^{m\times m m^\prime }\) and \(A_2 \in {\mathbb {R}}^{n \times m m^\prime }\) are defined as
The first constraint in (1) ensures that a cell at frame \(t+1\) correponds to a cell at frame t. The second one is related to mitosis, i.e., a cell at frame t can correpond to at most two cells at frame \(t+1\). From the optimial solution \({\textbf{z}}^*\), the correspondence matrix is obtained by taking columnwise maximum. If the cell of interest in frame t is connected to two cells in frame \(t+1\), it is predicted as a parent. Training of GNNDOL is implemented by Pytorch and neuralscs python package^{24} on a NVIDIA Tesla V100 GPU (32GB). We adopted the weighted binary cross entropy loss and used Adam optimizer with learning rate \(1.0 \times 10^{4}\). In applying the trained network to an image sequence, CVXPY package^{25} was used to solve the quadratic program. The detailed algorithm of GNNDOL is shown in Supplementary Information.
Naive GNN
For measuring the effect of DOL, we prepared another model called naive GNN, where the updated features are fed into a MLP whose output is the correspondence matrix \({\textbf{z}}\). Note that no constraints are imposed to \({\textbf{z}}\) here and the absence of DOL affects upstream GNN parameters via back propagation.
Results and discussion
We carry out all experiments on the public dataset, C2C12^{26}, which contains timelapse microscopy image sequences cultured under 4 different media conditions, including with fibroblast growth factor 2 (FGF2), bone morphogenetic protein 2 (BMP2), FGF2 + BMP2, and control (no growth factor). Each image sequence is composed of 1013 frames with size of 1392 \(\times\) 1040 pixels. The images were recorded by Zeiss Axiovert T135V microscope with the resolution of 1.3 \(\upmu\)m/pixel. The cell displacement between two adjacent frames is about 6 pixel on average. The annotation of C2C12 consists of the coordinates of cell centroids and their corresponding cell identifiers. In the original distribution of C2C12, only one sequence, F0009, was annotated. We also used additional annotation covering all sequences contributed by Su et al.^{7}. For each media condition, three sequences are used as training examples and the remaining one sequence is used as test examples (Table 1). Fig. 3a shows the number of cells at each frame. Cells may enter or exit the field of view, but the number of cells was consistently increasing due to mitosis.
For preprocessing for GNNDOL, UNet is trained with the sequence F0009, and shape feature vectors are extracted from the contraction layer corresponding to 50 \(\times\) 50 image patches. To accelerate the training of GNNDOL, the number of cells is reduced to 30 as follows. For frame \(t+1\), all the daughter cells are included first. The rest is filled with randomly chosen cells. For frame t, we include the parent cells of the daughter cells and the predecessors of the randomly chosen cells. Finally, additional cells are chosen at frame t to adjust the number of cells to 30. Notice that all cells are included in cell graphs in applying the trained GNNDOL to test examples.
Mitosis detection
We compared GNNDOL with a stateoftheart GNN model by BenHaim and Raviv (GNNB & R)^{9}. This model was trained only with F0009 in the original paper, but we trained it with all the annotations using their code^{27}. Note that GNNB & R does not impose the constraints on cell correspondences. GNNDOL was additionally compared with MPM^{14}, a popular mitosis detection method. We evaluated how accurately mitosis events (i.e., the parent cells) are identified in each test sequence (Table 2). It is observed that GNNDOL performs consistently better than GNNB & R, MPM and NaiveGNN. GNNB & R achieved high precision but relatively low recall, indicating that they are likely to miss hardtodetect mitosis events. Our results suggest that imposing the constraint by DOL is beneficial in identifying mitosis events correctly. A downside of using DOL, however, is that it is computationally more demanding due to the use of quadratic programming. Figure 3b shows that the computational time of GNNDOL for processing a pair of frames is about three times longer than NaiveGNN.
Cell correspondence prediction
We also investigated the prediction accuracy of cell correpondence prediction. Frames with at least one parent cell and their next frames are chosen for examination. GNNDOL, GNNB & R and NaiveGNN are applied to the pair of cell graphs derived from the frame pairs. The predicted corresponding matrix Z is compared with the ground truth. In Fig. 4a,b GNNDOL is compared with GNNDOL and GNNB & R in terms of F1score, respectively. GNNDOL performed better than NaiveGNN and GNNB & R consistently, showing the significant contribution of DOL. It is also found that GNNB & R tends to fail badly, when there are more than one mitosis events happening. Multiple mitosis cases occur rarely in training data, so it should be hard to learn via purely datadriven approaches.
Conclusion
In this paper, we proposed a differentiable model, GNNDOL, to detect cell mitosis from video. GNNDOL was particularly accurate, when there are multiple mitosis events in an image. Our method should be useful to measure mitosis frequency under the influence of a drug, which is an important step of drug development. While GNNDOL is designed for predicting cell correspondences in two adjacent images and amenable to online video processing, it may also contribute to cell lineage analysis^{28,29} over a long time frame. A drawback of our method is that Unet and GNNDOL are separate and not unified. The cell identification mistakes of Unet are carried over to GNNDOL and can never be corrected. If a unified network that can learn the mitosis detection task in the endtoend fashion is constructed, this problem may be alleviated.
Deep learning models have shown that, given a large amount of data, rules and constraints behind the data can be automatically discovered and utilized for accurate prediction^{23}. However, in scientific applications such as mitosis detection, the amount of data is inherently limited and incorporating known constraints as DOL can be advantageous over purely data driven approaches. In future, we would like to explore the possibilities of DOLs further in various scientific domains.
Data availability
The source code of GNNDOL can be found at https://github.com/95HaishanZHANG/GNNDOL. C2C12 dataset is available at https://osf.io/ysaq2/. The additional annotations are available from Prof. AnAn Liu (liuanantju@163.com) upon request.
References
Moen, E. et al. Deep learning for cellular image analysis. Nat. Methods 16, 1233–1246 (2019).
Hirose, T., Kotoku, J., Toki, F., Nishimura, E. K. & Nanba, D. Labelfree quality control and identification of human keratinocyte stem cells by deep learningbased automated cell tracking. Stem Cells 39, 1091–1100 (2021).
Huang, L., McKay, G. N. & Durr, N. J. A deep learning bidirectional temporal tracking algorithm for automated blood cell counting from noninvasive capillaroscopy videos. In International Conference on Medical Image Computing and ComputerAssisted Intervention, 415–424 (Springer, 2021).
Chen, L. C., Zhu, Y., Papandreou, G., Schroff, F. & Adam, H. Encoderdecoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European conference on computer vision (ECCV), 801–818 (2018).
Ren, S., He, K., Girshick, R. & Sun, J. Faster RCNN: Towards realtime object detection with region proposal networks. Advances in Neural Information Processing Systems28 (2015).
Nishimura, K. & Bise, R. Spatialtemporal mitosis detection in phasecontrast microscopy via likelihood map estimation by 3dcnn. In 2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), 1811–1815 (IEEE, 2020).
Su, Y.T., Lu, Y., Liu, J., Chen, M. & Liu, A.A. Spatiotemporal mitosis detection in timelapse phasecontrast microscopy image sequences: A benchmark. IEEE Trans. Med. Imaging 40, 1319–1328 (2021).
Jug, F., Levinkov, E., Blasse, C., Myers, E. W. & Andres, B. Moral lineage tracing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 5926–5935 (2016).
BenHaim, T. & Raviv, T. R. Graph neural network for cell tracking in microscopy videos. In European Conference on Computer Vision, 610–626 (Springer, 2022).
Wang, Y., Kitani, K. & Weng, X. Joint object detection and multiobject tracking with graph neural networks. In 2021 IEEE International Conference on Robotics and Automation (ICRA), 13708–13715 (IEEE, 2021).
Bensch, R. & Ronneberger, O. Cell segmentation and tracking in phase contrast images using graph cut with asymmetric boundary costs. In 2015 IEEE 12th International Symposium on Biomedical Imaging (ISBI), 1220–1223 (IEEE, 2015).
Bise, R., Yin, Z. & Kanade, T. Reliable cell tracking by global data association. In 2011 IEEE international symposium on biomedical imaging: From nano to macro, 1004–1010 (IEEE, 2011).
Nishimura, K., Hayashida, J., Wang, C., Ker, D. F. E. & Bise, R. Weaklysupervised cell tracking via backwardandforward propagation. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XII 16, 104–121 (Springer, 2020).
Hayashida, J., Nishimura, K. & Bise, R. Mpm: Joint representation of motion and position map for cell tracking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3823–3832 (2020).
Amos, B. & Kolter, J. Z. Optnet: Differentiable optimization as a layer in neural networks. In International Conference on Machine Learning, 136–145 (PMLR, 2017).
Ronneberger, O., Fischer, P. & Brox, T. Unet: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computerassisted intervention, 234–241 (Springer, 2015).
Vaswani, A. et al. Attention is all you need. Advances in Neural Information Processing Systems30 (2017).
Brasó, G. & LealTaixé, L. Learning a neural solver for multiple object tracking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 6247–6257 (2020).
Tiwari, P., Lakhan, A., Jhaveri, R. H. & Gronli, T.M. Consumercentric internet of medical things for cyborg applications based on federated reinforcement learning. IEEE Trans. Consum. Electron.https://doi.org/10.1109/TCE.2023.3242375 (2023).
Chui, K. T. et al. Multiround transfer learning and modified generative adversarial network for lung cancer detection. Int. J. Intell. Syst. 2023, 1–14 (2023).
Wang, Z. et al. Cnnand ganbased classification of malicious code families: A code visualization approach. Int. J. Intell. Syst. 37, 12472–12489 (2022).
An, L., Yan, Z., Wang, W., Liu, J. K. & Yu, K. Enhancing visual coding through collaborative perception. IEEE Trans. Cogn. Dev. Syst.https://doi.org/10.1109/TCDS.2022.3203422 (2022).
LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).
Neuralscs package. https://github.com/facebookresearch/neuralscs.
CVXPY. https://www.cvxpy.org/.
Ker, D. F. E. et al. Phase contrast timelapse microscopy datasets with automated and manual cell tracking annotations. Sci. Data 5, 1–12 (2018).
Celltrackergnn package. https://github.com/talbenha/celltrackergnn.
MalinMayor, C. et al. Automated reconstruction of wholeembryo cell lineages by learning from sparse annotations. Nat. Biotechnol. 41, 44 (2022).
Sugawara, K., Çevrim, Ç. & Averof, M. Tracking cell lineages in 3d by incremental deep learning. Elife 11, e69380 (2022).
Acknowledgements
This work is supported by AMED JP20nk0101111, JST ERATO JPMJER1903, and CREST JPMJCR21O2, JSPS Kakenhi 23K16939.
Author information
Authors and Affiliations
Contributions
H.Z. and K.T. conceived the research. H.Z. and D.H.N. designed and implemented the machine learning algorithms. H.Z. conducted computational experiments. H.Z. and K.T. wrote the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Zhang, H., Nguyen, D.H. & Tsuda, K. Differentiable optimization layers enhance GNNbased mitosis detection. Sci Rep 13, 14306 (2023). https://doi.org/10.1038/s4159802341562y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s4159802341562y
This article is cited by

LP norm regularized deep CNN classifier based on biwolf optimization for mitosis detection in histopathology images
International Journal of Information Technology (2024)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.