Landmark detection in 2D bioimages for geometric morphometrics: a multi-resolution tree-based approach

The detection of anatomical landmarks in bioimages is a necessary but tedious step for geometric morphometrics studies in many research domains. We propose variants of a multi-resolution tree-based approach to speed-up the detection of landmarks in bioimages. We extensively evaluate our method variants on three different datasets (cephalometric, zebrafish, and drosophila images). We identify the key method parameters (notably the multi-resolution) and report results with respect to human ground truths and existing methods. Our method achieves recognition performances competitive with current existing approaches while being generic and fast. The algorithms are integrated in the open-source Cytomine software and we provide parameter configuration guidelines so that they can be easily exploited by end-users. Finally, datasets are readily available through a Cytomine server to foster future research.


Datasets
Datasets (images and ground-truth landmarks) are available online on a Cytomine web server [12] http://www.demo.cytomine.be. End-users can access data through Cytomine-WebUI using: • login: landmark In order to retrieve the data automatically, a basic script located at https:// github.com/cytomine/Cytomine-python-datamining/tree/master/cytomine-applications/ landmark_model_builder/download_datasets.py can be used. After the installation of the Cytomine python client (see http://doc.cytomine.be/x/ TQK8), this basic script will use Cytomine RESTful API to retrieve the data using the following command: $python download_datasets.py <repository>
The source code for building the different models is available in the landmark_ model_builder repository: build_[generic,lc,dmbl]_model.py contains the source code for our algorithm, LC and DMBL. The test_[generic,lc,dmbl] .sh files can be used to set the algorithms parameters and build the models. The source code for predicting landmark positions on new images using the models built is available in the landmark_predict repository: landmark_ [generic,lc,dmbl]_predict.py contains the source code for our algorithm, LC and DMBL. The test_[generic,lc,dmbl].sh files can be used to set the algorithms parameters and perform the detection. Please note that you will need to add the corresponding softwares to your Cytomine instance using the add_software files.

Dataset Description
In this Section we give more details about the context and describe materials and methods to obtain our images.

CEPHA
Cephalometry aims at analyzing human cranium for orthodontic diagnosis and treatment planning. This dataset has been previously described in [15,14]. 300 cephalometric X-ray images were collected from 300 patients aged six to 60 years old. The images are acquired with Soredex CRANEX Excel Ceph machine (Tuusula, Finland) and Soredex SorCom software (3.1.5, version 2.0). Image resolution is 1935 by 2400 pixels in TIFF format. Regarding the ground truth data for evaluation, 19 landmarks were manually marked and reviewed by two experienced medical doctors for each image. An ethical approval was obtained to conduct the study with IRB Number 1-102-05-017, which was approved by the research ethics committee of the Tri-Service General Hospital in Taipei, Taiwan. The data is available in Cytomine Project LANDMARKS-NTUST-CEPHA.

DROSO
Developmental homeostasis enables the constancy of the phenotype despite genetic, environmental and stochastic variations. Precise quantification of morphological traits is paramount to estimate perturbations and tackle the genetic and molecular bases of developmental homeostasis. The Drosophila wing, with its plane and stereotyped structure, is well suited to quantify subtle variations in size and shape in a population that would reveal inefficient homeostasis. Fifteen morphological landmarks corresponding to intersections between veins or between veins and the margin are used to describe wing size and shape with geometric morphometrics [8].
w 1118 flies were raised on standard yeast-cornmeal medium at 25 • C. Crosses were performed between 5 females and 5 males and transferred each 48h. Thirty females from the total offspring were sampled and their wings mounted on one slide, dorsal side up, in Hoyer's medium. Slides were scanned with a Hamamatsu Nanozoomer Digital Slide scanner, running the Nanozoomer software with a 20x objective and an 8-bit camera. Wing pictures were separately exported into TIFF format using NDP.view with the 5x lens and oriented with hinges to the left (Images are 1440 by 900 pixels). The fifteen morphometric landmarks were manually acquired as described in [5]. The data is available in Cytomine Project LANDMARKS-UPMC-DROSO.

ZEBRA
The zebrafish is increasingly used for studying embryogenesis in vertebrates; its rapid development and the transparency of its embryos and larvae have led to the identification of several mutants deficient in skeletal morphogenesis [13]. In particular, the head skeleton is the first to undergo ossification, by first forming a cartilaginous matrix starting at 3dpf which is later converted into bone structures through perichondral ossification. Other bone elements are formed without a pre-existing cartilage matrix by endomembraneous ossification. These processes, and the precise and reproducible positioning of the different elements at different stages of development under normal conditions have been well studied and described [7,3]. Many of the identified mutations affecting skeletogenesis [13] cause either absence or severe malformations of the different cartilage [13,7,3,4,9] or bone elements [11], however recent studies focus increasingly on the signaling pathways regulating the precise positioning and shaping of the different elements [2,17]. For these studies, more precise, objective and quantitative methods for morphometric description of the head skeleton are required.
Here, Zebrafish (Danio rerio) were maintained under standard conditions [16] in the GIGA zebrafish facility (licence LA2610359). Rearing and breading were performed as previously described [1], all protocols for experiments were evaluated by the Institutional Animal Care and Use Committee of the University of Liège and approved under the file numbers 568, 1074, and 1264 (licence LA 1610002).
The calcified bone structures in the head skeleton of zebrafish larvae were stained using Alizarin red S (Sigma-Aldrich, Diegem, Belgium) as previously described [1]. Briefly, the larvae were fixed in 4% PFA for 2h at room temperature and rinsed several times with PBST (3.2 mM Na2HPO4, 0.5 mM KH2PO4, 1.3 mM KCl, 135 mM NaCl, 0.05% Tween R 20, pH 7.4), then pigmentation was bleached in a H2O2 solution (H2O2 3%, KOH 0.5%) and finally the larvae were rinsed 3 times in a solution of 25% glycerol / 0.1% KOH and 50% glycerol, 0.1% KOH. After the bleaching, long rinses (at least 20min each) in a 25% glycerol, 0.1% KOH solution are necessary to prevent fading of the staining. The larvae are stained in a 0.05% Alizarin red solution in water for 30min in the dark on low agitation, rinsed in a 50% glycerol, 0.1% KOH solution to remove excess staining and kept at 4 • C in the same solution. Images of stained larvae (n=20-30 larvae) were obtained on a binocular (Olympus, cell B software) by placing the larvae in glycerol in a white plastic plate, using the same illumination and acquisition parameters for each session. Bitmap images are 2576 by 1932 pixels.
The data is available in Cytomine Project LANDMARKS-ULG-ZEBRA. Table 1 shows the values that were tested during the comparison of the landmark detection algorithms. Full description of LC and DMBL parameters can be found in their corresponding papers [6,10].

Robustness analysis
In this section, we will analyze the influence of the deformations in the images on the accuracy of our method. We define the deformation of an image i, d i as the euclidean distance between its landmarks and the mean shape (the mean position of the landmarks). This deformation is computed once the shapes have been centered: Where L is the number of landmarks and N the number of images. In order to keep the deformations comparable between the datasets, image heights and widths were set to 1, and the number of landmarks was fixed to L = 10. These L landmarks were selected randomly. The deformation distribution of both approaches is given in Figure 1. From this figure, we can conclude that the deformations in the DROSO dataset are more important than in CEPHA and in ZEBRA. Figure 2 shows the influence of the importance of the deformation on the error when the distance to the mean shape criterion is used. As it could be expected, RAW, SUB and GAUSSIAN features have more difficulties to handle large deformations than HAAR and SURF features. DMBL also seems to encounter difficulties with bigger deformations. Haar-Like features seems to be the less impacted by the deformations along with LC.