Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript. tracking all individuals in small or large collectives of unmarked animals


Understanding of animal collectives is limited by the ability to track each individual. We describe an algorithm and software that extract all trajectories from video, with high identification accuracy for collectives of up to 100 individuals. uses two convolutional networks: one that detects when animals touch or cross and another for animal identification. The tool is trained with a protocol that adapts to video conditions and tracking difficulty.

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: Tracking by identification in
Fig. 2: Using to study small and large animal groups.

Code availability is open-source and free software (license GPL v.3). The source code and the instructions for its installation are available at A quick-start user guide and a detailed explanation of the GUI can be found at The software is also provided as Supplementary Software.

Data availability

Processed data that can be used to reproduce all figures and tables can be found at Lossless compressed videos can be downloaded from the same page. Raw videos are available from the corresponding author upon reasonable request. A library of single-individual zebrafish images for use in testing identification methods also can be found at Two example videos, one of 8 adult zebrafish and one of 100 juvenile zebrafish, are also included as part of the quick-start user guide.


  1. Pérez-Escudero, A., Vicente-Page, J., Hinz, R. C., Arganda, S. & de Polavieja, G. G. Nat. Methods 11, 743–748 (2014).

    Article  Google Scholar 

  2. Dolado, R., Gimeno, E., Beltran, F. S., Quera, V. & Pertusa, J. F. Behav. Res. Methods 47, 1032–1043 (2015).

    Article  Google Scholar 

  3. Rasch, M. J., Shi, A. & Ji, Z. bioRxiv Preprint at (2016).

  4. Rodriguez, A., Zhang, H., Klaminder, J., Brodin, T. & Andersson, M. Sci. Rep. 7, 14774 (2017).

    Article  Google Scholar 

  5. Wang, S. H., Zhao, J. W. & Chen, Y. Q. Multimed. Tools Appl. 76, 23679–23697 (2017).

    Article  Google Scholar 

  6. Xu, Z. & Cheng, X. E. Sci. Rep. 7, 42815 (2017).

    CAS  Article  Google Scholar 

  7. Lecheval, V. et al. Proc. Biol. Sci. 285, 1877 (2018).

    Article  Google Scholar 

  8. LeCun, Y., Bengio, Y. & Hinton, G. Nature 521, 436–444 (2015).

    CAS  Article  Google Scholar 

  9. Abadi, M. et al. TensorFlow: large-scale machine learning on heterogeneous distributed systems. (2015).

  10. Rusk, N. Nat. Methods 13, 35 (2016).

    CAS  Article  Google Scholar 

  11. Pan, S. J. et al. IEEE Trans. Knowl. Data Eng. 22, 1345–1359 (2010).

    Article  Google Scholar 

  12. Laan, A., Iglesias-Julios, M. & de Polavieja, G. G. R. Soc. Open Sci. 5, 180679 (2018).

    Article  Google Scholar 

  13. Martins, S. et al. Zebrafish 13, S47–S55 (2016).

    Article  Google Scholar 

  14. Glorot, X. & Bengio, Y. in Proc. Thirteenth International Conference on Artificial Intelligence and Statistics (eds Teh, Y. W. & Titterington, M.) 249–256 (PMLR, Sardinia, Italy, 2010).

  15. Kingma, D. & Ba, J. arXiv Preprint at (2015).

  16. Morgan, N. & Bourlard, H. in Advances in Neural Information Processing Systems 2 (ed Touretzky, D. S.) 630–637 (Morgan Kaufmann, San Francisco, 1990).

  17. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I. & Salakhutdinov, R. J. Mach. Learn. Res. 15, 1929–1958 (2014).

    Google Scholar 

  18. Bradski, G. Dr. Dobb’s Journal 25, 120–123 (2000).

    Google Scholar 

  19. Oppenheim, A. V. & Schafer, R. W. Discrete-time Signal Processing (Pearson, Upper Saddle River, NJ, 2014).

  20. Scott, D. W. Multivariate Density Estimation: Theory, Practice, and Visualization (John Wiley & Sons, Hoboken, NJ, 2015).

Download references


We thank A. Groneberg, A. Laan and A. Pérez-Escudero for discussions; J. Baúto, R. Ribeiro, P. Carriço, T. Cruz, J. Couceiro, L. Costa, A. Certal and I. Campos for assistance in software, arena design and animal husbandry; and A. Bruce (Monash University, Melbourne, Australia), N. Blüthgen (Technische Universität Darmstadt, Darmstadt, Germany), C. Ferreira, A. Laan and M. Iglesias-Julios (Champalimaud Foundation, Lisbon, Portugal) for videos of ants, flies and zebrafish fights. This study was supported by Congento LISBOA-01-0145-FEDER-022170, NVIDIA (M.G.B., F.H. and G.G.d.P.), PTDC/NEU-SCC/0948/2014 (G.G.d.P.) and Champalimaud Foundation (G.G.d.P.). F. R.-F. acknowledges an FCT PhD fellowship.

Author information

Authors and Affiliations



F.R.-F., M.G.B. and G.G.d.P. devised the project and algorithms and analyzed data. F.R.-F. and M.G.B. wrote the code with help from F.H. M.G.B. managed the code architecture and GUI. F.R.-F. managed testing procedures. R.H. built setups and conducted experiments with help from F.R.-F. G.G.d.P. supervised the project. M.G.B. wrote the supplementary material with help from F.R.-F., R.H., F.H. and G.G.d.P., and G.G.d.P. wrote the main text with help from F.R.-F., M.G.B. and F.H.

Corresponding author

Correspondence to Gonzalo G. de Polavieja.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Integrated supplementary information

Supplementary Figure 1 Training dataset of individual images.

(a) Holding grid used to record 184 juvenile zebrafish (TU strain, 31 dpf) in separated chambers (60-mm-diameter Petri dishes). (b) Sample frame showing the individuals used to create the dataset and the individuals used as social context (n= 46 videos corresponding to n = 184 different individuals; ~18,000 frames per individual). (c) Summary of the individual-images dataset. The dataset is composed of a total of ~3,312,000 uncompressed, grayscale, labeled images (52 × 52 pixels).

Supplementary Figure 2 Single-image identification accuracy for different group sizes and different variations of the identification network.

Each network is trained from scratch using 3,000 temporally uncorrelated images per animal (90% for training and 10% for validation) and then tested with 300 new temporally uncorrelated images to compute the single-image identification accuracy (Supplementary Notes). We train and test each network five times. For every repetition, the individuals of the group and the images of each individual are selected randomly. Images are extracted from videos of 184 different animals recorded in isolation (Supplementary Fig. 2). Colored lines with markers represent single-image accuracies (mean ± s.d., n= 5) for network architectures with different numbers of convolutional layers (a; see Supplementary Table 2 for the architectures) and different sizes and numbers of fully connected layers (b; see Supplementary Table 3 for the architectures). The black solid line with diamond markers shows the accuracy for the network used to identify images in (see Supplementary Table 1, identification convolutional neural network).

Supplementary Figure 3 Experimental setup for recording zebrafish videos.

(a) Front view of the experimental setup used to record zebrafish in groups and in isolation. (b) Side view of the same setup with the light diffuser rolled up. (c) Close-up view of the custom-made circular tank used to record the groups of 10, 60 and 100 juvenile zebrafish. (d) Sample frame from a video of 60 animals (n= 3 videos of 10 zebrafish, n= 3 videos of 60 zebrafish, and n= 3 videos of 100 zebrafish).

Supplementary Figure 4 Experimental setup used to record fruit fly videos.

(a) Exterior view of the setup used to record flies in groups. (b) Top view of the same setup with the diffuser rolled up. (c) Close-up view of one of the two arenas used (arena 1). (d) Sample frame from a video of 100 flies (n = 1 group of 38 flies, n = 2 groups of 60 flies, n = 1 group of 72 flies, n = 2 groups of 80 flies, and n = 3 groups of 100 flies; all animals were different for each group).

Supplementary Figure 5 Automatic estimation of identification accuracy.

Comparison between the accuracy estimated automatically by and the accuracy computed by human validation of the videos (Supplementary Notes). The estimated accuracy is computed over the validated portion of the video. Blue dots represent the videos referenced in Supplementary Tables 57.

Supplementary Figure 6 Accuracy as a function of the minimum number of images in the first global fragment used for training.

To study the effect of the minimum number of images per individual in the first global fragment used to train the identification network, we created synthetic videos using images of 184 individuals recorded in isolation (Supplementary Fig. 1). Each synthetic video consists of 10,000 frames, where the number of images in every individual fragment was drawn from a gamma distribution, and the crossing fragments lasted for three frames (Supplementary Notes). The parameters were set as follows: θ = [2,000, 1,000, 500, 250, 100], k = [0.5, 0.35, 0.25, 0.15, 0.05], number of individuals = [10,60,100]. For every combination of these parameters we ran three repetitions. In total, we computed both the cascade of training and identification protocols and the residual identification for 225 synthetic videos. (a) Identification accuracy for simulated (empty markers) and real videos (color markers) as a function of the minimum number of images in the first global fragment. The number next to each color marker indicates the number of animals in the video. The accuracy of the real videos was obtained by manual validation (Supplementary Tables 57). In some videos, animals are almost immobile for long periods of time because of low-humidity conditions. Potentially, the individual fragments acquired during these periods encode less information that is useful for identifying the animals. To account for this, we corrected the number of images in the individual fragments by considering only frames in which the animals were moving with a speed of at least 0.75 BL/s. We observed that was more likely to have higher accuracy when the minimum number of images in the first global fragment used for training was > 30. (b) Distributions of the number of images per individual fragment for real videos of zebrafish, and their fits to a gamma distribution. (c) Distributions of speeds of zebrafish and fruit fly videos.

Supplementary Figure 7 Performance as a function of resolution.

Human-validated accuracy of tracking results obtained at six different resolutions. Pixels per animal are here indicated at the identification stage. There are fewer pixels per animal at the segmentation stage—approximately 25 and 300 pixels per animal, compared with 100 and 600 at the identification stage, respectively.

Supplementary Figure 8 Performance after application of Gaussian blurring.

Human-validated accuracy of tracking results obtained at seven different values of the s.d. of a Gaussian filtering of the video.

Supplementary Figure 9 Performance with inhomogeneous light conditions.

Background image corresponding to two different experiments with 60 zebrafish (n = 1 experiment for each condition). On the left for our standard setup and on the right after switching off the IR LEDs in two walls and covering the light diffuser in the same side with a black cloth. Human-validated accuracy of tracking results is given below the images. The background image is computed as the average of equally spaced frames along the video with a period of 100 frames.

Supplementary Figure 10 Attack score over time for seven pairs of fish staged to fight.

Each colored line represents the attack score of an individual (see the Methods for the definition of ‘attack score’).

Supplementary Figure 11 Correlation between the average distance to the center of the tank and the average speed for two milling groups of 100 juvenile zebrafish.

(a) Probability density of the location in the tank of three representative individuals depicted in (b) as gray markers. (b) Average speed along the video as a function of the average distance to the center of the tank for all the fish in the group. Each black dot represents an individual; the gray markers are the individuals depicted in (a). The blue dashed line is the line of best fit to the data (R2 = 0.5686, Pearson’s r and P = 10–19, two-sided P value using Wald test with t-distribution of the test statistic). (c) Same as in (a) for a different video. (d) Same as in (b) for a different video (R2 = 0.6934, Pearson’s r and P = 7 × 10–27, two-sided P value using Wald test with t-distribution of the test statistic).

Supplementary information

Supplementary Text and Figures

Supplementary Figs. 1–11, Supplementary Tables 1–12 and Supplementary Note 1

Reporting Summary

Supplementary Software contains two folders: (1) idtrackerai-1.0.3-alpha, which is the code for the software at the time of publication (see for the latest version), and (2) idtracker.ai_Figures_and_Tables_code, which includes the code to reproduce the panels in Figs. 1 and 2, as well as Supplementary Figures and Supplementary Tables

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Romero-Ferrero, F., Bergomi, M.G., Hinz, R.C. et al. tracking all individuals in small or large collectives of unmarked animals. Nat Methods 16, 179–182 (2019).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

Further reading


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing