Using DeepLabCut for 3D markerless pose estimation across species and behaviors

Nath, Tanmay; Mathis, Alexander; Chen, An Chi; Patel, Amir; Bethge, Matthias; Mathis, Mackenzie Weygandt

doi:10.1038/s41596-019-0176-0

Protocol
Published: 21 June 2019

Using DeepLabCut for 3D markerless pose estimation across species and behaviors

Tanmay Nath¹^na1,
Alexander Mathis^1,2^na1,
An Chi Chen³,
Amir Patel³,
Matthias Bethge⁴ &
…
Mackenzie Weygandt Mathis ORCID: orcid.org/0000-0001-7368-4456¹

Nature Protocols volume 14, pages 2152–2176 (2019)Cite this article

93k Accesses
548 Citations
123 Altmetric
Metrics details

Subjects

Abstract

Noninvasive behavioral tracking of animals during experiments is critical to many scientific pursuits. Extracting the poses of animals without using markers is often essential to measuring behavioral effects in biomechanics, genetics, ethology, and neuroscience. However, extracting detailed poses without markers in dynamically changing backgrounds has been challenging. We recently introduced an open-source toolbox called DeepLabCut that builds on a state-of-the-art human pose-estimation algorithm to allow a user to train a deep neural network with limited training data to precisely track user-defined features that match human labeling accuracy. Here, we provide an updated toolbox, developed as a Python package, that includes new features such as graphical user interfaces (GUIs), performance improvements, and active-learning-based network refinement. We provide a step-by-step procedure for using DeepLabCut that guides the user in creating a tailored, reusable analysis pipeline with a graphical processing unit (GPU) in 1–12 h (depending on frame size). Additionally, we provide Docker environments and Jupyter Notebooks that can be run on cloud resources such as Google Colaboratory.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Pose estimation with DeepLabCut.**

**Fig. 3: Methods for frame selection.**

**Fig. 7: 3D pose estimation of a cheetah.**

Machine learning reveals the control mechanics of an insect wing hinge

Article 17 April 2024

Highly accurate protein structure prediction with AlphaFold

Article Open access 15 July 2021

brainlife.io: a decentralized and open-source cloud platform to support neuroscience research

Article Open access 11 April 2024

Data and code availability

The code is fully available at https://github.com/AlexEMG/DeepLabCut. Other inquiries should be made to the corresponding author (M.W.M.).

References

Ilg, E. et al. FlowNet 2.0: evolution of optical flow estimation with deep networks. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 1647–1655 (2017).
Toshev, A. & Szegedy, C. DeepPose: human pose estimation via deep neural networks. 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVVR) 1653–1660 (2014).
Levine, S., Pastor, P., Krizhevsky, A., Ibarz, J. & Quillen, D. Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection. Int. J. Robot. Res. 37, 421–436 (2018).
Article Google Scholar
Wainberg, M., Merico, D., Delong, A. & Frey, B. J. Deep learning in biomedicine. Nat. Biotechnol. 36, 829–838 (2018).
Article CAS Google Scholar
LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).
Article CAS Google Scholar
Donahue, J. et al. DeCAF: a deep convolutional activation feature for generic visual recognition. Proceedings of the 31st International Conference on Machine Learning 647–655 (2014).
Yosinski, J., Clune, J., Bengio, Y. & Lipson, H. How transferable are features in deep neural networks? Advances in Neural Information Processing Systems (NIPS) 27, 3320–3328 (2014).
Google Scholar
Goodfellow, I., Bengio, Y. & Courville, A. Deep Learning 1 (MIT Press, Cambridge, MA, 2016).
Google Scholar
Kümmerer, M., Wallis, T. S. A., Gatys, L. A. & Bethge, M. Understanding low-and high-level contributions to fixation prediction. Proceedings of the IEEE International Conference on Computer Vision 4789–4798 (2017).
Insafutdinov, E., Pishchulin, L., Andres, B., Andriluka, M. & Schiele, B. DeeperCut: a deeper, stronger, and faster multi-person pose estimation model. European Conference on Computer Vision 34–50 (2016).
Insafutdinov, E. et al. ArtTrack: articulated multi-person tracking in the wild. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 1293–1301 (2017).
Mathis, A. et al. DeepLabCut: markerless pose estimation of user-defined body parts with deep learning. Nat. Neurosci. 21, 1281–1289 (2018).
Article CAS Google Scholar
Nath, T. et al. Using DeepLabCut for 3D markerless pose estimation across species and behaviors. Preprint at https://www.biorxiv.org/content/10.1101/476531v1 (2018).
Dell, A. I. et al. Automated image-based tracking and its application in ecology. Trends Ecol. Evol. (Amst) 29, 417–428 (2014).
Article Google Scholar
Anderson, D. J. & Perona, P. Toward a science of computational ethology. Neuron 84, 18–31 (2014).
Article CAS Google Scholar
Egnor, S. R. & Branson, K. Computational analysis of behavior. Annu. Rev. Neurosci. 39, 217–236 (2016).
Article CAS Google Scholar
Dollár, P., Welinder, P. & Perona, P. Cascaded pose regression. 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition 1078–1085 (2010).
Gomez-Marin, A., Partoune, N., Stephens, G. J. & Louis, M. Automated tracking of animal posture and movement during exploration and sensory orientation behaviors. PLoS ONE 7, 1–9 (2012).
Article Google Scholar
Matsumoto, J. et al. A 3D-video-based computerized analysis of social and sexual interactions in rats. PLoS ONE 8, e78460 (2013).
Article CAS Google Scholar
Uhlmann, V., Ramdya, P., Delgado-Gonzalo, R., Benton, R. & Unser, M. FlyLimbTracker: an active contour based approach for leg segment tracking in unmarked, freely behaving Drosophila. PLoS ONE 12, e0173433 (2017).
Article Google Scholar
Ben-Shaul, Y. OptiMouse: a comprehensive open source program for reliable detection and analysis of mouse body and nose positions. BMC Biol. 15, 41 (2017).
Article Google Scholar
Winter, D. A. Biomechanics and Motor Control of Human Movement (John Wiley & Sons, 2009).
Zhou, H. & Hu, H. Human motion tracking for rehabilitation—a survey. Biomed. Signal Process. Control 3, 1–18 (2008).
Article CAS Google Scholar
Kays, R., Crofoot, M. C., Jetz, W. & Wikelski, M. Terrestrial animal tracking as an eye on life and planet. Science 348, aaa2478 (2015).
Article Google Scholar
Colyer, S. L., Evans, M., Cosker, D. P. & Salo, A. I. A review of the evolution of vision-based motion analysis and the integration of advanced computer vision methods towards developing a markerless system. Sports Med. Open 4, 24 (2018).
Article Google Scholar
Wei, K. & Kording, K. P. Behavioral tracking gets real. Nat. Neurosci. 21, 1146–1147 (2018).
Article CAS Google Scholar
Mathis, A. & Warren, R. A. On the inference speed and video-compression robustness of DeepLabCut. Preprint at https://www.biorxiv.org/content/10.1101/457242v1 (2018).
Aguillon Rodriguez, V. et al. The International Brain Laboratory: reproducing a single decision-making behavior in mice across labs. Society for Neuroscience 2018, abstr. 613.01 (2018).
Felzenszwalb, P. F. & Huttenlocher, D. P. Pictorial structures for object recognition. Int. J. Comput. Vis 61, 55–79 (2005).
Article Google Scholar
Andriluka, M., Pishchulin, L., Gehler, P. & Schiele, B. 2D human pose estimation: new benchmark and state of the art analysis. 2014 IEEE Conference on Computer Vision and Pattern Recognition 3686–3693 (2014).
Wei, S.-E., Ramakrishna, V., Kanade, T. & Sheikh, Y. Convolutional pose machines. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 4724–4732 (2016).
Cao, Z., Simon, T., Wei, S.-E. & Sheikh, Y. Realtime multi-person 2D pose estimation using part affinity fields. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 1302–1310 (2017).
Simon, T., Joo, H., Matthews, I. & Sheikh, Y. Hand keypoint detection in single images using multiview bootstrapping. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 4645–4653 (2017).
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 770–778 (2016).
Dong, H. et al. TensorLayer: a versatile library for efficient deep learning development. Proceedings of the 25th ACM International Conference on Multimedia 1201–1204 (2017).
Pereira, T. D. et al. Fast animal pose estimation using deep neural networks. Nat. Methods 16, 117–125 (2019).
Article CAS Google Scholar
OpenCV. Open Source Computer Vision Library, https://opencv.org (2015).
Lucas, B. D. & Kanade, T. An iterative image registration technique with an application to stereo vision. Proceedings of the 7th International Joint Conference on Artificial Intelligence, Vol. 2, 674–679 (Morgan Kaufmann, 1981).
Oliphant, T. E. Python for scientific computing. Comput. Sci. Eng. 9, 10–20 (2007).
Article CAS Google Scholar
Abadi, M. et al. TensorFlow: a system for large-scale machine learning. Preprint at https://arxiv.org/abs/1605.08695 (2016).
Merkel, D. Docker: lightweight linux containers for consistent development and deployment. Linux J. 2014, 2 (2014).
Google Scholar
Mathis, M. W., Mathis, A. & Uchida, N. Somatosensory cortex plays an essential role in forelimb motor adaptation in mice. Neuron 93, 1493–1503.e6 (2017).
Article CAS Google Scholar
McKinney, W. Data structures for statistical computing in Python. Proceedings of the 9th Python in Science Conference 51–56 (2010).
Hunter, J. D. Matplotlib: a 2D graphics environment. Comput. Sci. Eng. 9, 90–95 (2007).
Article Google Scholar
Durbin, J. & Koopman, S. J. Time Series Analysis by State Space Methods Vol. 38 (Oxford University Press, 2012).
Seabold, S. & Perktold, J. Statsmodels: econometric and statistical modeling with Python. Proceedings of the 9th Python in Science Conference 57–61 (2010).
Kabra, M., Robie, A. A., Rivera-Alba, M., Branson, S. & Branson, K. JAABA: interactive machine learning for automatic annotation of animal behavior. Nat. Methods 10, 64 (2012).
Article Google Scholar
Berman, G. J., Choi, D. M., Bialek, W. & Shaevitz, J. W. Mapping the stereotyped behaviour of freely moving fruit flies. J. R. Soc. Interface 11, (2014).
Fox, E., Jordan, M. I., Sudderth, E. B. & Willsky, A. S. Sharing features among dynamical systems with beta processes. In Advances in Neural Information Processing Systems (eds Bengio, Y. et al.) 22, 549–557 (Neural Information Processing Systems Foundation, 2009).
Wiltschko, A. B. et al. Mapping sub-second structure in mouse behavior. Neuron 88, 1121–1135 (2015).
Article CAS Google Scholar
Vogelstein, J. T. et al. Discovery of brainwide neural-behavioral maps via multiscale unsupervised structure learning. Science 344, 386–392 (2014).
Article CAS Google Scholar
Priebe, C. E., Marchette, D. J. & Healy, D. M. Integrated sensing and processing for statistical pattern recognition. Mod. Signal Process. 46, 223 (2003).
Google Scholar
Todd, J. G., Kain, J. S. & de Bivort, B. L. Systematic exploration of unsupervised methods for mapping behavior. Phys. Biol. 14, 015002 (2017).
Article Google Scholar
Russakovsky, O. et al. Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115, 211–252 (2015).
Article Google Scholar
Pouw, W., de Jonge-Hoekstra, L. & Dixon, J. A. Stabilizing speech production through gesture-speech coordination. Preprint at https://psyarxiv.com/arzne (2018).
Wickens, A. et al. Magnetoelectric materials for miniature, wireless neural stimulation at therapeutic frequencies. Preprint at https://www.biorxiv.org/content/10.1101/461855v1 (2018).
Wilson, A. M. et al. Locomotion dynamics of hunting in wild cheetahs. Nature 498, 185–189 (2013).
Article CAS Google Scholar
Jackson, B. E., Evangelista, D. J., Ray, D. D. & Hedrick, T. L. 3D for the people: multi-camera motion capture in the field with consumer-grade cameras and open source software. Biol. Open 5, 1334–1342 (2016).
Article Google Scholar
Urban, S., Leitloff, J. & Hinz, S. Improved wide-angle, fisheye and omnidirectional camera calibration. ISPRS J. Photogramm. Remote Sens 108, 72–79 (2015).
Article Google Scholar
Theriault, D. H. et al. A protocol and calibration method for accurate multi-camera field videography. J. Exp. Biol. 217, 1843–1848 (2014).
Article Google Scholar

Download references

Acknowledgements

DeepLabCut is an open-source tool on GitHub and has benefited from suggestions and edits by many individuals, including R. Eichler, J. Rauber, R. Warren, T. Abe, H. Wu, and J. Saunders. In particular, the authors thank R. Eichler for input on the modularized version. The authors thank the members of the Bethge Lab for providing the initial version of the Docker container. We also thank M. Li, J. Li, and D. Robson for use of the zebrafish image; B. Rogers for use of the horse images; and K. Cury for the fly images. The authors are grateful to E. Insafutdinov and C. Lassner for suggestions on how to best use the TensorFlow implementation of DeeperCut. We also thank A. Hoffmann, P. Mamidanna, and G. Kane for comments throughout this project. Last, the authors thank the Ann van Dyk Cheetah Centre (Pretoria, South Africa) for kindly providing access to their cheetahs. The authors thank NVIDIA Corporation for GPU grants to both M.W.M. and A.M. A.M. acknowledges a Marie Sklodowska-Curie International Fellowship within the 7th European Community Framework Program under grant agreement no. 622943. A.P. acknowledges an Oppenheimer Memorial Trust Fellowship and the National Research Foundation of South Africa (grant 99380). M.B. acknowledges funding from the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) via the Collaborative Research Center (Projektnummer 276693517–SFB 1233: Robust Vision) and by the German Federal Ministry of Education and Research through the Tübingen AI Center (FKZ 01IS18039A). M.W.M. acknowledges a Rowland Fellowship from the Rowland Institute at Harvard.

Author information

These authors contributed equally: Tanmay Nath, Alexander Mathis.

Authors and Affiliations

Rowland Institute at Harvard, Harvard University, Cambridge, MA, USA
Tanmay Nath, Alexander Mathis & Mackenzie Weygandt Mathis
Department of Molecular & Cellular Biology, Harvard University, Cambridge, MA, USA
Alexander Mathis
Department of Electrical Engineering, University of Cape Town, Cape Town, South Africa
An Chi Chen & Amir Patel
Tübingen AI Center & Centre for Integrative Neuroscience, Eberhard Karls Universität Tübingen, Tübingen, Germany
Matthias Bethge

Authors

Tanmay Nath
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Mathis
View author publications
You can also search for this author in PubMed Google Scholar
An Chi Chen
View author publications
You can also search for this author in PubMed Google Scholar
Amir Patel
View author publications
You can also search for this author in PubMed Google Scholar
Matthias Bethge
View author publications
You can also search for this author in PubMed Google Scholar
Mackenzie Weygandt Mathis
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Conceptualization: A.M., T.N., and M.W.M. A.M., T.N., and M.W.M. wrote the code. A.P. provided the cheetah data; A.C.C. labeled the cheetah data; A.C.C., A.M., and A.P. analyzed the cheetah data. M.W.M., A.M., and T.N. wrote the manuscript with input from all authors. M.W.M. and M.B. supervised the project.

Corresponding author

Correspondence to Mackenzie Weygandt Mathis.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information: Nature Protocols thanks Gonzalo G. de Polavieja and other anonymous reviewer(s) for their contribution to the peer review of this work.

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Video 1

A video of the sequence described in Fig. 7c. While the images created in Fig. 7c included 256 labeled images (99% training set split, from different sessions, cameras, and perspectives), the video was created after additional training with data from different videos (908 images, 95% training set split). This network was also used for Fig. 7e.

Reporting Summary

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nath, T., Mathis, A., Chen, A.C. et al. Using DeepLabCut for 3D markerless pose estimation across species and behaviors. Nat Protoc 14, 2152–2176 (2019). https://doi.org/10.1038/s41596-019-0176-0

Download citation

Received: 05 October 2018
Accepted: 09 April 2019
Published: 21 June 2019
Issue Date: July 2019
DOI: https://doi.org/10.1038/s41596-019-0176-0

This article is cited by

Characterization of early markers of disease in the mouse model of mucopolysaccharidosis IIIB
- Katherine B. McCullough
- Amanda Titus
- Susan E. Maloney
Journal of Neurodevelopmental Disorders (2024)
Phenotypic analysis of ataxia in spinocerebellar ataxia type 6 mice using DeepLabCut
- Dennis Piotrowski
- Erik K. H. Clemensson
- Melanie D. Mark
Scientific Reports (2024)
Prefrontal control of superior colliculus modulates innate escape behavior following adversity
- Ami Ritter
- Shlomi Habusha
- Oded Klavir
Nature Communications (2024)
Neural signatures of natural behaviour in socializing macaques
- Camille Testard
- Sébastien Tremblay
- Michael L. Platt
Nature (2024)
Basal ganglia–spinal cord pathway that commands locomotor gait asymmetries in mice
- Jared M. Cregg
- Simrandeep K. Sidhu
- Ole Kiehn
Nature Neuroscience (2024)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.