Iterative human and automated identification of wildlife images

Miao, Zhongqi; Liu, Ziwei; Gaynor, Kaitlyn M.; Palmer, Meredith S.; Yu, Stella X.; Getz, Wayne M.

doi:10.1038/s42256-021-00393-0

Article
Published: 18 October 2021

Iterative human and automated identification of wildlife images

Nature Machine Intelligence volume 3, pages 885–895 (2021)Cite this article

1690 Accesses
20 Citations
33 Altmetric
Metrics details

Subjects

A preprint version of the article is available at arXiv.

Abstract

Camera trapping is increasingly being used to monitor wildlife, but this technology typically requires extensive data annotation. Recently, deep learning has substantially advanced automatic wildlife recognition. However, current methods are hampered by a dependence on large static datasets, whereas wildlife data are intrinsically dynamic and involve long-tailed distributions. These drawbacks can be overcome through a hybrid combination of machine learning and humans in the loop. Our proposed iterative human and automated identification approach is capable of learning from wildlife imagery data with a long-tailed distribution. Additionally, it includes self-updating learning, which facilitates capturing the community dynamics of rapidly changing natural systems. Extensive experiments show that our approach can achieve an ~90% accuracy employing only ~20% of the human annotations of existing approaches. Our synergistic collaboration of humans and machines transforms deep learning from a relatively inefficient post-annotation tool to a collaborative ongoing annotation tool that vastly reduces the burden of human annotation and enables efficient and constant model updates.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Overview of a realistic animal classification system.**

**Fig. 2: Label efficiency comparison with transfer learning on the group 2 validation set (ordered with respect to training sample size).**

Highly accurate protein structure prediction with AlphaFold

Article Open access 15 July 2021

Scientific discovery in the age of artificial intelligence

Article 02 August 2023

ColabFold: making protein folding accessible to all

Article Open access 30 May 2022

References

Steenweg, R. et al. Scaling-up camera traps: monitoring the planet’s biodiversity with networks of remote sensors. Front. Ecol. Environ. 15, 26–34 (2017).
Article Google Scholar
Rich, L. N. et al. Assessing global patterns in mammalian carnivore occupancy and richness by integrating local camera trap surveys. Global Ecol. Biogeogr. 26, 918–929 (2017).
Article Google Scholar
Barnosky, A. D. et al. Has the Earth’s sixth mass extinction already arrived? Nature 471, 51–57 (2011).
Article Google Scholar
Ahumada, J. A. et al. Wildlife insights: a platform to maximize the potential of camera trap and other passive sensor wildlife data for the planet. Environ. Conserv. 47, 1–6 (2020).
Article MathSciNet Google Scholar
Norouzzadeh, M. S. et al. Automatically identifying, counting, and describing wild animals in camera-trap images with deep learning. Proc. Natl Acad. Sci. 115, E5716–E5725 (2018).
Article Google Scholar
Miao, Z. et al. Insights and approaches using deep learning to classify wildlife. Sci. Rep. 9, 8137 (2019).
Article Google Scholar
Liu, Z. et al. Large-scale long-tailed recognition in an open world. In Proc. IEEE Conference on Computer Vision and Pattern Recognition 2537–2546 (IEEE, 2019).
Liu, Z. et al. Open compound domain adaptation. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 12406–12415 (IEEE, 2020).
Hautier, Y. et al. Anthropogenic environmental changes affect ecosystem stability via biodiversity. Science 348, 336–340 (2015).
Article Google Scholar
Barlow, J. et al. Anthropogenic disturbance in tropical forests can double biodiversity loss from deforestation. Nature 535, 144–147 (2016).
Article Google Scholar
Ripple, W. J. et al. Conserving the world’s megafauna and biodiversity: the fierce urgency of now. Bioscience 67, 197–200 (2017).
Google Scholar
Dirzo, R. et al. Defaunation in the Anthropocene. Science 345, 401–406 (2014).
Article Google Scholar
O’Connell, A. F., Nichols, J. D. & Karanth, K. U. Camera Traps in Animal Ecology: Methods and Analyses (Springer Science & Business Media, 2010).
Burton, A. C. et al. Wildlife camera trapping: a review and recommendations for linking surveys to ecological processes. J. Appl. Ecol. 52, 675–685 (2015).
Article Google Scholar
Kays, R., McShea, W. J. & Wikelski, M. Born-digital biodiversity data: millions and billions. Divers. Distrib. 26, 644–648 (2020).
Article Google Scholar
Swanson, A. et al. Snapshot Serengeti, high-frequency annotated camera trap images of 40 mammalian species in an African savanna. Sci. Data 2, 1–14 (2015).
Article Google Scholar
Ahumada, J. A. et al. Community structure and diversity of tropical forest mammals: data from a global camera trap network. Philos. Trans. R. Soc. B Biol. Sci. 366, 2703–2711 (2011).
Article Google Scholar
Pardo, L. E. et al. Snapshot Safari: a large-scale collaborative to monitor Africa’s remarkable biodiversity. South Africa J. Sci. https://doi.org/10.17159/sajs.2021/8134 (2021).
Anderson, T. M. et al. The spatial distribution of African savannah herbivores: species associations and habitat occupancy in a landscape context. Philos. Trans. R. Soc. B Biol. Sci. 371, 20150314 (2016).
Article Google Scholar
Palmer, M., Fieberg, J., Swanson, A., Kosmala, M. & Packer, C. A ‘dynamic’ landscape of fear: prey responses to spatiotemporal variations in predation risk across the lunar cycle. Ecol. Lett. 20, 1364–1373 (2017).
Article Google Scholar
Tabak, M. A. et al. Machine learning to classify animal species in camera trap images: applications in ecology. Methods Ecol. Evol. 10, 585–590 (2019).
Article Google Scholar
Whytock, R. C. et al. Robust ecological analysis of camera trap data labelled by a machine learning model. Methods Ecol. Evol 12, 1080–1092 (2021).
Article Google Scholar
Beery, S., Van Horn, G. & Perona, P. Recognition in terra incognita. In Proc. European Conference on Computer Vision (ECCV) 456–473 (IEEE, 2018).
Tabak, M. A. et al. Improving the accessibility and transferability of machine learning algorithms for identification of animals in camera trap images: MLWIC2. Ecol. Evol. 10, 10374–10383 (2020).
Article Google Scholar
Shahinfar, S., Meek, P. & Falzon, G. How many images do I need? Understanding how sample size per class affects deep learning model performance metrics for balanced designs in autonomous wildlife monitoring. Ecol. Inform. 57, 101085 (2020).
Article Google Scholar
Norouzzadeh, M. S. et al. A deep active learning system for species identification and counting in camera trap images. Methods Ecol. Evol. 12, 150–161 (2020).
Article Google Scholar
Willi, M. et al. Identifying animal species in camera trap images using deep learning and citizen science. Methods Ecol. Evol. 10, 80–91 (2019).
Article Google Scholar
Schneider, S., Greenberg, S., Taylor, G. W. & Kremer, S. C. Three critical factors affecting automated image species recognition performance for camera traps. Ecol. Evol. 10, 3503–3517 (2020).
Article Google Scholar
Kays, R. et al. An empirical evaluation of camera trap study design: how many, how long and when? Methods Ecol. Evol. 11, 700–713 (2020).
Article Google Scholar
Prach, K. & Walker, L. R. Four opportunities for studies of ecological succession. Trends Ecol. Evol. 26, 119–123 (2011).
Article Google Scholar
Mech, L. D., Isbell, F., Krueger, J. & Hart, J. Gray wolf (Canis lupus) recolonization failure: a Minnesota case study. Can. Field-Nat. 133, 60–65 (2019).
Article Google Scholar
Taylor, G. et al. Is reintroduction biology an effective applied science? Trends Ecol. Evol. 32, 873–880 (2017).
Article Google Scholar
Clavero, M. & Garcia-Berthou, E. Invasive species are a leading cause of animal extinctions. Trends Ecol. Evol. 20, 110 (2005).
Article Google Scholar
Caravaggi, A. et al. An invasive-native mammalian species replacement process captured by camera trap survey random encounter models. Remote Sens. Ecol. Conserv. 2, 45–58 (2016).
Article Google Scholar
Arjovsky, M., Bottou, L., Gulrajani, I. & Lopez-Paz, D. Invariant risk minimization. Preprint at https://arxiv.org/abs/1907.02893 (2019).
Yosinski, J., Clune, J., Bengio, Y. & Lipson, H. How transferable are features in deep neural networks? In Advances in Neural Information Processing Systems 3320–3328 (IEEE, 2014).
Deng, J. et al. ImageNet: a large-scale hierarchical image database. In Proc. 2009 IEEE Conference on Computer Vision and Pattern Recognition 248–255 (IEEE, 2009).
Pimm, S. L. et al. The biodiversity of species and their rates of extinction, distribution and protection. Science https://doi.org/10.1126/science.1246752 (2014).
Liu, W., Wang, X., Owens, J. & Li, Y. Energy-based out-of-distribution detection. In Advances in Neural Information Processing Systems (eds Larochelle, H. et al.) 21464–21475 (Curran Associates, 2020).
Lee, D.-H. Pseudo-label: the simple and efficient semi-supervised learning method for deep neural networks. In Workshop on Challenges in Representation Learning, ICML, Vol. 3 (2013).
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Proc. IEEE Conference on Computer Vision and Pattern Recognition 770–778 (IEEE, 2016).
Hinton, G., Vinyals, O. & Dean, J. Distilling the knowledge in a neural network. Preprint at https://arxiv.org/abs/1503.02531 (2015).
Gaynor, K. M., Daskin, J. H., Rich, L. N. & Brashares, J. S. Postwar wildlife recovery in an African savanna: evaluating patterns and drivers of species occupancy and richness. Anim. Conserv. 24, 510–522 (2020).
Article Google Scholar
Paszke, A. et al. in Advances in Neural Information Processing Systems Vol. 32 (eds Wallach, H. et al.) 8024–8035 http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf (Curran Associates, 2019)
Chen, T., Kornblith, S., Norouzi, M. & Hinton, G. A simple framework for contrastive learning of visual representations. Preprint at https://arxiv.org/abs/2002.05709 (2020).
He, K., Fan, H., Wu, Y., Xie, S. & Girshick, R. Momentum contrast for unsupervised visual representation learning. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 9729–9738 (IEEE, 2020).
Xiao, T., Wang, X., Efros, A. A. & Darrell, T. What should not be contrastive in contrastive learning. Preprint at https://arxiv.org/abs/2008.05659 (2020).

Download references

Acknowledgements

We thank T. Gu, A. Ke, H. Rosen, A. Wu, C. Jurgensen, E. Lai, M. Levy and E. Silverberg for annotating the images used in this study, as well as everyone else involved in this project. Data collection was supported by J. Brashares and through grants to K.M.G. from HHMI BioInteractive, the Rufford Foundation, Idea Wild, the Explorers Club and the UC Berkeley Center for African Studies. We are grateful for the support of Gorongosa National Park, especially M. Stalmans, in permitting and facilitating this research. Z.L. is supported by NTU NAP. K.M.G. is supported by Schmidt Science Fellows in partnership with the Rhodes Trust, and the National Center for Ecological Analysis and Synthesis Director’s Postdoctoral Fellowship. M.S.P. is funded by National Science Foundation grant no. PRFB #1810586.

Author information

These authors contributed equally: Zhongqi Miao, Ziwei Liu.

Authors and Affiliations

Department of Environmental Science, Policy, and Management, University of California, Berkeley, Berkeley, CA, USA
Zhongqi Miao, Stella X. Yu & Wayne M. Getz
International Computer Science Institute, University of California, Berkeley, Berkeley, CA, USA
Zhongqi Miao & Stella X. Yu
School of Computer Science and Engineering, Nanyang Technological University, Singapore, Singapore
Ziwei Liu
National Center for Ecological Analysis and Synthesis, University of California, Santa Barbara, Santa Barbara, CA, USA
Kaitlyn M. Gaynor
Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ, USA
Meredith S. Palmer
School of Mathematics, Statistics and Computer Science, University of KwaZulu-Natal, Durban, South Africa
Wayne M. Getz

Authors

Zhongqi Miao
View author publications
You can also search for this author in PubMed Google Scholar
Ziwei Liu
View author publications
You can also search for this author in PubMed Google Scholar
Kaitlyn M. Gaynor
View author publications
You can also search for this author in PubMed Google Scholar
Meredith S. Palmer
View author publications
You can also search for this author in PubMed Google Scholar
Stella X. Yu
View author publications
You can also search for this author in PubMed Google Scholar
Wayne M. Getz
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

This study was conceived by Z.M., Z.L., K.M.G. and M.S.P. The methods were designed by Z.M. and Z.L. Code was written by Z.M., and the computations were undertaken by Z.M. with help from Z.L. The main text was drafted by Z.M. and Z.L., with contributions, editing and comments from all authors. K.M.G. and M.S.P. collected all data and oversaw annotation. Z.M. created all figures and tables in consultation with W.M.G., Z.L. and S.X.Y.

Corresponding authors

Correspondence to Zhongqi Miao or Wayne M. Getz.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature Machine Intelligence thanks Dan Morris and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 The distribution of images across species in the entire camera trap data set.

There are 55 categories in total. 14 categories were tagged as "unknown” (colored in orange) and used to improve and validate our model’s sensitivity to novel and difficult samples.

Extended Data Fig. 2 The distribution of species across the two groups of data.

We split the data set into two groups to mimic two sequential data collection seasons. In the first group, there are 26 categories (colored in blue). The second group has 41 categories. Group 1 is used in the first period experiment to train a baseline model, and Group 2 is used in the second period experiment to test and update the model.

Extended Data Fig. 3 The overall experimental workflow of our framework.

In the first time step, a baseline model is trained using group 1 training data with only 26 categories. Next, the classifier is fine-tuned using the 14 unknown categories and energy-based loss to increase the sensitivity to out-of-distribution categories. After the classifier is fine-tuned, the classifier is then used to predict classifications for group 2 training data. Here, high-confidence predictions are trusted while low-confidence predictions are flagged for human annotation. In the final step, both machine- and human-annotations are used to update the previous model with OLTR and semi-supervised techniques. Once the model is updated, the classifier is fine-tuned using energy-based loss again for out-of-distribution sensitivity.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Miao, Z., Liu, Z., Gaynor, K.M. et al. Iterative human and automated identification of wildlife images. Nat Mach Intell 3, 885–895 (2021). https://doi.org/10.1038/s42256-021-00393-0

Download citation

Received: 03 February 2021
Accepted: 22 August 2021
Published: 18 October 2021
Issue Date: October 2021
DOI: https://doi.org/10.1038/s42256-021-00393-0