Abstract
The ever-increasing human footprint even in very remote places on Earth has inspired efforts to document biodiversity vigorously in case organisms go extinct. However, the data commonly gathered come from either primary voucher specimens in a natural history collection or from direct field observations that are not traceable to tangible material in a museum or herbarium. Although both datasets are crucial for assessing how anthropogenic drivers affect biodiversity, they have widespread coverage gaps and biases that may render them inefficient in representing patterns of biodiversity. Using a large global dataset of around 1.9 billion occurrence records of terrestrial plants, butterflies, amphibians, birds, reptiles and mammals, we quantify coverage and biases of expected biodiversity patterns by voucher and observation records. We show that the mass production of observation records does not lead to higher coverage of expected biodiversity patterns but is disproportionately biased toward certain regions, clades, functional traits and time periods. Such coverage patterns are driven by the ease of accessibility to air and ground transportation, level of security and extent of human modification at each sampling site. Conversely, voucher records are vastly infrequent in occurrence data but in the few places where they are sampled, showed relative congruence with expected biodiversity patterns for all dimensions. The differences in coverage and bias by voucher and observation records have important implications on the utility of these records for research in ecology, evolution and conservation research.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Rent or buy this article
Get just this article for as long as you need it
$39.95
Prices may be subject to local taxes which are calculated during checkout





Data availability
The links to the species occurrence records downloaded from the GBIF are available at Zenodo (https://doi.org/10.5281/zenodo.6834577). The datasets, data tables, grid cell vector polygons and R codes are archived at Zenodo (https://doi.org/10.5281/zenodo.6834577).
Code availability
All scripts, codes and data documentation necessary to repeat our analyses have been made available in the Zenodo database (https://doi.org/10.5281/zenodo.6834577) under the folder ‘SCRIPTS’.
References
Butchart, S. H. et al. Global biodiversity: indicators of recent declines. Science 328, 1164–1168 (2010).
Tittensor, D. P. et al. A mid-term analysis of progress toward international biodiversity targets. Science 346, 241–244 (2014).
Johnson, C. N. et al. Biodiversity losses and conservation responses in the Anthropocene. Science 356, 270–275 (2017).
Díaz, S. et al. Pervasive human-driven decline of life on Earth points to the need for transformative change. Science 366, eaax3100 (2019).
Kageyama, M. et al. in Museum Studies: Perspectives and Innovations (eds Williams, S. L. & Hawks, C. A.) 257–264 (Society for the Preservation of Natural History Collections, 2007).
Cook, J. A. et al. The Beringian Coevolution Project: holistic collections of mammals and associated parasites reveal novel perspectives on evolutionary and environmental change in the North. Arct. Sci. 3, 585–617 (2016).
Jungblut, A. D. & Hawes, I. Using Captain Scott’s Discovery specimens to unlock the past: has Antarctic cyanobacterial diversity changed over the last 100 years? Proc. R. Soc. B 284, 20170833 (2017).
Daru, B. H., Bowman, E. A., Pfister, D. H. & Arnold, A. E. A novel proof-of-concept for capturing the diversity of endophytic fungi preserved in herbarium specimens. Philos. Trans. R. Soc. B 374, 20170395 (2018).
Meineke, E. K., Davis, C. C. & Davies, T. J. The unrealized potential of herbaria for global change biology. Ecol. Monogr. 88, 505–525 (2018).
Colella, J. P. et al. The open-specimen movement. BioScience 71, 405–414 (2021).
Unger, S., Rollins, M., Tietz, A. & Dumais, H. iNaturalist as an engaging tool for identifying organisms in outdoor activities. J. Biol. Educ. 55, 537–547 (2021).
Sullivan, B. L. et al. eBird: a citizen-based bird observation network in the biological sciences. Biol. Conserv. 142, 2282–2292 (2009).
Dickinson, J. L. et al. The current state of citizen science as a tool for ecological research and public engagement. Front. Ecol. Environ. 10, 291–297 (2012).
Miller-Rushing, A., Primack, R. & Bonney, R. The history of public participation in ecological research. Front. Ecol. Environ. 10, 285–290 (2012).
Petersen, T. K., Speed, J. D. M., Grøtan, V. & Austrheim, G. Species data for understanding biodiversity dynamics: the what, where and when of species occurrence data collection. Ecol. Solut. Evid. 2, e12048 (2021).
Dickinson, J. L., Zuckerberg, B. & Bonter, D. N. Citizen science as an ecological research tool: challenges and benefits. Annu. Rev. Ecol. Evol. Syst. 41, 149–172 (2010).
Tulloch, A. I. T., Possingham, H. P., Joseph, L. N., Szabo, J. & Martin, T. G. Realising the full potential of citizen science monitoring programs. Biol. Conserv. 165, 128–138 (2013).
Hortal, J. & Lobo, J. M. A synecological framework for systematic conservation planning. Biodivers. Inform. 3, 16–45 (2006).
Lobo, J. M., Baselga, A., Hortal, J., Jimenez-Valverde, A. & Gomez, J. F. How does the knowledge about the spatial distribution of Iberian dung beetle species accumulate over time? Divers. Distrib. 13, 772–780 (2007).
Sandel, B. et al. Estimating the missing species bias in plant trait measurements. J. Veg. Sci. 26, 828–838 (2015).
Hijmans, R. J. et al. Assessing the geographic representation of genebank collections: the case of the Bolivian wild potatoes. Conserv. Biol. 14, 1755–1765 (2000).
Hortal, J., Lobo, J. M. & Jimenez-Valverde, A. Limitations of biodiversity databases: case study on seed-plant diversity in Tenerife, Canary Islands. Conserv. Biol. 21, 853–863 (2007).
Funk, V. A. & Morin, N. A survey of the herbaria of the southeast United States. SIDA Contrib. Bot. 18, 35–52 (2000).
Norris, W. R., Lewis, D. Q., Widrlechner, M. P., Thompson, J. D. & Pope, R. O. Lessons from an inventory of the Ames, Iowa, flora (1859–2000). J. Iowa Acad. Sci. 108, 34–63 (2001).
Schmidt-lebuhn, A. N., Knerr, N. J. & Kessler, M. Non-geographic collecting biases in herbarium specimens of Australian daisies (Asteraceae). Biodivers. Conserv. 22, 905–919 (2013).
Meyer, C., Kreft, H., Guralnick, R. & Jetz, W. Global priorities for an effective information basis of biodiversity distributions. Nat. Commun. 6, 8221 (2015).
Meyer, C., Weigelt, P. & Kreft, H. Multidimensional biases, gaps and uncertainties in global plant occurrence information. Ecol. Lett. 19, 992–1006 (2016).
Tingley, R., Meiri, S. & Chapple, D. G. Addressing knowledge gaps in reptile conservation. Biol. Conserv. 204, 1–5 (2016).
Guedes, T. B. et al. Patterns, biases and prospects in the distribution and diversity of Neotropical snakes. Glob. Ecol. Biogeogr. 27, 14–21 (2018).
Daru, B. H. et al. Widespread sampling biases in herbaria revealed from large-scale digitization. New Phytol. 217, 939–955 (2018).
Troudet, J., Vignes-Lebbe, R., Grandcolas, P. & Legendre, F. The increasing disconnection of primary biodiversity data from specimens: how does it happen and how to handle it? Syst. Biol. 67, 1110–1119 (2018).
Wheeler, Q. D. Insect diversity and cladistic constraints. Ann. Entomol. Soc. Am. 83, 1031–1047 (1990).
Cotterill, F. P. D. Systematics, biological knowledge and environmental conservation. Biodivers. Conserv. 4, 183–205 (1995).
Dalton, R. Natural history collections in crisis as funding is slashed. Nature 423, 575 (2003).
Gropp, R. E. Are university natural science collections going extinct? BioScience 53, 550 (2003).
Stokstad, E. Nebraska husks research to ease budget squeeze. Science 300, 35 (2003).
Vollmar, A., Macklin, J. A. & Ford, L. Natural history specimen digitization: challenges and concerns. Biodivers. Inform. 7, 93–112 (2010).
Andreone, F. et al. Italian natural history museums on the verge of collapse? ZooKeys 456, 139–146 (2014).
Kemp, C. The endangered dead. Nature 518, 292–294 (2015).
Paknia, O., Rajaei, Sh,H. & Koch, A. Lack of well-maintained natural history collections and taxonomists in megadiverse developing countries hampers global biodiversity exploration. Org. Divers. Evol. 15, 619–629 (2015).
Nowogrodzki, A. Biological specimen troves threatened by funding pause. Nature 531, 561 (2016).
Bakker, F. T. et al. The Global Museum: natural history collections and the future of evolutionary science and public education. PeerJ 8, e8225 (2020).
Pyke, G. H. & Ehrlich, P. R. Biological collections and ecological/environmental research: a review, some observations and a look to the future. Biol. Rev. 85, 247–266 (2010).
Maldonado, C. et al. Species diversity and distribution in the era of Big Data. Glob. Ecol. Biogeogr. 24, 973–984 (2015).
Rudbeck, A. V. et al. The Darwinian shortfall in plants: phylogenetic knowledge is driven by range size. Ecography 2022, e06142 (2022).
Speed, J. D. M. et al. Contrasting spatial, temporal and environmental patterns in observation and specimen based species occurrence data. PLoS ONE 13, e0196417 (2018).
Meineke, E. K., Davies, T. J., Daru, B. H. & Davis, C. C. Biological collections for understanding biodiversity in the Anthropocene. Philos. Trans. R. Soc. B 374, 20170386 (2018).
Pearson, K. D. & Mast, A. R. Mobilizing the community of biodiversity specimen collectors to effectively detect and document outliers in the Anthropocene. Am. J. Bot. 106, 1052–1058 (2019).
What is GBIF? (GBIF, accessed 18 October 2022); https://www.gbif.org/what-is-gbif
The IUCN Red List of Threatened Species. Version 2022-1 (IUCN, accessed 19 October 2022); https://www.iucnredlist.org
Fleishman, E. & Murphy, D. D. A realistic assessment of the indicator potential of butterflies and other charismatic taxonomic groups. Conserv. Biol. 23, 1109–1116 (2009).
Troudet, J. et al. Taxonomic bias in biodiversity data and societal preferences. Sci. Rep. 7, 9132 (2017).
Lehikoinen, A. et al. Declining population trends of European mountain birds. Glob. Change Biol. 25, 577–588 (2019).
Allington-Jones, L. & Bailey, R. Treatments for lipid oxidation in taxidermy and impact on DNA recovery. Stud. Conserv. 66, 463–476 (2021).
Rocha, L. A. et al. Specimen collection: an essential tool. Science 344, 814–815 (2014).
Myers, N. et al. Biodiversity hotspots for conservation priorities. Nature 403, 853–858 (2000).
Venter, O. et al. Targeting global protected area expansion for imperiled biodiversity. PLoS Biol. 12, e1001891 (2014).
Loiselle, B. A. et al. Predicting species distributions from herbarium collections: does climate bias in collection sampling influence model outcomes? J. Biogeogr. 35, 105–116 (2008).
Newbold, T. Applications and limitations of museum data for conservation and ecology, with particular attention to species distribution models. Prog. Phys. Geogr. 34, 3–22 (2010).
Mair, L. & Ruete, A. Explaining spatial variation in the recording effort of citizen science data across multiple taxa. PLoS ONE 11, e0147796 (2016).
Yang, W., Ma, K. & Kreft, H. Geographical sampling bias in a large distributional database and its effects on species richness–environment models. J. Biogeogr. 40, 1415–1426 (2013).
Yates, T. L. The role of voucher specimens in mammal collections: characterisation and funding responsibilities. Acta Zool. Fenn. 170, 81–82 (1985).
Donegan, T. M. New species and subspecies descriptions do not and should not always require a dead type specimen. Zootaxa 1761, 37–48 (2008).
Palmer, M. W., Earls, P. G., Hoagland, B. W., White, P. S. & Wohlgemuth, T. Quantitative tools for perfecting species list. Environmetrics 13, 121–137 (2002).
Robinson, J. G. in Conservation of Exploited Species (eds Reynolds, J. D. et al.) 485–498 (Cambridge Univ. Press, 2001).
Freitag, S., Hobson, C., Biggs, H. C. & Jaarsveld, A. S. Testing for potential survey bias: the effect of roads, urban areas and nature reserves on a southern African mammal data set. Anim. Conserv. 1, 119–127 (1998).
Funk, V. A. & Richardson, K. Biological specimen data in biodiversity studies: use it or lose it. Syst. Biol. 51, 303–316 (2002).
Soria-Auza, R. W. & Kessler, M. The influence of sampling intensity on the perception of the spatial distribution of tropical diversity and endemism: a case study of ferns from Bolivia. Divers. Distrib. 14, 123–130 (2008).
Ballesteros-Mejia, L., Kitching, I. J., Jetz, W., Nagel, P. & Beck, J. Mapping the biodiversity of tropical insects: species richness and inventory completeness of African sphingid moths. Glob. Ecol. Biogeogr. 22, 586–595 (2013).
Yang, W., Ma, K. & Kreft, H. Environmental and socio-economic factors shaping the geography of floristic collections in China. Glob. Ecol. Biogeogr. 23, 1284–1292 (2014).
Zizka, A., Antonelli, A. & Silvestro, D. sampbias, a method for quantifying geographic sampling biases in species distribution data. Ecography 44, 25–32 (2021).
Hanson, T. et al. Warfare in biodiversity hotspots. Conserv. Biol. 23, 578–587 (2009).
Zipper, S. C. et al. Urban heat island impacts on plant phenology: intra-urban variability and response to land cover. Environ. Res. Lett. 11, 054023 (2016).
Li, D., Stucky, B. J., Deck, J., Baiser, B. & Guralnick, R. P. The effect of urbanization on plant phenology depends on regional temperature. Nat. Ecol. Evol. 3, 1661–1667 (2019).
Jeppsson, T., Lindhe, A., Gärdenfors, U. & Forslund, P. The use of historical collections to estimate population trends: a case study using Swedish longhorn beetles (Coleoptera: Cerambycidae). Biol. Conserv. 143, 1940–1950 (2010).
Yates, T. L. et al. The ecology and evolutionary history of an emergent disease: hantavirus pulmonary syndrome. Bioscience 52, 989–998 (2002).
O’Connell, A. F. Jr, Gilbert, A. T. & Hatfield, J. S. Contribution of natural history collection data to biodiversity assessment in national parks. Conserv. Biol. 18, 1254–1261 (2004).
Prather, L. A., Fuentes, O. A., Mayfield, M. H. & Ferguson, C. J. The decline of plant collecting in the United States: a threat to the infrastructure of biodiversity studies. Syst. Bot. 29, 15–28 (2004).
Winker, K. Natural history museums in a postbiodiversity era. BioScience 54, 455–459 (2004).
Bortolus, A. Error cascades in the biological sciences: the unwanted consequences of using bad taxonomy in ecology. Ambio 37, 114–118 (2008).
Joseph, L. Museum collections in ornithology: today’s record of avian biodiversity for tomorrow’s world. Emu https://doi.org/10.1071/MUv111n3_ED (2011).
Bradley, R. D., Bradley, L. C., Garner, H. J. & Baker, R. J. Assessing the value of natural history collections and addressing issues regarding long-term growth and care. BioScience 64, 1150–1158 (2014).
Renner, S. S. & Rockinger, A. Is plant collecting in Germany coming to an end? Willdenowia 46, 93–97 (2016).
Spear, D. M., Pauly, G. B. & Kaiser, K. Citizen science as a tool for augmenting museum collection data from urban areas. Front. Ecol. Evol. 5, 86 (2017).
Dunnum, J. L., McLean, B. S. & Dowler, R. C. Mammal collections of the Western Hemisphere: a survey and directory of collections. J. Mammal. 99, 1307–1322 (2018).
Malaney, J. & Cook, J. A perfect storm for mammalogy: declining sample availability in a period of rapid environmental degradation. J. Mammal. 99, 773–788 (2018).
Ferguson, A. W. On the role of (and threat to) natural history museums in mammal conservation: an African small mammal perspective. J. Vert. Biol. 69, 20028–1 (2020).
Salvador, R. & Cunha, C. Natural history collections and the future legacy of ecological research. Oecologia 192, 641–646 (2020).
Fischer, E. E., Cobb, N. S., Kawahara, A. Y., Zaspel, J. M. & Cognato, A. I. Decline of amateur Lepidoptera collectors threatens the future of specimen-based research. BioScience 71, 396–404 (2021).
Heberling, J. M., Miller, J. T., Noesgaard, D., Weingart, S. B. & Schigel, D. Data integration enables global biodiversity synthesis. Proc. Natl Acad. Sci. USA 118, e2018093118 (2021).
Hedrick, B. P. et al. Digitization and the future of natural history collections. BioScience 70, 243–251 (2020).
Sandel, B. et al. The influence of Late Quaternary climate-change velocity on species endemism. Science 334, 660–664 (2011).
Holt, B. G. et al. An update of Wallace’s zoogeographic regions of the world. Science 339, 74–78 (2013).
Mainali, K., Hefley, T., Ries, L. & Fagan, W. F. Matching expert range maps with species distribution model predictions. Conserv. Biol. 34, 1292–1304 (2020).
McKenna, D. D. & Farrell, B. D. in The Timetree of Life (eds Hedges, S. B. & Kumar, S.) 278–289 (Oxford Univ. Press, 2009).
Xue, Y., Davies, I., Fink, D., Wood, C. & Gomes, C. P. in Principles and Practice of Constraint Programming (ed. Rueher, M.) 707–719 (Springer, 2016).
Robinson, O. J., Ruiz‐Gutierrez, V. & Fink, D. Correcting for bias in distribution modelling for rare species using citizen science data. Divers. Distrib. 24, 460–472 (2018).
Callaghan, C. T., Rowley, J. J. L., Cornwell, W. K., Poore, A. G. B. & Major, R. E. Improving big citizen science data: moving beyond haphazard sampling. PLoS Biol. 17, e3000357 (2019).
Clemann, N. et al. Value and impacts of collecting vertebrate voucher specimens, with guidelines for ethical collection. Mem. Mus. Vic. 72, 141–151 (2014).
Syfert, M. M., Smith, M. J. & Coomes, D. A. The effects of sampling bias and model complexity on the predictive performance of MaxEnt species distribution models. PLoS ONE 8, e55158 (2013).
Soberón, J., Jiménez, R., Golubov, J. & Koleff, P. Assessing completeness of biodiversity databases at different spatial scales. Ecography 30, 152–160 (2007).
Marcer, A. et al. Uncertainty matters: ascertaining where specimens in natural history collections come from and its implications for predicting species distributions. Ecography 2022, e06025 (2022).
Zizka, A. et al. CoordinateCleaner: standardized cleaning of occurrence records from biological collection databases. Methods Ecol. Evol. 10, 744–751 (2019).
World Flora Online (WHO, accessed 12 July 2022); http://www.worldfloraonline.org
Pinkert, S., Barve, V., Guralnick, R. & Jetz, W. Global geographical and latitudinal variation in butterfly species richness captured through a comprehensive country-level occurrence database. Glob. Ecol. Biogeogr. 31, 830–839 (2022).
Frost, D. R. Amphibian Species of the World: An Online Reference (Version 5.3) (American Museum of Natural History, 2009); http://research.amnh.org/vz/herpetology/amphibia/index.php
del Hoyo, J. & Collar, N. J. HBW and BirdLife International Illustrated Checklist of the Birds of the World: Non-passerines Vol. 1 (Lynx Edicions, 2014).
del Hoyo, J. & Collar, N. J. HBW and BirdLife International Illustrated Checklist of the Birds of the World: Passerines Vol. 2 (Lynx Edicions, 2016).
Uetz, P., Freed, P., Aguilar, R. & Hošek, J. (eds) The Reptile Database (accessed January 6, 2020); http://reptile-database.org/
Wilson, D. E. & Reeder, D. M. Mammal Species of the World: A Taxonomic and Geographic Reference 3rd edn (John Hopkins Univ. Press, 2005).
Soberón, J. & Peterson, T. Biodiversity informatics: managing and applying primary biodiversity data. Philos. Trans. R. Soc. B 359, 689–698 (2004).
The IUCN Red List of Threatened Species. Version 6.2 (IUCN, accessed 28 February 2022); https://www.iucnredlist.org
Bird Species Distribution Maps of the World. Version 2020.1 (BirdLife International, 2020); http://datazone.birdlife.org/species/requestdis
Brummitt, R. K. World Geographical Scheme for Recording Plant Distributions 2nd edn (TDWG, 2001); http://www.tdwg.org/standards/109
Harris, L. W. & Davies, T. J. A complete fossil-calibrated phylogeny of seed plant families as a tool for comparative analyses: testing the ‘time for speciation’ hypothesis. PLoS ONE 11, e0162907 (2016).
Shields, O. World numbers of butterflies. J. Lepid. Soc. 43, 178–183 (1989).
Smith, S. A. & Brown, J. W. Constructing a broadly inclusive seed plant phylogeny. Am. J. Bot. 105, 302–314 (2018).
Chazot, N. et al. Priors and posteriors in Bayesian timing of divergence analyses: the age of butterflies revisited. Syst. Biol. 68, 797–813 (2019).
Jetz, W. & Pyron, R. A. The interplay of past diversification and evolutionary isolation with present imperilment across the amphibian tree of life. Nat. Ecol. Evol. 2, 850–858 (2018).
Jetz, W., Thomas, G. H., Joy, J. B., Hartmann, K. & Mooers, A. O. The global diversity of birds in space and time. Nature 491, 444–448 (2012).
Tonini, J. F. R., Beard, K. H., Ferreira, R. B., Jetz, W. & Pyron, R. A. Fully-sampled phylogenies of squamates reveal evolutionary patterns in threat status. Biol. Conserv. 204, 23–31 (2016).
Bininda‐Emonds, O. R. et al. The delayed rise of present‐day mammals. Nature 446, 507–512 (2007).
Kreft, H. & Jetz, W. Global patterns and determinants of vascular plant diversity. Proc. Natl Acad. Sci. USA 104, 5925–5930 (2007).
R Core Team. R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2022).
Daru, B. H., Karunarathne, P. & Schliep, K. phyloregion: R package for biogeographic regionalization and macroecology. Methods Ecol. Evol. 11, 1483–1491 (2020).
Hijmans, R. terra: Spatial data analysis. R package version 1.7-3 https://CRAN.R-project.org/package=terra (2023).
Paradis, E. & Schliep, K. ape 5.0: an environment for modern phylogenetics and evolutionary analyses in R. Bioinformatics 35, 526–528 (2019).
Bivand, R. S., Hauke, J. & Kossowski, T. Computing the Jacobian in Gaussian spatial autoregressive models: an illustrated comparison of available methods. Geogr. Anal. 45, 150–179 (2013).
Wickham, H. ggplot2: Elegant Graphics for Data Analysis (Springer-Verlag, 2016).
Jombart, T. & Dray, S. adephylo: exploratory analyses for the phylogenetic comparative method. Bioinformatics 26, 1907–1909 (2008).
Revell, L. J. phytools: An R package for phylogenetic comparative biology (and other things). Methods Ecol. Evol. 3, 217–223 (2012).
Abouheif, E. A method for testing the assumption of phylogenetic independence in comparative data. Evol. Ecol. Res. 1, 895–909 (1999).
Blomberg, S. P., Garland, T. & Ives, A. R. Testing for phylogenetic signal in comparative data: behavioural traits are more labile. Evolution 57, 717–745 (2003).
Pagel, M. Inferring the historical patterns of biological evolution. Nature 401, 877–884 (1999).
Pavoine, S., Ollier, S., Pontier, D. & Chessel, D. Testing for phylogenetic signal in phenotypic traits: new matrices of phylogenetic proximities. Theor. Popul. Biol. 73, 79–91 (2008).
Koleff, P. et al. Measuring beta diversity for presence–absence data. J. Anim. Ecol. 72, 367–382 (2003).
Kingsland, S. The importance of history and historical records for understanding the Anthropocene. Bull. Ecol. Soc. Am. 98, 64–71 (2017).
Díaz, S. et al. The global spectrum of plant form and function. Nature 529, 167–171 (2016).
Shirey, V. et al. LepTraits 1.0: a globally comprehensive dataset of butterfly traits. Sci. Data 9, 382 (2022).
Oliveira, B. et al. AmphiBIO, a global database for amphibian ecological traits. Sci. Data 4, 170123 (2017).
Myhrvold, N. P. et al. An amniote life-history database to perform comparative analyses with birds, mammals, and reptiles. Ecology 96, 3109–3109 (2015).
Wilman, H. et al. EltonTraits 1.0: species-level foraging attributes of the world’s birds and mammals. Ecology 95, 2027–2027 (2014).
Carmona, C. P. et al. Erosion of global functional diversity across the tree of life. Sci. Adv. 7, eabf2675.
Davies, T. J. The macroecology and macroevolution of plant species at risk. New Phytol. 222, 708–713 (2019).
Purvis, A. et al. Nonrandom extinction and the loss of evolutionary history. Science 288, 328–330 (2000).
Pelletier, T. A. et al. Predicting plant conservation priorities on a global scale. Proc. Natl Acad. Sci. USA 115, 13027–13032 (2018).
Yessoufou, K., Daru, B. H. & Davies, T. J. Phylogenetic patterns of extinction risk in the Eastern Arc ecosystems, an African biodiversity hotspot. PLoS ONE 7, e47082 (2012).
Redding, D. W. & Mooers, A. Ø. Incorporating evolutionary measures into conservation prioritization. Conserv. Biol. 20, 1670–1678 (2006).
Bielby, J., Cunningham, A. A. & Purvis, A. Taxonomic selectivity in amphibians: ignorance, geography or biology? Anim. Conserv. 9, 135–143 (2006).
Nelson, A. Travel Time to Major Cities: A Global Map of Accessibility (Global Environment Monitoring Unit, 2008).
Fick, S. E. & Hijmans, R. J. WorldClim 2: new 1 km spatial resolution climate surfaces for global land areas. Int. J. Climatol. 37, 4302–4315 (2017).
Allan, J., Venter, O. & Watson, J. Temporally inter-comparable maps of terrestrial wilderness and the Last of the Wild. Sci. Data 4, 170187 (2017).
Partow, A. The Global Airport Database. Release Version 0.0.1 (Partow, 2003); http://www.partow.net/miscellaneous/airportdatabase/
Global Peace Index 2022: Measuring Peace in a Complex World (Institute for Economics & Peace, accessed 9 July 2022); http://visionofhumanity.org/resources
Palmer, L. Show me the money. Nat. Clim. Change 1, 376–380 (2011).
Science and Technology Report (UNESCO Institute for Statistics, 2012); http://www.uis.unesco.org/ScienceTechnology/Pages/research-and-development-statistics.aspx
Bivand, R. R packages for analyzing spatial data: a comparative case study with areal data. Geograph. Anal. https://doi.org/10.1111/gean.12319 (2022).
Acknowledgements
We thank Stanford University and Texas A&M University-Corpus Christi for logistic support. B.H.D. was supported by the US National Science Foundation (awards 2031928 and 2113424). We are grateful to G. Nakamura, L. Ford and S. Pons for comments on earlier drafts of the paper. In addition, we are grateful to Holger Kreft for kindly sharing his data on the expected distribution of plants, which was instrumental in our analysis.
Author information
Authors and Affiliations
Contributions
The study was conceived and designed by B.H.D. Analyses were carried out by B.H.D. The paper was written by B.H.D and revised by B.H.D. with help from J.R.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Ecology & Evolution thanks James Speed and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Patterns of expected species richness of terrestrial taxa.
The expected species richness of (a) Plants was derived from a co-kriging interpolation model of 1,032 regional floras worldwide, and (b) Butterflies, derived from a co-kriging interpolation of 543 geographic units covering the known inventory of butterflies, whereas the expected species richness of (c) Amphibians, (d) Birds, (e) Reptiles, and (f) Mammals, were generated by overlaying expert-based extent-of-occurrence range map of each species with equal-area grid cells of 100 km × 100 km. The bamako colour palette is common to all panels, with dark green indicating high coverage and yellow indicating low coverage. The maps are in the Wagner IV projection.
Extended Data Fig. 2 Spatial composition of β-diversity across grid cells by voucher and observation records.
Maps of dissimilarity between record types for: (a, b) Plants (n = 240,377 species), (c, d) Butterflies (n = 9809 species), (e, f) Amphibians (n = 4862 species), (g, h) Birds (n = 9380 species), (i, j) Reptiles (n = 7259 species), and (k, l) Mammals (n = 4508 species). Dissimilarity was assessed by generating pairwise distance matrices of Simpson’s β-diversity between all pairs of grid cells within major biogeographically defined areas recognized by the Biodiversity Information Standards (also known as the Taxonomic Databases Working Group (TDWG)). Values of β vary between 0 (species composition is identical between grid cells) and 1 (high dissimilarity, no shared taxa). Both voucher and observation records of most taxonomic groups showed high dissimilarity in less frequently sampled regions of South America, Africa, and Southeast Asia, and decline in frequently sampled Europe and North America.
Extended Data Fig. 3 Relationship between sampling effort (measured as taxonomic coverage) versus dissimilarity (measured as spatial composition of beta diversity) by voucher and observation records.
Indicated are the relationships between sampling effort and dissimilarity of record types for (a, b) plants, (c, d) butterflies, (e, f) amphibians, (g, h) birds, (i, j) reptiles, and (k, l) mammals. Trend line (in red) computed by evaluating the loess smooth at equally spaced points covering the range of dissimilarity values for each sampling effort.
Extended Data Fig. 4 Patterns of geographic coverage of grid cells by voucher and observation records of plants across spatial grain (50 × 50, 100 × 100, 200 × 200, 400 × 400, 800 × 800 and 1600 km × 1600 km).
Geographic coverage of grid cells was calculated as number of unique collection locales for each grid cell. Evenness or clustering of geographic coverage indicated by Moran’s I (Monte Carlo test, 999 randomizations) with values of 1 indicating clustered/biased coverage and 0 corresponding to geographically even coverage. The bamako colour palette is common to all panels, with darkgreen indicating high coverage and yellow indicating low coverage. The maps are in the Wagner IV projection.
Extended Data Fig. 5 Patterns of geographic coverage of grid cells by voucher and observation records of butterflies across spatial grain (50 × 50, 100 × 100, 200 × 200, 400 × 400, 800 × 800 and 1600 km × 1600 km).
Geographic coverage of grid cells was calculated as number of unique collection locales for each grid cell. Evenness or clustering of geographic coverage indicated by Moran’s I (Monte Carlo test, 999 randomizations) with values of 1 indicating clustered/biased coverage and 0 corresponding to geographically even coverage. The bamako colour palette is common to all panels, with darkgreen indicating high coverage and yellow indicating low coverage. The maps are in the Wagner IV projection.
Extended Data Fig. 6 Patterns of geographic coverage of grid cells by voucher and observation records of amphibians across spatial grain (50 × 50, 100 × 100, 200 × 200, 400 × 400, 800 × 800 and 1600 km × 1600 km).
Geographic coverage of grid cells was calculated as number of unique collection locales for each grid cell. Evenness or clustering of geographic coverage indicated by Moran’s I (Monte Carlo test, 999 randomizations) with values of 1 indicating clustered/biased coverage and 0 corresponding to geographically even coverage. The bamako colour palette is common to all panels, with darkgreen indicating high coverage and yellow indicating low coverage. The maps are in the Wagner IV projection.
Extended Data Fig. 7 Patterns of geographic coverage of grid cells by voucher and observation records of birds across spatial grain (50 × 50, 100 × 100, 200 × 200, 400 × 400, 800 × 800 and 1600 km × 1600 km).
Geographic coverage of grid cells was calculated as number of unique collection locales for each grid cell. Evenness or clustering of geographic coverage indicated by Moran’s I (Monte Carlo test, 999 randomizations) with values of 1 indicating clustered/biased coverage and 0 corresponding to geographically even coverage. The bamako colour palette is common to all panels, with darkgreen indicating high coverage and yellow indicating low coverage. The maps are in the Wagner IV projection.
Extended Data Fig. 8 Patterns of geographic coverage of grid cells by voucher and observation records of reptiles across spatial grain (50 × 50, 100 × 100, 200 × 200, 400 × 400, 800 × 800 and 1600 km × 1600 km).
Geographic coverage of grid cells was calculated as number of unique collection locales for each grid cell. Evenness or clustering of geographic coverage indicated by Moran’s I (Monte Carlo test, 999 randomizations) with values of 1 indicating clustered/biased coverage and 0 corresponding to geographically even coverage. The bamako colour palette is common to all panels, with darkgreen indicating high coverage and yellow indicating low coverage. The maps are in the Wagner IV projection.
Extended Data Fig. 9 Patterns of geographic coverage of grid cells by voucher and observation records of mammals across spatial grain (50 × 50, 100 × 100, 200 × 200, 400 × 400, 800 × 800 and 1600 km × 1600 km).
Geographic coverage of grid cells was calculated as number of unique collection locales for each grid cell. Evenness or clustering of geographic coverage indicated by Moran’s I (Monte Carlo test, 999 randomizations) with values of 1 indicating clustered/biased coverage and 0 corresponding to geographically even coverage. The bamako colour palette is common to all panels, with darkgreen indicating high coverage and yellow indicating low coverage. The maps are in the Wagner IV projection.
Extended Data Fig. 10 Pairwise relationships between 6 socioeconomic and ecological variables.
Correlations based on pairwise Spearman-rank correlations between the variables at spatial grain of 100 km. All variables were log-transformed before analysis. The statistical test used was two-sided. Exact p values are indicated below correlation coefficients.
Supplementary information
Supplementary Information
Supplementary Tables 1–7 and Notes 1 and 2.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Daru, B.H., Rodriguez, J. Mass production of unvouchered records fails to represent global biodiversity patterns. Nat Ecol Evol (2023). https://doi.org/10.1038/s41559-023-02047-3
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41559-023-02047-3