Open Science principles for accelerating trait-based science across the Tree of Life

Abstract

Synthesizing trait observations and knowledge across the Tree of Life remains a grand challenge for biodiversity science. Species traits are widely used in ecological and evolutionary science, and new data and methods have proliferated rapidly. Yet accessing and integrating disparate data sources remains a considerable challenge, slowing progress toward a global synthesis to integrate trait data across organisms. Trait science needs a vision for achieving global integration across all organisms. Here, we outline how the adoption of key Open Science principles—open data, open source and open methods—is transforming trait science, increasing transparency, democratizing access and accelerating global synthesis. To enhance widespread adoption of these principles, we introduce the Open Traits Network (OTN), a global, decentralized community welcoming all researchers and institutions pursuing the collaborative goal of standardizing and integrating trait data across organisms. We demonstrate how adherence to Open Science principles is key to the OTN community and outline five activities that can accelerate the synthesis of trait data across the Tree of Life, thereby facilitating rapid advances to address scientific inquiries and environmental issues. Lessons learned along the path to a global synthesis of trait data will provide a framework for addressing similarly complex data science and informatics challenges.

Main

Traits, broadly speaking, are measurable attributes or characteristics of organisms. Traits related to function (for example, leaf size, body mass, tooth size or growth form) are often used to understand how organisms interact with their environment and other species via key vital rates such as survival, development and reproduction1,2,3,4,5.

Trait-based approaches have long been used in systematics and macroevolution to delineate taxa and reconstruct ancestral morphology and function6,7,8 and to link candidate genes to phentoypes9,10,11. The broad appeal of the trait concept is its ability to facilitate quantitative comparisons of biological form and function. Traits also allow us to mechanistically link organismal responses to abiotic and biotic factors with measurements that are, in principle, relatively easy to capture across large numbers of individuals. For example, appropriately chosen and defined traits can help identify lineages that share similar life-history strategies for a given environmental regime12,13. Documenting and understanding the diversity and composition of traits in ecosystems directly contributes to our understanding of organismal and ecosystem processes, functionality, productivity and resilience in the face of environmental change14,15,16,17,18,19.

In light of the multiple applications of trait data to address challenges of global significance (Box 1), a central question remains: How can we most effectively advance the synthesis of trait data within and across disciplines? In recent decades, the collection, compilation and availability of trait data for a variety of organisms has accelerated rapidly. Substantial trait databases now exist for plants20,21,22,23, reptiles24,25, invertebrates23,26,27,28,29, fish30,31, corals32, birds23,33,34, amphibians35, mammals23,36,37,38 and fungi23,39, and parallel efforts are no doubt underway for other taxa. Though considerable effort has been made to quantify traits for some groups (for example, Fig. 1), substantial work remains. To develop and test theory in biodiversity science, much greater effort is needed to fill in trait data across the Tree of Life by combining and integrating data and trait collection efforts.

Fig. 1: Mammal, bird and plant phylogenies coloured according to the number of traits for which we have data for each species and lineage.
figure1

The plant phylogeny is sparsely populated for traits but contains more taxa (n = 10,596) than the mammal and bird phylogenies (n = 5,747 and 9,993, respectively). Trait data were downloaded from refs. 25,34,87. We counted the number of traits present across these datasets for each species and mapped those onto phylogenies using posteriors37,88 and a random subset of plant species within a single phylogeny89. Terminal branches (representing species) and ancestral lineages (using ancestral state reconstruction90) were coloured according to the number of reconstructed traits. Note that this is an exploratory analysis conducted purely to show variation in the availability of trait data across taxonomic groups.

Current barriers to global trait-based science

Despite the recognized importance of traits, several common research practices limit our capacity for meaningful synthesis across the Tree of Life. These practices include failure to publish usable datasets alongside new findings40, missing or inadequate metadata41, minimal descriptions of methods used to collate, clean and analyse trait datasets in published works42, and inadequate coordination between researchers and institutions with common goals, such as filling strategic spatial or taxonomic gaps in trait knowledge43,44. Our limited ability to access and redistribute trait data contributes to the widespread reproducibility crisis within science45. Any study relying on data that cannot easily be re-used introduces barriers to verifying the claims made by those studies and thereby questions the reproducibility of the science46, which is becoming of prime importance to many scientific journals. Such limitations have been common within trait-based science.

Access to data is not the only impediment to a global synthesis of trait knowledge. Barriers to synthesis exist because researchers and institutions are apprehensive that the time and resources they spend to create new observations or share legacy data (for example, observations from field guides, specimens, or publications without data supplements) will not be recognized. Identifying who should receive credit for contributing trait observations (whether via co-authorship or other formal recognition) is a complex issue, particularly where data involve a chain of expertise (for example, when trait data are extracted from taxonomic treatments involving specimen collectors, digitizers, taxonomists and curators). Funding bodies are often reluctant to support data management, limiting recognition of the sizeable effort expended on creating bespoke solutions to curating and harmonizing trait data from different sources46.

Opportunities exist for expanding the spatial and taxonomic coverage of trait observations, particularly by strengthening interdisciplinary connections across single organismic groups. Despite certain plant traits (for example, growth form, height and leaf size) being carefully catalogued in taxonomic species descriptions47, these data have only recently been exchanged with large-scale databases such as TRY21 or BIEN (http://bien.nceas.ucsb.edu/bien/). Although several informatics challenges in biodiversity science have now been overcome (for example, synthesizing global species occurrence information (https://www.gbif.org/) and sharing genetic data on individuals (https://www.ncbi.nlm.nih.gov/genbank/)), trait science lacks a vision for achieving global integration across all organisms. We argue that this is not simply a failure of the traits community to learn from existing successful networks. Instead, cataloguing traits is a more complex task that is highly context-dependent and therefore needs a more refined network model than that offered by a centralized repository.

We propose that widespread adoption of key Open Science principles (Box 2) could be transformative for trait science in achieving a global synthesis. These principles would lay a strong foundation for transparency, reproducibility and recognition, and encourage a culture of data sharing and collaboration beyond established networks. Openness reinforces the scientific process by allowing increased scrutiny of methods and results, resulting in the deeper exploration of findings and their significance42,48,49,50,51. The scope of trait science would increase if researchers and institutions: (1) made datasets available in machine-accessible formats under clear licensing arrangements; (2) created and adopted standardized protocols, handbooks or metadata formats for data collection, documentation and management (see refs. 48,49); and (3) created human-centred networks to reduce the complexity of integrating existing data from disparate sources (for example, specimens, published literature, citizen-science initiatives50,51 and large-scale digitization efforts). These different sources exhibit systematic differences in error rates, validation, context, reproducibility and objectivity relative to field-collected trait observations. Without a model of recognition that embraces transparency and fairness, much trait data will remain hidden from science.

Introducing the Open Traits Network

The Open Traits Network (OTN) is a collaborative initiative for accelerating trait data synthesis. Specifically, it is a global, decentralized community welcoming all researchers and institutions pursuing the collaborative goal of standardizing and integrating trait data across all organisms. We promote five main objectives built upon Open Science ideals that could transform trait science:

  1. (1)

    Openly sharing data, methods, protocols, codes and workflows.

  2. (2)

    Citing original data collectors and providing scholarly credit.

  3. (3)

    Providing appropriate metadata together with trait observations.

  4. (4)

    Collecting trait data following reproducible, standardized methods and protocols (when available), or committing to their development.

  5. (5)

    Providing training resources in trait collection and database construction using Open Science principles.

We envision a future for trait research where protocols for data exchange and re-use are transparent, research findings are reproducible, and all trait data (either newly collected or from legacy sources) are openly available to the research community and broader public. While several network models exist in trait research (Fig. 2), the OTN adopts a decentralized but connected structure with an emphasis on bringing people together through data and expertise.

Fig. 2: Architectures of three alternative networks in which research groups (nodes) interact in collecting and organizing trait data.
figure2

Black nodes are individuals, groups or institutions conducting projects. Light-green nodes are those harmonizing data and developing protocols, where node size is proportional to available resources. Dark-green nodes are synthesis nodes that collect standardized trait data and knowledge. a, Groups are disconnected and decentralized, risking duplication of effort (often the status quo). b, Groups are linked to a centralized repository, potentially limiting innovation. c, The Open Traits Network, represented by orange lines. Nodes are linked within biological domains (for example, plants or marine) and include expertise from diverse disciplines (for example, systematics, palaeobiology, ecology and biomechanics) allowing for more efficient and specialized decisions about trait collection. Data synthesis across domains or disciplines is facilitated by joining nodes based on common workflows, theoretical frameworks and data-sharing protocols that adhere to the principles of the Open Traits Network.

Often, groups building smaller-scale databases do so in isolation, using their own tools and workflows tailored to their research question; they are decentralized and disconnected (Fig. 2a). Decentralization has certain advantages, including retaining the power to determine which traits are most useful in a study system and how they should be compiled. There is little formal support or interaction across this style of network, so researchers often collect redundant data and develop similar tools for data collection, cleaning and integration, which can lead to duplication of effort. There are many small, isolated and heterogeneous data sources of this sort, increasing the disconnect between pools of trait data52.

For some organisms, centralized hubs exist to aggregate and standardize trait data across disparate sources (see refs. 21,32,53,54,55,56,57) (Fig. 2b). These trait repositories have become the main access point for trait data on well-studied taxa such as plants and corals, but they remain mostly isolated, limiting the sharing of expertise and information across taxa. As these repositories continue to grow, difficulties with data integration and synthesis will also increase due to the momentum of entrenched workflows and exchange protocols that may not be interoperable.

Some successful large-scale initiatives have followed the centralized and connected network model (for example, the Global Biodiversity Information Facility (GBIF; https://www.gbif.org/) and GenBank (https://www.ncbi.nlm.nih.gov/genbank/)). These platforms mandate strict data exchange protocols to facilitate synthesis using standardized vocabularies (for example, the Darwin Core58 and Humboldt Core59). These protocols have been central to the explosive growth of biodiversity data as they facilitate the exchange of information using common data formats58,59,60. Ontologies that provide unified terms and concepts necessary to represent traits have been developed (for example, Uberon, the multispecies anatomy ontology for animals61, and TOP, the Thesaurus of Plant characteristics62). These provide integration with other data types (for example, genetic and environmental) and their corresponding ontologies (for example, Gene Ontology63 and Environmental Ontology64).

Despite these successes, we argue that a centralized and connected network structure will not facilitate trait data synthesis. Trait observations are highly nuanced and hierarchical. Describing multiple aspects of a phenotype for any organism with traits is not amenable to a simplified set of exchange fields that apply across the Tree of Life. While the centralized and connected model (Fig. 2b) does have benefits, it lacks the necessary flexibility to connect trait data where ontologies and exchange formats do not exist. The likely result is that established trait networks will remain isolated and disconnected.

The decentralized but connected model (orange connections in Fig. 2c) adopted by the OTN maintains the key advantages of a decentralized network (for example, taxon- or discipline-specific decision making) while enhancing the level of connectivity among groups, allowing for easier sharing of expertise, tools and data. These network characteristics also buffer against node loss (for example, due to lack of funding). Decentralized and connected networks are characterized by socially mediated improvements in learning65 as they capitalize on the aggregated judgement of many experts rather than singular opinions66. The OTN model capitalizes on existing connections within disciplines and links domains across the Tree of Life to disseminate knowledge about traits. By recognizing the importance of specialist taxon groups (light-green nodes in Fig. 2c) and accommodating their needs into the development of cross-domain tools for synthesis (dark-green nodes in Fig. 2c), the OTN model will be particularly beneficial for low-profile taxa that may not be accommodated by a centralized effort to synthesize data. The OTN’s open, decentralized network structure will allow researchers to retain agency and independence while also creating a collaborative effort to minimize the duplication of effort.

How (and why) to participate in the OTN

The OTN seeks to broaden its membership by lowering barriers to inclusion and advocating for approaches to trait science that benefit data custodians. New members can join the OTN via our website (www.opentraits.org) through two mechanisms: (1) adding a member profile (for example, name, location, expertise and collaboration statement); and/or (2) registering their open-source (or embargoed) trait datasets in the OTN Trait Dataset Registry (see Activity 1). The registry contains metadata for trait datasets and links users to the open dataset. New entries to the registry will be reviewed by OTN members before being added. This step will facilitate interaction between new and established OTN members and encourage deeper collaboration. Once registered, members will receive regular updates about the OTN, including newly registered trait datasets, notifications about upcoming chances for face-to-face meetings, and funding opportunities. Members will also benefit from the OTN through the sharing of resources, funding calls and workshops where appropriate.

OTN membership spans scientists (and institutions) with high-level expertise in trait data science and synthesis activities through to those with strong motivations to work with traits but little expertise. The OTN has already conducted an international workshop facilitated by an open call for participants, with more workshops planned. Following this initial communication process, we are currently sharing ideas and act upon them within subgroups. Being a decentralized network, the OTN does not need to rely on funding and dedicated personnel to complete tasks, though larger goals will benefit from financial support. Instead, we will communicate the joint aims and gaps between network nodes (Fig. 2) and arrange workshops and activities where necessary.

We recognize that altruism is unlikely to offer enough motivation to ensure widespread participation in the OTN. The sharing of trait datasets is not merely a technical problem to be solved; it relies on custodians having the skills, incentives and motivation to contribute. The key incentives for individuals to join the OTN include increasing the findability of their data and expertise and having access to a ready-made network of trait scientists and institutions engaging in relevant initiatives. Data are a powerful asset for researchers, and release under open-license schemes accompanied by well-defined metadata offers great potential for new collaborations and increased visibility. A persistent concern is that scientists will lose control of their hard-earned data under open licensing, though this underestimates the potential for new collaborations and may unnecessarily increase distrust within the scientific community67. Access to scientific networks can provide valuable exposure and connection49, particularly for early-career researchers and those in developing nations, although it is important to understand the risks involved. By emphasizing the importance of community engagement and support, the OTN seeks to make trait-data sharing and synthesis an opportunity for all involved rather than simply a technical challenge to be solved.

Milestones toward an open approach to trait-based science

We highlight five OTN activities (several of which are already operational) that demonstrate the power of a decentralized and connected network to increase knowledge transfer in trait science. Trait scientists have made significant achievements in key areas, such as the synthesis of large numbers of observations within taxonomic groups20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39 and the development of theory and frameworks to use these data when testing ideas and large-scale empirical studies1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19. However, basic foundations are still lacking to quantify how and why traits vary across organisms.

Activity 1: Maintaining a global registry of trait-based initiatives

Several data gaps impede synthetic analyses across taxa, geographical locations and ontogeny. The heterogeneous ways in which trait data have been collected to date have resulted in a patchy and unrepresentative data landscape across trait types, taxa, regions and times of the year68,69. The OTN bridges these gaps by maintaining a Trait Dataset Registry that can be accessed at http://opentraits.org/datasets.html.

The OTN Registry contains information on existing open (or embargoed) datasets so that gaps can be identified and ultimately filled through collective effort. Core information for the registry includes Digital Object Identifier (DOI), taxonomic coverage, curator and format. The OTN Registry also provides the opportunity for contributors to identify if and where code to process and manipulate raw data is located (see Activity 2). As it develops, the OTN Registry will relate trait concepts to ontologies provided through the Open Biomedical Ontologies Foundry (http://www.obofoundry.org). The OTN Registry maps to several Open Science principles (for example, Open Source, Open Data and Open Access; Box 2) and is designed to support data retrieval and integration.

The OTN does not place restrictions on what members may consider traits of importance to a taxonomic group. Most traits can be measured from individuals and fit into existing definitions, though this may not be appropriate for organisms where individual or taxonomic boundaries are unclear (for example, microbes70 and fungi71). It can be argued that traits encompass emergent properties of populations (for example, abundance and geographic range size) or represent interactions among species (for example, diet type). Within the OTN, we believe that more important than imposing strict definitions around traits is engaging the community in discussion about the utility of available data for answering novel ecological and evolutionary questions.

Activity 2: Sharing reproducible workflows and tools for aggregating trait data

The OTN leverages collaborative software development via platforms like GitHub (https://github.com/) to create modular, open-source software to access, harmonize and re-use data with seamless piping of data from one software tool to the next. OTN contributors have already developed several open-source tools such as the traitdataform package, which assists R users to format their data and harmonize units (http://ecologicaltraitdata.github.io/traitdataform). The code for the Coral Traits database32 (https://github.com/jmadin/traits) could be modified to guide the creation of databases on other organisms. The FENNEC project provides a tool for accessing and viewing community trait data as a self-hosted website service72 (https://github.com/molbiodiv/fennec). The OTN can act as a connector between developers and the broader community seeking to synthesize trait data, facilitating the training of scientists in all aspects of reproducible data management.

Activity 3: Advocating for a free flow of data and appropriate credit

One goal of the OTN is to increase the use of open datasets and to ensure due credit is given to researchers who collect or synthesize primary data. Without effective reward or motivation for collecting new trait observations or sharing legacy data, a trait synthesis across the Tree of Life will remain unattainable. Currently, motivation for collecting and sharing new primary data is not strong and direct funding for trait data management is scarce.

The OTN can strengthen the attribution of credit to data providers and promote new data collection via two paths. Firstly, the OTN will encourage citation back to primary source via a permissive license model that secures authorship attribution (for example, Creative Commons Attribution 4.0 Int; CC BY 4.0) and the use of DOIs and Open Researcher and Contributor ID (ORCID) identifiers. Open-access datasets with a DOI can be tracked to understand patterns of re-use and to assess the impact of the author’s decision to share.

There is an important distinction between sharing data within a network and making data publicly available under an open license. Clear license arrangements increase visibility and promote fair attribution and citation (for example, using Creative Commons licenses such as CC-BY or CC0). CC-BY requires attribution (that is, citation) to the original creator whereas CC0 does not legally require users of the data to cite the source, though this does not affect ethical norms for attribution in research communities (https://creativecommons.org/share-your-work/public-domain/cc0/). Identifying who should be credited for prior work on legacy data is complicated by the involvement of many individuals. This issue could be solved, in part, by inviting organizations to be named as contributors or co-authors on outputs using their data or (looking forward) implementing new ways of documenting who should be credited for making specimens or datasets usable in trait science.

Incentives to collect new trait data can be linked to the Open Science practice of pre-registration. In pre-registration, authors archive a public proposal for research activities (for example, via the Centre for Open Science; https://cos.io/prereg/) which, if approved, may receive in-principle acceptance from participating journals. As of March 2019, 168 journals are willing to give in-principle acceptance following pre-review of the study design prior to conducting field or experimental work. Ten of these participating journals regularly feature papers on trait-based science (for example, BMC Ecology and Ecology and Evolution). We envision a situation where the OTN Trait Registry (Activity 1) could be used to identify spatial or taxonomic gaps in trait data that could be coupled to pre-registered hypotheses. Together, pre-registration and in-principle acceptance of findings could incentivize the collection of new data, circumventing the growing reliance on available data with known gaps.

Activity 4: Creating a trait core to facilitate synthesis and standardization

Trait science requires its own ‘core’ terminology or data standard that is flexible enough to capture the complexity of trait data. Building on efforts to standardize occurrence data (that is, Darwin Core58) and biological inventories (that is, Humboldt Core43,59), the OTN envisions a trait core offering a set of cross-domain metadata standards and controlled vocabularies that are (ideally) connected to trait ontologies via unambiguous identifiers. This standard terminology would be implemented across trait-data publications, unifying data in decentralized repositories as well as centralized data portals.

A trait core would allow trait data to be: (1) interpreted accurately within the context of their collection (that is, including information on associated data on factors such as environmental conditions at collection sites, taxa covered, data custodians or collection methods); and (2) known by compatible terms so that observations of similar phenomena can be grouped and compared (that is, what is meant by ‘generation time’ or ‘establishment’ across taxonomic groups73,74). Existing initiatives may provide logical cornerstones for referencing terms and concepts, including Ecological Metadata Language41. Several initiatives implement the Ecological Metadata Language (for example, The Knowledge Network for Biocomplexity75, Darwin Core58 and Humboldt Core59) and the use of referencing terms from anatomy or phenotype ontologies (for example, the Plant Ontology66 and the Vertebrate Trait Ontology67) to relate traits to publicly defined terms, allowing annotated data to be processed computationally (http://www.obofoundry.org).

Progress towards a trait core is already being made through the development of a prototypal Ecological Trait Standard76 (Box 3). However, the development and adoption of a trait core requires consultation and coordination within the broader scientific community, a goal which the OTN is ideally placed to advance. The OTN can mobilize expertise for cross-domain workshops and advocate for funding, which allows not only meetings of experts but also the creation of cyber-infrastructure for synthesis nodes (dark-green nodes in Fig. 2c). Links to emerging initiatives for biodiversity data standardization (for example, Species Index of Knowledge57) will also be vital for success, as will ratification of the core through the Biodiversity Information Standards (TDWG, www.tdwg.org).

Activity 5: Facilitating consistent approaches to measuring traits within major groups

The OTN will share new developments towards protocols and handbooks for major clades that standardize approaches to capture trait observations. Protocols are necessary because downstream activities such as developing metadata standards (Activity 4) will be impossible if trait measurement protocols do not exist. Some research communities have adopted standardized terms56,62 and data collection protocols (for example, plants20,77,78,79,80, invertebrates29,81,82,83, mammals36 and aquatic life30,32,84), though these may not always fit the requirements of some studies (for example, where trait variability rather than the average trait of species is targeted85). Protocols and handbooks may not emerge rapidly and should have the flexibility to be open to innovation through a commitment to version control and updates as techniques evolve. Two versions of the plant trait measurement handbook have been published77,86 and several online resources exist that can be updated regularly (see http://prometheuswiki.org/tiki-custom_home.php).

Standardizing approaches to trait measurement across research communities will reduce ambiguity when aggregating data and improve the quality of resulting datasets. Integrating trait standardization and databasing into taxonomic workflows constitutes a challenge and an opportunity7 that holds the promise of bridging the long disconnect between structural and functional traits. The presence of a range of biodiversity collections personnel in the OTN and an open invitation for more to join is expected to catalyse the adoption of trait-based thinking into taxonomic practices.

Concluding remarks

This is the opportune time to push towards a new approach to sharing and synthesizing trait data across all organisms. Trait science has great potential to increase its taxonomic, phylogenetic and spatial scopes by leveraging data-science tools, embracing Open Science principles, and creating stronger connections between researchers, institutions, publishers and funding bodies. We hope that trait enthusiasts, regardless of field and research stage, will engage with the OTN via our website (www.opentraits.org) and help build new connections between disciplines, institutions and taxonomic domains. By adding metadata profiles for datasets to the OTN Trait Dataset Registry, trait collection efforts become more findable, as do the researchers who have compiled them. We envision that by connecting people with common goals, we can work collectively towards a synthesis of global trait data to preserve the nuances of taxon-specific expertise while also facilitating collaboration across domains. We urge scientists and institutions keen to commit to Open Science principles to make use of existing resources, including those offered by the Centre for Open Science (https://cos.io/), the Open Science Training Handbook (https://open-science-training-handbook.gitbook.io/book/), the Open Science Training Initiative (http://www.opensciencetraining.com/index.php) and FOSTER (https://www.fosteropenscience.eu/toolkit).

To support and expand the activities of the OTN, we will grow membership and develop communities around synthesis nodes to undertake key activities and secure funding support, in particular for the development of a trait core. Funding for international workshops, technical support and implementation meetings could drive a new era of trait-based synthesis that mirrors the achievement of similar initiatives such as GBIF, which now houses >1 billion occurrence records.

By supporting a reciprocal exchange of expertise and outputs using Open Science principles between researchers and institutions, we can mobilize data for a cross-taxa, worldwide, trait-based data resource to examine, understand and predict nature’s responses to global change. As a better-connected OTN emerges, data streams and coordination will improve, allowing us to deliver information to support globally important research agendas (Box 1) as well as specific data and knowledge to the public through integration with third-party portals. Lessons learned along the path to a global synthesis of trait data across all organisms will provide a framework for addressing similarly complex, context-dependent challenges in biodiversity informatics and beyond.

Change history

  • 09 March 2020

    An amendment to this paper has been published and can be accessed via a link at the top of the paper.

References

  1. 1.

    Adler, P. B. et al. Functional traits explain variation in plant life history strategies. Proc. Natl Acad. Sci. USA 111, 740–745 (2014).

  2. 2.

    Chapin, F. S. III, Autumn, K. & Pugnaire, F. Evolution of suites of traits in response to environmental stress. Am. Nat. 142, S78–S92 (1993).

  3. 3.

    Chown, S. L. & Gaston, K. J. Macrophysiology–progress and prospects. Funct. Ecol. 30, 330–344 (2016).

  4. 4.

    Kooijman, S. A. L. M. Dynamic Energy and Mass Budgets in Biological Systems (Cambridge Univ. Press, 2000).

  5. 5.

    Diaz, S., Cabido, M. & Casanoves, F. Plant functional traits and environmental filters at a regional scale. J. Veg. Sci. 9, 113–122 (1998).

  6. 6.

    Harmon, L. J. et al. Early bursts of body size and shape evolution are rare in comparative data. Evolution 64, 2385–2396 (2010).

  7. 7.

    Sauquet, H. & Magallón, S. Key questions and challenges in angiosperm macroevolution. New Phytol. 219, 1170–1187 (2018).

  8. 8.

    Sneath, P. H. & Sokal, R. R. Numerical Taxonomy: The Principles and Practice of Numerical Classification (W. H. Freeman & Co, 1973).

  9. 9.

    Edmunds, R. C. et al. Phenoscape: identifying candidate genes for evolutionary phenotypes. Mol. Biol. Evol. 33, 13–24 (2015).

  10. 10.

    Mungall, C. J. et al. The Monarch Initiative: an integrative data and analytic platform connecting phenotypes to genotypes across species. Nucleic Acids Res. 45, D712–D722 (2016).

  11. 11.

    Gkoutos, G. V., Schofield, P. N. & Hoehndorf, R. The anatomy of phenotype ontologies: principles, properties and applications. Brief. Bioinform. 19, 1008–1021 (2017).

  12. 12.

    Westoby, M., Falster, D. S., Moles, A. T., Vesk, P. A. & Wright, I. J. Plant ecological strategies: some leading dimensions of variation between species. Annu. Rev. Ecol. Syst. 33, 125–159 (2002).

  13. 13.

    Kiørboe, T., Visser, A. & Andersen, K. H. A trait-based approach to ocean ecology. ICES J. Mar. Sci. 75, 1849–1863 (2018).

  14. 14.

    Kunstler, G. et al. Plant functional traits have globally consistent effects on competition. Nature 529, 204–207 (2016).

  15. 15.

    Laughlin, D. C. Nitrification is linked to dominant leaf traits rather than functional diversity. J. Ecol. 99, 1091–1099 (2011).

  16. 16.

    Finegan, B. et al. Does functional trait diversity predict above-ground biomass and productivity of tropical forests? Testing three alternative hypotheses. J. Ecol. 103, 191–201 (2015).

  17. 17.

    Laigle, I. et al. Species traits as drivers of food web structure. Oikos 127, 316–326 (2018).

  18. 18.

    Brown, J. H., Gillooly, J. F., Allen, A. P., Savage, V. M. & West, G. B. Toward a metabolic theory of ecology. Ecology 85, 1771–1789 (2004).

  19. 19.

    West, G. B., Brown, J. H. & Enquist, B. J. A general model for the origin of allometric scaling laws in biology. Science 276, 122–126 (1997).

  20. 20.

    Iversen, C. M. et al. A global Fine‐Root Ecology Database to address below‐ground challenges in plant ecology. New Phytol. 215, 15–26 (2017).

  21. 21.

    Kattge, J. et al. TRY–a global database of plant traits. Glob. Change Biol. 17, 2905–2935 (2011).

  22. 22.

    Bernhardt‐Römermann, M., Poschlod, P. & Hentschel, J. BryForTrait–A life‐history trait database of forest bryophytes. J. Veg. Sci. 29, 798–800 (2018).

  23. 23.

    Bennett, J. M. et al. GlobTherm, a global database on thermal tolerances for aquatic and terrestrial organisms. Sci. Data 5, 180022 (2018).

  24. 24.

    Meiri, S. Traits of lizards of the world: variation around a successful evolutionary design. Glob. Ecol. Biogeogr. 27, 1168–1172 (2018).

  25. 25.

    Myhrvold, N. P. et al. An amniote life‐history database to perform comparative analyses with birds, mammals, and reptiles. Ecology 96, 3109–3109 (2015).

  26. 26.

    Schäfer, R. B. et al. A trait database of stream invertebrates for the ecological risk assessment of single and combined effects of salinity and pesticides in South-East Australia. Sci. Total Environ. 409, 2055–2063 (2011).

  27. 27.

    Bland, L. Global correlates of extinction risk in freshwater crayfish. Animal Conserv. 20, 532–542 (2017).

  28. 28.

    Brun, P., Payne, M. R. & Kiørboe, T. A trait database for marine copepods. Earth Syst. Sci. Data 9, 99–113 (2017).

  29. 29.

    Parr, C. L. et al. GlobalAnts: a new database on the geography of ant traits (Hymenoptera: Formicidae). Insect Conserv. Divers. 10, 5–20 (2017).

  30. 30.

    Froese, R. & Pauly, D. Progress Report on FishBase (Fisheries Centre, University of British Columbia, 2010).

  31. 31.

    Frimpong, E. A. & Angermeier, P. L. Fish traits: a database of ecological and life-history traits of freshwater fishes of the United States. Fisheries 34, 487–495 (2009).

  32. 32.

    Madin, J. S. et al. The Coral Trait Database, a curated database of trait information for coral species from the global oceans. Sci. Data 3, 160017 (2016).

  33. 33.

    Garnett, S. T. et al. Biological, ecological, conservation and legal information for all species and subspecies of Australian bird. Sci. Data 2, 150061 (2015).

  34. 34.

    Wilman, H. et al. EltonTraits 1.0: species‐level foraging attributes of the world’s birds and mammals: Ecological Archives E095‐178. Ecology 95, 2027 (2014).

  35. 35.

    Oliveira, B. F., São-Pedro, V. A., Santos-Barrera, G., Penone, C. & Costa, G. C. AmphiBIO, a global database for amphibian ecological traits. Sci. Data 4, 170123 (2017).

  36. 36.

    Jones, K. E. et al. PanTHERIA: a species‐level database of life history, ecology, and geography of extant and recently extinct mammals. Ecology 90, 2648–2648 (2009).

  37. 37.

    Faurby, S. et al. PHYLACINE 1.2: the phylogenetic atlas of mammal macroecology. Ecology 99, 2626 (2018).

  38. 38.

    Galán-Acedo, C., Arroyo-Rodríguez, V., Andresen, E. & Arasa-Gisbert, R. Ecological traits of the world’s primates. Sci. Data 6, 55 (2019).

  39. 39.

    Flores-Moreno, H. et al. fungaltraits aka fun fun: a dynamic functional trait database for the world's fungi (GitHub, 2019); https://doi.org/10.5281/zenodo.1216257.

  40. 40.

    Sholler, D., Ram, K., Boettiger, C. & Katz, D. S. Enforcing public data archiving policies in academic publishing: A study of ecology journals. Big Data Soc. 6, 2053951719836258 (2019).

  41. 41.

    Fegraus, E. H., Andelman, S., Jones, M. B. & Schildhauer, M. Maximizing the value of ecological data with structured metadata: an introduction to Ecological Metadata Language (EML) and principles for metadata creation. Bull. Ecol. Soc. Am. 86, 158–168 (2005).

  42. 42.

    Parker, T. H. et al. Transparency in ecology and evolution: real problems, real solutions. Trends Ecol. Evol. 31, 711–719 (2016).

  43. 43.

    Hortal, J. et al. Seven shortfalls that beset large-scale knowledge of biodiversity. Annu. Rev. Ecol. Evol. Syst. 46, 523–549 (2015).

  44. 44.

    Cornwell, W. K., Pearse, W. D., Dalrymple, R. L. & Zanne, A. E. What we (don’t) know about global plant diversity. Ecography 42, 1819–1831 (2019).

  45. 45.

    Stodden, V., Seiler, J. & Ma, Z. An empirical analysis of journal policy effectiveness for computational reproducibility. Proc. Natl Acad. Sci. USA 115, 2584–2589 (2018).

  46. 46.

    Lowndes, J. S. S. et al. Our path to better science in less time using open data science tools. Nat. Ecol. Evol. 1, 0160 (2017).

  47. 47.

    Weigelt, P., König, C. & Kreft, H. GIFT–a global inventory of floras and traits for macroecology and biogeography. J. Biogeogr. https://doi.org/10.1111/jbi.13623 (2019).

  48. 48.

    Parker, T., Nakagawa, S. & Gurevitch, J., IIEE workshop participants. Promoting transparency in evolutionary biology and ecology. Ecol. Lett. 19, 726–728 (2016).

  49. 49.

    McKiernan, E. C. et al. Point of view: How open science helps researchers succeed. eLife 5, e16800 (2016).

  50. 50.

    Munafò, M. R. et al. A manifesto for reproducible science. Nat. Hum. Behav. 1, 0021 (2017).

  51. 51.

    Nosek, B. A. et al. Promoting an open research culture. Science 348, 1422–1425 (2015).

  52. 52.

    Farley, S. S., Dawson, A., Goring, S. J. & Williams, J. W. Situating ecology as a big-data science: current advances, challenges, and solutions. BioScience 68, 563–576 (2018).

  53. 53.

    Falster, D. S. et al. BAAD: a Biomass And Allometry Database for woody plants. Ecology 96, 1445–1445 (2015).

  54. 54.

    Salguero‐Gómez, R. et al. COMADRE: a global data base of animal demography. J. Anim. Ecol. 85, 371–384 (2016).

  55. 55.

    Salguero‐Gómez, R. et al. The COMPADRE Plant Matrix Database: an open online repository for plant demography. J. Ecol. 103, 202–218 (2015).

  56. 56.

    Marques, G. M. et al. The AmP project: comparing species on the basis of dynamic energy budget parameters. PLOS Comput. Biol. 14, e1006100 (2018).

  57. 57.

    Conde, D. A. et al. Data gaps and opportunities for comparative and conservation biology. Proc. Natl Acad. Sci. USA 116, 9658–9664 (2019).

  58. 58.

    Wieczorek, J. et al. Darwin Core: an evolving community-developed biodiversity data standard. PLOS ONE 7, e29715 (2012).

  59. 59.

    Guralnick, R., Walls, R. & Jetz, W. Humboldt Core–toward a standardized capture of biological inventories for biodiversity monitoring, modeling and assessment. Ecography 41, 713–725 (2018).

  60. 60.

    Deans, A. R. et al. Finding our way through phenotypes. PLOS Biol. 13, e1002033 (2015).

  61. 61.

    Haendel, M. A. et al. Unification of multi-species vertebrate anatomy ontologies for comparative biology in Uberon. J. Biomed. Semant. 5, 21 (2014).

  62. 62.

    Garnier, E. et al. Towards a thesaurus of plant characteristics: an ecological contribution. J. Ecol. 105, 298–309 (2017).

  63. 63.

    The Gene Ontology Consortium. The Gene Ontology resource: 20 years and still GOing strong. Nucleic Acids Res. 47, D330–D338 (2018).

  64. 64.

    Buttigieg, P. L., Morrison, N., Smith, B., Mungall, C. J. & Lewis, S. E. The environment ontology: contextualising biological and biomedical entities. J. Biomed. Semant. 4, 43 (2013).

  65. 65.

    Becker, J., Brackbill, D. & Centola, D. Network dynamics of social influence in the wisdom of crowds. Proc. Natl Acad. Sci. USA 114, E5070–E5076 (2017).

  66. 66.

    Page, S. E. The Difference: How the Power of Diversity Creates Better Groups, Firms, Schools, and Societies - New Edition (Princeton Univ. Press, 2008).

  67. 67.

    Tenopir, C. et al. Data sharing by scientists: practices and perceptions. PLOS ONE 6, e21101 (2011).

  68. 68.

    Tyler, E. H. et al. Extensive gaps and biases in our knowledge of a well‐known fauna: implications for integrating biological traits into macroecology. Glob. Ecol. Biogeogr. 21, 922–934 (2012).

  69. 69.

    Kissling, W. D. et al. Towards global data products of Essential Biodiversity Variables on species traits. Nat. Ecol. Evol. 2, 1531–1540 (2018).

  70. 70.

    Lajoie, G. & Kembel, S. W. Making the most of trait-based approaches for microbial ecology. Trends Microbiol. 27, 814–823 (2019).

  71. 71.

    Dawson, S. K. et al. Handbook for the measurement of macrofungal functional traits: a start with basidiomycete wood fungi. Funct. Ecol. 33, 372–387 (2019).

  72. 72.

    Ankenbrand, M. J., Hohlfeld, S. C., Weber, L., Förster, F. & Keller, A. Functional exploration of natural networks and ecological communities. Methods Ecol. Evol. 9, 2028–2033 (2018).

  73. 73.

    Gaillard, J.-M. et al. Generation time: a reliable metric to measure life-history variation among mammalian populations. Am. Nat. 166, 119–123 (2005).

  74. 74.

    Steiner, U. K., Tuljapurkar, S. & Coulson, T. Generation time, net reproductive rate, and growth in stage-age-structured populations. Am. Nat. 183, 771–783 (2014).

  75. 75.

    Andelman, S. J., Bowles, C. M., Willig, M. R. & Waide, R. B. Understanding environmental complexity through a distributed knowledge network. BioScience 54, 240–246 (2004).

  76. 76.

    Schneider, F. D. et al. Towards an ecological trait-data standard. Methods Ecol. Evol. 10, 2006–2019 (2019).

  77. 77.

    Perez-Harguindeguy, N. et al. A new handbook for standardised measurement of plant functional traits worldwide. Aust. J. Bot. 64, 715–716 (2013).

  78. 78.

    Fang, J. et al. Methods and protocols for plant community inventory. Biodivers. Sci. 17, 533–548 (2009).

  79. 79.

    Sack, L. et al. A unique web resource for physiology, ecology and the environmental sciences: PrometheusWiki. Funct. Plant Biol. 37, 687–693 (2010).

  80. 80.

    Bjorkman, A. D. et al. Tundra Trait Team: a database of plant traits spanning the tundra biome. Glob. Ecol. Biogeogr. 27, 1402–1411 (2018).

  81. 81.

    Moretti, M. et al. Handbook of protocols for standardized measurement of terrestrial invertebrate functional traits. Funct. Ecol. 31, 558–567 (2017).

  82. 82.

    Ferris, H. NEMAPLEX: The Nematode-Plant Expert Information System (Univ. California Davis, 2005); http://nemaplex.ucdavis.edu/

  83. 83.

    Tennessen, J. M., Barry, W. E., Cox, J. & Thummel, C. S. Methods for studying metabolism in Drosophila. Methods 68, 105–115 (2014).

  84. 84.

    Palomares, M. L. D. & Pauly, D. SeaLifeBase v.12/2010 (2010); www.sealifebase.org

  85. 85.

    Le Bagousse‐Pinguet, Y. et al. Traits of neighbouring plants and space limitation determine intraspecific trait variability in semi‐arid shrublands. J. Ecol. 103, 1647–1657 (2015).

  86. 86.

    Cornelissen, J. et al. A handbook of protocols for standardised and easy measurement of plant functional traits worldwide. Aust. J. Bot. 51, 335–380 (2003).

  87. 87.

    Maitner, B. S. et al. The bien r package: a tool to access the Botanical Information and Ecology Network (BIEN) database. Methods Ecol. Evol. 9, 373–379 (2018).

  88. 88.

    Jetz, W., Thomas, G., Joy, J., Hartmann, K. & Mooers, A. The global diversity of birds in space and time. Nature 491, 444–448 (2012).

  89. 89.

    Smith, S. A. & Brown, J. W. Constructing a broadly inclusive seed plant phylogeny. Am. J. Bot. 105, 302–314 (2018).

  90. 90.

    Revell, L. J. phytools: an R package for phylogenetic comparative biology (and other things). Methods Ecol. Evol. 3, 217–223 (2012).

  91. 91.

    Díaz, S. et al. The global spectrum of plant form and function. Nature 529, 167–171 (2016).

  92. 92.

    Andersen, K. H. et al. Characteristic sizes of life in the oceans, from bacteria to whales. Annu. Rev. Mar. Sci. 8, 217–241 (2016).

  93. 93.

    Neuheimer, A. B. et al. Adult and offspring size in the ocean over 17 orders of magnitude follows two life history strategies. Ecology 96, 3303–3311 (2015).

  94. 94.

    Ernest, S. M. et al. Thermodynamic and metabolic effects on the scaling of production and population energy use. Ecol. Lett. 6, 990–995 (2003).

  95. 95.

    Weiss, K. C. & Ray, C. A. Unifying functional trait approaches to understand the assemblage of ecological communities: synthesizing taxonomic divides. Ecography 42, 2012–2020 (2019).

  96. 96.

    Ball, I. R., Possingham, H. P. & Watts, M. in Spatial Conservation Prioritisation: Quantitative Methods and Computational Tools (eds Moilanen, A. et al.) 185–195 (Oxford Univ. Press, 2009).

  97. 97.

    Pollock, L. J., Thuiller, W. & Jetz, W. Large conservation gains possible for global biodiversity facets. Nature 546, 141–144 (2017).

  98. 98.

    Margules, C. R. & Pressey, R. L. Systematic conservation planning. Nature 405, 243–253 (2000).

  99. 99.

    Gross, N. et al. Functional trait diversity maximizes ecosystem multifunctionality. Nat. Ecol. Evol. 1, 0132 (2017).

  100. 100.

    Loreau, M. Does functional redundancy exist? Oikos 104, 606–611 (2004).

  101. 101.

    van Bodegom, P. M., Douma, J. C. & Verheijen, L. M. A fully traits-based approach to modeling global vegetation distribution. Proc. Natl Acad. Sci. USA 111, 13733–13738 (2014).

  102. 102.

    Sakschewski, B. et al. Leaf and stem economics spectra drive diversity of functional plant traits in a dynamic global vegetation model. Glob. Change Biol. 21, 2711–2725 (2015).

  103. 103.

    Butler, E. E. et al. Mapping local and global variability in plant trait distributions. Proc. Natl Acad. Sci. USA 114, E10937–E10946 (2017).

  104. 104.

    Kearney, M. & Porter, W. Mechanistic niche modelling: combining physiological and spatial data to predict species’ ranges. Ecol. Lett. 12, 334–350 (2009).

  105. 105.

    Fordham, D. A. et al. How complex should models be? Comparing correlative and mechanistic range dynamics models. Glob. Change Biol. 24, 1357–1370 (2018).

  106. 106.

    Enriquez‐Urzelai, U., Kearney, M. R., Nicieza, A. G. & Tingley, R. Integrating mechanistic and correlative niche models to unravel range‐limiting processes in a temperate amphibian. Glob. Change Biol. 25, 2633–2647 (2019).

  107. 107.

    Benito Garzón, M., Robson, T. M. & Hampe, A. ΔTrait SDMs: species distribution models that account for local adaptation and phenotypic plasticity. New Phytol. 222, 1757–1765 (2019).

  108. 108.

    Berzaghi, F. et al. Assessing the role of megafauna in tropical forest ecosystems and biogeochemical cycles–the potential of vegetation models. Ecography 41, 1934–1954 (2018).

  109. 109.

    Galetti, M. & Dirzo, R. Ecological and evolutionary consequences of living in a defaunated world. Biol. Conserv. 163, 1–6 (2013).

  110. 110.

    Huang, Y. et al. Orchimic (v1. 0), a microbe-mediated model for soil organic matter decomposition. Geosci. Model Dev. 11, 2111–2138 (2018).

  111. 111.

    McGuire, K. L. & Treseder, K. K. Microbial communities and their relevance for ecosystem models: decomposition as a case study. Soil Biol. Biochem. 42, 529–535 (2010).

  112. 112.

    Todd-Brown, K. E., Hopkins, F. M., Kivlin, S. N., Talbot, J. M. & Allison, S. D. A framework for representing microbial decomposition in coupled climate models. Biogeochemistry 109, 19–33 (2012).

  113. 113.

    Hardisty, A. R. et al. The Bari Manifesto: an interoperability framework for essential biodiversity variables. Ecol. Inform. 49, 22–31 (2019).

Download references

Acknowledgements

Ideas presented stem from initial discussions at three international meetings—the Australian National Climate Change Adaptation Research Facility Roundtable on Species Traits, the iDigBio ALA Traits workshop, and the preliminary Open Traits workshop held at the Ecological Society of America. R.V.G. is supported by an Australian Research Council DECRA Fellowship (DE170100208). D.S.F. is supported by an Australian Research Council Future Fellowship (FT160100113). R.S.-G. is supported by NERC R/142195-11-1. W.D.P. is supported by NSF ABI-1759965, NSF EF-1802605, and USDA Forest Service agreement 18-CS-11046000-041. A.K. received financial support for M.J.A. by the German Research Foundation (DFG KE1743/7-1). C.M.I. was supported by the Biological and Environmental Research program in the United States Department of Energy’s Office of Science. C.P. is supported by the DFG Priority Program 1374. M.J. was supported by the German Research Foundation within the framework of the Jena Experiment (FOR 1451) and by the Swiss National Science Foundation. S.P.-M. was supported by the Benson Fund from the Department of Paleobiology, National Museum of Natural History. S.T.M. is supported by SERDP project RC18-1346. B.J.E. was supported by NSF Grants DEB0133974, HDR1934790 and EF1065844, a Leverhulme Trust Visiting Professorship Grant, and an Oxford Martin School Fellowship.

Author information

R.V.G. wrote the manuscript with contributions from D.S.F., B.S.M., R.S.-G., V.V., W.D.P., F.D.S., J.K., J.H.P., J.S.M., M.J.A., C.P., X.F., V.M.A., J.A., S.C.A., M.A.B., L.M.B., B.L.B., C.H.B.-A., I.B., A.J.R.C., R.C., B.R.C., D.A.C., S.L.C., B.F., H.G., A.H.H., J.H., J.A.H., H.H., M.H., C.M.I., M.J., M.K., A.K., P.Mabee, P.Manning, L.M., S.T.M., D.S.P., T.M.P., S.P.-M., C.A.R., M.R., H.S., B.S., M.J.S., R.J.T., J.A.T., C.V., R.W., K.C.B.W., M.W., I.J.W. and B.J.E.

Correspondence to Rachael V. Gallagher.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Gallagher, R.V., Falster, D.S., Maitner, B.S. et al. Open Science principles for accelerating trait-based science across the Tree of Life. Nat Ecol Evol 4, 294–303 (2020). https://doi.org/10.1038/s41559-020-1109-6

Download citation