In the past decade, microbiome researchers have produced a wealth of sequence data and associated microbial outputs and activity1. Most microbiome studies to date have analysed microbiomes in order to answer a specific research question, with little emphasis placed on data reusability. However, increasing microbiome data reuse would result in a better return on investments and an increase in accessibility to microbiome science for researchers unable to generate their own microbiome data. It may also enable researchers to address scientific questions beyond the scope of the original study2. Recognizing these advantages, the scientific community has repeatedly advocated for findable, accessible, interoperable and reusable (FAIR) data3. We launched the National Microbiome Data Collaborative (NMDC) Ambassador Program in April 2021 to increase FAIR data adoption and metadata stewardship among microbiome researchers. Herein, we provide information and insights regarding the implementation of a community learning model to assist in the advancement of other iterations of this type of programme.

The campaign for FAIR data advocates for sufficient metadata (for example, geographic location, sampling date, host information, and so on) to be available for microbiome datasets, because the utility and reusability of data sharply decrease when crucial metadata fields are missing, insufficient or improperly formatted4,5. Training events, ideally with a practical or hands-on component, are essential to increase recognition and adoption of metadata standards.

The NMDC was founded in 2019 as a Department of Energy project with the intention of making multi-omics microbiome data FAIR and open to all6. Core to this mission is the implementation of standardized metadata fields linked to microbiome datasets, leveraging the minimum information about any (x) sequence (MIxS) standards developed by the Genomic Standards Consortium7. During the initial stages of the NMDC programme, it was confirmed that the microbiome research community had varied knowledge, or were unaware, of the benefits of metadata standards as well as options for available metadata templates8. In order to address this knowledge gap, we set up the NMDC Ambassador Program as a community learning model.

Community learning models enable a cohort of researchers to receive targeted training and materials to disseminate and present to their own research communities, and have been shown to increase the overall reach of the training content9. The NMDC advertised the Ambassador Program through social media, mailing lists, the NMDC website and professional networks. The aim was to support a culture change within microbiome research towards increased metadata stewardship by recruiting, training, and supporting early career researchers to host metadata standards workshops.

From 56 total applications, 12 early career researchers, representing 11 institutions across the United States, were selected for the pilot scheme. Ambassador backgrounds, areas of expertise and microbiomes of interest were varied. Ambassadors received an honorarium to cover costs associated with workshop implementation. Six unique training sessions for the Ambassadors were hosted by the NMDC team in partnership with the Center for Scientific Collaboration and Community Engagement (CSCCE). These sessions covered best practices in data management, metadata standards, community engagement and effective practices to serve as community liaisons.

After completion of all six training sessions, each Ambassador was asked to adapt NMDC-created metadata training templates to include specifics regarding their microbiome of interest. They also prepared hands-on guides for MIxS metadata spreadsheets. Ambassadors were responsible for hosting two workshops, or training sessions, centred around metadata standards for their respective research communities. Table 1 outlines the full list of Ambassador Program expectations.

Table 1 Ambassador Program requirements and expectations

To measure the impact of these workshops, post-workshop surveys were conducted to assess how attendee knowledge and familiarity with metadata standards increased after Ambassador events10. In less than a year, a total of 23 Ambassador-hosted workshops and presentations reached more than 800 researchers, and demonstrated improvement in participant recognition and practical experience with metadata standards (Fig. 1).

Fig. 1: The selection of the pilot cohort of NMDC Ambassadors, the locations of their affiliated institutions and the quantitative impact of the Ambassador Program.
figure 1

Eligibility criteria are described here. Publ. note: Springer Nature is neutral about jurisdictional claims in maps.

Due to the COVID-19 pandemic, the programme’s outreach strategy had to pivot from the envisioned in-person workshops to predominantly virtual options. Training was provided to Ambassadors by CSCCE in virtual event organization to enable switching to online-only training formats. Ambassadors implemented several CSCCE-proposed strategies including real-time polls, digital whiteboards and live chat, all of which helped to make virtual events more engaging and informative.

The NMDC continues to recruit 10–20 Ambassadors through this programme annually. Learnings from the first cohort provided the NMDC team with valuable insights for how to improve the programme moving forward. Feedback and recommendations collected from the Ambassadors during post-programme surveys and debriefing calls were summarized in a final report and presentation by CSCCE to the NMDC team. The Ambassadors wanted more guidance on choosing venues for workshops. Therefore, more comprehensive language was drafted and a list will be provided to future Ambassadors as a starting point for venue choices. Several Ambassadors communicated the benefits of having an NMDC team member present at their events to answer questions, especially regarding how metadata standards fit into the overall NMDC mission, and how metadata requirements form the basis of the NMDC Data Portal11. This feedback led to the NMDC team creating supplemental training materials for Ambassadors focused on how to talk about and answer questions relating to the NMDC. The Ambassadors also recommended having more variety in workshop content, therefore the programme scope was amended to include material on microbiome bioinformatics workflows and data reuse through the NMDC Data Portal, while continuing to include content on metadata standards.

The pilot cohort of the Ambassador Program demonstrated the advantages and practicality of a community learning model in microbiome research and biological data stewardship. Ambassadors were provided with extensive training from industry experts, they were able to hone their presentation and hosting skills, and they were given key opportunities to network and present at diverse venues. Equally, the Ambassadors advanced the NMDC mission and broadened the NMDC community network by reaching more researchers than would have been possible by NMDC team members alone.

The NMDC Ambassador Program will continue to evolve, and we hope that it will continue to increase awareness of, and proficiency with, data stewardship, leading to better use of microbiome datasets over time.