Science progresses through the acquisition, accumulation and dissemination of knowledge, which in turn is generated by gathering, curating and interpreting data. Data has been generated at increased speed and volume over the past decades1, and are often the foundation of scientific publications. Yet, in the current status quo, the data are very rarely findable, accessible, interoperable, reusable, and open (FAIR/O), neither during the article reviewing process nor after the article has been published2. Verifying, replicating, and reusing data, all processes fundamental to science, is therefore often unachievable. This significantly limits enhanced understanding of science, rational and data-driven research, and machine-actionability2,3. Aside from the lack of accessibility, there is often a lack of standardized procedures to collect data, for example, to manufacture and test materials, making comparison and benchmarking unreliable4.

Databases, both general and domain-specific, that meet the FAIR/O principles could provide a solution to the limitations of the current status quo (Table 1). The Open Membrane Database (OMD), launched in August 2021, is such a database that was designed based on FAIR/O data management in the field of water purification membranes5. It was developed by researchers, for researchers, as an open-source archive of the performance and physicochemical properties of reverse osmosis desalination membranes. The protocols by which the data are gathered are openly accessible and all datapoints have a unique and persistent identifier. The OMD falls under a creative commons license that allows users to copy and redistribute the data in any medium or format, and to remix, transform, and build upon the material (that is, CC BY-NC 4.0). This domain-specific database also contains interactive plots and filters to explore and manipulate the dataset in various ways, and it has the option to export the data as a plain text file. The OMD therefore meets the FAIR/O guidelines and can be used without any restrictions and is free of charge.

Table 1 Comparison of data features between the current status quo and the aspiration of domain-specific databases

The initial OMD dataset was uploaded by the founding team. To make the OMD a sustainable, community-supported database, an online submission tool was implemented for external users to submit their own data originating from published and peer-reviewed papers. The tool guides the user through a series of input fields requiring information on the peer-reviewed manuscript, the membrane type and performance, the testing conditions, and the membrane physicochemical properties. Even though the OMD is getting substantial attention from the community through website visits, paper citations, and attendance of presentations at conferences and webinars, no external submissions have been recorded since its launch. The anticipated involvement of the community did thus not become reality. As a consequence, the sustainability of the OMD is in jeopardy, limiting the potential of big data to understand the fundamental factors governing reverse osmosis desalination, and to direct novel membrane design.

As a co-founder of the OMD, I have to admit it was a sobering experience to see no external submissions to the database since its launch over a year ago, despite our sustained efforts in this direction. It was a confrontational reality check on the state of data sharing in academia, or at least in the field of water purification. The difficulties encountered with the crowd-sourced aspect of the OMD made me ponder what motivates researchers to share their data, and whether the low involvement so far could be symptomatic of a bigger problem related to the academic structures in place. It also made me think how other stakeholders in the Open Science debate, for example, the universities, the funding agencies, and the journals, could incite researchers to implement FAIR/O data practices (Fig. 1).

Fig. 1: Incentives of researchers to contribute to FAIR/O data management.
figure 1

The intrinsic incentives consist of getting increased exposure from the community and of serving the common good. Extrinsic incentivization can come from the involved stakeholders (for example, journals, universities and funding agencies). The principles of the current academic system are not aligned with committing to the common good and do not directly incentivize researchers to contribute to FAIR/O data practices.

On an individual level, the intrinsic motivation of researchers to contribute to FAIR/O data could be to get increased exposure from the community, while at the same time serving it. In a way, the main incentive could be to commit to the ‘common good’: the more data, the better the predictive power of the applied algorithms, and therefore the higher the chance that the field achieves a scientific breakthrough. At the very minimum, larger datasets are likely to improve data-informed decision making for future research, making science as a whole more resource-efficient than the current status quo6. Although committing to the common good is an incentive of high ethical value, I feel that it has faded into the background and has taken a low priority in the work of individual scientists or research groups. I am convinced that researchers are still working towards the greater good and wish to contribute to it. However, the way in which the academic system operates does not prioritize involvement of the community in FAIR/O data practices.

When looking at the pillars of the current academic culture, it seems that quantitative, individual performance metrics dominate the system, as illustrated by the now commonly accepted publish or perish principle7,8. In this competitive environment, there might just be no room left for ‘common good’ practices because they mismatch the drivers that are now in place. Indeed, researchers are currently not rewarded in any direct form for the time spent contributing to FAIR/O data, causing this task to fall through the mazes of a net made out of competition, publication count, citations, and h-indices. To avoid this from happening and to simultaneously address the well-established mental health difficulties and high levels of stress9, a systemic change of the current academic system and incentive structures would be required. Committing to the common good would indeed be more straightforward in a culture that is built on collaboration, inclusivity, long-term thinking and slow-paced but impactful science10,11.

A shift in academic culture from competition to collaboration, from quantity to quality, from short-term to long-term thinking, and from sugar-coating to frankly informing, will not take place overnight. It will require courage, willpower and tenacity. Perhaps most importantly, it will need the involvement of all stakeholders, not only from academics. Fortunately, there are visible signs of a generally increased awareness surrounding the unhealthy logic of the current metric-driven system12,13. Hopefully, FAIR/O data management practices can benefit from this movement and surf along with it.

Beyond a cultural shift in academia, I believe that a number of external sources can help further incentivize and reward FAIR/O data practices. We could imagine, for example, special recognitions given to the largest contributors to a database, similar to the ‘Certificates of Recognition’ for reviewers now awarded by journals14,15,16. This type of recognition might become significant once a database is established, though the risk is that it would put us back in the current prestige-driven short-sighted mindset. Therefore, incentives and rewards coming from the existing overarching structures are likely to be more desirable in the long term.

The universities should encourage and support their researchers in making FAIR/O data an integral part of the research process2. Furthermore, the common perception of doing this task pro-bono can be overcome by developing appropriate forms of recognition for the efforts made. These aspects are predominant in the recent ‘Sorbonne declaration on research data rights’, signed by leading research universities across the globe17. The emergence of declarations like this and others18,19 demonstrates the increased awareness of the difficulties encountered with FAIR/O data, but also indicates that the general trend is in favour of it. For their part, the funding agencies should provide funds to develop the digital infrastructure needed to make FAIR/O data sustainable, while simultaneously mandating researchers to share their data on trustworthy repositories. The latter is a key part of a so-called data management plan, which is slowly becoming an integral and mandatory part of grant and contract agreements20,21. Finally, scientific journals could also significantly contribute to making data FAIR/O and more discoverable, as exemplified by the journal Data in Brief which describes and provides access to research data22. More generally, journals could become a key player in FAIR/O data practices by demanding the raw research to be deposited in a repository, as advocated for by FAIRsharing21 and as executed by, for example, Figureshare23.

External stakeholders increasingly recommend transparency and sharing of data, but still in a rather suggestive way. For example, most journals nowadays recommend to publish and share data, but they have not yet made it mandatory in most fields of science. We are headed in the right direction, but there is still some reluctance. This hesitancy might simply be because FAIR/O data management has only relatively recently entered the scientific and public debate, suggesting more time is needed to achieve a mindset change. Similarly, FAIRsharing is not yet common practice amongst most researchers, including in the field of water treatment24. The growing demand for FAIR/O data will thus require a significant investment from all involved stakeholders to create awareness and educate researchers about the power and the future of this new way of doing science21,24. In addition, only very few domain-specific repositories exist, even though they are typically more effective in imposing structured FAIR/O data practices and standards than the more common generalist repositories24. In this regard, I envision journals playing an increasingly important role as a ‘data transfer hub’: they can mandate researchers to deposit the data in a repository, which is on its turn linked to an external domain-specific database, such as the OMD. The database is then automatically fed by the newly, peer-reviewed data, ensuring its sustainability. The database can either simply store the data but it can also provide tools for data exploration to increase its usefulness to the community. Obviously, these types of well-developed cyber infrastructures with a long-term vision will necessitate significant public investment. This investment has to be seen, however, as part of the necessary efforts to enable sustained scientific innovation.

The lack of external submissions to the recently launched OMD was employed as a starting point to discuss the incentives of researchers to contribute to FAIR/O data management. While external stakeholders can definitely help incentivize researchers to commit to FAIR/O data, advocating for a culture shift in the academic system away from performance metrics and towards more collaboration and long-term thinking is as essential. If we, as academics, truly believe that FAIR/O science can accelerate discovery and direct innovation, then we need to take an honest look in the mirror and ask why we haven’t already contributed to the structures that currently exist, and what we would need to get involved in the future. It is worth asking ourselves these questions now that we are still in the phase of getting all stakeholders on board and of setting up the required infrastructure.