The European Commission has its critics, but no one can doubt it has ambitious plans. For example, by the year 2020, the commission says, all European researchers will be able to log in to an enormous virtual repository that will eventually provide access to the collective data from all publicly funded research. This European Open Science Cloud would be a safe, cheap and reliable way to store and access data, and getting to it would be as easy as signing into a Netflix account. It would also be a massive boon — encouraging interdisciplinary research and data reuse, reducing duplication and promoting reproducibility. Sounds more like a dream than a plan? Some scientists think so. Few even know the project exists.
Given the enormous value of such a system to researchers, their obliviousness is a sign that some of the people tasked with bringing the vision to life do not yet believe it will happen. And even those with faith don’t know exactly what it would look like or how it would come about.
The vagueness is no great surprise. Rather than construct a single physical data repository, the commission wants to bring together and build on existing research data centres, both public and private. It would connect these using a single interface with common software and protocols. This is an efficient use of resources, but also a logistical and coordination nightmare.
Much of the commission’s work so far has gone into understanding why researchers do not already routinely share data, focusing on existing incentives and the need for expertise. It thinks, rightly, that the cloud is the way to push the underlying culture of science towards data sharing. But now, getting the project going is crucial.
To that end, the commission gathered data experts from Europe’s major laboratories, science funders and government representatives in Brussels on 12 June. The grand — if somewhat staged — aim of the event was to get all parties to endorse the project. But, revealingly, many attendees saw the project unfolding in a range of different ways. These conflicting views will have to be aligned somehow by the end of the year, when the commission intends to publish a formal plan.
Outstanding issues include the big one: how to pay. Major data repositories and shared cloud-computing facilities already exist, but some will need to grow. All will need to be made interoperable and connected by a high-bandwidth network. The commission has said that it expects to pay €2 billion (US$2.2 billion) of an overall €6.7 billion — the rest of which it hopes will come from national funders and private sources using “innovative” business models.
“The dream is so big and shapeless that many involved can’t see a path to achieving it.”
One group that was little represented at the invitation-only event was commercial companies. These will be essential to bringing together under one virtual roof the many petabytes (1015 bytes) of data that European institutions generate each month.
Meanwhile, there are more subtle challenges that will take time and money to solve. The envisioned software tools to search, browse and access data do not yet exist. And if the cloud is to become more than somewhere that research data go to die, the data must come annotated and formatted in such a way that other scientists can make sense of them.
Currently, the commission’s dream is so big and shapeless that many involved can’t see a path to achieving it. German and Dutch ministers warned last month that it risked getting bogged down in detail and funding disputes. Wary of failing to capitalize on the existing will to complete the project, they called for support for an initiative already under way in their countries. This aims to kick-start the wider science cloud by getting existing data infrastructures to agree on protocols that make at least some of their data findable, accessible, interoperable and reusable (FAIR). The project — called GO FAIR — intends to develop a template for linking up new partners, including cross-border collaborators, within a year.
Other European countries should not fear getting behind this initiative, or other pilot schemes that have grown organically in recent years. As long as communication lines remain open, finding out what works and what doesn’t, and taking some responsibility for tackling the huge range of questions, can only be valuable. It could also provide the necessary momentum. Although such an ambitious plan might never have emerged without European leadership, the size of the project has brought inertia. And like a PhD student faced with the looming task of writing up a thesis, those involved may find the project almost too daunting to start. Getting a range of players to agree to the cloud as a goal was a crucial first stage. But progress is now more likely to come from pulling their heads down from the clouds and getting stuck in.
- Journal name:
- Date published: