From astronomy to genomics, scientists are increasingly storing and studying their data sets on shared remote ‘cloud’ computing servers, accessed through the Internet. Three of Europe’s biggest research labs now want to help academics by working with commercial firms to create a continent-wide cloud-computing portal — and they are hoping to get backing from the European Commission.
Many researchers find cloud computing to be more flexible and efficient than buying expensive hardware — they can rent servers from firms such as Amazon and Google when they need a burst of power for an intensive computation, for example (see Nature 522, 115–116; 2015). Despite the advantages, some academics are concerned about security and reliability when storing their data on outside servers, says Bob Jones, a computer scientist at CERN, Europe’s particle-physics lab near Geneva, Switzerland.
Jones thinks that a single portal combining offerings from commercial providers and publicly funded infrastructure could solve some of these problems, and ultimately increase access to key data sets. Since 2012, CERN — with the European Space Agency and the European Molecular Biology Laboratory (EMBL) in Heidelberg, Germany — has been developing a test-bed system called the Helix Nebula. Run for two years with funding from the European Commission, and coordinated by Jones, the initiative has since evolved into a portal involving 30 different cloud providers, known as the Helix Nebula Marketplace (HNX). CERN has simulated particle collisions on the platform, and EMBL has used it to analyse genetic sequences, including some moved from Amazon’s cloud, says Rupert Lück, EMBL’s head of IT services.
Ambitions to expand were bolstered when, in May, the European Commission announced plans to fund a Europe-wide ‘research cloud’. “The commission likes the idea of open science,” Jones said on 26 June at a meeting in Geneva to discuss a European Open Science Cloud. “What we have to do now is take that enthusiasm from the public sector, the private sector and European institutions, and put it in place.”
The commission is not specifically backing Jones’ plan: it will launch its call for proposals in 2016 and says there are “a range of possibilities for business models”. It wants a virtual platform to host data and encourage their analysis and reuse across disciplines and borders. Climate and satellite data, for instance, “represent a goldmine for research, innovation and new business opportunities”, says the commission.
A European cloud for researchers built around the HNX would be a single gateway through which users could access cloud services and open research data from existing public infrastructure — for example, through the European Grid Infrastructure Federated Cloud, a network of largely publicly funded cloud services such as the Supercomputing Centre of Galicia in Spain — and through companies, such as Cloudwatt, a provider based in Paris. A pilot platform would start relatively small, with the computing equivalent of 100 million hours of processor time and some 10 petabytes of storage (1 petabyte is 1015 bytes). The network would need to expand to 20 times this size to serve the whole of Europe, says Jones.
An advantage of such a system is that all data would be stored, protecting them if a provider were to stop operating, says Jones. And the system’s standard terms would make it quicker and easier for researchers to sign up to and access, he says. “The most valuable thing for researchers is their data. If we’re going to convince researchers to trust cloud services, we really do need this hybrid model.” A federated European cloud could also deal with restrictions that require sensitive data to be analysed in its country of origin, says Lück.
In the United States, researchers and funders are also thinking about how to increase access to data stored on clouds variously funded by the US National Science Foundation, individual institutions and companies, says David Lifka, director of the Cornell University Center for Advanced Computing in Ithaca, New York, which runs a service called Red Cloud. “Sharing cloud capacity is the next logical step,” he says. But creating a system that is fair and does not constrain users is not easy, he adds.
US computer giants Google, Amazon and Microsoft are notably absent from the HNX. Mark Skilton, who studies information systems at the University of Warwick, UK, suggests that the focus on European companies may reflect the commission’s desire to boost homegrown providers. “The issue is whether this will suffer for the lack of Amazon and Google scaling,” he says. Some researchers see the likes of Amazon and Google as a route to open data. Writing in Nature this week, genomics researchers call on funding agencies to expand access to major data sets by paying to place them in popular cloud services (see page 149).
The biggest barrier to cloud computing for small labs is the cost of accessing high-quality cloud resources, says Skilton. If the negotiating power of a European initiative can bring costs down, many could benefit, he says. But it is unclear whether commercial providers will play ball, says Lifka. Although firms often give trial periods for free, “from my experience, their price is their price”, he says. Getting everyone — especially commercial partners — to work under the same governance system and according to the same conditions will be an organizational challenge, says Skilton.
- Journal name:
- Date published:
- See Editorial page 128