Nature | Comment

Technology: Sharing data in materials science

Two years on from the launch of the US Materials Genome Initiative, five experts highlight how materials scientists still need to work differently.

Article tools


Sally Tinkle: Learn from other initiatives | David L. McDowell: Incentivize sharing | Amanda Barnard: Embrace uncertainty | Francois Gygi: Make simulations reproducible | Peter B. Littlewood: Probe the infinite variety

Sally Tinkle: Learn from other initiatives

Policy analyst at the Science and Technology Policy Institute, Washington DC

The US Materials Genome Initiative (MGI), launched in June 2011 by President Barack Obama, aims to halve the time and cost of developing advanced materials for applications such as energy, transport and security. Two years in, hundreds of millions of dollars have been invested in academic, industry and federal-agency projects.

Sharing data and developing computational tools are crucial to the MGI's success. Advanced materials have complex physical and chemical properties that can be manipulated for different applications, and these can change during synthesis, manufacture and use. The tracking of these properties is a formidable task, and the MGI includes efforts to standardize terminology, data-archiving formats and reporting guidelines.

Fortunately, much can be learned from existing collaborations in nanotechnology. The National Nanotechnology Initiative (NNI), established a decade earlier for materials in the 1–100-nanometre range, is a ready partner for the MGI, which encompasses scales from nanometres to micrometres.

The MGI could consider joining the NNI's Nanotechnology Knowledge Infrastructure initiative that was launched in May 2012 to develop a digital data and information framework and to strengthen collaborations between the science and modelling communities. This initiative has already defined a set of Data Readiness Levels, modelled on NASA's Technology Readiness Levels, to provide a basis for communicating the quality and maturity of materials data.

“The MGI can serve as a hub for sharing information.”

The MGI could also join the partnership between the NNI and the European Commission to support a transatlantic dialogue on the nuts and bolts of data sharing: informatics, consensus-derived ontologies, data representation and archiving.

Data sharing is an inherently collaborative activity that has the potential to propel materials science forward more rapidly. The MGI can invigorate existing efforts and serve as a hub for sharing information on materials at all scales.

David L. McDowell: Incentivize sharing

Executive director of the Institute for Materials, Georgia Institute of Technology, Atlanta

The MGI must avoid a 'build it and they will come' attitude. Incentives are needed for scientists and engineers to collaborate and share their data and skills. There has to be something in it for everyone.

The data-sharing environment must invite collaboration as well as facilitate it. Stakeholders have broad interests that go beyond retrieving existing data — they want to discover materials and forecast enhanced products. An intuitive and robust online environment, and cyber-infrastructure growth that is distributed and organic, rather than centralized, will encourage contributions from diverse users.

Social-networking strategies can connect users with varied expertise to pursue common interests. Win–win approaches should be encouraged. For example, uploading experimental data sets in return for access to modelling tools drives further modelling. Clear agreements must govern credit attribution and the ethics of data use.

Maximizing the utility of information is a major attraction for investors in the MGI's infrastructure. Expensive data sets obtained, for example, from national synchrotron and neutron-diffraction facilities should be archived and leveraged to the greatest extent possible for searching and citation, as should data from massive supercomputer simulations.

Open-access rules are desirable, following the example of the National Science Foundation-funded nanoHUB for nanometre-scale modelling and simulation tools, as well as the LAMMPS molecular-dynamics code and the DREAM.3D software for meshing three-dimensional microstructures.

Amanda Barnard: Embrace uncertainty

Head of the Virtual Nanoscience Laboratory, Commonwealth Scientific and Industrial Research Organisation, Parkville, Australia

The MGI is opening up styles of collaborative working that raise technological and personal challenges. Materials scientists must become more comfortable with uncertainty. They must relinquish control, trust their fellow scientists, and resist the urge to redo everything 'just to be sure'.

Delivering new science from existing data requires the pooling of resources. Some insights and breakthroughs cannot be made any other way. One method may probe scales or achieve resolutions that others cannot. Electron microscopy can resolve subatomic features on surfaces, but optical microscopy shows how light reflects from them.

It is difficult to combine results from different sources. Errors arise from idiosyncrasies in experimental or computational techniques. Many experimentalists know the frustration of reproducing results that vary with laboratory conditions. Even theory-based computational methods can yield different answers.

Mixing data from different origins often introduces more uncertainty than a simple sum of the measurement or statistical errors stemming from the pure data sets. To benefit from data sharing, we must learn to live with that.

The other sort of uncertainty that MGI users must embrace is the human element — our opinions of the people who created the original data and of their competence. Scientists are trained to be sceptical as well as objective. To move materials research forward quickly, we need to assume that each contributor is highly capable, and let the quality of the data speak for itself.

The MGI's value will only come if we can draw from it as easily and confidently as we give to it.

Francois Gygi: Make simulations reproducible

Professor of computer science, University of California, Davis

The most rapid rewards of the MGI could come from sharing simulations of materials structures.

Numerical simulations are not as reliable and reproducible as their theoretical and computational basis would suggest. They often give differing results owing to the complexity of approximations and the number of parameters used.

Overcoming these difficulties is essential for designing new materials. More robust predictions from simulations of the formation of defects in the lattice of a material, for example, improves our ability to optimize the materials' strength or electronic properties.

“Universal data formats and centralized databases are not always necessary.”

Data are reliable only if they can be independently verified and reproduced by different research groups, ideally using different tools. Sharing data freely will make such cross-validation possible.

When disseminating simulation data, researchers must bear two points in mind. First, simulation software should be openly accessible, not just results. Software vendors must not forbid — as some currently do — publication of raw results or performance data out of fear that comparisons may show their product in an unfavourable light. The scientific community should fight this trend.

Second, universal data formats and centralized databases are not always necessary. The materials community could adopt existing frameworks for data sharing. For example, a vast amount of open-source software already supports the World Wide Web Consortium standards for publishing and exchanging data on the Internet, such as the Extensible Markup Language (XML).

With a modest investment, researchers can publish their own data on their own servers in ways that others can access readily. By encouraging the development of domain-specific web tools, we will lower the barriers to data cross-verification and validation.

Peter B. Littlewood: Probe the infinite variety

Associate laboratory director for physical sciences and engineering, Argonne National Laboratory, Illinois

From synchrotrons to scanning-electron microscopes, nanotechnology tools have been honed in the information revolution. Now, through the MGI, we need to invent molecular manufacturing by expanding our vision to include the infinite variety of materials.

There are fundamental hurdles. Despite the initiative's ambitious name, atoms are not genes. The biological genome is both a theory and an algorithm for execution. In materials science, quantum mechanics can doom attempts to translate perfectly from code to function.

This theoretical impasse simply reflects the diversity of materials. Tiny variations in composition or structure can produce entirely new functions. The semiconductor industry depends on a delicate salting of silicon with minute concentrations of other atoms.

Yet chemistry can be systematic. Since Dmitri Mendeleev formulated the periodic table, we have exposed patterns of materials' structure and function, now sifted with the aid of powerful computers and high-throughput experiments. We are building, if not a single 'genome', a patchwork of tools matched to material type, property and function. The MGI will expand that.

But the brute-force approach of the modern electronics industry cannot be scaled up to make lightweight structural materials, batteries or solar cells. Here, production must be measured in megatonnes and square kilometres. The MGI has to help us beyond design and into synthesis — our goal being the engineering of programmable matter that builds itself.

Journal name:
Date published:

For the best commenting experience, please login or register as a user and agree to our Community Guidelines. You will be re-directed back to this page where you will see comments updating in real-time and have the ability to recommend comments to other users.


Commenting is currently unavailable.

sign up to Nature briefing

What matters in science — and why — free in your inbox every weekday.

Sign up



Nature Podcast

Our award-winning show features highlights from the week's edition of Nature, interviews with the people behind the science, and in-depth commentary and analysis from journalists around the world.