Two decades of fumigation data from the Soybean Free Air Concentration Enrichment facility

  • Elise Kole Aspray
  • Timothy A. Mies
  • Elizabeth A. Ainsworth
Data Descriptor


  • The ongoing debate on secondary use of health data for research has been renewed by the passage of comprehensive data privacy laws that shift control from institutions back to the individuals on whom the data was collected. Rights-based data privacy laws, while lauded by individuals, are viewed as problematic for the researcher due to the distributed nature of data control. Efforts such as the European Health Data Space initiative seek to build a new mechanism for secondary use that erodes individual control in favor of broader secondary use for beneficial health research. Health information sharing platforms do exist that embrace rights-based data privacy while simultaneously providing a rich research environment for secondary data use. The benefits of embracing rights-based data privacy to promote transparency of data use along with control of one’s participation builds the trust necessary for more inclusive/diverse/representative clinical research.

    • Scott D. Kahn
    • Sharon F. Terry
    CommentOpen Access
  • Data harmonization is an important method for combining or transforming data. To date however, articles about data harmonization are field-specific and highly technical, making it difficult for researchers to derive general principles for how to engage in and contextualize data harmonization efforts. This commentary provides a primer on the tradeoffs inherent in data harmonization for researchers who are considering undertaking such efforts or seek to evaluate the quality of existing ones. We derive this guidance from the extant literature and our own experience in harmonizing data for the emergent and important new field of COVID-19 public health and safety measures (PHSM).

    • Cindy Cheng
    • Luca Messerschmidt
    • Joan Barceló
    CommentOpen Access
  • Recent advances in computer-aided diagnosis, treatment response and prognosis in radiomics and deep learning challenge radiology with requirements for world-wide methodological standards for labeling, preprocessing and image acquisition protocols. The adoption of these standards in the clinical workflows is a necessary step towards generalization and interoperability of radiomics and artificial intelligence algorithms in medical imaging.

    • Miriam Cobo
    • Pablo Menéndez Fernández-Miranda
    • Lara Lloret Iglesias
    CommentOpen Access
  • Software and data citation are emerging best practices in scholarly communication. This article provides structured guidance to the academic publishing community on how to implement software and data citation in publishing workflows. These best practices support the verifiability and reproducibility of academic and scientific results, sharing and reuse of valuable data and software tools, and attribution to the creators of the software and data. While data citation is increasingly well-established, software citation is rapidly maturing. Software is now recognized as a key research result and resource, requiring the same level of transparency, accessibility, and disclosure as data. Software and data that support academic or scientific results should be preserved and shared in scientific repositories that support these digital object types for discovery, transparency, and use by other researchers. These goals can be supported by citing these products in the Reference Section of articles and effectively associating them to the software and data preserved in scientific repositories. Publishers need to markup these references in a specific way to enable downstream processes.

    • Shelley Stall
    • Geoffrey Bilder
    • Timothy Clark
    CommentOpen Access
  • The expansive production of data in materials science, their widespread sharing and repurposing requires educated support and stewardship. In order to ensure that this need helps rather than hinders scientific work, the implementation of the FAIR-data principles (Findable, Accessible, Interoperable, and Reusable) must not be too narrow. Besides, the wider materials-science community ought to agree on the strategies to tackle the challenges that are specific to its data, both from computations and experiments. In this paper, we present the result of the discussions held at the workshop on “Shared Metadata and Data Formats for Big-Data Driven Materials Science”. We start from an operative definition of metadata, and the features that  a FAIR-compliant metadata schema should have. We will mainly focus on computational materials-science data and propose a constructive approach for the FAIRification of the (meta)data related to ground-state and excited-states calculations, potential-energy sampling, and generalized workflows. Finally, challenges with the FAIRification of experimental (meta)data and materials-science ontologies are presented together with an outlook of how to meet them.

    • Luca M. Ghiringhelli
    • Carsten Baldauf
    • Matthias Scheffler
    CommentOpen Access