Independent verification of data is a fundamental principle of scientific research across the disciplines. The self-correcting mechanisms of the scientific method depend on the ability of researchers to reproduce the findings of published studies in order to strengthen evidence and build upon existing work. Stanford University medical researcher, John Ioannidis, a prominent scholar on reproducibility in science, has pointed out that the importance of reproducibility does not have to do with ensuring the ‘correctness’ of results, but rather with ensuring the transparency of exactly what was done in a given line of research1.
In theory, researchers should be able to re-create experiments, generate the same results, and arrive at the same conclusions, thus helping to validate and strengthen the original work. However, reality does not always meet these expectations. Too often, scientific findings in biomedical research cannot be reproduced2; consequently, resources and time are wasted, and the credibility of scientific findings are put at risk. Furthermore, despite recent heightened awareness, there remains a significant need to better educate students and research trainees about the lack of reproducibility in life science research and actions that can be taken to improve it. Here, we review predominant factors affecting reproducibility and outline efforts to improve the situation.
What is reproducibility?
The phrase ‘lack of reproducibility’ is understood in the scientific community, but it is a rather broad expression that incorporates several aspects. Though a standardized definition has not been fully established, the American Society for Cell Biology® (ASCB®) has attempted a multi-tiered approach to defining the term reproducibility by identifying the subtle differences in how the term is perceived throughout the scientific community.
ACSB4 has discussed these differences with the following terms: direct replication, which are efforts to reproduce a previously observed result by using the same experimental design and conditions as the original study; analytic replication, which aims to reproduce a series of scientific findings through a reanalysis of the original data set; systemic replication, which is an attempt to reproduce a published finding under different experimental conditions (e.g., in a different culture system or animal model); and conceptual replication, where the validity of a phenomenon is evaluated using a different set of experimental conditions or methods.
It is generally thought that the improvement of direct replication and analytic replication is most readily addressed through training, policy modifications, and other interventions, while failures in systematic and conceptual replication are more difficult to connect to problems with how research was performed as there is more natural variability at play.
The reproducibility problem
Many studies claim a significant result, but their findings cannot be reproduced. This problem has attracted increased attention in recent years, with several studies providing evidence that research is often not reproducible. A 2016 Nature survey3, for example, revealed that in the field of biology alone, over 70% of researchers were unable to reproduce the findings of other scientists and approximately 60% of researchers could not reproduce their own findings.
The lack of reproducibility in scientific research has negative impacts on health, lower scientific output efficiency, slower6,7 scientific progress, wasted time and money, and erodes the public’s trust in scientific research. Though many of these problems are difficult to quantify, there have been attempts to calculate financial losses. A 2015 meta-analysis5 of past studies regarding the cost of non-reproducible research estimated that $28 billion per year is spent on preclinical research that is not reproducible. Looking at avoidable waste in biomedical research on the whole, it is estimated that as much as 85% of expenditure may be wasted due to factors that similarly contribute to non-reproducible research such as inappropriate study design, failure to adequately address biases, non-publication of studies with disappointing results, and insufficient descriptions of interventions and methods.
Factors contributing to the lack of reproducibility
Failures of reproducibility cannot be traced to a single cause, but there are several categories of shortcomings that can explain many of the cases where research cannot be reproduced. Here are some of the most significant categories.
A lack of access to methodological details, raw data, and research materials.
For scientists to be able to reproduce published work, they must be able to access the original data, protocols, and key research materials. Without these, reproduction is greatly hindered and researchers are forced to reinvent the wheel as they attempt to repeat previous work. The mechanisms and systems for sharing raw unpublished data and research materials, such as data repositories and biorepositories, need to be made robust so that sharing is not an impediment to reproducibility.
Use of misidentified, cross-contaminated, or over-passaged cell lines and microorganisms.
Reproducibility can be complicated and/or invalidated by biological materials that cannot be traced back to their original source, are not thoroughly authenticated, or are not properly maintained. For example, if a cell line is not identified correctly, or is contaminated with mycoplasma or another cell type, results can be affected significantly and their likelihood of replication diminished. There are many cases of studies conducted with misidentified or cross-contaminated cell lines, so results rendered questionable, and conclusions drawn from them are potentially invalid8. Improper maintenance of biological materials via long-term serial passaging can also seriously affect genotype and phenotype, which can make reproducing data difficult. Several studies have demonstrated that serial passaging can lead to variations in gene expression, growth rate, spreading, and migration in cell lines9,10; and changes in physiology, virulence factor production, and antibiotic resistance in microorganisms11,12,13.
Inability to manage complex datasets
Advancements in technology have enabled the generation of extensive, complex data sets; however, many researchers do not have the knowledge or tools needed for analyzing, interpreting and storing the data correctly. Further, new technologies or methodologies may not yet have established or standardized protocols, so variations and biases can be easily introduced, which in turn can affect the ability to analytically replicate the data.
Poor research practices and experimental design
Among the findings from scholarly efforts examining non-reproducibility is that, in a significant portion of cases, the cause could be traced to poor practices in reporting research results, and poor experimental design14,15. Poorly designed studies without a core set of experimental parameters, whose methodology is not reported clearly, are less likely to be reproducible. If a study is designed without a thorough review of existing evidence, or if the efforts to minimize biases are insufficient, reproducibility becomes more problematic.
These refer to the ways that judgement and decision-making are affected by the individual subjective social context that each person builds around them. They are errors made in cognitive processes that are due to personal beliefs or perceptions. Researchers strive for impartiality and try to avoid cognitive bias, but it is often difficult to completely shut out the subtle, subconscious ways that cognitive bias can affect the conduct of research16,17. Scientists have identified dozens of different types of cognitive biases, including confirmation bias, selection bias, the bandwagon effect, cluster illusion, and reporting bias17. Confirmation bias is the unconscious act of interpreting new evidence in ways that confirm one’s existing belief system or theories; this type of bias impacts how information is gathered, interpreted, and recalled. Selection bias sees researchers choose subjects or data for analysis that is not properly randomized; here, the sample obtained is not truly representative of the whole population. The bandwagon effect is the tendency to agree with a position too easily, without sufficient evaluation in order to maintain group harmony; this form of bias may lead to the acceptance of unproven ideas that have gained popularity. Cluster illusion is when patterns are perceived in a pool of random data in which no actual pattern exists; a bias based on the tendency of the brain to seek out patterns. Reporting bias is when study participants selectively reveal or suppress information in a study according to their own subconscious drivers; this form of bias may lead to underreporting of negative or undesirable experimental results.
A competitive culture that rewards novel findings and undervalues negative results
The academic research system encourages the rapid publication of novel results. Researchers are rewarded more for publishing novel findings, and not for publishing negative results (e.g., where a correlation was not found)15. Indeed, there are limited arenas for publishing negative results, which could hone researchers’ efforts and avoid repeating work that may be difficult to replicate. Overall, reproducibility in research is hindered by under-reporting of studies that yield results deemed disappointing or insignificant. University hiring and promotion criteria often emphasize publishing in high-impact journals and do not generally reward negative results. Also, a competitive environment for research grants may incentivize researchers to limit reporting of details learned through experience that make experiments work better.
Recommended best practices
A number of significant efforts have been aimed at addressing the lack of reproducibility in scientific research. Individual researchers, journal publishers, funding agencies, and universities have all made substantial efforts toward identifying potential policy changes aimed at improving reproducibility16,18,19,20,21. What has emerged from these efforts is a set of recommended practices and policy prescriptions that are expected to have a large impact.
Robust sharing of data, materials, software, and other tools.
All of the raw data that underlies any published conclusions should be readily available to fellow researchers and reviewers of the published article. Depositing the raw data in a publicly available database would reduce the likelihood that researchers would select only those results that support a prevailing attitude or confirms previous work. Such sharing would accelerate scientific discoveries, and enable scientists to interact and collaborate at a meaningful level.
Use of authenticated biomaterials
Data integrity and assay reproducibility can be greatly improved by using authenticated, low-passage reference materials. Cell lines and microorganisms verified by a multifaceted approach that confirms phenotypic and genotypic traits, and a lack of contaminants, are essential tools for research. By starting a set of experiments with traceable and authenticated reference materials, and routinely evaluating biomaterials throughout the research workflow, the resulting data will be more reliable, and more likely to be reproducible.
Training on statistical methods and study design
Experimental reproducibility could be considerably improved if researchers were trained how to properly structure experiments and perform statistical analyses of results. By strictly adhering to a set of best practices in statistical methodology and experimental design, researchers could boost the validity and reproducibility of their work.
Pre-registration of scientific studies
If scientists pre-register proposed scientific studies (including the approach) prior to initiation of the study, it would allow careful scrutiny of all parts of the research process and would discourage the suppression of negative results.
Publish negative data
Many times, ‘negative’ data that do not support a hypothesis typically go unpublished as they are not considered high impact or innovative. By publishing negative data, it helps to interpret positive results from related studies and can help researchers adjust their experimental design so that further resources and funding are not wasted22.
Thorough description of methods
It is important that research methodology is thoroughly described to help improve reproducibility. Researchers should clearly report key experimental parameters, such as whether experiments were blinded, which standards and instruments were used, how many replicates were made, how the results were interpreted, how the statistical analysis was performed, how the randomization was done, and what criteria were used to include or exclude any data.
Ongoing efforts to improve reproducibility
There is a varied and influential group of organizations that are already working to improve the reproducibility of scientific research. The following is a list of initiatives aimed at supporting one or more aspects of the research reproducibility issue.
American Society for Cell Biology (ASCB) - The ASCB Report on Reproducibility
ASCB continues to identify methods and best practices that would enhance reproducibility in basic research. From its original analysis, the ASCB task force identified and published several recommendations focused on supporting existing efforts and initiating new activities on better training, reducing competition, sharing data, improving peer reviews, and providing cell authentication guidelines.
American Type Culture Collection (ATCC) - Cell and Microbial Authentication Services and Programs
Biological resource centers, such as ATCC, provide the research community with standardized, traceable, fully authenticated cell lines and microorganisms to aid in assay reproducibility. At ATCC, microbial strains are authenticated and characterized through genotypic, phenotypic, and functional analyses to confirm identity, purity, virulence, and antibiotic resistance. ATCC has also taken a lead in cell line authentication by publishing the voluntary consensus standard, ANSI/ATCC ASN-0002: Authentication of Human Cell Lines: Standardization of STR Profiling, and by performing STR profiling on all human cell lines managed among its holdings.
Furthermore, ATCC offers online cell line authentication training in partnership with Global Biological Standards Institute, NIH (R25GM116155-03), and Susan G. Komen (SPP160007), which focuses on the best practices for receiving, managing, authenticating, culturing, and preserving cell cultures. To further support cell authentication and reproducibility in the life sciences, ATCC also provides STR profiling and mycoplasma detection testing as services to researchers.
National Institutes of Health (NIH) - Rigor and Reproducibility
To help improve rigor, reproducibility, and transparency in scientific research, the NIH issued a notice in 2015 that informed scientists of revised grant application instructions that focused on improving experimental design, authenticating biological and chemical resources, analyzing and interpreting results, and accurately reporting research findings. These efforts have led to the adoption of similar guidelines by journals across numerous scientific disciplines and has resulted in cell line authentication becoming a prerequisite for publication.
Science Exchange & the Center for Open Science - The Reproducibility Project: Cancer Biology
This initiative was designed to provide evidence of reproducibility in cancer research and to identify possible factors that may affect reproducibility. Here, selected results from high-profile articles are independently replicated by unbiased third parties to evaluate if data could be consistently reproduced. For each evaluated study, a registered report delineating the experimental workflow is reviewed and published before experimentation is initiated; after data collection and analysis, the results are published as a replication study.
Author Policies for Publication
Many peer-reviewed journals have updated their reporting requirements to help improve the reproducibility of published results. The Nature Research journals, for example, have implemented new editorial policies that help ensure the availability of data, key research materials, computer codes and algorithms, and experimental protocols to other scientists. Researchers must now complete an editorial policy checklist to ensure compliance with these policies before their manuscript can be considered for review and publication.
Most people familiar with the issue of reproducibility agree that these efforts are gaining traction. However, progress will require sustained attention on the issue, as well as cooperation and involvement from stakeholders across various fields.
Accuracy and reproducibility are essential for fostering robust and credible research and for promoting scientific advancement. There are predominant factors that have contributed to the lack of reproducibility in life science research. This issue has come to light in recent years and a number of guidelines and recommendations on achieving reproducibility in the life sciences have emerged, but the practical implementation of these practices may be challenging. It is essential that the scientific community are objective when designing experiments, take responsibility for depicting their results accurately, and thoroughly and precisely describe all methodologies used. Further, funders, publishers, and policy-makers should continue to raise awareness about the lack of reproducibility and use their position to promote better research practices throughout the life sciences. By taking action and seeking opportunities for improvement, researchers and key stakeholders can help improve research practices and the credibility of scientific data.