Negative and null results are routinely produced across all scientific disciplines, but rarely get reported. The key to combat the biases arising from this mismatch lies in disseminating all details about a work, rather than just positive results.
A good metaphor for how science works is the game Battleship1. The players shoot cannons into a largely unexplored territory. Sometimes, eureka! The shot hits a ship, but most cannonballs land in the bleak and empty ocean. At first sight those shots seem wasted. But every player knows that the misses provide valuable information. Getting to know the terrain ultimately points you towards the targets.
Science — like Battleship — often is about learning from the research attempts that failed to confirm the original hypothesis. But a thorough look through the scientific literature reveals that these negative or null results are rarely reported. Published articles are usually polished narratives, which convey that everything went according to plan.
Occasional happy mistakes
When an article is written up, sent to the editor and goes through peer review, the studies that researchers or journals consider uninteresting are shelved. They disappear into the bottom drawer of the principal investigator and it is mostly the negative and null results that suffer this fate. This effect is well known and has been dubbed the 'file-drawer effect'2.
Daniele Fanelli, a scientometrician at the Université de Montréal, Canada, has performed large investigations into just how seldom negative or null results are published. The results are surprising: depending on the discipline, the portion of articles reporting negative or null results typically amounts to 10–15%. Only in the fields of space science and geoscience does this number exceed 25%. Materials science reports less than 10%, physics around 15%, and the percentages for chemistry, the biological sciences and medicine lie in between3. Ask a few PhD students about how many negative or null results they have already produced in their short careers and these percentages seem minuscule.
© RUTE ANDRÉ
This positive-outcome bias might harm scientific progress on the whole. “If you are certain that your study was correctly conducted, and as long as you report exactly what you did, the publication of any result — whether positive or negative — will be valuable” Fanelli says. “But if you only allow positive results to be reported, it might turn out that people base their future research on an occasional happy mistake.” This occasional happy mistake, which might be due to a little impurity in a chemical synthesis or a statistical fluctuation in a clinical trial, will lead fellow researchers down the wrong path, delay scientific progress and lead to irreproducible studies. The rectification of a false positive result or an exaggerated effect will waste a lot of time and money. Conversely, not reporting experiments that did not show the desired outcome might cause duplicate scientific efforts and lead to a slower discovery of errors or scientific fraud.
Biases across the disciplines
In certain disciplines the positive-outcome bias has been discussed for decades and we have a fairly good picture of its effect on scientific progress. Psychology4 and evolutionary biology5, for example, have a long history of worrying about the consequences. One such consequence might be the infamous decline effect: it seems that scientific findings become less pronounced every time they are replicated6,7. The absence of negative and null results artificially inflates the magnitude of an effect in the early years after its discovery — and for the true magnitude to be determined a slow and laborious self-correction process is necessary.
Another field for which the decline effect has been well documented and where researchers have been reiterating the importance of negative and null results is medicine8. In this case, it is not purely of academic interest to foster efficient communication of an effect's true magnitude. Concealing clinical trials, which show that a new medication is no better than the conventional product, might cost the public health sector billions. The fact that the positive-outcome bias has a considerable effect on the overall scientific opinion has been well documented for various medical fields9, 10, 11.
Gerhard Fröhlich, a philosopher of science from the Johannes-Kepler University in Linz, Austria, has been criticizing biases and distortions in clinical trials for a long time. “We must not merely consider the published articles to get a balanced picture of a medical therapy, especially in meta studies. Rather the unpublished studies are just as important as they are much more likely to contain negative or null results”, he says. He goes on to mention an even more worrying finding: in medicine, industry funded research is less likely to report negative or null results than publically funded studies12.
For disciplines in the physical and chemical sciences where statistics do not play such a central methodological role, publication bias and the file-drawer effect have been investigated much less. Possibly it is not a big problem in those disciplines. Fanelli, at least, has found that disciplines in the physical sciences in general are more likely to report negative results than in the biological sciences, psychology and medicine3. For Fanelli, an insightful rationale for this discrepancy lies in a modernized version of August Comte's 200-year-old concept of a 'Hierarchy of Sciences'. Sciences may be ranked according to the complexity of the studied objects, with physics at the bottom and sociology at the top. The hypothesis is that disciplines at the top are more susceptible to biases. “We often treat all fields in the same way, expecting that psychology has the same methodological rigor as physics”, Fanelli remarks. Nevertheless, he thinks that publication biases are present in physics as well. But based on the rationale of the hierarchy of sciences, he argues: “I suspect that in physics the decline effect is less of a problem than in softer sciences. Although some initially reported effects might be too large or even due to an artefact, the self-correction mechanism in physics seems much more rapid.” At present, Fanelli and his colleagues are collecting more evidence to back up this hypothesis13.
Omitting relevant details
It seems, however, that a few bad habits in reporting practices have undesirable consequences even in the hard sciences. If papers are polished a bit too much or if relevant details are left out, anomalies might arise that distort the scientific literature in the long term. One such anomaly has recently been uncovered by organic chemists Tomas Hudlicky and Martina Wernerova from Brock University in St Catherines, Canada. They questioned the exceptionally high yields of more than 95% in organic synthesis methods that are routinely reported nowadays and set out to test whether the reported yields are in fact realistic. A look through the literature shows that such high yields were rarely reported before 1980 and that today they are still absent from the journal Organic Synthesis where all submissions are independently reproduced before publication. Wernerova and Hudlicky carefully studied how much of the chemical product is lost in the typical work-up procedure and concluded that any yield of more than 94% is unrealistic14. Any higher reported yield seems an 'occasional happy mistake', just like the false positive results that find their way into psychological journals. This case is a prime example of how a publication bias is introduced into the official literature because of a bad reporting practice, which by now is almost impossible to correct.
In addition to those distortions, omitting small but relevant details can have direct consequences on other people's research. This fact is well demonstrated by the following anecdote: during his PhD, Mathias Kläui, now a physics professor at the Johannes Gutenberg-University in Mainz, Germany, and active in the field of nanomagnetism, developed a method to determine all in-plane magnetization components in a magnetic film from a single measurement. For certain angles this worked very well, for others it did not. “As it was just a side project, we never really investigated why this was the case”, Kläui says. In the publication this measurement technique was merely used to validate a new fabrication method for magnetic films, so the group decided to mention only information on the successful measurement angles15. Years later, Kläui received an e-mail from a PhD student in the US. “Because of our paper, she had tried to repeat the same measurements with all angles and came across the same problems”, he says, “she carried on, trying to understand the result for two years. Naturally, she was quite frustrated when she heard that I had had the same problems but we never published the details.”
Reporting all the details of what you did and how you did it seems to be quite straightforward and the fact that this is not the norm certainly is puzzling. After all there are not many external barriers. In these times of online publication, preprint repositories and large supplementary information files, there are no space limitations and all necessary infrastructure is available. Moreover, hardly any journal explicitly excludes the publication of negative results. The barriers that keep us from disseminating the big 'failed' experiments as well as the little erroneous paths rather seem to be internal.
When talking to researchers, especially from physics and chemistry, they often express the concern that their lack of a positive result is simply due to a trivial mistake. Or they think that their negative results are less important, will have little impact and will not get many citations. Commonly, if preliminary results indicate that the study will not produce a positive outcome, research projects get terminated. Why waste time and money to finish and write them up if the research will be of low impact anyway? There is some truth to those prejudices: Fanelli showed that articles reporting negative results get cited less, but this finding was only statistically significant for the biological sciences16. In contrast to positive results, which tend to get cited by a fixed community of researchers from a specific discipline, negative results are cited by a broad range of scientists from different fields17. Moreover, a simulation study that was recently published in PLoS ONE by de Winter and Happee18 suggests that a certain amount of the file-drawer effect might actually be beneficial for science as a whole. The simulation showed that with a selective publication approach it took fewer published articles to arrive at a true meta-analytic estimate of the effect than with a publish-everything approach. Ranking results according to their scientific impact might make sense to a certain extent. However, the simulation also demonstrated that not reporting negative results leads to poorly reproducible and contradictory literature.
Fröhlich does not think that those arguments against publishing negative results are very strong. His first thought on how to explain the internal barrier is competition. “Not publishing negative results is a strategy to withhold information from scientific competitors because they are condemned to go down the same wrong paths and dead end streets again and to repeat prior errors”, he says19. In this context, it is a long-standing hypothesis that the competitive environment and publish-or-perish culture that scientists nowadays find themselves in, contribute to the publication of fewer negative results. But only recently have the first firm pieces of evidence started to trickle in. Fanelli, for example, showed that in US states with a more competitive research environment the portion of positive-outcome studies is larger than in states with less competition20 (Fig. 1).
Combating publication biases
From what we know about the undesirable effects of publication biases, we need to overcome the barriers that keep us from publishing more negative results. Especially in medicine and psychology the matter is urgent and scientists, journals and policy makers have become aware of it. Plenty of proposals about how to combat publication biases have been made. Most are specific to certain disciplines, which reflects the variations in prerequisites and demands across the different fields.
For clinical studies, comprehensive clinical-trial registries can make it harder to draw a veil over negative or null outcomes; such measures are already being implemented in various countries. The publication of more raw data is another measure to facilitate reproducibility, especially, but not exclusively for medical and biological studies. One of the proposals that immediately comes to Fanelli's mind is a reproducibility factor for journals. Rather than counting citations to measure quality, this factor would be a gauge for how well the published articles withstand a reproducibility test. Fröhlich has a related idea that would help with its implementation: “When funding a scientific project, I am in favour of dedicating a certain portion of the money for independently checking the reproducibility of the original results.” But the career-advancement system might compromise this idea. “Unfortunately it is hard to win laurels as an individual researcher by reproducing previous studies” Fröhlich adds.
Fanelli also thinks the current evaluation practices are not optimal for preventing biases. “Changes in the funding and career-advancement system might make it less of a conflict of interest to share negative results. If you get at least some credit for having tried the wrong experiments first, this will help to overcome publication biases” he says.
In physics and chemistry, although our general understanding of reporting biases in these disciplines is still very sketchy, we might not need the big measures to combat distortions. On the other hand, it is obvious that the hard sciences are not at all immune to biases. The general idea is clear: when researchers write up a scientific article and when journals select and finalize it, a greater focus on the experimental details rather than on the success story will lead to more transparency. In the example of organic yields, for instance, requesting the exact information on how the yield was obtained and whether any kind of average over several experiments was taken, is plain and simple. Another easy measure to save PhD students and postdocs from frustration would be to designate some space in the supplementary information to specifically talk about the possible experimental pitfalls and erreoneous paths taken.
The appeal message to scientists and journals must be: report the awful truth! At the beginning it might seem slightly embarrassing to disseminate everything, including errors and failures. But in the end it will be for the benefit of science.
- 95–111 (Campus Verlag, 1991). in Der unendliche Prozeβ der Zivilisation (eds Kuzmics, H. & Mörth, I.)
- Psychol. Bull. 86, 638–641 (1979).
- PLoS ONE 5, e10068 (2010).
- Rev. Gen. Psychol. 13, 146–166 (2009). et al.
- Biol. Rev. 77, 211–222 (2002). &
- Nature 470, 437 (2011).
- Proc. R. Soc. Lond. 269, 43–48 (2002). &
- J. Am. Med. Assoc. 294, 218–228 (2005).
- Health Technol. Assess. 14, 1–193 (2010). et al.
- CNS Drugs 27, 457–468 (2013).
- PLoS ONE 8, e66844 (2013). , , &
- Can. J. Surg. 54, 321–326 (2011). et al.
- Proc. Natl Acad. Sci. USA 10, 15031–15306 (2013). &
- Synlett 18, 2701–2707 (2010). &
- J. Appl. Phys. 93, 7349–7351 (2003). , , , &
- Scientometrics 94, 701–709 (2013).
- Scientometrics 95, 277–297 (2013). et al.
- PLoS ONE 6, e66463 (2013). &
- 535–549 (UVK Verlagsgesellschaft mbH, 1998). in Knoweledge Management und Kommunikationssysteme, Workflow Management, Multimedia, Knowledge Transfer Proc. 6th Int. Symp. for Information Sci. (eds Zimmermann, H. H. & Schramm, V.)
- PLoS ONE 5, e10271 (2010).