Quality and value: Statistics in peer review

Nature (2006) | doi:10.1038/nature04989

Researchers need reviewers to check their stats.

Statistical methods are widely used in many areas of natural science, especially in my field of research, epidemiology. Although statistical procedures are often viewed as a black art, or as a black box, they are not limited to specialists. With today's computing power and software, researchers can and do use computationally intensive methods of great complexity, often leading to the use of techniques that are more sophisticated and powerful than necessary. Many researchers have trouble interpreting the results, or interpret them incorrectly. Clearly, this is a matter for peer review.

Yet an enduring problem for journal editors is obtaining the services of expert reviewers. It is conventional to have at least two subject-area reviewers for a submitted paper, and their expertise tends not to be in statistics (except for purely methodological papers).

A lack of experts

Even in observational sciences, such as epidemiology, which routinely make heavy use of statistical methods, expertise is focused on accounting for the subtle workings of different kinds of systematic error (bias) that can affect comparisons between groups of people. Only secondarily are epidemiologists concerned with random variability and noise, despite the importance of these factors in the interpretation of their results.

Epidemiologists customarily rely on biostatisticians when designing, analysing and interpreting their studies. They learn the fundamentals of handling random variation just as molecular biologists learn physical chemistry, by taking often difficult courses and by using statistical methods in their work. But they are not statisticians any more than molecular biologists are physical chemists.

Obtaining two reviewers with appropriate specialist expertise is difficult enough without requiring yet another reviewer to evaluate the use of statistics in a paper. Statisticians are in unusually high demand. For the many fully engaged in (and sometimes overwhelmed by) service and support of clinical trials or other studies, the unpaid labour of peer review is at the bottom of their priority list.

Some journals tackle this problem by adding a 'triage' checkbox for reviewers to indicate whether additional statistical reviewer is needed. If a reviewer knows the statistical methods are beyond his or her expertise this may be helpful, even though it lengthens the overall review process. But the reviewer may be unaware of some of the subtle pitfalls of the methods. She or he must know enough statistics to be aware of his or her own limited expertise; many reviewers are reluctant to confess they do not know whether standard methods have been properly applied.

Unusual techniques

There are also many papers reporting new methods, or novel applications of existing methods. The authors often include the person or people who pioneered the technique and are hence uniquely placed to judge it; and the technical issues may be difficult and specialized.

In such cases there is a strong temptation for peer reviewers, especially when confronted with a brief technical appendix dense with integral signs or linear algebra, to give the authors a 'pass'. But author expertise in an area is no reason to waive peer review. If it were, all review could be done by examining a curriculum vitae, skipping the paper itself. In such cases the only solution is to seek the services of a specialist, entailing additional delay. As the number of journals and articles keeps increasing at a greater rate than the reviewer pool, the situation gets worse.

Some high-circulation journals, such as the American Heart Association's journal Circulation, the New England Journal of Medicine and the Journal of the American Medical Association employ paid statistical consultants or editors. But this is beyond the reach of most. And it raises another question: how important is it that the statistical methods are 'correct' by conventional practice? This may seem a strange question and in a perfect world no one would ask it. But the world is far from perfect. Audits of the extent of errors present in statistical methodology in the literature have been done, but as far as we know none has evaluated the consequences (A. Vail & E. Gardner Hum. Reprod. 18, 1000�1004; 2003).

If the answer is that statistical errors are not much different from those in other methods (spectroscopy, bioinformatics, X-ray diffraction), the issue becomes a more general problem. But I believe we would find that the major cost is in researcher (reader) misinterpretation of statistical results. It is still distressingly common to read that effects 'not statistically significant' are due to chance. Scientists need better instruction in the interpretation of statistical results.

Like most other things, statistical peer review is a trade-off in time and expense versus some unknown practical pay-off. Perhaps, in the end, we will have to fall back on the observation of one of my colleagues. "Real peer review", he says, "begins after publication."



New England Journal of Medicine


David Ozonoff is an environmental epidemiologist with a special interest in mathematical methods for use in small populations. He is co-editor of Environmental Health (www.ehjournal.net) and professor of environmental health at Boston University School of Public Health in Massachusetts.

Visit our peer-to-peer blog to read and post comments about this article.

Extra navigation