Statistics at the Bench: A Step-by-Step Handbook for Biologists

  • Martina Bremer &
  • Rebecca W Doerge
Cold Spring Harbor Laboratory, 2010 200 pages, hardcover, $59.00 0879698578 | ISBN: 0-879-69857-8

Statistics induces deep-seated feelings of insecurity for many scientists. Some have jumbled memories from a long-ago course that didn't seem to equip them to handle the real data analysis problems that they face today. Others escaped formal statistics coursework, only to realize that the omission created a gap in their training that they have had to address with self-study. Just as a person who has never studied the art of French cuisine may need the elementary skills to cook a quick dinner, so do these researchers need the skills to perform some basic statistics. Statistics at the Bench: a Step-by-Step Handbook for Biologists was written with the goal of helping these scientists and partially succeeds.

After an initial caution that the book is not meant to be a comprehensive textbook, but is instead a reference, the authors briefly introduce common statistical tests and concepts and illustrate them with examples from the life sciences. Readers without access to statistical analysis software (or who prefer more user-friendly programs) will appreciate the detailed instructions about how to perform tests in Microsoft Excel. The authors also explain some methods that are beyond Excel's capabilities, such as permutation tests and statistical classification methods, for researchers who want to take the next step. Concepts are described in typical statistics jargon, which may present something of a barrier for the average scientist reader, but the case study data examples are convincingly realistic and grounded in daily scientific practice, involving data from fruit flies, plants, humans and other organisms. The book is published as a hardcover ring binder that lies flat on a desktop at any selected page, which is a nice touch that adds convenience, although it also makes Statistics at the Bench look a bit like a cookbook.

The book has limitations, however, that might restrict its audience. The primary problem is that the authors assume that the reader already has a fairly solid understanding of which statistical test is appropriate to his or her data and that the only step-by-step instructions needed are the ones on how to get Excel to crunch the numbers. This assumption will be valid for some readers, but not for the would-be data analyst who is facing a spreadsheet full of numbers and struggling to figure out what to do with them. These readers would benefit from a decision tree or flow diagram on how to select appropriate statistical tests.

A related issue is that Statistics at the Bench does little to guide readers who think they know the right statistical procedure, but are mistaken. There's definitely a need for gentle guidance on this topic, as I found when I worked with a colleague to conduct a statistical review for a scientific publisher. In papers published by biologists and other life scientists, we found that it was fairly common to select an inappropriate statistical test. One error was to analyze extremely small datasets with methods that are appropriate for large ones. Another was to compare the means of several different samples with multiple t tests instead of with a single analysis of variance. Some of the errors were trivial, but others may have led to erroneous conclusions about statistical significance, such as the conclusion that two samples were significantly different when they really were not. A statistics handbook could become extremely valuable if it not only explained how to perform each procedure, but also helped readers to quickly recognize situations in which it is not the best choice and directed them to other options.

An example that illustrates both the book's strengths and weaknesses is its discussion of replication. Here, as in many other places, the authors do a very good job of explaining the underlying concepts; in this case, technical replication (repeated analysis of the same sample material to quantify measurement or technical error) and biological replication (repeated analyses on different individuals to assess biological variation in the population). However, Bremer and Doerge stop just short of providing actionable guidance from the statistical perspective. A reader wondering how many technical replicates to perform would logically turn to the discussion of sample size in the next chapter. But he or she might be challenged to understand the connection between the concepts of sample size and of replication, as the later chapters no longer use the term 'replicate' or clarify whether it is biological replicates or technical replicates that are being discussed. Uniform terminology, plus a flow diagram or references to specific page numbers, would provide the necessary structure for the reader and a few additional critical thinking exercises might close the conceptual gap. For example, many scientists perform experiments in triplicate; under what assumptions is a sample size of 3 likely to provide a reliable estimate of the population mean? How might inbreeding in laboratory populations of model organisms affect sample size calculations?

In summary, Statistics at the Bench will be quite useful for scientists who already know the basics of cooking with statistics, but it isn't a substitute for a good cooking class. Other options might be better for those who are just learning their way around the statistical kitchen.