How robust are your data?

doi:10.1038/ncb0609-667a

Download PDF

Editorial
Published: June 2009

How robust are your data?

Nature Cell Biology volume 11, page 667 (2009)Cite this article

4590 Accesses
5 Citations
Metrics details

New rules for the presentation of statistics.

Thanks to advanced imaging technologies and better integration with molecular and systems approaches, cell biology is undergoing something of a renaissance as a quantitative science. Robust conclusions from quantitative data require a measure of their variability. Cell biology experiments are often intricate and measure complex processes. Consequently the number of independent repeats of a measurement can be limited for practical reasons, yet the variability of the measurements can be rather high. Cell biologists have developed good intuition to guide their analysis of such constrained datasets. Biological complexity and the reliance on intuition can cause culture shock to physical scientists crossing over into cell biology (a kind of extension of the celebrated 'two cultures' concept of C. P. Snow).

With the arrival of quantitative information and '-omic' datasets, statistical analysis becomes a necessity to complement instinct. The problem is that statistical tools are built on basic assumptions such as the independence of replicate measurements and the normality of data distribution. Usually, sizeable datasets are prerequisite for statistical analysis. Alas, these can be as hard come by as a biostatistician (n is typically well below 5). The result is that all too often statistics (frequently undefined 'error bars') are applied to data where they are simply not warranted.

There are no easy solutions to rectify the prevalence of poor statistics in cell biology studies. However, an obvious recommendation is to consult a statistician when planning quantitative experiments. Consider whether n represents independent experiments (you may actually be publishing a measure of the quality of your pipette!) and whether it is large enough for the test applied. Avoid showing statistics when they are not justified; instead, show 'typical' data or, better still, all the measurements. Importantly, displaying unwarranted statistics attributes a misleading level of significance to the data. Always describe and justify any statistical analysis applied. We have updated our guidelines to reflect these recommendations (http://www.nature.com/ncb/pdf/gta.pdf). One key rule: if the number of independent repeats is less than the fingers of one hand, show the actual measurements rather than error bars. If you wish to present error bars, include the actual measurements alongside them.

Finally, please remember that you are interrogating a complex system — be careful not to discard 'outlier' data points on a whim, as they may well be as relevant as clustered measurements. One is naturally inclined to ignore data that do not match the hypothesis tested, but biology is rarely as black and white as we would like. Do not make 'hypothesis driven' research become 'hypothesis forced'!

Rights and permissions

Reprints and permissions

About this article

Cite this article

How robust are your data?. Nat Cell Biol 11, 667 (2009). https://doi.org/10.1038/ncb0609-667a

Download citation

Issue Date: June 2009
DOI: https://doi.org/10.1038/ncb0609-667a

This article is cited by

Natural variation of root exudates in Arabidopsis thaliana-linking metabolomic and genomic data
- Susann Mönchgesang
- Nadine Strehmel
- Dierk Scheel
Scientific Reports (2016)

How robust are your data?

Rights and permissions

About this article

Cite this article

This article is cited by

Natural variation of root exudates in Arabidopsis thaliana-linking metabolomic and genomic data

Search

Quick links

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Natural variation of root exudates in Arabidopsis thaliana-linking metabolomic and genomic data

Search

Quick links