Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • BOOK REVIEW

Statistical dark arts endanger democracy — and life

US President Donald Trump holds a chart on global coronavirus disease data

US President Donald Trump displays a chart of COVID-19 statistics in May 2020.Credit: Tom Brenner/Reuters

Calling Bullshit: The Art of Scepticism in a Data‑Driven World Carl T. Bergstrom & Jevin D. West Allen Lane (2020)

“The world is awash with bullshit, and we’re drowning in it … this book is our attempt to fight back.” So begins a passionate exposition of how the language of science can be weaponized to mislead both researchers and the public. Its authors are two scourges of the current ‘infodemic’, Carl Bergstrom and Jevin West.

Both are fascinated by contagion — of ideas, diseases, norms, information and misinformation. Bergstrom is an evolutionary biologist, West a data scientist. In 2007, they founded the Eigenfactor Project to map the influence of journals, papers and authors. A decade later, they developed a course in spotting quantitative chicanery. Calling Bullshit — penned before the coronavirus pandemic — is a version of that course, landing just when it has never been more important to know how to navigate data.

Their target is statistical shenanigans: “language, statistical figures, data graphics, and other forms of presentation intended to persuade or impress an audience by distracting, overwhelming, or intimidating them with a blatant disregard for truth, logical coherence, or what information is actually being conveyed.” Informative and never boring, this labour of love lays bare a cornucopia of selection biases, misleading data visualizations, machine-learning mishaps and more.

Examples include: bluffing by mantis shrimp; physician Andrew Wakefield’s infamous fraud concerning a non-existent link between vaccination and autism; the worrying phenomenon of ‘deepfake’ videos; the spurious relationship between facial features and criminality; the unsubstantiated validity of the marshmallow test, a supposed measure of willpower; and the dubious effectiveness of wellness programmes. Even experienced researchers will have ‘aha’ moments. Well over 100 figures underscore the authors’ concrete, no-nonsense approach.

Yet there are missed opportunities. Most importantly, the referencing is below par. The authors assert that, to verify a claim, one must “dig to the source”. Why, then, does Calling Bullshit not use citation footnotes? Instead, it presents a chapter-specific alphabetized literature list and the unappealing prospect of guessing which references are relevant to what. A claim such as “Most people think they’re pretty good at spotting bullshit” might not be supported by any empirical research; it is difficult to tell (I could find no references for it in the list). Neither is there a figure listing. So how can we evaluate a graph suggesting that, around 2001, television channels Fox News and CNN had roughly similar ideological orientations — could this be balderdash? In that case, the source paper is listed at the back of the book, but I wonder how many will dig for it.

Another missed opportunity concerns data visualization. The authors stress the potential to mislead by inverting an axis, zooming in too much, zooming out too much or ignoring base rates. This is demonstrated, for instance, with a plot showing that more 20–24-year-old drivers die in car crashes than do 16–19-year-old drivers. They point out that this ignores the fact that the first group drives much more than does the second. The younger group has about twice as many fatal crashes per mile driven.

What Bergstrom and West fail to show is that, executed properly, visualization can also be an excellent tool for avoiding being misled. For instance, when researchers claim an association between two variables, it is good practice to show the scatter plot of data points. Otherwise, it is almost impossible to assess whether the claimed relation might be nonlinear, or the result of outliers, or due to unexpected clusters. To paraphrase statistician Frederick Mosteller: although it is easy to lie with data visualization, it is even easier to lie without it.

Finally, a pet peeve. Bergstrom and West bemoan that “scientists are stuck using p-values because they don’t have a good way to calculate the probability of the alternative hypothesis”. This ignores an alternative statistical approach: Bayesian model comparison. Challenges remain, but at least Bayesians attempt to find an approximate answer to the right question, instead of struggling to interpret an exact answer to the wrong question.

All that said, this book will train readers to be statistically savvy at a time when immunity to misinformation is essential: not just for the survival of liberal democracy, as the authors assert, but for survival itself. Perhaps a crash course on bullshit detection should be a mandatory part of the school curriculum.

Nature 584, 36 (2020)

doi: https://doi.org/10.1038/d41586-020-02280-x

Subjects

Nature Careers

Jobs

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing

Search

Quick links