How can you tell if scientists in a certain field are publishing only positive results and throwing away dull findings? Many meta-researchers — who analyse rafts of studies to try to come up with reliable conclusions — use a graphical tool called a funnel plot to sift through the studies, checking for publication bias towards interesting findings.

But statistically-minded academics took to Twitter last week to debate the utility of these plots, with some saying that they should have no place in the toolbox of meta-researchers. The prompt for this was a 21 March blogpost from Uri Simonsohn, who studies decision-making and methodology at the University of Pennsylvania in Philadelphia, which argued that the funnel plot test is flawed because of a key assumption.

“I was expecting it to be a low-key post, but it got tons of attention,” says Simonsohn, who has developed a method called p-curve that also looks for publication bias.

Making a funnel plot involves plotting the sample size of a published scientific study against the size of the effect that the study measures. In many cases, bigger studies will yield a more precise estimate of the effect size, so the graph should — all things being equal — look like an inverted funnel, because the larger studies will cluster closer to the ‘true’ effect size and the smaller studies will be spread more widely.

Violating assumptions

However, if researchers have been selectively publishing positive results, the funnel plot will be asymmetrical, because the perceived ‘less interesting’ negative results on one side of the graph will not be in the literature. And this asymmetry is sometimes claimed as evidence of publication bias in a given area.

But Simonsohn’s blogpost points out that the funnel plot relies on the assumption that there is no relationship between the effect size and the sample size. Using simulated data, he shows that you can generate an asymmetrical funnel plot by violating this assumption, and therefore that an asymmetrical funnel plot does not necessarily stem from publication bias.

Hard numbers on the use of funnel plots are difficult to find, but Simonsohn thinks that some people are missing this point and so are misusing funnel plots to diagnose publication bias.

His advice to someone thinking of using one is to do so only if the studies being analysed actually compare like with like on crucial variables such as population size and the hypothesis being tested.

“I don’t think that happens in psychology. Maybe in medicine it does,” he says.

Although funnel plots are well established in medical meta-analysis — they were formalized and popularized by a 1997 paper in the British Medical Journal (BMJ)1 — they took longer to catch on in other fields, such as psychology, so academics in these areas could be less aware of the underlying assumptions.

Eyeballing asymmetry

Many researchers also note that even judging whether a funnel-test curve is asymmetrical needs to be done correctly. Just eyeballing the shape isn’t enough: rather, follow-up mathematical tests should be applied to a curve to measure its degree of asymmetry.

But not everyone agrees with Simonsohn’s stark conclusions. Joe Hilgard, who researches video games at the University of Pennsylvania and has an interest in meta-analysis, wrote a response entitled “Funnel plots, done correctly, are extremely useful”.

Hilgard told Nature that the tool can be “tremendously informative and tremendously useful … but you really have to pay careful attention to the data you put into it”.

The problem identified by Simonsohn can be mitigated, he suggests, by splitting data into what he calls “homogeneous subgroups”. For example, studies that examine a possible link between violent video games and aggression could be split into those that look at behavioural outcomes, those that look at affective outcomes and those that look at cognitive outcomes.

By categorizing studies in this way, the assumption questioned by Simonsohn becomes more reasonable, Hilgard says. Simonsohn says that he agrees in theory, but that, in practice, he thinks “it is very difficult to generate, at least in psychology, lists of studies that are truly homogeneous”.

Marcus Munafo, who researches addictive behaviour at the University of Bristol, UK, points out that what Simonsohn calls a “crazy assumption” — that effect size and sample size are unrelated — is actually acknowledged as a possible flaw in the BMJ paper that presented the funnel plot to the world.

Munafo, who has worked with one of the authors of that original paper, says: “My hunch is that probably most people haven’t read the original paper [and] don’t carefully consider the assumptions underlying the test. But I’m not sure funnel plots are particularly unique in that.”

Responding to criticism that his blog post highlights what is in the original BMJ article, Simonsohn says that what he sees as a “fatal flaw” in the funnel plot is listed there merely as one of many possible defects . “If … you mention it among many other superficial flaws, you are kind of hiding the flaw,” he says.

Richard Morey, a psychological researcher at Cardiff University, UK, who has previously written about how funnel plots can be asymmetrical without their being any publication bias, has another reason to avoid the tool: everyone knows already that publication bias exists.

“In the ideal case, they often tell you what you already know," he says. "We know there’s publication bias. We don’t need a test to tell that."