Flip Franssen/Hollandse Hoogte/eyevine
Social psychologist Ap Dijksterhuis.
Thinking about a professor just before you take an intelligence test makes you perform better than if you think about football hooligans. Or does it? An influential theory that certain behaviour can be modified by unconscious cues is under serious attack.
A paper published in PLoS ONE last week1 reports that nine different experiments failed to replicate this example of ‘intelligence priming’, first described in 1998 (ref. 2) by Ap Dijksterhuis, a social psychologist at Radboud University Nijmegen in the Netherlands, and now included in textbooks.
David Shanks, a cognitive psychologist at University College London, UK, and first author of the paper in PLoS ONE, is among sceptical scientists calling for Dijksterhuis to design a detailed experimental protocol to be carried out indifferent laboratories to pin down the effect. Dijksterhuis has rejected the request, saying that he “stands by the general effect” and blames the failure to replicate on “poor experiments”.
An acrimonious e-mail debate on the subject has been dividing psychologists, who are already jittery about other recent exposures of irreproducible results (see Nature 485, 298–300; 2012). “It’s about more than just replicating results from one paper,” says Shanks, who circulated a draft of his study in October; the failed replications call into question the underpinnings of ‘unconscious-thought theory’.
Dijksterhuis published that theory in 2006 (ref. 3). It fleshed out more general, long-held claims about a ‘smart unconscious’ that had been proposed over the past couple of decades — exemplified in writer Malcolm Gladwell’s best-selling book Blink (Penguin, 2005). The theory holds that behaviour can be influenced, or ‘primed’, by thoughts or motives triggered unconsciously — in the case of intelligence priming, by the stereotype of a clever professor or a stupid hooligan. Most psychologists accept that such priming can occur consciously, but many, including Shanks, are unconvinced by claims of unconscious effects.
In their paper, Shanks and his colleagues tried to obtain an intelligence-priming effect, following protocols in Dijksterhuis’s papers or refining them to amplify any theoretical effect (for example, by using a test of analytical thinking instead of general knowledge). They also repeated intelligence-priming studies from independent labs. They failed to find any of the described priming effects in their experiments.
The e-mail debate that Shanks joined was kicked off last September, when Daniel Kahneman, a Nobel-prizewinning psychologist from Princeton University in New Jersey who thinks that unconscious social priming is likely to be real, circulated an open letter warning of a “train wreck looming” (see Nature http://doi.org/mdr; 2012) because of a growing number of failures to replicate results. Social psychology “is now the poster child for doubts about the integrity of psychological research”, he told psychologists, “and it is your responsibility” to deal with it.
“It’s about more than just replicating the results from one paper.”
Other high-profile social psychologists whose papers have been disputed in the past two years include John Bargh from Yale University in New Haven, Connecticut. His claims include that people walk more slowly if they are primed with age-related words.
Bargh, Dijksterhuis and their supporters argue that social-priming results are hard to replicate because the slightest change in conditions can affect the outcome. “There are moderators that we are unaware of,” says Dijksterhuis.
But Hal Pashler, a cognitive psychologist at the University of California, San Diego — a long-time critic of social priming — notes that the effects reported in the original papers were huge. “If effects were that strong, it is unlikely they would abruptly disappear with subtle changes in procedure,” he says.
No one is suggesting that there is anything fraudulent about the results, but the charges that some of Dijksterhuis’s key papers may report false positives is a particular embarrassment for the Netherlands. It comes close on the heels of exposures of scientific misconduct by two other Dutch social psychologists: in 2011, Diederik Stapel of Tilburg University admitted to inventing data, and in June 2012, an investigation committee concluded that Dirk Smeesters from the Erasmus University in Rotterdam had cherry-picked data in some papers.
Shanks’s replication failures cannot be dismissed, says Eric-Jan Wagenmakers, a mathematical psychologist at the University of Amsterdam who last year published a series of studies that failed to lend support4 to unconscious-thought theory. He is disappointed that Dijksterhuis has declined “repeated requests” to help to generate a definitive answer.
Dijksterhuis says that “focusing on a single phenomenon is not that helpful and won’t solve the problem”. He adds that social psychology needs to get more rigorous, but that the rigour should be applied to future, not historical, experiments. The social-priming debate will rumble on, he says, because “there is an ideology out there that doesn’t want to believe that our behaviour can be cued by the environment”.
Others remain concerned. Kahneman wrote in the e-mail debate on 4 February that this “refusal to engage in a legitimate scientific conversation … invites the interpretation that the believers are afraid of the outcome”.
- Journal name:
- Nature
- Volume:
- 497,
- Pages:
- 16
- Date published:
- ()
- DOI:
- doi:10.1038/497016a
- See Correspondence ‘Reproducibility: Priming-effect author responds’
The evidence marshaled above against these four hypotheses suggests that they fail to provide a compelling explanation of the difference in the outcomes of the previous studies and those reported here. A fifth and final possibility is that some or all of the published results on intelligence priming were false positives. Is this a more plausible explanation? One notable feature of the published studies is the number of experiments whose results are statistically non significant at the conventional p = .05 level. For example, in Dijksterhuis et al.'s <sup>18</sup> Study 1, described previously, there were four different primes: the stereotypes professors and supermodels, and the exemplars Albert Einstein and Claudia Schiffer. Although there was a reliable difference in general knowledge test scores between groups primed with the exemplars, the difference between groups primed with professors and supermodels was not significant. Rather than interpreting this as a failure to replicate the basic intelligence priming effect, Dijksterhuis et al. <sup>18</sup> noted that the effect was in the expected direction and concluded that stereotype priming can indeed influence test scores. Similarly, Schubert and Häfner <sup>23</sup> I think obtained a non significant difference between groups expected to show the standard assimilation effect, but again did not interpret this as casting doubt on the existence of intelligence priming.
Can behavior be unconsciously primed via the activation of attitudes, stereotypes, or other concepts? A number of studies have suggested that such priming effects can occur, and a prominent illustration is the claim that individuals' accuracy in answering general knowledge questions can be influenced by activating intelligence-related concepts such as professor or soccer hooligan. I n 9 experiments with 475 participants we employed the procedures used in these studies, as well as a number of variants of those procedures, in an attempt to obtain this intelligence priming effect. None of the experiments obtained the effect, although financial incentives did boost performance. A Bayesian analysis reveals considerable evidential support for the null hypothesis. The results conform to the pattern typically obtained in word priming experiments in which priming is very narrow in its generalization and unconscious (subliminal) influences, if they occur at all, are extremely short-lived. We encourage others to explore the circumstances in which this phenomenon might be obtained.
I have been following and participating in this thread. With respect to the above entry --WOE!
Is it possible to say simply what you mean so I don't have to attempt some form of high-level textual analysis?. Sorry, but a dream conjoined with a parable is just too much for my limited abilities.
Is Dijksterhuis a Hero of Science or not?
From the article:
"In 2006, Dijksterhuis published a theory that fleshed out more general, long-held claims about a "smart unconscious" that had been proposed over the past couple of decades - exemplified in writer Malcolm Gladwellâ's best-selling book <em>Blink</em> (Penguin, 2005).
"The theory holds that behavior can be influenced, or 'primed', by thoughts or motives triggered unconsciously - in the case of intelligence priming, by the stereotype of a clever professor or a stupid hooligan. Most psychologists accept that such priming can occur consciously, but many, including Shanks, are unconvinced by claims of unconscious effects."
Last Thursday morning, just before I woke up, I had a nightmare. I rarely remember my dreams, mainly because (even if they are quite lucid), they usually dissolve into gibberish if I try to craft a verbal narrative of them. But last week's dream did survive the process of translation to a coherent narrative.
In my dream, I found myself on the grounds of the Boston Museum of Science, where, until late last year, I had been a weekend volunteer science educator for the past 25 years, where I supervised a unique Puzzle Activity of my own making.
In last week's dream, I found myself wandering into the Children's Discovery Center (where I had begun my volunteer stint some 25 years ago), except that the layout of the room was much larger than it is in real life. The space was more like a melange of several locations in the building, including the lobby. There was a disturbing commotion outside on the street, and a SWAT Team was arriving and swarming into the building. I ventured outside to observe some pandemonium whereupon some authoritative voice yelled, "Get back inside."
There was also pandemonium inside, including police with their guns drawn. One of the officers pointed his gun at me and barked some orders that I couldn't understand. In my dream, it was clear from his body language and tone of voice that he was commanding me to obey his orders, which I couldn't make out in the cacophonous din. And I was terrified he was going to shoot me for failing to obey his superior authority.
Typical of any nightmare, this is where I woke up.
So why do I recount this unnatural dream in this <em>Nature</em> comment thread?
Because in that dream, I was the "clever professor" while the adrenalized cop with the gun barking orders at me was the "stoopid hooligan."
And therein lies the problem with the replication of Dijksterhuis' experiment. Different people have different perceptions of "clever professor" vs "stupid hooligan." In my dream, who is characterized as the revered authority figure and who is the miscreant hooligan who doesn't belong there?
I'm sure the cop character in my nightmare would have considered me the suspicious hooligan, perhaps returning to the scene with some mischief in mind after having been summarily dismissed from volunteer services for having the temerity to operate beyond the ever tightening constraints on volunteer educators imposed by the museum's increasingly regressive bureaucratic administration.
In other words, stereotyped characters can reverse, much like Charlie Brown in Peanuts. One minute he's the Hero, the next minute he's the Goat who stumbled, screwed up, and ignominiously lost the game.
Who's the Hero and who's the Goat?
In your classic Greek Tragedy the protagonist plays both Hero and Goat, with that fateful reversal being central to the plot. At the end of the story, the tragic Hero-Goat character sings the quixotic Dithyramb ( <em>e.g.</em> What Kind of Fool Am I )
The word "tragedy" comes from the Greek, "<em>tragoidia</em>" which means "Goat-Song."
Failure to replicatie priming-to-behavior is not new. A review by Wheeler and Petty (2001) reported 20% of studies on the priming-to-behavior found no effect of the priming on behaviors. It should be noted that 80% of studies found the effect. Schank et al (2013) can just be one of the 20% which have previously failed to replicate.
Wheeler, S. C. & Petty, R. E. (2001). The effect of stereotype activation on behavior: A review of possible mechanisms. Psychological Bulletin, 127, 797-826.
Ap Dijksterhuis wrote:
<em>You are absolutely right, and I am actually working on such a protocol to test effects of social priming. Shanks knew this, and the Nature journalist knows this too. I am not at all against replication.</em>
That is excellent news. So the <em>Nature</em> article is simply incorrect when it says "Dijksterhuis has rejected the request, saying that he "stands by the general effect" and blames the failure to replicate on "poor experiments"?
<em>But what Shanks is doing - publishing very poor replications in a journal with a very rudimentary review process - contributes to the problem, not to the solution.</em>
It worries me to see this, because it suggests a Bargh-like conflation of medium with message. As everyone should by now know, PLOS ONE's peer-review system is just as strict about scientific quality as any other journals. The <em>only</em> difference between its approach and that of more traditional journals is that it makes no attempt to guess in advance how influential an article will be or to refuse publication to those that it guesses will not be widely found exciting. PLOS ONE's stated review goal is to publish everything that is good science, and nothing that is not.
[Now of course it's possible that bad science will sometimes slip through that net - just as it does at any journal (see: Arsenic Life). But that's not something inherent in PLOS ONE's review criteria, it's just the fallibility inherent in any system depending on humans.]
Replication and reproduction are what makes (social) science interesting! It should not cause outrage if someone tries - and fails - to reproduce existing work, but it should be encouraged.
In my field, political science, authors whose work has been replicated and criticized sometimes write a new paper as an 'answer' to the replication. I call that a replication chain.
Even though cross-checks advance knowledge, and everyone agrees, when it comes to their own work, most authors defend their earlier paper by claiming that the replication was:
* fundamentally flawed
* contains statistical and reporting errors
* is of trivial nature, or
* less realistic and
* of limited utility.
These are direct quotes from replication chains I collected in political science .
Comments so far have characterized the replication paper as worth "applauding" and the journal that published it as as showing "integrity," and as also a "terrible" paper in a journal with "minimal critical review." As Ran Hassin notes, it's time to go to the paper itself to read and perhaps participate in the interesting exchange between the replication and original authors.
Whatever one thinks of the news writing, the scientific staff of Nature has proposed some interesting guidelines for improve research reporting found on the link, above, labeled "reducing our irreproducibility." Parallel efforts have recently been announced by the Association for Psychological Science, the Association for Consumer Psychology, and the Psychonomic Bulletin, to name a few, and a new initiative is also under way at the Society for Personality and Social Psychology. All of these efforts are in the spirit of, as Ap Dijksterhuis says above, "do[ing] (even) better research and replicat[ing] our findings more often before we publish our work" -- and, I might add, conducting and reporting well-designed and adequately powered replication studies after publication.
While I agree that the article here completely misfires (as Ran Hassin writes, it's not even clear what it aims for), I was a bit disappointed at both protagonists' commentaries on the PLoS ONE page (and also, by the way, at Ap's dissing PLoS ONE's reviewing process in a way similar to Bargh last year). Personally I can't think of any finding that would falsify unconscious behavioural priming, simply because it's not a hypothesis but an observation, so all this yes it exists, no it doesn't is not really very helpful in any way.
So ok, failure to replicate does not simply prove an effect does not exist (Ap's point, supported by the many replications), and also they're not exact replications (Ap's point; indeed David's "conditions very close to theirs" is not really satisfying), but then David claims it's not easy to exactly replicate because not enough information is given (plus, "Dijksterhuis also chastises us for not contacting him for his expert advice during our research project. It must have slipped his mind that we did so twice but on neither occasion was he able to provide the information we requested."). Also, Ap's line "They could have told the Shanks team that behavioral priming is ideally done in cubicles" is, besides condescending, beside the point, because the real question is why it works in some tightly controlled settings, but not in others. Simply calling on various forms of added noise, as Ap does, seems to be rather unconstructive - and also not really contributing to the relevance of behavioural priming, because what do I care if my performance is aided by assimilation with a professor if this only happens in a cubicle. This is not an elementary low-level visual perception process here that I try to single out. If the effect is supposed to be on overt, complex behaviour influenced by semantically meaningful stimuli at a conceptually meaningful level, then surely something must remain outside the cubicle, irrespective of whether it is conscious or not?
But like I said, it's really beside the point. Some saw the apple fall (25, says Ap), some say it doesn't fall, and until now nobody's seen it drift upward. Only it seems to fall exclusively in precisely controlled circumstances. That's hardly a law of physics, that's a question about what the hell is happening here. But neither Ap nor David seem, at least in their PLoS discussion, interested at all in the why of the divergent findings. The latter seems simply bent on casting uniform doubt on UC behavioural priming based on 9 fairly divergent experiments, while the former seems bent on dismissing the results on methodological grounds, with as major justification various unknown factors that can influence the outcome outside of a cubicle or otherwise poorly controlled circumstances. So essentially we have a phenomenon in which a main variable is a cubicle and instead of discussing why this is so it's just a yes/no game on whether the apple fell.
The lively discussion of the replicability issue is truly important, and we all stand to benefit from it. It already taught us that our Methods section should be much more detailed, that there's much more theory that goes into what we do than we had imagined, and that replication attempts should be much more powerful than they tend to be. It also raised important issues about what does it mean to replicate a finding, and the role of theories in replications. PPS has devoted dozens of pages to the issue, if not more, and may continue to do so. As the intellectual discussion continues, I expect that we will learn much more. We will also gain from the systematic and huge replication project spearheaded by Brian Nosek.
But none of these things are mentioned in the piece, not even in passing. I really wanted to ignore the shallowness of the report, and the embarrassing style, but I simply have to join Jeff and the others: Seriously? in Nature?? The fact that Stapel and Smeesters are invoked feels more like a smearing campaign (against Social Psychology? The Netherlands?) than anything else (and forgive me for not being impressed by explicit qualifiers).
Replicability is a serious issue - in the sciences at large, in psychology in general, and in social psychology in particular - and we should treat it as such. I suggest that interested readers seriously read the Plos1 paper, and then go to seriously read Ap's reply (that appears on the website too), And Shank's reply to the reply etc. -- and then form an informed impression.
Dear Mike,
You are absolutely right, and I am actually working on such a protocol to test effects of social priming. Shanks knew this, and the Nature journalist knows this too. I am not at all against replication.
However, it is mystifying to me why Nature did not also say that our effect had already been replicated in 25 experiments in 10 different labs. Shanks is among the very few people who are not able to replicate it. Indeed, I think this is largely because of poor experimentation.
I'm also not denying that science has a problem. Many findings in all kinds of fields are difficult to replicate. This means we should learn to do (even) better research and replicate our findings more often before we publish our work. But what Shanks is doing - publishing very poor replications in a journal with a very rudimentary review process - contributes to the problem, not to the solution.
<em>David Shanks [...] is among sceptical scientists calling for Dijksterhuis to design a detailed experimental protocol to be carried out indifferent laboratories to pin down the effect. Dijksterhuis has rejected the request, saying that he 'stands by the general effect' and blames the failure to replicate on 'poor experiments'.</em>
This makes no sense to me. If Shanks's and others' experiments are poor, then the way to fix that is surely to design a better experimental protocol. For someone outside the field of Social Psychology, such as myself, it's difficult to construe Dijksterhuis's refusal to be involved in a way that doesn't suggest fear of the likely outcome.
If Shanks's failed attempts to replicate legitimate results are due to poor experimental design, then any questions those attempts raise can be emphatically dispelled by giving him the necessary protocol to achieve successful replication. I don't see what the downside would be.
BTW: I agree that the flavor of the main text is somewhat sensational.
Not sure why Nature chose to go this way, save for the obvious --i.e., the issues Social Psychology has been publicly suffering (Not to say other areas in psychology [and other disciplines] do not suffer similar issues - but Social Psychological research "can" be more media-attention-grabbing than, say, changes in amount recalled under conditions of environmental alterations, etc.
Dear Ap: Not accusing you of "devil-pacts" with media or concern with personal standing. Just saying that certain forms of response invite such interpretations even when invalid.
Jeff: No disagreements here. I just think that "circling the wagon" ultimately is counterproductive and invites the wrong conclusions (from perhaps the wrong subset(s) of readers, but subset(s) nonetheless). Your concerns are, as are Ap's, valid, but I am not sure that pointing out issues with the paper in PLOS, etc., will get the desired results. But, as with anything and everything else, I certainly could be very wrong.
Stan, my response is in reference to the reporting of the failed replications. It reads like a breathless tabloid accounting. I don't recall Nature trumpeting other failed replications in this way or as a reflection on an entire field.
As you note, failed replications are to be expected, and there is nothing damning about the normal progress of science. Why Nature felt the need to report these effects at all and certainly in the manner they chose is beyond me.
Drawing comparisons between fraud and failed replications is simply unacceptable. It is disgusting. If Ap were American, he'd probably be contacting a lawyer to initiate a suit.
Dear Stan, I agree with almost everything you said, but I do not simply place blame on the messenger. I wrote an extensive commentary on the Shanks studies explaining why I think most of them are not very informative (see their Plos One paper). Moreover, the finding that one can prime intelligence has been obtained in 25 studies in 10 different labs. I'm also fine with people trying to replicate my findings, it's just that I will not join them in their efforts. I told the writer of this letter that I don't think more replications will be all that enlightening, as there are already many replications out there. Finally, in our Nijmegen lab we are actually designing a social priming paradigm especially for replication purposes. The person writing this letter knew all this, but I suppose she figured it wasn't relevant. And as for personal standing, I wasn't worried about it until I read this letter.
Despite my colleagues' sentiments seemingly to the contrary (see above), this newest failure to replicate is a big deal for psychology.
Of course replications are contingent events - influenced by a host of extraneous factors. But the failure to replicate after N (where N is some relatively substantial number) attempts is news-worthy, and one would think (hope?) the field would find such data important (cf., JPSP's refusal to publish approximately 3 failures to replicate Bem's ESP research on the shaky grounds - communicated to me by the editor - that JPSP is not in the business of publishing replications. Despite raising serious questions about the nature of JPSP's "business", the fact is that a failure to replicate is not a replication!).
To the case at hand. I do not know the methodological details. Nor do I much care. What I am concerned with is the author's response (above), placing blame on the messenger (Shank's studies), opining that we need to look to the future not to the past, etc. Such deflection invites suspicion, helping create the impression (whether true or not) that the individual is more concerned with personal standing than with the scientific merit of his findings
All experiments are subject to failures of replication. Even studies guided by fundamental physical laws cannot assume a completely closed system and an exact replication of circumstances surrounding the original finding(s). A significance value is assigned to results - but this is not a stamp of "truth". What we hope is that the result will generally be replicable, so that a cause-effect statement can be made about the major factors, while allowing for the possibility that under some (perhaps many) circumstances extraneous variables will dilute the finding, thus rendering it non-significant.
This is a long-winded attempt to say that the defensiveness evidenced in the "first responses" to this newest failure to replicate is unbecoming of a field that wishes to be taken seriously as a scientific endeavor. I do not know if such a desire, taken as an exclusive metaphysical commitment, is correct for psychology (I have my doubts), nor do I doubt that other areas of scientific endeavor suffer similar failures to replicate (check "retraction watch"). But psychology has come under the media microscope (in part due to the "devil's pact" some colleagues have made with media-based venues); given this alliance, a scientifically respectable, not self-protective, response is in the best interests of everyone associated with the field.
This is a letter of profound stupidity. The observation that the Shanks data on priming call into question unconscious thought theory is ridiculous. It's not just the case that the Shanks experiments are extremely unprofessional, and that they can actually easily be construed in such a way that they support our findings, and that our unconscious thought work is replicated very often these days (papers in press at Psych Science, JESP, SCAN), but priming and unconscious thought pertain to two completely different lines of research. They have absolutely nothing to do with each other. The fact that I'm named with Stapel and Smeesters in the same paragraph is not just stupid, it's disgusting. Thank you Nature. Keep up the good work!
But seriously, I don't need an apology, but please take this daft letter away as quickly as possible.
Truly shocked that this is published by Nature and not some blowhard blog or tabloid. Really?! Really?!! It's a blow to a whole field that someone performed some terrible replications and published them in an open-access journal with minimal critical review? And what Dave Nussbaum said: OMG, a failed replication! That's not supposed to happen in science! Right.
The titling and reporting of this content is an embarrassment to Nature.
The headline of this article is truly disappointing, especially coming from an institution like Nature. Ideally, science is self-correcting, so why would it be a blow to an entire field when it engages in self-correction? Find me one scientist who could stand up and say that everything published in their field of research is definitively true and will never be revised or updated.
Rather than applauding the efforts to replicate results, or the integrity of journals like PLOS for publishing them, somehow Nature has jumped into the misguided journalistic pursuit of making everything into a dramatic controversy.
Were Newton's discoveries a blow to physics? They too caused the field to have to ultimately revise its beliefs, but I have a feeling that everything turned out ok. At least until Einstein came along and ruined everything.