Replication, replication, replication
Squeezing a rubber ball in one's left hand increases creativity. Writing a brief paragraph on important values improves academic achievement over the subsequent year. Chocolate cookies help you to persist with an unsolvable problem.
What have these findings got in common? They are all in a ‘top 20’ of findings that people would most like to see replicated (see www.psychfiledrawer.org/top-20). Replication is described by many as the cornerstone of scientific progress, and the issue has been discussed extensively in the blogosphere of late.
The debate built following failures to replicate studies on priming by John Bargh, and on ‘anomalous retroactive influences’ by Daryl Bem. Some have even suggested that replication is psychology’s ‘house of cards’.
In an attempt to shed some light on this perennial issue, I invited psychologists to share their views on replication, and constructive ways forward. I hope you will find the results to be a timely and important collection, building on The Psychologist’s role as a forum for discussion and debate.
Here, we present:
Replication, replication, replication
Stuart J. Ritchie, Richard Wiseman and Christopher C. French with the opening contribution to a special on one of the cornerstones of scientific progress
Replication: where do we go from here?
A variety of perspectives on replication and possible ways forward, from: Daniel Simons; Dave Nussbaum; Henry L. Roediger, III; Gregory Mitchell; Daryl Bem; Claudia Hammond; Daniel Bor; Sam Gilbert; Joshua Hartshorne and Adena Schachner; Alex Holcombe and Hal Pashler; Jelte Wicherts; and Stephen Pilling
…Please download the PDF or see http://tinyurl.com/psycho0512
Dr Jon Sutton
Last year, Cornell social psychologist Daryl Bem had a paper published in the prestigious Journal of Personality and Social Psychology (JPSP) entitled ‘Feeling the future’ (Bem, 2011b).
According to the nine studies described in the paper, participants could reliably – though unconsciously – predict future events using extrasensory perception. The findings proved eye-catching, with many major media outlets covering the story; Bem even discussed his work on the popular American TV show The Colbert Report.
The wide-ranging discussion of Bem’s paper has raised questions regarding the limits of science, our current statistical paradigm, the policies of academic journal publishing, and what exactly a scientist needs to do to convince the world that a surprising finding is true. In this article we outline the ‘Feeling the future’ controversy, our part in it, and highlight these important questions about scientific psychology.
Some recent parapsychological research projects have taken a somewhat idiosyncratic approach to extrasensory perception by examining, for example, whether zebra finches can see into the future (Alvarez, 2010). In contrast, Bem adopted a more back-to-basics approach, taking well-worn psychological phenomena and ‘time-reversing’ them to place the causes after the effects. By far the largest effect size was obtained in Bem’s final experiment, which investigated the ‘retroactive facilitation of recall’. In this procedure, participants were shown a serial list of words, which they then had to type into a computer from memory in a surprise free recall test. After the test, the computer randomly selected half of the words from the list and showed them again to the participants. Bem’s results appeared to show that this post-test practice had worked backwards in time to help his participants to remember the selected words – in the recall test they had remembered more of the words they were about to (randomly) see again.
If these results are true, the implications for psychology – and society – are huge. In principle, experimental results could be confounded by participants obtaining information from the future, and studying for an exam after it has finished could improve your grade!
As several commentators have pointed out, Bem’s (2011b) experiments were far from watertight – for instance, Alcock (2011) and Yarkoni (2011) have outlined numerous experimental flaws in the design. We won’t describe these various issues here as they have been widely discussed in the blogosphere and elsewhere (see, for instance, Bem’s, 2011a, response to Alcock). While many of these methodological problems are worrying, we don’t think any of them completely undermine what appears to be an impressive dataset. The ‘Feeling the future’ study has become a test case for proponents of Bayesian theory in psychology, with some commentators (e.g. Rouder & Morey, 2011) suggesting that Bem’s seemingly extraordinary results are an inevitable consequence of psychology’s love for null-hypothesis significance testing. Indeed, Wagenmakers et al. (2011a) suggest that had Bayesian analyses been employed, with appropriate priors, most of Bem’s effects would have been reduced to a credibility level no higher than anecdotal evidence. Given that casinos are not going bankrupt across the world, argued the authors, our prior level of scepticism about the existence of precognitive psychic powers should be high.
Bem and colleagues responded (2011), suggesting a selection of priors which were in their view more reasonable, and which were in our view illustrative of the problem with Bayesian analyses, especially in a controversial area like parapsychology: Your Bayesian prior will depend on where you stand on the previous evidence. Do you, unlike most scientists, take seriously the positive results that are regularly published in parapsychology journals like the Journal of the Society for Psychical Research, or the Journal of Parapsychology? Or do you only accept those that occasionally appear in orthodox journals, like the recent meta-analysis of ‘ganzfeld’ telepathy studies in Psychological Bulletin (Storm et al., 2010)? Do you consider the real world – full as it is of the aforementioned successful casinos – as automatic evidence against the existence of any psychic powers? Your answers to these questions will inform your priors and, consequently, the results of your Bayesian analyses (see Wagenmakers et al., 2011b, for a response to Bem et al., 2011).
We reasoned that the first step towards discovering whether Bem’s alleged effects were genuine was to see if they would replicate. As one of us has pointed out previously (Wiseman, 2010), the only definitive way of doing this isto carry out exact replications of the procedure in the original experiment. Otherwise, any experimental differences muddy the waters and – if the replications fail – allow for alternative interpretations and ‘get-outs’ from the original proponents. Recently, this argument was taken up with direct reference to Bem’s experiment by LeBel and Peters (2011), who strongly argued in favour of more exact replications.
Admittedly, carrying out exact replications of someone else’s work is hardly the most glamorous way to spend your time as a scientist. But we are often reminded – most recently by an excellent article in the APS Observer (Roediger, 2012, and this issue) – that replication is one of the cornerstones of scientific progress. Keeping this in mind, the three of us each repeated the procedure for Bem’s ‘retroactive facilitation of recall’ experiment in our respective psychology departments, using Bem’s instructions, involving the same number of undergraduate participants (50) as he used, and – crucially – using Bem’s computer program (with only some minor modifications, such as anglicising a few of the words). Either surprisingly or unsurprisingly, depending on your priors, all three replication attempts were abject failures. Our participants were no better at remembering the words they were about to see again than the words they would not, and thus none of our three studies yielded evidence for psychic powers.
We duly wrote up our findings and sent them off to the JPSP. The editor’s response came very quickly, and was friendly, but negative. The journal’s policy, the editor wrote, is not to publish replication attempts, either successful or unsuccessful. Add something new to the study (a ‘replication-and-extension’ study), he told us, and we may consider it. We replied, arguing that, since Bem’s precognitive effect would be of such clear importance for psychology, it would surely be critical to check whether it exists in the first place, before going on to look at it in different contexts. The editor politely declined once more, as described by Aldous (2011), and Goldacre (2011).
While exact replications are useful for science, they’re clearly not very interesting for top journals like JPSP, which will only publish findings that make new theoretical or empirical contributions. We are not arguing that our paper should automatically have been published; for all we knew at this point, it may have suffered from some unidentified flaw. We would, however, like to raise the question of whether journals should be so fast to reject without review exact replications of work they have previously published, especially in the age of online publishing, where saving paper is no longer a priority (Srivastava, 2011).
- Stuart J. Ritchie is at the University of Edinburgh
- Richard Wiseman is at the University of Hertfordshire
- Christopher C. French is at Goldsmiths, University of London
- To quote the title of Bem et al.’s (2011) response to Wagenmakers et al. (2011a): ‘Do psychologists need to change the way they analyse their data?’ Do we need to consider becoming Bayesians?
- How do we deal with experimenter effects in psychology laboratories?
- Should journals accept papers reporting replication attempts, successful or failed, when they themselves have published the original effect?
- Where should journals publish replication attempts? Internet-only, with article abstracts in the paper copy?
- Who should carry out replication studies? Should scientists be required to replicate their own findings?
- If a scientist chose to carry out many replications of other people’s work, how would this impact his or her career?
- Should more outstanding and controversial scientific questions be subject to prospective meta-analyses?
[REMEMBER, FOR NUMEROUS CONTRIBUTIONS ON THIS ISSUE PLEASE DOWNLOAD THE PDF VERSION OR SEE HTTP://TINYURL.COM/PSYCHO0512
Alcock, J.E. (2011, 6 January). Back from the future: Parapsychology and the Bem affair. Skeptical Inquirer. Retrieved 6 March 2012 from http://tinyurl.com/5wtrh9q
Aldhous, P. (2011, 5 May). Journal rejects studies contradicting precognition. New Scientist. Retrieved 6 March, 2012 from http://tinyurl.com/3rsb8hs
Alvarez, F. (2010). Higher anticipatory response at 13.5 ± 1 H local sidereal time in zebra finches. Journal of Parapsychology, 74(2), 323–334.
Bem, D.J. (2011a, 6 January). Response to Alcock’s ‘Back from the future: comments on Bem’. Skeptical Inquirer. Retrieved 6 March 2012, from http://tinyurl.com/chhtgpm
Bem, D.J. (2011b). Feeling the future: Experimental evidence for anomalous retroactive influences on cognition and affect. Journal of Personality and Social Psychology, 100(3), 407–425. doi: 10.1037/a0021524
Bem, D.J., Utts, J. & Johnson, W.O. (2011). Must psychologists change the way they analyse their data? A response to Wagenmakers, Wetzels, Borsboom & van der Maas (2011). Journal of Personality and Social Psychology, 101(4), 716–719.
Goldacre, B. (2011, 23 April). Backwards step on looking into the future. The Guardian. Retrieved 16 March 2012 from http://tinyurl.com/3d9o65e
LeBel, E.P., & Peters, K.R. (2011). Fearing the future of empirical psychology: Bem’s (2011) evidence of psi as a case study of deficiencies in modal research practice. Review of General Psychology, 15(4), 371–379. doi: 10.1037/a0025172
Ritchie, S.J., Wiseman, R., & French, C.C. (2012). Failing the future: Three unsuccessful replications of Bem’s ‘retroactive facilitation of recall’ effect. PLoS ONE, 7(3), e33423.
Roediger, H.L., III (2012). Psychology's woes and a partial cure: The value of replication. Observer. Retrieved 16 March 2012 from http://tinyurl.com/d4lfnwu
Rosenthal, R. (1966). Experimenter effects in behavioural research. New York: Appleton-Century-Crofts.
Rosenthal, R. (1979). The file drawer problem and tolerance for null results. Psychological Bulletin, 86(3), 638–641. doi: 10.1037/0033-2909.86.3.638
Rouder, J.N. & Morey, R.D. (2011). A Bayes-factor meta-analysis of Bem’s ESP claim. Psychonomic Bulletin & Review, 18(4), 682–689.
Schlitz, M., Wiseman, R., Watt, C., & Radin, D. (2006). Of two minds: Sceptic-proponent collaboration within parapsychology. British Journal of Psychology, 97, 313–322. doi: 10.1348/000712605X80704
Srivastava, S. (2011, May 10). How should journals handle replication studies? [Web log post]. Retrieved 6 March 2012 from http://tinyurl.com/crb24a8
Storm, L., Tressoldi, P. & Di Risio, L. (2010). Meta-analysis of free response studies, 1992–2008: Assessing the noise reduction model in parapsychology. Psychological Bulletin, 136(4), 471–485. doi: 10.1037/a001945
Wagenmakers, E.-J., Wetzels, R., Borsboom, D. & van der Maas, H.L.J. (2011a). Why psychologists must change the way they analyse their data: The case of psi: Comment on Bem (2011). Journal of Personality and Social Psychology, 100, 426–432. doi: 10.1037/a0022790
Wagenmakers, E.-J., Wetzels, R., Borsboom, D. & van der Maas, H.L.J. (2011b). Yes, psychologists must change the way they analyse their data: Clarifications for Bem, Utts, and Johnson (2011). Unpublished manuscript.
Wiseman, R. (2010). ‘Heads I win, tails you lose’: How parapsychologists nullify null results. Skeptical Inquirer, 34(1), 36–39.
Wiseman, R. & Schlitz, M. (1998). Experimenter effects and the remote detection of staring. Journal of Parapsychology, 61(3), 197–208.
Yarkoni, T. (2011, 10 January). The psychology of parapsychology, or why good researchers publishing good articles in good journals can still get it totally wrong [Web log post]. Retrieved 6 March 2012 from http://tinyurl.com/694ycam
BPS Members can discuss this article
Already a member? Or Create an account
Not a member? Find out about becoming a member or subscriber