Research, Research Ethics

Embrace the unknown

Chris Ferguson washes his hands of ‘science laundering’: cleaning up messy data for public consumption.

05 February 2019

Consider the basic premise ‘Does X cause Y?’ It’s at the root of almost any question of interest to the general public or policy makers. Does cognitive-behavioural therapy treat PTSD? Does the TV show 13 Reasons Why cause suicide in teens? Can implicit racism be tested for, and does training reduce racism in society? Generally speaking, people outside of psychological science (and arguably many people within it) want the answer to such simple questions. And it is often in the interest of professional guilds – the advocacy organisations that represent psychology and other sciences – to give simple answers. The result is a communication of quasi-scientific nostrums that are, at best, partially true and, at worst, absolute rubbish.

Science laundering is the washing away of inconvenient data, methodological weaknesses, failed replications, weak effect sizes, and between-study inconsistencies. The cleaned-up results of a research field are then presented as more solid, consistent and generalisable to real-world concerns than they are. Individual studies can be irresponsibly promoted by press release, or entire research fields summarised in policy statements in ways that cherry-pick data to support a particular narrative. Such promotions are undoubtedly satisfactory and easier to digest in the short-term, but they are fundamentally deceitful, and they cast psychology as a dishonest science.

Accusations of science laundering have been levelled at professional guilds such as the American Psychological Association (APA) for many years (Ferguson, 2015; O’Donohue & Dyslin, 1996). The formula appears to be to take an issue of great interest to the general public or policy makers and boil it down to simplistic truisms using science language. In most cases, these quasi-scientific truisms are either politically palatable for the members of the organisation, which creates the illusion that social science tends to support liberal causes (Redding, 2001), or appear to make psychological science indispensable to a policy decision when, in fact, it is not.

My own field of video game violence presents a case study in this phenomenon. Twice, in 2004 and 2013, the APA convened a taskforce to study the issue. Both were composed of a majority of individuals with strong, public, anti-game views, unbalanced by sceptical voices (Copenhaver & Ferguson, in press). This was particularly puzzling given that no fewer than 230 scholars had written to the APA expressing their concerns about the quality of their public stances on this issue (Consortium of Scholars, 2013). It’s hard to shake the sense that ‘science by committee’ may be an ineffective way to reach objective conclusions, and that a taskforce report has little to do with the true state of a science; in this case, an area that has suffered a series of retractions, corrections, failed replications (e.g. Przybylski et al., 2014), failed re-analyses and null results using preregistered designs (e.g. McCarthy et al., 2016). Video game science was repudiated by the US Supreme Court in the 2011 case Brown v. EMA, and some scholars have expressed the view that the APA’s continued public stance on this particular issue has damaged the credibility of psychological science in the eyes of the courts (Hall et al., 2011).

Why do this? Why not change course and release honest statements for research fields that are messy, inconsistent, have systematic methodological weaknesses or that may be outright unreproducible? Incentive structures. Individual scholars are likely seduced by their own hypotheses for a multitude of reasons, both good and bad. Big claims get grants, headlines, book sales and personal prestige. I note this not to imply wrongdoing, but to acknowledge we are all human and respond to incentives.

These incentive structures have been well documented in science more widely, and psychology specifically, in recent years. Unfortunately, the public remains largely unaware of such debates, and ill-equipped to critically evaluate research. As one recent example, Jean Twenge and colleagues (2018) released a study, covered widely in the press, linking screen use to youth suicides. However, another scholar with access to the same dataset noted in an interview that the magnitude of effect is about the same as for eating potatoes on suicide (see Gonzalez, 2018: effect sizes ranged from r = .01 to .11 depending on outcome). Such correlations are likely within Meehl’s ‘crud factor’ for psychological science, wherein everything tends to correlate with everything else, to a small but meaningless degree.

In some cases, the meaningfulness of a hypothesis (such as saving children from suicide) can seem critical, even if the effects are trivial. And I can understand why professional guilds, who can be considered to function as businesses for whom psychology is a product they must market, are driven to ‘get it out there’. Perhaps they lament the perception of psychology as a ‘soft’ science (e.g. Breckler, 2005). Psychologists are often pushed to be more assertive in marketing or branding psychology (e.g. Bray, 2009; Weir, 2014, although also see Koocher, 2006 for a different approach), and professional bodies actively advocate for psychological science (Bersoff, 2013). This isn’t necessarily a bad thing, but such calls may inadvertently communicate that accuracy is of secondary importance. For instance, Weir (2014) quoted one scholar as indicating that ‘it’s more important to put the science out there, even if a news story misses some of the subtleties’.

To be clear, I am not suggesting anything remotely like bad faith: merely that the understandable zeal to promote psychological science may have backfired, insofar as promotional efforts often overlook psychology’s weaknesses.

The issue of poor communication can spill over into the clinical realm. For instance, a recent treatment guideline for post-traumatic stress disorder focused on recommending cognitive-behavioural therapy (APA, 2017). These recommendations were controversial with practitioners from other modalities, perhaps not surprisingly. A 2018 meta-analysis led by Joseph Carpenter found fairly modest results for CBT with PTSD (better results were found for other anxiety disorders), which raises the possibility that the clinical guidelines may be overselling its value.

Some readers may be thinking, ‘Isn’t it better to attempt to apply psychology to important societal issues even if the evidence available falls short of being conclusive? How certain do we really need to be before we stop fretting about overselling the value of our science?’ I take an unapologetically hard line on this: honesty must be a fundamental facet of scientific communication. We cannot and should not sweep under the rug inconvenient data, methodological weaknesses or tiny effect sizes for the sake of an appealing narrative, no matter how heartfelt that narrative may be. To do so simply isn’t scientific and, inevitably, will do more harm than good to our field.

In some cases, a ‘messy’ policy statement can still have important policy implications. They’re woefully difficult to find among professional guilds, but government reviews are sometimes more honest. For instance, a 2010 review of violent video game research correctly identified conclusions as inconsistent and limited by methodological flaws (Australian Government, Attorney General’s Department, 2010). Despite the messiness, this review paved the way for Australia to rate more violent games, which previously had been effectively censored, unavailable even to adults.

Ultimately, we should be looking to educate the public about data. People are complex; behaviour is messy. Often psychological science doesn’t have the answer, and we should be comfortable with a response that is murky, convoluted, difficult to parse, controversial, non-politically correct or simply ‘We don’t know’. It’s time for psychological science to embrace the unknown and become more honest about our debates, methodological weaknesses and inconsistencies.

Our brave pioneers

After climbing down from my high horse on science laundering, it is only fair to recognise that our field has seen some pioneers push toward better, more transparent and open methods. This ‘open science’ movement has often been fraught with controversy and even acrimony, but it deserves to take hold as a way forward to clearer scientific values.

The incentive structures in science developed such that ‘publish or perish’ and publication bias created an environment of widespread Type I error. It appears that only a minority of findings in psychology are replicable (Open Science Collaboration, 2015), although, in fairness, this appears to be true for other sciences, such as medicine. This concern has sometimes been passionately challenged (e.g. Gilbert et al., 2016). But if we reflect upon researcher biases, expectancy effects, fluidity of methods and the pressure to produce positive findings, it would seem clear that the issues with reproducibility are almost certainly true. Or, put more bluntly, a fair percentage of things we’ve been telling introductory psychology students for decades are rubbish.

Finding a way out of this state of affairs will require cultural change within psychology. In part this involves the adoption of more transparent methods. Primary among these is pre-registration. With pre-registration, scholars thoroughly publish, in advance of data collection, their hypotheses, measures, methods and analysis plan. Pre-registration increases confidence that results were not manipulated after data collection to produce false positive publishable findings (Nosek et al., 2018). Other aspects of open science, such as transparency in measures and data with data-sharing, can also increase the rigour of our field.

It’s interesting that concerns over the methods of psychological science aren’t new – some research fields, such as media violence, began to crumble in the late 2000s, and observations became generalised across psychological science soon after (e.g. Simmons et al., 2011). Way back in 1999, Wilkinson and the Task Force on Statistical Inference highlighted the importance of interpretation of effect sizes. Yet most articles ignored this suggestion, and those that reported effect sizes happily defended the most trivial of effects so long as p = .05 had been reached, rendering the entire effort pointless. By contrast, this recent wave has continued to gather momentum, ushering in what some have called a psychological science ‘renaissance’ (Nelson et al., 2018) or, in Brian Nosek’s words, a ‘reformation not a crisis’.

Others, to be sure, are less enthusiastic. In one infamous early draft of a 2016 column by Susan Fiske, she referred to data replicators as ‘self-appointed data police’, and to ‘methodological terrorism’. Fiske’s detractors tended to view her comments as defending a status quo of elite scholars, restricting peer commentary and sheltering bad science. Her defenders worried over the proliferation of harsh peer comments online (comments that themselves did not go through peer review). In fairness, Fiske had a kernel of a fair point – the replication process did sometimes savage individual scholars in a way that appeared to kick a dog after it was down. For instance, Amy Cuddy appeared to be singled out as a sacrifice for the replication cause (see Susan Dominus’s 2017 New York Times piece). Although discrediting the power pose hypothesis is fair game, was it right for Cuddy to be humiliated repeatedly in the public eye? Did her own self-promotion, including a TED talk that remains the #1 Google search result for ‘power poses’, open her up to particularly harsh criticism? Do we feel less sympathy for Phil Zimbardo over new analyses of the Stanford Prison Experiment (see ‘Time to change the story’) because he spent decades promoting it?

These are hard questions to answer. Yet it’s clear we can’t go back. We can’t return to the false-positive results of the past, nor continue to reify them because they’re part of a comforting narrative of how wonderful psychological science is. Only by embracing change, openness and transparency can psychological science progress. Sure, let’s have a conversation about the most civil way to make these changes happen. But ultimately science is about data, not people, and we should worry less about personalities and more about methods that produce the best data. Those who have pushed for open science and this renaissance in psychology deserve great credit.

- Chris Ferguson is Professor of Psychology at Stetson University
[email protected]

Key sources

Copenhaver, A. & Ferguson, C.J. (in press). Selling violent video game solutions: A look inside the APA’s internal notes leading to the creation of the APA’s 2005 resolution on violence in video games and interactive media. International Journal of Law and Psychiatry.
Ferguson, C.J. (2015). ‘Everybody knows psychology is not a real science’: Public perceptions of psychology and how we can improve our relationship with policymakers, the scientific community, and the general public. American Psychologist, 70, 527–542.
Fiske, S. (2016). Mob rule or wisdom of crowds [Draft of article for APS Observer]. Available at http://datacolada.org/wp-content/uploads/2016/09/Fiske-presidential-guest-column_APS-Observer_copy-edited.pdf
Gilbert, D.T., King, G., Pettigrew, S. & Wilson, T.D. (2016). Comment on ‘Estimating the reproducibility of psychological science’. Science, 351(6277), 1037.
Nosek, B.A., Ebersole, C.R., DeHaven, A.C. & Mellor, D.T. (2018). The preregistration revolution. Proceedings of the National Academy of Sciences of the United States of America, 115(11), 2600–2606.
Open Science Collaboration (2015). Estimating the reproducibility of psychological science. Science, 349(6251), 1–8.
Nelson, L.D., Simmons, J. & Simonsohn, U. (2018). Psychology’s renaissance. Annual Review of Psychology, 69, 511–534.
Simmons, J.P., Nelson, L.D. & Simonsohn, U. (2011). False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science, 22(11), 1359–1366.
Weir, K. (2014). Translating psychological science. APA Monitor, 45(9), 32. Available at www.apa.org/monitor/2014/10/translating-science.aspx