The Dunning-Kruger effect and its discontents

David Dunning responds to our March cover feature.

The Dunning-Kruger effect suggests that unknowledgeable people lack the very expertise they need to recognise their lack of expertise. They thus overrate their knowledge and performance. Put more technically, deficient cognition (i.e., expertise) leads to faulty metacognition (i.e., self-evaluation of expertise). In contrast, highly expert people underrate their skills socially because they overestimate the knowledge level of their peers (Kruger & Dunning, 1999).

The effect has been uncovered in many settings involving medical residents, gun owners, tournament chess players, debate teams, beginning aviators, and hospital lab techs (Dunning, 2011). In a 2021 study, people who were the least able to separate fake from real news demonstrated little awareness of their failures but the most willingness to trust false news and spread it to others (Lyons et al., 2021).

In a recent Psychologist article (March issue), Robert D. McIntosh and Sergio Della Sala reviewed research arguing that the Dunning-Kruger effect has no real existence but is a mere statistical artefact. They argued that the typical study ‘double-dips’ in its measure of expertise, in that the same task performance measure is used a) to classify people into expert groups and b) to benchmark the accuracy of self-appraisals of expertise. This opens the effect to the statistical artefact of regression to the mean.

However, their discussion of the effect focuses on a slim sliver of work amid 20 years and more of active and varied research on the topic (for an ageing review discussing possible statistical artefacts, see Dunning, 2011). They also oddly ignore scholarship that directly contradicts their assertions.  For example, they cite the original statistical complaint about Dunning-Kruger (Krueger & Mueller, 2002) but not the response that immediately follows it in the same journal (Kruger & Dunning, 2002).

In this limited space, let me present that contrary research. First, regarding double-dipping, the authors fail to mention research that avoids it (Feld et al., 2017). Researchers asked economics students to estimate their performance on a course exam. However, they sorted students on their objective expertise separately, based on grade-point average in previous courses. Although somewhat reduced, the Dunning-Kruger effect survived quite well. We have adopted Feld et al.’s smart technique whenever we can in our ongoing research.
Second, the authors fail to mention that we have examined regression to the mean effects directly across several studies (Ehrlinger et al., 2008; Kruger & Dunning, 2002). The key test is to see whether the effect survives when noise (or measurement unreliability) is reduced or eliminated. The effect remains quite robust even after we do so.

Third, the authors fail to mention research that reveals that something more than statistical artefact is going on and affirms the original metacognitive account for the effect. Jensen et al. (2021) asked respondents, in two separate studies that each involved over 3500 respondents, to gauge their performance on logic or grammar quizzes. They then carefully asked which computational model, based on item-response theory, best anticipated the pattern of misjudgment seen among poor and top performers. The model that worked best included the assumption that poor performers were more prone to judge the accuracy of their answers poorly, as anticipated by our theoretical 1999 framework.

Similarly, Engeler and Häubl (2021) found that people who were worse than average but thought themselves as better than average tended to commit the error of overestimating their performance. Their error was about the self. In contrast, those who performed better than average but thought themselves as worse tended to overestimate others – that is, their predominant error was one in social perception. These findings are consistent with our original theoretical framework and subsequent statistical investigations (Ehrlinger et al., 2008).

Fourth, the authors conflate the idea behind the Dunning-Kruger effect with its concrete measurement, as though research on the effect stopped after Study 2 of our original 1999 paper. There are many other ways to assess whether low performers lack insight into their deficits that avoid regression artefacts. In a class of 95 medical students learning cardio-pulmonary resuscitation (CPR), 36 failed. But only three recognized that failure before seeing a video of their performance (and only 17 after) (Vnuk et al., 2006). In a national survey asking 1300 Americans their opinions about vaccines, roughly a third implausibly claimed to be as knowledgeable about the causes of autism as doctors and scientists. In reality, this overconfident group, who affirmed many myths about autism, demonstrated the least amount of knowledge (Motta et al., 2018).

I apologise for the selectivity in this review. Science communication should humbly nod to when the situation is complicated, and there are some fascinating twists and turns in the story of the Dunning-Kruger effect outlined by over two decades of research that I hope would be a good read. I have also avoided a long-winded critique of the research featured in the original Psychologist article. For example, Nuhfer et al. (2017) may have asked people for nuanced and granular self-assessments, but then they lumped participants into crude categories (e.g., people more or less than 20 points overconfident) that obscured much of the information contained in those fine-grained measures and which limited what they could find.

And, again, there is other on-point research that tells a different story. We have used similar fine-grained self-assessments and replicated the original Dunning-Kruger effect. But more than that, we found psychological conditions that exacerbate or lessen the effect. In one such study, we asked participants financial questions like, ‘If you invested £100 in a bond with a 5% annual interest rate, compounded every year, how much money would you have after 20 years?’ We asked participants to guess the chance that each of their answers was right.

In the example above, the correct answer is £265, which surprises some people, usually because they are not aware of the power of compound interest. Instead, they have a different theory in mind in which money grows along a rigid straight line, and so think the answer is £200. They are quite confident in their answers because they apply a systematic misbelief to their reasoning. Others without such a strong theory are more aware that they do not know. Thus, the Dunning-Kruger effect is most pronounced when people harbour systematic misconceptions about a task (Williams et al., 2013).

And, of course, such a psychological circumstance would not matter if the Dunning-Kruger effect were a mere a statistical artefact.

But one last thought. Often, scholars cite statistical artefacts to argue that the Dunning-Kruger effect is not real. But they fail to notice that the pattern of self-misjudgements remains regardless of what may be producing it. Thus, the effect is still real; the quarrel is merely over what produces it. Are self-misjudgements due to psychological circumstances (such as metacognitive deficits among the unknowledgeable) or are they due to statistical principles, revealing self-judgement to be a noisy, unreliable, and messy business?

Either way, there are potential consequences. Among over 1100 medical students in an obstetrics/gynaecology rotation, those receiving a grade of D+ or lower on their final exam thought on average they would get a much-higher B-. Those getting an A only slightly underestimated their grade as a B+ (Edwards et al., 2003). No matter how these patterns of mistaken self-perception arise, I’d be interested in how they play out as students move on to their medical practice.

David Dunning
University of Michigan

Illustration: Tim Sanders

References
Dunning, D. (2011). The Dunning-Kruger effect: On being ignorant of one’s own ignorance. In J. Olson & M. P. Zanna (Eds.), Advances in Experimental Social Psychology (vol. 44, pp. 247-296). Elsevier.
Edwards, R.K., Kellner, K.R., Sistrom, C.L. & Magyari, E.J. (2003). Medical student self-assessment of performance on an obstetrics and gynecology clerkship. American Journal of Obstetrics and Gynecology, 188(4), 1078-1082.
Ehrlinger, J., Johnson, K., Banner, M. et al. (2008). Why the unskilled are unaware: Further explorations of (lack of) self-insight among the incompetent. Organizational Behavior and Human Decision Processes, 105(1), 98-121.
Engeler, I. & Häubl, G. (2021). Miscalibration in predicting one’s performance: Disentangling misplacement and misestimation. Journal of Personality and Social Psychology, 120(4), 940–955.
Feld, J., Sauermann, J. & de Grip, A. (2017). Estimating the relationship between skill and overconfidence. Journal of Behavioral and Experimental Economics, 68, 18-24.
Jansen, R.A., Rafferty, A.N. & Griffiths, T.L. (2021). A rational model of the Dunning-Kruger effect supports insensitivity to evidence in low performers. Nature Human Behavior, 5(6), 756-763.
Kruger, J.M. & Dunning, D. (1999). Unskilled and unaware of it: How difficulties in recognizing one’s own incompetence lead to inflated self-assessments.  Journal of Personality and Social Psychology, 77(12), 1121-1134.
Kruger, J., & Dunning, D. (2002). Unskilled and unaware—But why? A reply to Krueger and Mueller. Journal of Personality and Social Psychology, 82(2), 189-192.
Lyons, B.A., Montgomery, J.M., Guess, A.M. et al. (2021). Overconfidence in news judgments is associated with false news susceptibility. Proceedings of the National Academic of Science, 118(23), e2019527118.
Motta, M., Callaghan, T. & Sylvester, S. (2018). Knowing less but presuming more: Dunning-Kruger effects and the endorsement of anti-vaccine policy attitudes. Social Science & Medicine, 211(C), 274-281.
Nuhfer, E., Fleisher, S., Cogan, C. et al. (2017). How random noise and a graphical convention subverted behavioral scientists’ explanations of self-assessment data: Numeracy underlies better alternatives. Numeracy: Advancing Education in Quantitative Literacy, 10(1).
Vnuk, A., Owen, H. & Plummer, J. (2006). Assessing proficiency in adult basic life support: Student and expert assessment and the impact of video recording. Medical Teacher, 28, 429–434.
Williams, E.F., Dunning, D. & Kruger, J. (2013). The hobgoblin of consistency: Algorithmic judgment strategies underlie inflated self-assessments of performance. Journal of Personality and Social Psychology, 104(6), 976–994.

BPS Members can discuss this article

Already a member? Or Create an account

Not a member? Find out about becoming a member or subscriber