The persistent irony of the Dunning-Kruger Effect

Robert D. McIntosh and Sergio Della Sala explore issues around overconfidence and expertise.

When Donald Trump suggested that we could treat Covid-19 by injecting disinfectant, it was hard to know whether to be more astonished by his ignorance in forming the idea or his arrogance in voicing it. And one could only sympathise with Dr Deborah Birx, the pandemic task force coordinator and viral immunologist, squirming in her seat a few feet away. There could hardly be a more vivid illustration of the idea that those with the least expertise in a domain are the most overconfident, whilst true experts are more modest.

Variations on this theme have been expressed by eminent thinkers throughout the ages, for example Charles Bukowski in the illustration above (and for more, see box below). It has also featured in the psychological literature, starting from a 1999 paper by Justin Kruger and David Dunning, from whom it takes its popular name: the Dunning-Kruger Effect (DKE).

Unskilled and unaware

Kruger and Dunning’s now-famous paper was entitled ‘Unskilled and unaware of it’. It reported four studies, in which people performed tests of intellectual or social reasoning, and then estimated how well they had done, as a percentile relative to others, or by guessing their actual score. The numbers were analysed by a peculiar method. People were divided into four tiers (quartiles) of ability based on ranked performance, and then actual and self-estimated performance were plotted for each quartile, resulting in the graph on the left of Figure 1. For all four studies in the paper, and for many studies since, the same general pattern emerged. Poor performers overestimated themselves much more than did those who were truly competent. This is the signature pattern of the DKE.

Figure 1. The true figure DKE figure (A) shows results from Kruger & Dunning’s (1993) Study 1, a humour judgement task (1999, p.1124). Participants performing in the bottom quartile grossly overestimated their ability. Top quartile performers underestimated themselves, but to a lesser degree. Note that people in general ranked themselves as better than average (above the 50th percentile). 

Online explanations (B) of the DKE are more often accompanied by this fictional figure, not from real data, which plots confidence against competence. Mt. Stupid is an artistic invention. The message seems to be that ‘a little learning is a dangerous thing’, a well-known line from a poem by Alexander Pope (and subsequently sung by Frank Sinatra amongst others).

Kruger and Dunning (1999, p.1130) suggested this effect to be ‘a psychological analogue to anosognosia’, referring to a profound lack of insight that can accompany stroke or dementia, which leaves people unaware of their physical or cognitive impairments (see Mograbi & Morris, 2018). They argued that, for many intellectual activities, the skills that we need to judge our own performance are exactly the same as the skills that we need to perform the task. For example, the skills needed to judge whether a sentence is grammatical are the same as those needed to compose grammatical sentences. Unskilled people suffer a ‘dual burden’ because they lack ability for a task, and this also robs them of the ability to recognise it. Accordingly, people of low ability have a ‘metacognitive’ deficit, a blind spot for their own errors, which makes them overconfident. The former President’s ignorance and his arrogance would be two sides of the same dual burden.

The DKE in psychology and beyond

In 2000, Dunning and Kruger were awarded the IgNobel Prize – honouring achievements that ‘first make people laugh, and then make them think’ – for ‘their modest report’. (The 2017 Ig Nobel Prize ceremony further honoured the DKE by including the premiere of a mini-opera, ‘The Incompetence Opera: A musical encounter with the Peter Principle and the Dunning-Kruger Effect. And the word so’.) The original paper now has nearly 7000 citations in Google Scholar. The classic pattern, of gross overestimation amongst the least skilled people, has been replicated for diverse domains, from laboratory participants performing logical reasoning tasks, to students sitting real exams (see Dunning, 2011, 2017). The DKE has been extended to the domain of political beliefs (Hall & Raimi, 2018), and applied to explain the confident endorsement of medical misinformation amongst vaccine sceptics (Motta et al., 2018).

It has also permeated the wider culture. Web searches on the ‘Dunning-Kruger effect’ return a bewildering array of hits, from popular science sources to online encyclopaedias, business blogs and political punditry. The DKE is often portrayed in terms of general intelligence, rather than domain-specific expertise, as epitomised by a YouTube clip of John Cleese explaining that stupid people are too stupid to know they are stupid. On social media, #DunningKruger is a pithy way to disparage the intellect of those we disagree with.

Double-dipping and regression to the mean

At the same time, in the academic literature, it has been suggested that the signature pattern of the DKE (Figure 1A) might be nothing more than a statistical artefact. In a typical study, people’s tendencies to under- or overestimation are analysed as a function of their ability for the task. This involves a ‘double dipping’ into the data because the task performance score is used once to rank people for ability, and then again to determine whether the self-estimate is an under- or over-estimate. This dubious double-dipping makes the analysis prone to a slippery statistical phenomenon called ‘regression to the mean’.

Regression to the mean is most often described for situations where we measure the same thing twice over time. Extreme scores on the first occasion ‘regress’ towards the mean or average level on the second occasion, not due to any active process, but just because of the dumb mechanics of chance. Consider that, if you throw one die and roll a six (or a one), then you will probably have a less extreme score when you roll again, just by chance. The same general principle applies for any two measures in which random factors play a role.

There is a superstitious fear amongst US athletes that being featured on the cover of Sports Illustrated magazine causes a subsequent slump in form (Smith, 2002). We do not need to believe in malediction to explain this ‘Sports Illustrated jinx’ we need only realise that sports people get selected for the cover at the very peak of their form, from which the only way is down. Imagine an athlete running 100 meters in 10.45 seconds. She would set a new world record, and she might feature on the cover, but it is unlikely that she would beat or even match her record next time. Her performance would seem to take a turn for the worse, giving credence to the notorious jinx. The Nobel Prize-winning psychologists Daniel Kahneman and Amos Tverseky (1973) have highlighted these sorts of regression artefacts as amongst the most perilous of pitfalls for causal reasoning.

The problem is pervasive, because regression to the mean does not only happen over time; it occurs between any two measures that are less than perfectly related (e.g. Chen & Chen, 2021). Imagine that we measure the height and weight of 100 people. Taller people tend to be heavier than shorter people, but the relationship is far from perfect, so the 25 shortest people will not also be the 25 lightest. This means that the shortest people will rank on average higher for weight than they do for height, and the tallest people will rank lower for weight than they do for height. If we notice this pattern, we may come up with causal explanations for why short people are relatively overweight, and tall people relatively underweight. But these explanations would be as fallacious as the Sports Illustrated jinx; there is really nothing to explain beyond the fact that height and weight are imperfectly related.

A similar origin has been proposed for the DKE (Krueger & Mueller, 2002). Self-estimates are often uncertain, so they relate quite weakly to actual performance, giving plenty of room for regression to play. It is even worse if the task is on a bounded scale, like a percentile, because anyone scoring at either extreme would be able to misestimate in one direction only. If we focus on the people with the poorest performance, we will inevitably find that they rank higher in terms of their self-estimated scores, and we may say that they over-estimate themselves. We may even be tempted to give a causal explanation such as the dual-burden account, which proposes that unskilled people have a metacognitive deficit. But a simpler explanation would be dumb old regression to the mean.

The ‘better-than-average’ effect, and the DKE

One objection to the idea that the DKE is a regression artefact is that poor performers over-rate themselves by a lot, whilst top performers may under-rate themselves by a little, but still have reasonably accurate self-estimation (Figure 1A). If the DKE were due to regression of self-estimates towards the mean, then surely it should be symmetrical, with overestimation at the bottom end matched by an equal and opposite underestimation at the top?

Well, not necessarily, because it depends on what mean level the self-estimates are regressing towards. If the overall mean self-estimate is the same as the mean actual performance then we would expect a symmetrical pattern, but if they are different, then we would not. In Figure 1A, the mean actual percentile is 50 per cent, but the mean self-estimate is more like 65 per cent. Regression of self-estimates towards this high mean level would produce the asymmetrical pattern of mis-estimation that we see.

This figure provides one instance of a curious phenomenon: that people in general rate themselves as better-than-average, across quite a wide range of tasks. It’s an effect sometimes known as illusory superiority. A well-known example is that the most drivers think they are better than average, though only half of the driving population could truly be in the top 50 per cent. In a DKE analysis, a general tendency for people to over-rate themselves would ensure that self-estimates regress towards a high number, so that top performers’ self-estimates would seem close to the truth, whilst the bottom performers would seem overconfident.

This idea was ingeniously tested by Katherine Burson and colleagues, in 2006. They noted that people tend to think they are better than average at tasks (like driving) that seem fairly easy and non-specialist, but rate themselves more pessimistically for tasks that seem difficult. Burson and colleagues ran a study in which they stepped up the difficulty of quiz questions by requiring very precise answers. People generally rated themselves pessimistically for this hard quiz, so the worst performers had fairly accurate self-estimates, whilst the top performers greatly underestimated themselves. This is a reversal of the classic DKE, because it now looked like the most skilled people had the metacognitive deficit. The dual-burden account cannot explain this reversal, but regression to the mean easily can.

Measuring metacognition

The dual-burden theory says that our ability to think, and to think about our thinking, are tightly coupled, so metacognition marches in lockstep with cognition.

If our cognition is faulty, then our metacognition will be too, and we may be the last person to know it. Unfortunately, the DKE pattern itself provides no good evidence for this theory, not only because of double-dipping, but because typical self-estimation studies do not really measure metacognition. Asking someone to give one number to say how well they did at a task, or to rank themselves relative to others, may involve some degree of self-awareness, but it also involves guesswork and approximation, and there is no easy way to unpick these factors.

A better approach is to use more granular methods of self-estimation. Nuhfer and colleagues (2017) tested more than a thousand people in science faculties, from novice undergraduates through graduate students to professors. Everyone did a detailed self-assessment of their competence for multiple areas in the practice of science and sat an equally diverse skills test. Given this more detailed self-assessment, people were generally pretty accurate, although the professors had a tighter range of estimation errors than the naïve undergraduates. Extreme estimation errors were very rare, and they were just as often underestimates as overestimates. This study concluded that experts may develop a more precise picture of their competencies, but there was no systematic overconfidence amongst their less skilled counterparts.

Another promising procedure is to get people to give multiple confidence ratings whilst they do a task, as a kind of metacognitive commentary. By analysing how confidence relates to performance over many trials, we can get better measures of metacognition. We have applied this strategy to some very simple tasks, in which people pointed at dots flashed on a touchscreen (McIntosh et al., 2019). We collected overall self-estimates of accuracy, and we replicated the classic DKE pattern. We also had people rate their performance after every trial so that we could extract measures of metacognitive insight. Low-accuracy people did indeed have poorer metacognitive insight than the high-accuracy people, but these metacognitive differences were not responsible for the DKE.

One way that we showed this was through a further experiment where we first measured people’s ability in pointing to dots, and we then adjusted the dot sizes, giving bigger dots to low-accuracy people and smaller dots to high-accuracy people, so that everyone ended up hitting the dot around half the time. Once we had equated the hit rates in this way, the low-accuracy people did not overestimate their success any more than the high-accuracy people, yet the metacognitive differences between them persisted. This study suggests that low-ability may indeed be associated with poorer self-monitoring, but that this is not the source of the famous DKE, which is just a regression artefact.

We next need to ask whether these conclusions extend to the higher-level intellectual skills for which the DKE is usually studied. Future studies can take advantage of rigorous methods developed recently for the analysis of metacognitive ability. The study of metacognition is a growing area in cognitive neuroscience. Theoretical developments have led to formal computational models of metacognition, as a ‘second-order’ process that allows the brain to monitor its own cognitive (or perceptual) performance (Maniscalco & Lau, 2012; Fleming & Daw, 2017). The essence of good metacognition is that a person will be more confident when they are performing well than when they are performing poorly, showing that they are sensitive to the quality of their performance. This ‘second-order’ model is different from the simple idea proposed in the dual-burden theory of the DKE, in which metacognition is said to depend on exactly the same processes as cognition (this would be a ‘first-order’ model of metacognition). The emerging field of metacognition has recently been summarised in Stephen Fleming’s very accessible 2021 book, Know Thyself.

The persistence of the DKE

The persistent popularity of the DKE meme may say more about our psychological foibles than the DKE itself does. Excessive self-confidence is more noticeable (and objectionable) than excessive modesty, so it may appear to require a special explanation. Our pattern-seeking minds find it easier to embrace a causal explanation with a satisfying human story (‘stupid people are too stupid to know they are stupid’) than an impersonal statistical force (regression-to-the mean). The causal story is then protected by a confirmation bias, whereby we count supporting examples in its favour, whilst explaining away the counter-examples. If we see an expert wracked by self-doubt, we may attribute this to a different phenomenon (‘imposter syndrome’), failing to notice that this contradicts the idea that competence begets self-knowledge.

There is a self-satisfied air of superiority when the DKE is ‘weaponised’ as a way to dismiss the views of others, or to impugn their intelligence. Calling someone out as a victim of the DKE seems watertight, because any denials can be taken to confirm the lack of awareness that the insult implies. But the irony is that to namecheck the DKE in this way shows a lack of awareness of the evidence, so any air of superiority is only an illusion.

BOX: Expressions of the DKE

Peter Baskerville: ‘The ignorant are ignorant of their ignorance.’

Daniel J. Boorstin: ‘The greatest enemy of knowledge is not ignorance, it’s the illusion of knowledge.’

Charles Bukowski: ‘The problem with the world is that intelligent people are full of doubts, while the stupid ones are full of confidence.’

Confucius: ‘Real knowledge is to know the extent of one’s ignorance.’

Charles Darwin: ‘Ignorance more frequently begets confidence than does knowledge.’

Mehmet Murat İldan: ‘The self-confidence of the ignorant is one of the biggest disasters of the humanity!’

Bertrand Russell: ‘… those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision.’

Shakespeare: ‘The fool doth think he is wise, but the wise man knows himself to be a fool.’

George Bernard Shaw: ‘He knows nothing; and he thinks he knows everything. That points clearly to a political career.’

Lao Tzu: ‘To know that you do not know is the best. To think you know when you do not is a disease. Recognizing this disease as a disease is to be free of it.’

Voltaire: ‘He must be very ignorant for he answers every question he is asked.’

 

BOX: The pandemic and the DKE

Meanwhile, the pandemic has provided food-for-thought concerning the relationship between competence and confidence. Just as Donald Trump opined on the benefits of internal disinfectant, so the Chechnyan leader Ramzan Kadyrov has advocated garlic-chewing to ward off the virus, and Boris Johnson boasted about shaking hands ‘with everybody’. Such displays of confidence are less than inspiring; we would presumably like our leaders to be able to judge when it is appropriate to be confident and when it is not.

Endorsement of unproven information about Covid-19, and sceptical attitudes to the virus and to vaccination, have inevitably been branded with the stamp of the DKE. A recent study confirmed that the people with the least knowledge about the virus over-rate their knowledge the most, suggesting a ‘self-illusion’ of medical expertise (Claessens et al., 2021). As is typical of such studies, the data look a lot like regression to the mean, and may tell us very little about cognitive or metacognitive ability. People might be uninformed, misinformed, or suspicious of mainstream messaging, but this does not mean they are too stupid to know they are stupid or suffering illusions of superiority. Casting such complex issues in terms of the DKE is likely to create divisive debates, rather than to illuminate the factors involved.

 - Robert D. McIntosh and Sergio Della Sala, Human Cognitive Neuroscience, Psychology, University of Edinburgh, UK. [email protected] and [email protected]

Illustration: Marcelina Amelia

See also 'The trick to getting confidence right'

Key sources 

Burson, K., Larrick, R. & Klayman, J. (2006). Skilled or Unskilled, but Still Unaware of It: How Perceptions of Difficulty Drive Miscalibration in Relative Comparisons. Journal of Personality and Social Psychology, 90, 60-77.

Chen, H. & Chen, S. "Regression to the mean". Encyclopedia Britannica. https://www.britannica.com/topic/regression-to-the-mean. Accessed 23 August 2021.

Claessens, A., Keita-Perse, O., Berthier, F., Raude, J., Chironi, G., Faraggi, M., Rousseau, G., Chaillou-Opitz, S., Renard, H., Aubin, V., Mercier, B., Pathak, A.,  Perrin, C., Claessens, Y-E. (2021). Self-illusion and medical expertise in the era of COVID-19. Open Forum Infectious Diseases,8, (4), ofab058, https://doi.org/10.1093/ofid/ofab058

Dunning, D. (2017). We are all confident idiots. Pacific Standard, updated June 14thhttps://psmag.com/social-justice/confident-idiots-92793

Dunning, D. (2011). The Dunning-Kruger effect. On being ignorant of one’s own ignoranceAdvances in Experimental Social Psychology (1st ed., Vol. 44). Elsevier Inc.

Fleming, S. M. (2021). Know Thyself: The Science of Self-Awareness. Basic Books.

Fleming, S. M., & Daw, N. D. (2017). Self-evaluation of decision-making: A general Bayesian framework for metacognitive computation. Psychological Review124(1), 91.

Hall, M.P. & Raimi, K. (2018). Is belief superiority justified by superior knowledge? Journal of Experimental Social Psychology76, 290-306. 

Kahneman, D., & Tversky, A. (1973). On the psychology of prediction. Psychological Review 80.4: 237.

Krueger, J., & Mueller, R. A. (2002). Unskilled, unaware, or both? The better-than-average heuristic and statistical regression predict errors in estimates of own performance. Journal of Personality and Social Psychology82(2), 180–188.

Kruger, J., & Dunning, D. (1999). Unskilled and unaware of it: how difficulties in recognizing one’s own incompetence lead to inflated self-assessments. Journal of Personality and Social Psychology77(6), 1121–34.

Maniscalco, B., & Lau, H. (2012). Short Communication A signal detection theoretic approach for estimating metacognitive sensitivity from confidence ratings. Consciousness and Cognition, 21(1), 422–430.

McIntosh, R. D., Fowler, E. A., Lyu, T., & Della Sala, S. (2019). Wise up: Clarifying the role of metacognition in the Dunning-Kruger effect. Journal of Experimental Psychology: General, 148(11), 1882–1897.

McIntosh, R.D., Moore, A., & Della Sala, S. (In press). Skill and self-knowledge: do unskilled people really lack insight? Registered Report. Royal Society Open Science.

Mograbi, D. & Morris, R. (2018). Definitions: Anosognosia. Cortex, 103, 385-386.

Motta, M., Sylvester, S. & Callaghan, T. (2018). Knowing less but presuming more: Dunning-Kruger effects and the endorsement of anti-vaccine policy attitudes. Social Science & Medicine, 211, 274–281.

Nuhfer, E., Fleisher, S., Cogan, C., Wirth, K., & Gaze, E. (2017). How Random Noise and a Graphical Convention Subverted Behavioral Scientists' Explanations of Self-Assessment Data: Numeracy Underlies Better Alternatives. Numeracy: Advancing Education in Quantitative Literacy, 10(1).

Smith, T. (2002). That old black magic millions of superstitious readers-and many athletes-believe that an appearance on Sports Illustrated's cover is the kiss of death. But is there really such a thing as the SI jinxVault, Jan 21st.  

BPS Members can discuss this article

Already a member? Or Create an account

Not a member? Find out about becoming a member or subscriber