BPS updates, History and philosophy, Research Ethics

Looking back: The making of an (in)famous experiment

Nestar Russell explores the early evolution of Stanley Milgram’s first official obedience to authority experiment.

26 September 2010

In the early 1960s Stanley Milgram (1963) showed that 65 per cent of a sample of ordinary Americans were willing to inflict potentially lethal shocks on an innocent other. Based on documents obtained from Milgram’s personal archive at Yale University, I was able to retrace some of the important and unmentioned steps that led to this ‘best-known result’ (Miller, 1986, p.9).

We all build on the shoulders of those who came before us, and with respect to Solomon Asch, Milgram’s obedience experiments were no exception. In his renowned 1951 experiment, Asch demonstrated that a third of all participants would conform to a group of confederates in their provision of obviously incorrect answers on a perceptual line judgement task.

When Milgram describes how his basic experimental procedure evolved, the influence of Asch is clear:

I was working for Asch in Princeton, New Jersey, in 1959 and 1960. I was thinking about his group-pressure experiment. One of the criticisms…is that they lack a surface significance, because after all, an experiment with people making judgments of lines has a manifestly trivial content. So the question I asked myself is, How can this be made into a more humanly significant experiment? And it seemed to me that if, instead of having a group exerting pressure on judgments about lines, the group could somehow induce something more significant from the person, then that might be a step in giving a greater face significance to the behavior induced by the group. Could a group, I asked myself, induce a person to act with severity against another person? (Evans, 1980, p.188).

Milgram said he wanted to use group pressure to coerce participants into ‘behaving aggressively toward another person’ (Tavris, 1974, p.80). Later Milgram (1974, p.148) termed such coercive sources of pressure ‘binding factors’: powerful bonds that can entrap a person into doing something they might otherwise prefer not to do. He then imagined a situation like Asch’s experiment, where a naive participant was placed among a group of actors:
…instead of confronting the lines on a card, each one of them would have a shock generator. In other words, I transformed Asch’s experiment into one in which the group would administer increasingly higher levels of shock to a person, and the question would be to what degree an individual would follow along with the group (Evans, 1980, pp.188–189).

But there was a problem: ‘to study the group effect you would also need an experimental control; you’d have to know how the subject performed without any group pressure’ (Tavris, 1974, p.80). Asch resolved the problem of requiring an experimental control by running the line-judgement exercise on participants in the absence of the group. However, Milgram was this time unable to draw from Asch’s legacy because ‘it was not obvious what the inducement would be for a solitary individual to administer shocks in increasing intensities to another person’ (Miller, 1986, p.18). According to Milgram, he started ‘zeroing in on this experimental control [problem].

Just how far would a person go under the experimenter’s orders? It was an incandescent moment… Within a few minutes, dozens of ideas on relevant variables emerged, and the only problem was to get them all down on paper (Tavris, 1974, p.80).

Milgram had his control experiment, but perhaps unwittingly adding ‘orders’ also introduced a new binding factor: a higher-status person trying to impose their will on someone below them in a hierarchical chain of command.

In support of Milgram’s account is an undated archival document (circa 1960) titled ‘Studies in Obedience’ which describes a rudimentary idea to use a shock device with a ‘dial that reads from …light-to-fatal’. He then discusses an initial goal and the main Asch-like coercive technique he intended to deploy to achieve it: ‘In order to create the strongest obedience situation use findings of group dynamics’ (see also Russell, in press). It seems Milgram was aware that to make his mark and capture the attention of academia, he had to develop an experiment that produced an eye-catching result in the first official publication (after which he could pursue numerous variations in an attempt to unravel why so many obeyed). But what was missing was a rationale as to why the group might agree to hurt an innocent person. Another document also titled ‘Studies in Obedience’ (circa 1960) with a sketch of a shock ‘Panel’ (see Figure 1 in PDF version) attempted to address this problem. That is, ‘Because of certain possible hazards’ the group were to undergo a ‘pledge to obey’.

But there is much in this document that Milgram’s post-hoc account has failed to mention: a ‘War Situation’ where one is to adhere to a ‘pledge to obey’ and all are given a Himmler-like ‘Waver [sic] of responsibility’, all ‘For Germa[n]y’. That Milgram’s concerns about the Holocaust – where ordinary Germans later frequently argued they were just following orders – provided the inspiration to invent the obedience experiments has been established (Miller, 1986, p.17). However, the above document illustrates that early in the formulation of his idea he was also attempting to ‘cut and paste’ into the controlled laboratory setting many of the Nazis’ tried and tested techniques of coercion. But in order for Milgram to achieve his unofficial initial goal to ‘create the strongest obedience situation’, were participants likely to accept a transparently Nazi-sounding ‘pledge to obey’ orders to inflict severe shocks on an innocent person? The changes that followed would suggest not.

Milgram knew that deceiving participants into thinking they were inflicting shocks on another person was internally likely to generate what he termed strain: intense feelings of tension. He also understood such feelings might detract from his initial goal to create ‘the strongest obedience situation’. Milgram countered such feelings by introducing what he would later term strain-resolving mechanisms: measures intended to reduce the tension normally associated with inflicting harm (Milgram, 1974, pp.153–164). For example, instead of a ‘pledge to obey’, Milgram revealed in his first research proposal (dated October 1960) a new idea: ‘Obviously some acceptable rationale must be provided’ for inflicting shocks and this was now to be ‘achieved by setting the experiment in a context of “social learning”.’ By contributing to some greater good, Milgram had transformed the infliction of harm from ‘something evil’ (shocking an innocent person) into something ‘good’ (advancing human learning) – a strain-resolving conversion process Adams and Belfour (1998, p.xx) termed moral inversion. In this proposal Milgram presented a sketch of the proposed shock device (see Figure 2 in PDF version).

The proposal also mentioned that participants were to be run through the procedure as one of several members of a group or alone. He presumed the group variation ‘will cause the critical subject to comply with the experimental commands to a far higher degree than in the “alone” situation’ and that, although they might be interesting, the latter’s primary purpose was to ‘serve as necessary controls for the group experiments’. Building more on Asch’s than his own legacy, Milgram’s ‘Obedience and Group Process’ experiments constituted at this time ‘the major concern of the present research’.

To assess the idea’s viability, Milgram soon after had some of his students build a shock machine (Figure 3 in PDF version), hone the experimental procedure and run the first pilot studies.

The only ‘group’ pilot Milgram later discussed confirmed his earlier prediction that ‘certain persons will follow the group’ to the end of the shock board. However, the first test runs of the alone control left him ‘astonished’ (as cited in Blass, 2004, p.68). Something about the experimenter’s commands seemed to render them a far stronger binding factor than he had anticipated.

Perhaps to advance his own legacy, rather than contributing to Asch’s, from this point onwards the group force variations were relegated from dominating the research programme to consisting of a couple of minor variations. The ‘alone’ variations were now to be the main focus.

But there was ‘something’ about the student-run pilots that Milgram ‘was never conviced [sic] of’. He suspected a general lack of professionalism might have tipped some participants off that it was all a ruse. It was of crucial importance that in the official series all participants were convinced the learner was being shocked.

However, one would expect that the more believable the experiments, the more resistant to obeying participants would become. This potential obstacle could defeat Milgram’s initial goal to produce a strikingly high completion rate. His solution to this potential problem seems to have been to bombard participants with an array of binding factors and strain-resolving mechanisms that might increase their probability of completing (see Russell, in press).

For example, in a document dated December 1960, Milgram noted that some participants in the pilot mentioned it was the learner’s prerogative to ‘leave whenever he wants to’, and this belief may have emboldened them to stop. Drawing upon his earlier Himmler-like ‘Waver [sic] of responsibility’, Milgram attempted to reduce the participants’ tension regarding their continued participation by proposing:
…the following change should be made; … Possible conversation: … EXPERIMENTER: I Have responsibility…go on with the experiment.

On 25 January 1961 Milgram completed a second research proposal, which presented several potentially fruitful variations on the basic experimental procedure that, after observing the first pilot studies, Milgram suspected might shed light on why so many participants completed the basic procedure. The variation mentioned first was stimulated by an observation where some participants looked away from the learner, who they could see dimly through a window (Milgram, 1974, pp.33–34), yet continued inflicting shocks. It seemed: ‘…the salience of the victim may in some degree regulate their performance. This can be tested by varying the “immediacy” of the victim’.

After receiving funding in May 1961, Milgram prepared for a second series of pilot studies and soon after informed his research assistant that the new and improved ‘[shock] apparatus is almost done and looks thoroughly professional’.

In the second research proposal, while alluding to his initial goal, Milgram asked: ‘if one is trying to maximize obedience, is it better to inform a person of the worst of what he may be asked to do at the outset, or is compliance best extracted piecemeal?’ Going by the increasing number of switches in Milgram’s successive envisioned and actual shock machines from 9 (Figure 1) to 10 (Figure 2) to 12 (Figure 3) to 30 (Figure 4), it would appear Milgram saw merit in the latter. It could be argued these changes represented the inclusion (and extension) of another binding factor that would later become known as the foot-in-the-door technique (Freedman & Fraser, 1966). This is where persons are more likely to agree to a significant request if it is preceded by a comparatively insignificant request.

In late July 1961 Milgram embarked on a second series of pilots that aimed to both eliminate participants’ penetrating ‘the cover story’ and to trial his recent idea to vary the ‘“immediacy” of the victim’. The two variations piloted were the ‘voice feedback’ condition (learner could be heard shouting only) and the ‘no feed back’ condition (after being strapped into the electric chair, the learner could not be seen or heard at all). In the latter condition: ‘virtually all subjects, once commanded, went blithely to the end of the board’ (Milgram, 1965, p.61). 

Thus, by the final pilot Milgram had discovered how he could achieve his initial goal of maximising the completion rate. But near total obedience: deprived us of an adequate basis for scaling obedient tendencies. A force had to be introduced that would strengthen the subject’s resistance to the experimenter’s commands (Milgram, 1965, p. 61). 

In the first official experiment Milgram decided participants were to experience a little perceptual feedback – auditory stimulation in the form of wall-banging at the 300- and 315-volt shocks. The intention of this procedural adaptation was to slightly increase the intensity of strain (instead of his usual approach of reducing tension).

On 7 August 1961 Milgram felt confident enough to run the first official ‘remote’ condition, generating his ‘best-known’ 65 per cent completion rate. In light of the subtle changes, he probably expected a slightly higher completion rate. Nonetheless, with most participants inflicting every shock, he had still achieved his initial goal of maximising the completion rate. And this result became the centrepiece of Milgram’s (1963) (in)famous publication ‘Behavioral study of obedience’ and had its intended effect.

What can Milgram’s study tell us about experimental psychology? Milgram was engaging in the ad hoc trial-and-error exploratory method of discovery, where ‘a scientist has no very clear idea what will happen, and aims to find out. He [sic] has a feeling for the “direction” in which to go (increase the pressure and see what happens) but no clear expectations of what to expect’ (Harré & Second, 1972, p.69). This is how many major discoveries occur in the pure sciences: often more by accident than design. Dynamite was very unlikely to have come about from hypothesis testing!  As Milgram (cited in Evans, 1980, p.191) said: ‘Many of the most interesting things we find out in experimentation you don’t learn until you carry it out.’

There was nothing underhanded about this approach; as Miller (1986, p.45) has pointed out: ‘Given that there was virtually no previous systematic research on obedience, it was understandable that Milgram’s focus was essentially in a context of discovery or exploration rather than confirming or disconfirming specific hypotheses’.

As the pictures in this article show, Milgram’s indisputably creative idea emerged gradually. Initially it was weak but over time it became a more viable, engaging and truly fascinating project. When reading Milgram’s publications one would be forgiven for thinking that he must have woken up one morning with the complete procedure in his head then ran the procedure later on that day. Milgram was clever, but not that clever!

His piloting studies were the seldom mentioned tool that clearly led Milgram to his most fascinating results. Finally, it is important to reiterate that although Milgram may have played an active role in maximising the completion rate in the Remote condition, the official series of experiments were still methodologically very tight. As mentioned, Milgram did not find the student-run pilots totally convincing and it was very important to him that the participants in the official research programme really believed the learner was being shocked. Methodologically, it would seem to me that the obedience research is a very robust series of experiments, and in part that is perhaps why their influence is still felt almost half a century later.

- Nestar Russell is at Victoria University of Wellington
[email protected]