Finding the magic number

Paul Wilson and Colin Cooper investigate methods used to extract the number of factors in a factor analysis
Factor analysis is the most widely used (and abused) family of data- structuring techniques to be found in psychological research. One of the most important considerations when performing factor analysis is determining how many factors are present. It is important because if either too few or too many factors are extracted, the rotated solution may make little sense, or even be misleading. A number of procedures are currently available to help make this decision; however, those most commonly used are inappropriate and not based on sound statistical theory. We will review four common methods; discuss the risks and benefits associated with each, and the practicalities of performing them. All are based on an initial principal components analysis of the data.

‘Greater than one’
Recently, Costello and Osborne (2005) found no fewer than 1700 studies that used some form of exploratory factor analysis in a two-year review of PsycINFO. The majority of these used the Kaiser–Guttman ‘Eigenvalues greater than one’ criterion, (Guttman, 1954; Kaiser, 1960, 1970) to determine the number of factors. Such a revelation would raise many psychometricians’ eyebrows.

In the latest edition of the psychometrician’s bible Psychometric Theory, Nunnally and Bernstein (1994) reason that the Kaiser–Guttman rule is simply ‘that a factor must account for at least as much variance as an individual variable’. In other words, the average of all eigenvalues is one, and factor analysis should extract those factors with an eigenvalue greater than this average value. For simplicity, it may be useful to think of eigenvalues as indicators of the variance explained by a factor. The Kaiser–Guttman rule, therefore, is arbitrarily based on the assumption that factors with ‘better than average’ variance explanation are significant, and those with ‘below average’ variation explanation are not.

Indeed, we should expect from any dataset (importantly, this includes random datasets), that there will be factors that explain a ‘greater than average’ variance, and similarly some that explain ‘below average’ variance. Therefore, with random data, the number of factors with eigenvalues greater than one will be the same as half the number of items making up that dataset. In other words, with random data comprising x items, Kaiser–Guttman will find x/2 factors… It’s time to worry about this method!The technique has also been shown to be sensitive to properties of a dataset other than the number of factors, with a tendency to consistently overestimate the number of factors. Nunnally and Bernstein highlighted that the more variables a dataset contained, the weaker the Kaiser–Guttman threshold became.A factor whose eigenvalue equals 1.0 accounts for 10 per cent of the variance when composed of 10 variables, but only 5 per cent of variance when composed of 20. Given such vulnerability, it is no wonder that this method is not recommended for use (see Cooper, 2002, p.124; Nunnally & Bernstein, 1994, p.482; and Pallant, 2005, p.183).

Finding the elbow
Cattell’s (1966) scree test is another factor estimation technique reliant on eigenvalues. Also known as the ‘above the elbow’ approach, it uses relative, rather than absolute eigenvalues. Factors have their eigenvalues plotted alongside each other (y-axis) in order of magnitude. Insignificant factors explaining little variance (and therefore low eigenvalues) will form a near-straight line towards the right of the graph. Factors explaining large amounts of variance will appear above the line to the left of the graph (see PDF).

The number of factors contained within the data is indicated by the number of points ‘above the elbow’ of the straight line. The obvious criticism of this method is its subjectivity, which is all too often frowned upon by the dogma of modern-day psychology. Nonetheless, it can be useful, as it allows a visual examination of a data structure, but only in accompaniment to a more statistically robust technique to provide that magic number of factors.

Mapping the right direction?
Velicer’s (1976) minimum average partial, or MAP method differs from methods mentioned so far, in that it has a much sounder theoretical rationale and is consequently more complex to compute. MAP produces a one-factor solution to a dataset and calculates an associated index based on the (average-squared) residual correlations of that one-factor solution. A residual correlation can be best thought of as a correlation indicating ‘left-over’ variance that could not be explained by the single-factor solution. The higher this index, the more variance is left unexplained by the factor. This process is then repeated for a two-factor extraction, then a three-factor extraction, and so on, with the index of residual correlations providing an indication of the amount of variance that goes unaccounted for in an extraction of x factors. This index will show the number of factors (x) that can be extracted to account for the maximum amount of variance within the dataset (i.e. the lowest residual correlation index). This is a primary objective of factor analysis: to account for and structure appropriately as much of the variation within a dataset as possible. So should we all use Velicer’s MAP then? Unfortunately, MAP has been shown to underestimate the true number of factors (Hayton et al., 2004), but may be more accurate than the Kaiser–Guttman or Cattell’s scree (Zwick & Velicer, 1986, p.440).

The crème de la crème
Finally, we look at what seems to be the crème de la crème of tests for the numbers of factors: Horn’s (1965) parallel analysis. This generates many, many sets of random data of the same appearance as the experimental data (same number of participants, same number of variables, etc.). It then factor analyses each set of random data and collates the resulting eigenvalues. This shows how big the first,second, third, etc. eigenvalues typically are when the null hypothesis is actually true (i.e. it shows how large a first-eigenvalue one can expect to find by chance, when in reality there are no factors present in the data). If the eigenvalue for the first factor is larger for the experimental dataset than for the random data, one can conclude that there is at least one factor present in the experimental dataset. If so, one considers whether the second eigenvalue from the experimental dataset is greater than its simulated counterpart, and so on.

Rather than just checking whether the eigenvalue from the experimental dataset is larger than the average of the simulated eigenvalues, it is becoming more common to scrutinise the sampling distribution of the simulated eigenvalues. This allows one to determine whether there is less than a 5 per cent chance that the first eigenvalue from the dataset could have occurred if, in reality, there are no factors in the data. If there appears to be one factor present, the real and simulated eigenvalues for the second, third, etc. factors are compared, until sooner or later the real dataset produces an eigenvalue that is no larger than one would expect by chance. Thus the number of eigenvalues before this point is indicative of the number of ‘true’ factors contained within the experimental data – three factors in the example shown in the Table above.

Time to go beyond the default
The conclusions of this review are far from new. Zwick and Velicer (1986) compared these four methods using simulated data of various dataset properties and found MAP and parallel analysis to be the most accurate methods, with Kaiser–Guttman being least accurate with consistent overestimation. So why have two theoretically grounded methods of estimating the number of factors been more often than not been tossed by the wayside in favour of lesser methods such as Kaiser–Guttman or Cattell’s scree?

Maybe it is because these lesser methods are commonly defaults within statistics software packages. For example, SPSS by itself can only offer Cattell’s scree plot and Kaiser–Guttman methods. So ‘if SPSS can’t do it, I can’t do it’? Think again! There are numerous psychologist-friendly factor analysis programs out there, for free! ‘FACTOR’ (Lorenzo-Seva & Ferrando, 2006) is a very straightforward freeware program that computes MAP and parallel analysis at the tick of a box. You can even do parallel analysis with SPSS by downloading a macro from the internet. If macros aren’t your thing, you may consider a freeware program by Watkins (2000) that will calculate random eigenvalues to compare with SPSS’s output. The program simply requires the number of participants and variables in your experimental data and how many random datasets you want calculated before averaging. Those who use R for their statistics will find parallel analysis packages such as the very recent ‘paran’ package, Dinno (2008) on the CRAN website.

Journal editorial policies are coming up to speed with factor analytic theory, with many now not accepting papers using Kaiser–Guttman and Cattell’s scree methods alone. Our hope for this article is to encourage those not yet sure about MAP and parallel analysis methods to give them a try – they are not as daunting as they first seem.

BPS Members can discuss this article

Already a member? Or Create an account

Not a member? Find out about becoming a member or subscriber