Exploratory factor analysis (EFA)

A statistical approach that hopes to simplify multivariate data, i.e. data where we have values on many variables (typically 5 to 500 or more) from a set of independent observations. In our fields this the values are almost always scores on the items of a questionnaire (or occasionally a rating system) so the items are the variables and the independent observations are from different people each providing information once (if more than once the observations are no longer independent).

Details #

The big idea behind EFA is that the statistical methods look at the correlations of the scores on the k items from the n people and aim to separate the variance, i.e. the differences between the scores to fit a model in which items have shared variance on “common factors” and uncorrelated noise (“specific factors” i.e. variance specific to each item and not shared with any other item). Suppose you have a measure with 14 items, seven supposed to be tapping into “anxiety” and seven supposed to be tapping into “depression”. (Yes, that’s the Hospital Anxiety and Depression Scales (HADS) questionnaire.) Here the common factor idea is that all the n participants each have a “latent” (unmeasurable) level of anxiety and a separate (but perhaps correlated) level of depression. Those unmeasurable levels are two supposed “common factors” common to all n people. However, the model is that no item is a perfect reflection of either factor: the extent to which other things, quirks of individual language usage, impacts of other things, contribute to the variance across the n participants that does not come from either supposed “common factor” or “latent variable” makes up the specific factor for each item.

If n is large enough a computer can do the statistical analysis and give you values for the loadings of each item on each of the common factors and variances for each specific factor.

“If n is big enough” is real issue but I like a very rough guide that it should be at least 50 and at least 40 times the number of common factors you think are contributing to response to the measure so if we are looking at two common factors here your n should be over 80. You still see papers saying that n should be a multiple of the number of items, often 5 or 10 times. This has been demostrated to be completely statistically wrong since the 1980s but it’s one of those myths that seems hard to kill off!

What are these loadings? They indicate how strongly, in the model applied, the item seems to be reflecting input from each of the common factors (a.k.a. latent variables). For the HADS you would expect to see quite large values (there’s another story but say bigger than .4 assuming typical statistical methods) and you’d expect the anxiety items to have such strong loadings against one of the factors and low loadings against the other, and vice versa for the depression items.

There are issues about how many common factors will be fitted to the data and ways to decide the best number to consider. There are issues about the actual statistical method to use (but generally that choice of method has very little impact on the loadings and interpretation). All EFA methods start out finding a fit in which the common factors are uncorrelated: so called “orthogonal” factors and there are then ways that a computer can rotate things to maximise ease of interpretation. “Orthogonal rotations” keep the factors uncorrelated but “oblique rotations” allow that the common factors might be correlated so oblique rotations of a two factor EFA of the HADs generally fit rather well to data from it and in datasets of item responses from either non-help-seeking and help-seeking people tend to find high (.6 to .8) correlations between the common factors.

Principal component analysis (PCA) is a simpler mathematical approach that makes no statistical assumptions but can be followed by the same rotations as EFA and you often (used to) see papers reporting to have conducted an EFA of a questionnaire which had actually done a PCA. Fortunately, though that muddle is, probably very rightly, offensive to strict psychometricians and statisticians, the two methods rarely produce very different findings for our measures.

Both EFA and PCA have largely been replaced by “Confirmatory factor analysis” (CFA) which involves a very similar statistical model that underneath EFA but CFA differs in the computer program only allowing loadings where we predicted them so where EFA will start by extracting up to 13 common factors from the HADS (it will always be one fewer than the number of items if n is bigger than the number of items, as it should be) and allowing all 14 items to load on all 13 factors, CFA will only allow two common factors and will only allow that the anxiety items load on one of them and not on the other and that the depression items do the same, but for the other common factor.

One important historical note: when factor analysis was first used on questionnaires say before the 1960s, the “computers” were humans tabulating things and the computing all the correlations and the fit to the model by hand, well, helped by slide rules. A single analysis could take weeks. Any modern laptop can do even complex factor analysis of even huge datasets with many items in a few seconds or less time.