20 Factor Analysis

20.1 The Latent-Factor Idea

Factor analysis treats the observed measurements (for example, ten survey items about customer satisfaction) as noisy reflections of a smaller set of unobserved constructs that generate them. Spearman (1904) introduced the common-factor model to study intelligence, and Thurstone (1947) generalised it to multiple correlated factors. In the common-factor form, each observed variable \(x_j\) decomposes into a weighted sum of shared factors plus a unique part:

\[ x_j = \lambda_{j1} F_1 + \lambda_{j2} F_2 + \cdots + \lambda_{jm} F_m + u_j \]

The \(\lambda_{jk}\) are the factor loadings, \(F_k\) are the latent factors (mean 0, variance 1), and \(u_j\) is the unique part of item \(j\) (item-specific variance plus measurement error). The idea is that a 10-by-10 item correlation matrix can often be reproduced by a much smaller 10-by-\(m\) loading matrix with \(m = 2\) or \(m = 3\), which is both more parsimonious and easier to interpret.

flowchart LR
    F1((Factor 1)) --> x1[Item 1]
    F1 --> x2[Item 2]
    F1 --> x3[Item 3]
    F2((Factor 2)) --> x4[Item 4]
    F2 --> x5[Item 5]
    F2 --> x6[Item 6]
    u1[u1] --> x1
    u2[u2] --> x2
    u3[u3] --> x3
    u4[u4] --> x4
    u5[u5] --> x5
    u6[u6] --> x6
    classDef default fill:#004466,color:#ffffff,stroke:#ffcc00,stroke-width:3px,rx:10px,ry:10px;

Common variance vs unique variance

The common-factor model explains only the part of the item variance that is shared with other items. The remainder, the uniqueness \(u_j\), is reserved for item-specific content and measurement error. This is the key difference from principal component analysis (see §4).

20.2 Correlation Suitability

Factor analysis needs the items to actually covary. If the correlation matrix is close to the identity matrix, there is no shared variance to extract. Two standard checks come before any extraction: the Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy (Kaiser 1970) and Bartlett’s test of sphericity (Bartlett 1950). KMO above 0.70 is acceptable and above 0.80 is meritorious; Bartlett’s test rejecting the identity-matrix null is the minimum bar.

Try here

The within-block correlations (x1-x3 and x4-x6) are noticeably larger than the across-block correlations, KMO is comfortably above 0.70, and Bartlett’s chi-square is highly significant: the matrix is factorable.

20.3 Eigenvalues and Factor Count

The eigenvalues of the correlation matrix quantify how much variance each potential component captures. The Kaiser rule (Kaiser 1960) retains components with eigenvalues greater than 1. The scree plot (Cattell 1966) looks for an elbow where successive eigenvalues drop off sharply. Parallel analysis (Horn 1965) compares observed eigenvalues to those from random data of the same shape and is now considered the most defensible rule.

Try here

Two eigenvalues exceed 1 and the scree plot bends after the second point: the data are consistent with two common factors, which matches how the sample was built.

20.4 PCA vs Factor Analysis

Principal component analysis (PCA) and exploratory factor analysis (EFA) are often confused. PCA extracts components that maximise the total variance of the observed items, so each component is a linear combination of the items. EFA extracts factors assumed to cause the items, so unique variance is deliberately excluded. In practice, when the goal is a latent construct (satisfaction, brand trust, ability), EFA is the right tool; when the goal is pure data reduction without a latent claim, PCA is.

Try here

The component loadings and the factor loadings both separate the items into two blocks, but the factor solution shrinks the loadings slightly because it works on the common variance only, and reports communalities (\(h^2\)) which are bounded below 1.

20.5 Extraction by Principal Axis Factoring

Principal axis factoring (PAF) replaces the ones on the diagonal of the correlation matrix with initial communality estimates (typically squared multiple correlations), eigendecomposes the reduced matrix, updates communalities from the solution, and iterates until they stabilise (Fabrigar et al. 1999). It is robust to mild non-normality. Maximum likelihood (ML) extraction is a parametric alternative that supports fit tests but assumes multivariate normality of the items.

Try here

The communalities sum to the common variance explained; the uniquenesses (1 minus \(h^2\)) are the item-specific parts the model cannot reproduce.

20.6 Rotation for Interpretability

Unrotated factors often spread loadings across several columns, making labels hard to assign. Rotation preserves the model’s fit while redistributing loadings to approximate simple structure (each item loads high on one factor and near zero on the others). Varimax (Kaiser 1958) keeps factors orthogonal and is easy to interpret; oblimin lets factors correlate, which is more realistic when the constructs are expected to overlap (Fabrigar et al. 1999).

Try here

Varimax produces two cleanly separated columns because the simulated factors were uncorrelated. Oblimin’s Phi (factor correlation) is near zero, confirming that the orthogonal rotation was appropriate for this data; when Phi is 0.30 or higher, an oblique rotation is usually preferred.

20.7 Loadings, Communalities, Cross-Loadings

Loadings are read column by column to label each factor, and row by row to decide which factor each item belongs to. A common rule of thumb treats \(|\lambda| \geq 0.40\) as salient (Hair et al. 2019). An item with two salient loadings of similar magnitude is a cross-loading: it is either a poorly-written item that straddles two constructs, or evidence that the two factors overlap substantively.

Try here

Every item has one salient loading and no cross-loading above 0.4, so each item is safely assigned to its dominant factor. The column pattern (items 1-3 on PA1, items 4-6 on PA2) is the basis for factor labels in the write-up.

20.8 Factor Scores

Once a model has been fit, each respondent needs a score on each factor for downstream analysis. Thurstone’s regression method (implemented by psych::fa) uses the factor-by-item regression weights from the solution. A simpler alternative is to sum or average the items that load on a given factor: unit-weighted scale scores are more transparent to stakeholders and often correlate above 0.95 with regression scores (Grice 2001). Cronbach’s alpha (Cronbach 1951) reports the internal consistency of such a unit-weighted scale.

Try here

The regression scores and the unit-weighted composites are correlated above 0.95 on each factor, and both scales show alpha above 0.70, meeting Nunnally’s (1978) acceptable-reliability threshold.

20.9 Reliability of Extracted Constructs

Cronbach’s alpha is widely reported but is strictly a lower bound on reliability and assumes tau-equivalence (equal true-score loadings across items). When loadings differ noticeably, McDonald’s omega (McDonald 1999), computed from the factor-analytic solution itself, is a more accurate composite-reliability estimate. Either way, reliability answers “do the items move together enough to be treated as one scale?” and is complementary to EFA, which answers “how many underlying constructs are there?”.

Rule of thumb

Report alpha for each extracted scale, flag any below 0.70, and include omega (via psych::omega()) when item loadings are visibly unequal. Reliability below 0.70 usually means the scale has too few items, a cross-loading, or a reverse-keyed item that was not recoded.

20.10 From EFA to Confirmatory Factor Analysis

Exploratory factor analysis lets the data pick which items load where. Confirmatory factor analysis (CFA) fixes the assignment in advance (a measurement model) and tests how well it reproduces the covariance matrix. lavaan (Rosseel 2012) is the standard R implementation. Fit is judged by several indices: CFI and TLI should be 0.95 or higher, RMSEA should be 0.06 or lower, and SRMR should be 0.08 or lower (Hu and Bentler 1999).

Try here

The standardised loadings land near the simulated values, the factor correlation is near zero, and CFI, TLI, RMSEA and SRMR all meet the Hu and Bentler (1999) cut-offs. CFA is the right tool when a theory, earlier EFA, or a validated instrument already specifies the item-to-factor assignment.

20.11 Reporting Factor-Analytic Work

A complete report includes: the list of items with their wording, the sample size, the suitability checks (KMO and Bartlett), the decision rule for the number of factors (Kaiser, scree, parallel analysis), the extraction method and rotation, the rotated loading matrix with communalities, the reliability of each retained scale, and, for CFA, the fit indices against their conventional cut-offs. If factor scores are used downstream, state which method (regression, Bartlett, or unit-weighted composite) produced them. Fabrigar et al. (1999) and Brown (2015) are the standard methodological references reviewers expect.

Summary

Concept	Description
Concept and Suitability
Latent-Factor Idea	Unobserved variables that explain shared variance among indicators
Correlation Suitability	KMO and Bartlett's test verify data are factorable
Eigenvalues	Components with eigenvalue > 1 are kept (Kaiser's rule)
Extraction and Interpretation
PCA vs FA	PCA reduces; FA models common variance explicitly
Principal Axis Factoring	Iterated communality estimation extracts factors
Rotation	Varimax (orthogonal) or oblimin (oblique) for interpretability
Loadings and Communalities	Loadings link indicators to factors; communality is variance explained
Confirmation
EFA to CFA	Move from EFA to CFA to confirm a hypothesised structure