QUESTIONS AND ANSWERS

 

My advisor tells me to choose a VARIMAX rotation when doing factor analysis, while you choose DIRECT OBLIMIN. What is the difference and why should I believe you?

 

OBLIMIN is SPSS’s option for oblique rotation (‘scheve rotatie’), VARIMAX (and the other options) refers to orthogonal rotation (‘rechte rotatie’). Forget about the word rotation and what it means (it refers to representing factor analysis in a vector space), and try to understand the differences when you see factor analysis as a bunch of regression equations. OBLIMIN allows for correlation between the latent factors (indeed estimates it), VARIMAX constraints this correlation to be 0.00 So, the choice should refer to your understanding of the set of items that you are analyzing: if you really think they represent multiple latent variables and that these latent variables cannot be correlated, choose VARIMAX. But why would you be analyzing a set of items in one analysis if you think the underlying dimensions are independent? I find it hard to think of such research situations.

 

OBLIMIN rotations produce more (complicated) output than VARIMAX. Apart from the factor correlations, the factor loadings matrix (=relationships between latent factors and observed indicators) is decomposed in the factor pattern and factor structure matrix which represent the direct and total effects of the latent factors. [Try to understand why these two matrices coincide in orthogonal rotations.] You want to interpret the pattern matrix, in particular when you are looking for ‘simple structure’ (!).

 

Why do you construct the index variable using COMPUTE index = MEAN(indicators) in stead of using factor scores?

 

The main reason for not using factor scores is that SPSS produces these only for complete cases and leaves out any case that has some missing value. The COMPUTE / MEAN construction uses all available information and also produces a score for cases that have missing values in the indicators (‘item non-response’). So this is a great way to preserve data.

 

(You can instruct SPSS FACTOR to do MEAN SUBSTITUTION to get a factor score for all cases, but this a weaker procedure to treat missing values.)

 

Other differences between factor score scales and mean value scales are:

·        Factor scores use different weights for the indicators and are in this sense ‘optimal’.

·        (Cronbach’s) alpha reliability refers to mean value scales. For factor score scales you should refer to theta or omega reliability (see appendix in Carmines and Zeller).

 

If the indicators are relatively homogeneous and there is a clear simple structure, and there are few missing values, the two alternatives will be vary close.

 

What is the difference between factor analysis (extraction = PAF) and component analysis (extraction = PC). Which one should I use?

 

The difference between PAF and PC is very large conceptually, but makes often little difference in practice. Briefly: in PAF the observed indicators are the consequences of the latent factors (this is the LISREL measurement model), in PC the components are the result (consequences) of the indicators. Understanding these different (causal) logics is crucial in understanding LISREL models, so I will say much more about this in the future.

 

Note that for the time being I use PC analysis, but explain the results as if it was PAF analysis. Introducing the difference will be done next week.