This classic blog attempts to give a critical analysis of the general intelligence factor 'g'. This factor is what emerges from a factor analysis of the correlation between various tests of mental ability, and is purported to explain a large fraction of the variance in performance across tests. Using Thomson's ability-sampling model, Cosma Shalizi created some simulated test data and performed factor analysis. Using 11 tests which draw from 500 shared and 500 unique 'abilities', all of which are uncorrelated independent random variables, it's shown that a single factor explaining around ~30% of test performance variance emerges. This is suggested to undermine one of the core ideas of 'g' theory, i.e., that the 'single factor' needn't correspond to any single variable of interest. While I in fact affirm this conclusion about 'g', I don't think this is a particularly strong argument concerning the reality or meaningfulness of g.
We must first make sure we don't get confused by the fact that, in the simulation, all of the 'abilities' are uncorrelated; this is irrelevant. Summing random variables creates structure in the data which factor analysis captures. For example, if we have 'tests'
To break down the simulations in the blog, let's consider a simple worked example: we'll suppose there is a set of 'shared abilities' that can potentially influence multiple tests (think 'eyesight', which could influence many types of test you might subject a person to (reading, driving, archery, etc)), and multiple sets of 'unique abilities' that are specific to each test (for example, 'plant knowledge' might exclusively influence a 'gardening skill' test).
"Number of tests": 5
"Shared abilities": {A, B, C, D, E, F, G, H, I, J, K}
"Unique Abilities": {
"Test 1": {a1, b1, c1},
"Test 2": {a2, b2, c2, d2},
"Test 3": {a3, b3},
"Test 4": {a4, b4, c4, d4},
"Test 5": {a5, b5, c5, d5, e5}
}
Now we can draw abilities at random from each relevant category for each test (from as few as 1 to as many as 11 each).
"Test 1": {C, H, J, I, K, E, D, a1, b1, c1},
"Test 2": {B, E, K, G, C, H, A, J, d2},
"Test 3": {K, H, G, A, D, I, C, F, a3, b3},
"Test 4": {H, K, G, C, I, d4, c4, a4, b4},
"Test 5": {H, K, G, D, I, B, C, E, J, e5}
In this case, it so happens that a trio of abilities (C, H and K) appear in all tests. Others, e.g. G also appear in multiple tests. For the purposes of our simulation, the data generating process is random, and uninteresting. Thus, there is no special significance to attribute to an ability or combination of abilities appearing multiple times. However, suppose we believe our battery of tests is meant to capture something we care about, like athletic performance; we might then wonder what exactly are these variables that have predictive value across a range of tests, and if there's anything significant in their co-incidence. I'll have more to say on this point later.
What matters for now is the simulations that Shalizi carried out. As I mentioned, he used a large number of shared and unique variables. The first point to make is that having run the code, I found that there was quite a bit of run to run variation in the variance explained by the leading factor -- between 20-50%. One factor explains this, namely the degree of overlap between variables in the tests. In runs where the tests look more similar, the
At this point, we should first ask ourselves what a 'real', 'meaningful', or 'monolithic' variable in science looks like. Are there any clear examples? What is it that we want to contrast the g factor with? One example of a 'proper' physiological measurement like height might come to mind. Few would doubt that height is real, not outside of a philosophy seminar at any rate. But it's trivial to show that height can be decomposed into a linear combination of arbitrarily many underlying variables: leg height, torso height, head height, etc; we can be as granular as we like. If we conduct a battery of basketball related tests and measure height, eye colour and hair colour, we will probably recover a single dominant factor mostly loaded with height. Will that change if, instead of measuring height, we measure leg, torso, and head length individually? Of course it won't. Neither the statistical properties nor the underlying reality of the situation will have changed, but we'll be able to point to a factor loaded with multiple component variables. A bona fide 'IQ organ' in the brain whose mass solely determines intelligence would be consistent with Shalizi's simulations. All that's needed is just to split the measurements up. If the point here goes through, then we've established that the mere fact that a factor is composed of multiple component variables doesn't imply anything in particular about the ontological status of the factor, negative or positive.
Indeed, though, on the 'positive' end of this (where I've been focused on the negative end), it remains the case 'g realism' can't really follow from the existence of a single dominant factor in a factor analysis. My point hitherto has been narrow: many variables in factor