Skip to main content.

Statistics Tips and Enigmas

Chuck Schultz


Statistics Tip:
Cardinal Motifs and Elemental Sins

It is customary to interpret the square of a correlation coefficient as a percentage of variance accounted for. Marple gave a vocabulary test to a number of children and found a gigantic spread of scores. She compared test scores with the children's ages and found they correlated .82. She squared .82, which gave .67, and concluded that age accounted for about two-thirds of the variance in vocabulary scores. And in this case, this seemed like a reasonable interpretation.

Hawkshaw had each of sever art critics rate the quality of 164 paintings submitted to the Nelson/Rovzar Art Contest. He correlated the ratings of the seven judges. Harry's and Jane's ratings correlated highest, .82. He concluded that Harry's ratings accounted for 67 percent of Jane's ratings.

Since Jane completed her ratings before Harry started, would it be better to say that Jane's ratings accounted for Harry's? Another customary way to refer to such an event is to say that Jane's and Harry's had two-thirds of their variance in common. You might assume that Jane and Harry are sensitive to and appreciate the same aspects of a painting.

The implication in both Marple's and Hawkshaw's examples is that common factors (motifs) influence the variables that are correlated, or that correlation implies causation. Oh, but that reminds us that our introductory psych instructor, as well as our introductory stat instructor, warned us that such a conclusion is naïve. But doesn't it seem that they both persisted in drawing that naïve conclusion from time to time.

It turns out that a high school art teacher had largely influenced Jane's preferences, which reveal a partiality to prominent structures. Harry's preferences are affected by his color blindness, which leads him to prefer paintings in which different hues are also different shades. In this sample of paintings, these factors led to similar ratings by Jane and Harry.

In addition, we need to remember that the selectivity of the sample affects the size of the correlation. Age will have a higher correlation with vocabulary scores if age varies from 60 months to 180 months than if age varies from 60 to 72 months. There will be more variance in vocabulary scores and age will "account for" a higher percentage of it. Still we know that age doesn't cause a larger vocabulary; rather, it is correlated with a lot of things that do.

An applicant sample will be more variable than a promotional sample, which generally leads to higher validity coefficients. That is, the applicant sample will have greater variability and a higher proportion of that greater variability will be "accounted for" by the selection test.

We are likely to say that a validity coefficient of .50 shows that 25 percent of the variance of job performance is accounted for by our selection procedure. We may temper our statement by replacing "job performance" with "criterion measure," acknowledging that our coefficient is only one piece of evidence for the validity of the test. Still we should recognize that the factors measured by the test may act as a surrogate for the ability to do the job. And that the surrogate shares a certain amount of common variance with the criterion measure under the particular circumstances in which we have observed the relationship.

Statistics Enigma:
Quest of the Inner Self through Factor Analysis

While behavior is complex, we often try to capture that complexity in a simple marker -- such as a test score. We know that many variables affect the substance and reliability candidates' test responses. We consider the effect of aptitudes, knowledge, convictions, discernment of the context, experience with the content, modes of expression, and so on. We look at correlations among measures to see persistent ways of acting, knowing that errors of measurement and sample selection affect the correlations.

Factor analysis provides a way to summarize the information latent in correlations. We decide what issues are common to a group of tests. The first factor extracted (as we say) can be pictured as a new variable, defined so that it has as much variance as possible in common with all the tests. The next factor has as much as possible in common with all the remaining variance. Commonly these initial factors are themselves complex and must be recombined (rotated) in order to yield dimensions that analysts can get a handle on.

Analysts use both geometry and mathematics. They plot tests as vectors (arrows -- directional forces) in multi-dimensional space. Then they find a new vector, which is the least-squares best fit to all the other vectors. The next factor is the best fit to the residuals. (It is all very simple! Or so a mathematician told me.)

We could obtain ninety-nine factors from a hundred tests, but most of these would seem essentially random. Ordinarily, a very small number of factors will account for all of the reliable variance. That number depends upon the reliability and diversity of the tests. Usually three to ten factors exhaust the reliable, independent dimensions.

A factor common to disparate influences may be impossible to interpret. But adjustment (rotation) of the first several factors will provide information that is more useful. The analyst rotates the factors while retaining essential relationships. The test vectors lying close to the new position of the factor may identify a meaningful dimension. For example, analysts identify combinations of variables that make up verbal, spatial, and numerical abilities; or the big five personality variables.

In some instances, factor naming seems more like a parlor game rather than a scientific endeavor. When , among easily identifiable factors, Guilford found a factor represented by miscellaneous cognitive tests, he would name it General Reasoning. In factoring a new test battery, he would find different tests defining a General Reasoning factor while the tests originally on General Reasoning would help to define different factors.

Charles Spearman invented factor analysis to summarize information in school examinations. When he analyzed theses tests, he noticed that his first factor seemed interpretable. He saw it as a general ability to do well on school tests, and he called it g. In myriad factor analyses, factors appear that the analyst chooses to call general ability. Investigators develop tests in an attempt to measure this construct as some thing common to many groups of measures. They also conjecture that the same thing explains relationships among a variety of non-test activities.

In most cases the g factor, or first principle axis, can be combined with other factors obtained in the analysis to provide new ways to group the common elements. New factors, linking other aspects of the measures, often provide alternative interesting ways to look at the information. The new perspective may provide insights that seem relevant to the purpose of testing.

We act, sometimes, as if once we have found g, we have all the interesting information contained in cognitive tests. The richness of human behavior ceases to amaze only when we quit looking.

Chuck Schultz may be reached at (360) 923-5340, 2941 B Firwood Loop SE, Olympia, WA 98501-4844.


© Copyright 1997 by the IPMA Assessment Council. All rights reserved.