Skip to main content.

Statistics: Tips and Enigmas

Chuck Schultz


Editor's Note: Chuck's contributions for this issue are a little different from those which have appeared in previous issues of the ACN They are more in the line of editorials on the use (and misuse) of numbers and psychometrics. This is especially true of the last article which has been directly labeled as a guest editorial. The topics discussed have caused some controversy in other discussion forums and may do so here. However, the topic is important to what we do and the fact that it may be controversial is not justification to ignore or not discuss the topic. If you wish to agree or disagree with the topics discussed the ACN will be pleased to receive other guest editorials for consideration in publication.

Ask What It Means

A certain toothpaste is unsurpassed at fighting cavities. What does that mean? It sounds to me as if it means that the researchers failed to reject the null hypothesis. What does that mean? If there was any difference among toothpaste tested, the difference was not statistically significant, regardless of whether the certain toothpaste had a slightly better or slightly poorer record. Shall we assume the research was well conducted and that the samples were large enough to revel significance if the difference had been sizable? Or should we ask?

A speaker encouraging skiers to dress warmly enunciated a percentage: X percent of a person's heat is lost through the head. I don't remember the percentage, but it was high. Forty or sixty percent, something like that. I usually remember numbers if they are meaningful, but that one I considered completely useless, and I didn't ask what it means because I was sure the speaker hadn't the foggiest idea.

How can I be so sure? Because I know the answer to how much heat you lose through your head is, "It depends." It depends upon the percentage of heat the rest of your body is ready to lose. It's going to lose a lot less heat through thermal underwear and an insulated snowsuit than through tee-shirt and jeans. Other factors such as ambient temperature, humidity, and wind chill will also have an effect. And what about the amount of hair on your head and how bouffant your hairstyle?

I'm just curious about influential factors when someone reports a validity coefficient. Is a validity of .45 high? Perhaps. Find out what it means.

Actuarial Prediction

Actuaries calculate probabilities associated with risk for insurers or lenders. It doesn't matter whether or not the association is intrinsic to the risk or whether other variables mediate the relationship. If they find that good students have fewer accidents, they solicit business from students with B averages. If they find that persons with good addresses pay their bills more regularly, the offer lower interest to persons who live in Laurelhurst.

We sometimes find a variable that is highly related to job performance: perhaps golfers sell more insurance. We hesitate to exclude candidates because they don't play golf. We select instead on variables intrinsic to job performance.

Perhaps Asian men do better engineering. We hesitate to exclude engineering candidates because they are not Asian men. We select instead on variables intrinsic to engineering.

Perhaps people who score high on job-related auto mechanics test make good auto mechanics. We hesitate to exclude mechanic candidates because they do not score high on our job-related auto mechanics test. We select instead on variables intrinsic to auto mechanics.

Perhaps people who score high on an IQ test are better secretaries. We hesitate to exclude secretary candidates because they do not score high on an IQ test. We select instead on variables intrinsic to secretarial work.

Perhaps some of you were with me on some of the examples but left me along the way. I used the same logic in each example. If you parted company, it is because you believed that some of the examples were more job-related than others.

I believe that in every case predictions can become more accurate, and more equitable, if we look beyond conventional measures to find variables intrinsically related to job performance. Then we can hire the candidates with low IQ test scores who will be better secretaries than other candidates with high scores. When do test analyzers make judgments based on what they already know, or what they think they know, rather than based on the data?

Guest Editorial:
The White Guys Get the Lions Share and Rationalize That That's OK

Folks do a lot of things in the name of affirmative action that are poor psychometrics. Race norming is based on some erroneous assumptions, including that the samples compared are selected in the same way (randomly?) From well-defined populations. Banding throws away valid test data. Sometimes managers use sensitive, valid measures to select from within a band. Even then, more defensible procedures exist for combining valid test data with the latter sensitive, valid measures.

On the other hand, folks use sound psychometrics to justify objective measures that result in the perpetuation of inequities. Seldom does a validity coefficient reach .71. Which is another way of saying, seldom does a selection procedure account for half of the variance in the criterion. Using utility formulas, we can show that test validities of .20 often make money for the company. And we justify using the tests because they are the best evidence we have of candidate qualification.

Usually, we do well to come anywhere near to predicting half the variance in job performance. Generally, much of the unaccounted for variance is error. In addition, the criterion measure does not precisely represent functioning expected in all of the positions in question. Furthermore, a wide variety of factors affect job performance that we can't appropriately measure, such as, will the candidate have domestic disruption, or will the candidate have a lousy supervisor. So maybe rxy isn't so bad.

Nevertheless, even when we do the best we can, our selection processes leave a lot of un-accounted-for variance. While we justify psychometrically what we have done, we have no call to justify it arrogantly. Scientists have a long history of arrogance, which we look back on disdainfully in the light of greater understanding. (Pasteur thought the idea ludicrous that fermentation could be caused by living beings.) So what do we do if evidence tells us that women can't succeed in business or the Blacks are naturally dumb? Can we be humble enough to believe that greater understanding might teach us different?

Colleagues challenge me to come up with better answers than the more intelligent people in the field have. I don't have evidence to prove that Blacks aren't dumb. That proposition doesn't interest me. I often notice Blacks who are more competent than I at a lot of things. Many of them I would outscore on a vocabulary test, which may have a significant correlation with the activity at which they are most competent. Something besides, or in addition to, the vocabulary test would provide a more equitable selection process.

When I use the abstraction something, critics can rightfully say that I haven't come any closer to an answer. But at least I have come closer to the question. Rather than look for a general answer to the proposition that doesn't interest me, I propose looking first for specific answers. For a given activity, why do some actors perform better than others?

Typically, subject-matter experts approach job analysis with some preconceived opinions: the best way to do a job is the way I do it, the best people for the job are people like me; people with certain training and experience will do better. It helps to look at particular workers and determine why they do well or poorly. In doing a job analysis, hold some variables constant such as race, cognitive test scores, age, and past experience.

Until we learn to make tests that focus on the variables inherently related to job performance, our best psychometrics will perpetuate the status quo. We need some type of affirmative action to make up for our lack of understanding.

I wish we did not need to consider race in making selections. But as long as results demonstrate that selections are implicitly influenced by race, we need to consider it. I prefer that it be considered openly. That we explicitly consider the value of a balanced workforce. Because if we don't, the conventional factors will continue historical inequities.

 

Chuck Schultz may be reached at (360) 923-5340, 2941 B Firwood Loop, Olympia, WA 98501-4844.


© Copyright 1997 by the IPMA Assessment Council. All rights reserved.