Skip to main content.

The Nassau County Police Case: Impressions

Craig J. Russell, Ph.D.
J.C. Penney Chair of Business Leadership
University of Oklahoma

July 12, 1996


I have one major impression about the Nassau County technical report and related materials. Before stating it, however, I would note that a recent Journal of Applied Psychology article authored by Rich Arvey and others shows that a criterion valid test in the presence of group mean differences will always be expected to demonstrate disparate impact. Hence, given widely known black-white differences on various tests of cognitive ability, any test that captures that ability will contribute to adverse impact. Peterson and Novick (1976) provided the most thorough integration of psychometric and political-social considerations influencing selection decisions, generating a summary equation in which social and psychometric considerations are made separately and combined into final selection decisions. The current effort in Nassau County seems to be a search for a test that does both - exhibits criterion validity and minimizes adverse impact. Unfortunately, this means that whatever procedure emerges, test scores obtained must almost certainly be uncorrelated with cognitive ability. Hence, we see the authors bending over backwards to eliminate cognitive test remnants from the predictor domain with the concurrent validity portion of the study (i.e., controlling for tenure when tenure is correlated with race and cognitive test scores, setting a 1% cut-off for the reading test. etc.) This is not necessarily bad if alternate, criterion-valid noncognitive procedures can be identified. As noted in the comments of others on this study, however, this does not appear to be the case here.

Now, as to my major impression, it appears that all decisions in the Nassau study were driven by impact adjustments. Specifically, the technical report is driven by a goal of achieving the Justice Department's "impact" goals while simultaneously doing what Frank Landy (1986 American Psychologist) called "stamp collecting;" i.e., filling the squares laid out in the study's Exhibit 1. The "theme" running throughout the report is the clear selective presentation of information - ranging from the descriptive (e.g., the description of security procedures) to basic analyses (zero-order correlations). Clearly, a great deal of "craftsmanship" was exerted to achieve their no impact results. Equally clearly, most of the judgment calls that went into this "craftsmanship" are not adequately explained or documented. For example, given the investigators involved, I cannot believe that there is not some rational explanation for the 1% cognitive test cut-off (though I can't imagine what it might be other than to effectively eliminate cognitive ability as part of the screening procedure). I coauthored a meta-analysis in JAP in which we examined all criterion-related validity studies published in Journal of Applied Psychology and Personnel Psychology between 1965 and 1992 to determine whether the purpose for which the research was conducted moderated results obtained. Studies initiated to defend an organization against a charge of employment discrimination consistently yielded the highest validities, while studies conducted by academics to test some theory or model of performance prediction yielded the lowest validities (Russell et al., 1992 Journal of Applied Psychology). Our interpretation of these results was that primary researchers make "judgment calls" at multiple points in a research project that never get reported (see examples in the Russell et al., 1992). Apparently, the motivating force causing the researcher to conduct the study influences these judgment calls in ways that influence validity results obtained. In the case of the Nassau technical report, it is clear that these judgment calls were made in such a way as to minimize the presence of cognitive ability in any part of the selection procedure. It is also equally clear that information needed to understand these judgment calls is missing from the report - if this were a journal article, no subsequent investigator could replicate or extend this research due to the lack of information about what was done and why. Unless the Justice Department's representative and the other advisors were actively involved in the day-to-day efforts of the research team, they would be hard pressed to evaluate what was going on using any criterion other than reducing bottom line adverse impact.

I am not as concerned with the complaint that the authors ignored prior cumulative knowledge captured in meta-analytic results. It seems clear that the authors did use prior cumulative knowledge in deciding to minimize the presence of cognitive ability in the predictor domain. Others also have spent much more time than I have examining the study's various post hoc "corrections" for statistical artifacts so I must bow to their expertise in their critique of the technical report authors' correction formulae and I would not immediately embrace any one position.

Finally, I have not read the technical report with the same level of attention I apply as a referee of manuscripts submitted to scholarly journals, which means that there may be additional problems that would come to light under such scrutiny.