TESTIMONY OF LINDA S. GOTTFREDSON ON THE DEPARTMENT OF JUSTICE'S (DOJ) INVOLVEMENT WITH THE 1994 NASSAU COUNTY POLICE ENTRANCE EXAMINATION
BEFORE THE CONSTITUTION SUBCOMMITTEE, JUDICIARY COMMITTEE, HOUSE OF REPRESENTATIVES
TUESDAY, MAY 20, 1997, 1:00 P.M.
Mr. Chairman, members of the subcommittee, my name is Linda Gottfredson. I appreciate the opportunity to be here today to share my concerns with you. I am a professor at the University of Delaware, and have spent most of my career researching the aptitude demands of work and the social dilemmas involved in employment testing. I am best known in personnel selection psychology for detailing why those testing dilemmas exist and how they can sometimes lead to the misuse of science for political ends. I do basic research on the nature of human talent as well as analyze its policy implications. One example is my articles, both in scientific journals and the Wall Street Journal, on race-norming and its use by the federal government. That practice was subsequently banned by the Civil Rights Act of 1991.
For the past 12 months I have been investigating a new form of quota-hiring that the Justice Department developed in Nassau County, NY, and is now pressing on police departments nationwide. I have analyzed the technical details of the 1994 Nassau County exam, read pertinent court documents, and interviewed test developers nationwide. I have reported my findings in the Wall Street Journal, in scientific meetings, and in a scientific article that is being published later this year (Gottfredson, 1996a,b, 1997b, in press). I myself play no role in developing employment tests, and I have no financial stake in producing or evaluating any of them.
Before describing my findings, I will first note that although I have dealt with controversial topics in my career, I have never encountered so much fear--visceral fear--as I have while investigating the Justice Department's involvement in police testing. Most test developers and public agencies they serve feel too vulnerable to DOJ retribution to make public their complaints about improper actions by its Civil Rights Division. These people may have to be assured confidentiality or protection if the full story is to be told about how DOJ has been abusing its power to promote quota hiring.
Like many other experts in personnel selection psychology, I believe that DOJ's Civil Rights Division is operating outside the law. It does so in two ways: in the ends it seeks and in the means by which it pursues those ends. The ends are quota-driven hiring, and the means include various forms of intimidation and harassment.
DOJ'S NEW FORM OF QUOTA-HIRING
DOJ has recently revolutionized its decades-old pursuit of quota-hiring. It has done so by developing a new form of race-driven testing. DOJ developed it several years ago in Nassau County for police hiring and is pressing it on police departments nationwide. The new procedure involves picking and choosing test content according to how well different races tend to score on that content rather than according to the importance of the skills being measured. Content on which all groups do about equally well (such as sections asking candidates to report their personality and interests) is retained. However, sections on which whites and Asians tend to do better (for example, reading, reasoning, and problem solving) are eliminated or given very little weight. The result is the virtual elimination of mental standards, no matter how critical they are. Only with such racial gerrymandering of test content can it be guaranteed that all races will score about equally well despite having unequal levels of crucial skills.
Racial gaps in job-related skills are well-known in the testing field. The Department of Education, for example, routinely alerts us to the still substantial racial-ethnic disparities in literacy and higher-order thinking skills among both adolescents and adults (for example, Kirsch, Jungeblut, Jenkins, & Kolstad, 1993). Those skills gaps have created havoc in employers' efforts to hire equal proportions of all races while retaining race-neutral merit hiring. Indeed, test developers have struggled unsuccessfully for decades to develop tests of essential cognitive skills that have little or no disparate impact. Lacking such solutions to their testing dilemma, many organizations used to race-norm their tests. Race-norming is the practice of ranking candidates only against others of their own race. This allowed employers to obtain the same distribution of test scores for all races despite sometimes large racial gaps in the skills being measured. The practice is explicitly race-conscious and was banned by the 1991 Civil Rights Act.
The racial gerrymandering of test content differs from race- norming by building in quotas at the front end so that scores do not have to be adjusted to obtain equal results by race. It is an especially insidious form of race-driven testing because the quotas are covert, they are achieved by drastically lowering hiring standards for all candidates, and because both the quotas and the destruction of standards are disguised under layers of complicated but dishonest science. By essentially eliminating standards for candidates of all races, it promises to devastate the quality of policing nationwide. As Frank Schmidt, a leading industrial psychologist, wrote last fall in the Wall Street Journal (Schmidt, 1996a), Nassau's test will "be a disaster wherever it is used."
DOJ claims, however, that the new test nearly eliminates disparate impact while improving merit hiring. This claim for improved validity is false. Many decades of research have shown that test batteries can only be degraded by eliminating mental standards when the jobs in question are at least moderately complex, which certainly includes police work. However, the pretense that the Nassau test is highly valid (job-related) is absolutely crucial to DOJ's new strategy for forcing quota-hiring on police departments. DOJ claims that its Nassau tests are as valid as others (which is not true) but that they have little impact against blacks (which is true). Under civil rights law and regulation, when two selection devices serve an employer's needs equally well, the employer must use the one that screens out fewer minorities. Thus does DOJ transform the Nassau test into its legal trump card for enforcing quota-hiring in other jurisdictions. As will be discussed, the Nassau test does not actually meet the equal validity test, but DOJ treats it as if it does.
Race-motivated selection of test content violates professional testing standards, of course, and probably the 1991 Civil Rights Act as well. However, it can yield the equal racial results that DOJ seems to insist on. It also is clever because there is no need to go through the intermediate and now illegal step of adjusting the scores by race (race-norming) in order to produce racial equality in test results despite racial gaps in essential skills.
The performance of police officers is crucial to their communities' safety and faith in government. As in medicine and piloting, mistakes in police work can have very severe and very public consequences: the loss of life, property, and public trust. That DOJ would pursue quotas at the expense of merit hiring in such sensitive work as policing illustrates how detached from law and common sense the Civil Rights Division has become.
DOJ DEVELOPS A RACIALLY GERRYMANDERED TEST IN NASSAU COUNTY
Let me distinguish what is old from what is new concerning DOJ's activities in Nassau County. What is old is the way in which DOJ has pressured Nassau County into 20 years of consent decrees. Nassau County recently entered its third decade of the Justice Department investigating and reshaping its police hiring. The entrance exams developed pursuant to the county's first two consent decrees both had disparate impact against blacks. In both cases, the test developers provided evidence that their tests were job related and therefore lawful despite their disparate impact. But in both cases, court records suggest that Justice allowed the supporting research data to be manipulated in order to find some pretext by which plaintiffs could claim that the tests were not job-related after all. Such opportunistic fishing expeditions in the validation data are not scientifically justified, but they allowed plaintiffs to argue that the tests should be rescored in a manner, as both were, to increase minority scores.
This is the last half of Justice's old one-two punch. The first, as described in Mr. Flick's testimony concerning the City of Torrance, is to intimidate departments with threats and crushing requests for information. The second is to strip them of their legitimate defenses. As already described, DOJ's new race-driven testing procedure greatly increases the power of that second punch.
What is unprecedented about the Nassau case is that under Nassau's third consent decree, in 1990, DOJ itself became a partner in creating the county's next police entrance examination, which was administered in 1994. As DOJ special counsel John M. Gadzichowski explained to the court:
"[M]y department made a decision to break ground....We thought that rather than coming in and challenging an exam every two and three years, so to speak, knocking it out, then coming back three years hence to look at another exam, we would participate in a joint test development project" (U.S. v. Nassau, 1995, p. 20).
The implications of this unprecedented step are hard to overstate. DOJ has entered the test development business, with reverberations across the country. And its product, which it would soon promote nationwide, was the new quota-hiring mechanism I have already described.
In Nassau County the race-based test construction effort took the form of a four-year, multimillion dollar effort by 10 experts, 5 of them DOJ's, to strip Nassau County's experimental test battery of all crucial cognitive demands. The county's experimental test battery, which was administered to 25,000 candidates in 1994, had consisted of 25 tests, five of them mental tests. The test scores that candidates received in late 1995, however, were based on only nine of those 25 tests. Eight of the remaining nine were fakable personality questionnaires (for example, "Achievement Motivation," "Openness to Experience," and "Emotional Stability"). The ninth, a reading test, had been rescored pass-fail with the passing score set at the lowest level possible: to pass, candidates had only to read as well as the worst one percent of readers on the police force. Moreover, candidates had been given up to a month (but the police officers only a week) to study the reading material on which they were tested. Candidates who failed this rock-bottom first-percentile mental standard did not necessarily score low on the exam, however, because the reading score contributed relatively little to their total score on the exam.
Denuding police exams of all meaningful mental demands is comparable to urging universities to admit students according their high school grades in non-academic subjects like gym and art, but to ignore their grades in math and science. To avoid the appearance of having totally ignored academic performance, however, a little credit would be given for getting at least a D in social studies.
With such standards, it is not surprising that many highly qualified Nassau candidates--for example, lawyers, experienced police and probation officers--earned very low or failing scores on the new test. Or, worse yet, that the test's top scorers included an unusual number of people who were semi-literate, had outstanding arrest warrants, or refused to take the drug or lie-detector tests.
The new test actually turns out to work little better than simply picking applicants at random, but the DOJ-Nassau test development team produced a long technical report in July 1995 extolling the new exam's supposedly high validity but miniscule disparate impact. The special counsels for DOJ and the county both lauded the test in seeking approval for its use from the U.S. District Court. Mr. Gadzichowski, DOJ's representative in the original 1977 lawsuit and subsequent consent decrees, testified that "it's beyond question that the examination...is valid" and that "it's the closest ['to a perfect exam, vis-a-vis the adverse impact'] that I've seen in my years of practice" (U.S. v. Nassau, 1995, pp. 22-24, 26). Possibly reflecting what he had been led to believe, William H. Pauley III, the county's special counsel to the police department over the many years of Justice Department litigation, was even more fulsome in his praise:
"The 1994 Examination is now recognized by DoJ and industrial psychologists as the finest selection instrument for police officers in the United States" (Hayden v. Nassau, 1996, pp. 15-16).
Mr. Pauley was correct about DOJ's opinion of the test, but he would soon see his county's new test become the object of unprecedented outrage and ridicule within the field of industrial psychology. More importantly for Nassau County, its rigorous training academy would soon be receiving less able recruits, posing for it the unpleasant choice between failing many more cadets or lowering academy standards. Nassau County police unions would also become very concerned and wonder if they and the county, all signatories to the consent decree, had been misled about the value of the new test.
DOJ USES NASSAU TEST AS TRUMP CARD FOR QUOTAS NATIONWIDE
The Justice Department routinely denies that it promotes any particular test or test developer, but the facts prove otherwise. Indeed, DOJ has a history of doing just that (for example, see O'Connell & O'Connell, 1988, on how the Department of Justice pressured the City of Las Vegas to use the firm of Richardson, Bellows, and Henry).
Soon after the new exam received the District Court's approval in Fall 1995, the Justice Department began encouraging other police departments around the nation to adopt some version of the Nassau test. The consulting firm which had led development of the Nassau test (at that time named HRStrategies) simultaneously issued a widely-circulated invitation in Spring 1996 (Aon Consulting, undated) urging other police departments to join a test validation consortium. It stated that the project's objective "is to produce yet additional refinements to the Nassau County-specific test, and to reduce even further the level of adverse impact among minority candidates" (p. 6). The announcement concluded by stressing the legal advantages of joining the consortium: "Ongoing review of the project by Department of Justice experts will provide a device that satisfies federal law" (p. 7).
David Jones' Detroit-based Aon Consulting, together with two of the other firms involved in developing the Nassau test (Leaetta Hough's The Dunnette Group of Minneapolis and Erich P. Prien's Performance Management Associates of Memphis) developed a second generation Nassau test for the Louisiana State Police during 1996. That test was developed with the assistance and "oversight" of the same two long-time DOJ consultants involved in the Nassau project (Irwin Goldstein of the University of Maryland, College Park, and Bernard Siskin of the Center for Forensic Economic Studies, Philadelphia). DOJ has urged other state police agencies, cities, and counties from coast to coast, north to south, to switch to the Nassau test or its Louisiana progeny. DOJ persuasion has included threatening some jurisdictions with serious consequences (for example, a DOJ pattern and practice lawsuit or refusal to end consent decrees) if the jurisdictions hire particular testing firms named by DOJ, but offering benign outcomes if they use Aon's services.
Some jurisdictions have already felt the consequences of refusing DOJ's advice. Just months after the court approved the Nassau County test in 1995, the NAACP threatened to sue the New Jersey State Police for discrimination, but suggested that litigation might be prevented if the State Police would consider switching to the Nassau County test (letter from Joshua Rose, of the law firm representing the NAACP, to Katrina Wright, NJ Deputy Attorney General, February, 1996, p. 2). Although the test the New Jersey State Police currently uses had itself been developed and adopted several years earlier at the urging of the Justice Department, then represented by David Rose (father and now law partner of Joshua Rose), it screened out more minority applicants than did the Nassau test. The New Jersey State Police refused to change its test and was sued on June 24, 1996 (NAACP v. State of New Jersey, 1996).
Suffolk County, NY, uses the same test as does the New Jersey State Police, also at DOJ's insistence for the purpose of increasing minority hiring. DOJ is now attempting to cripple that county's ability to hire any police with the test it administered in June, 1996, to over 34,000 candidates.
DOJ has apparently been trying to moot the growing controversy over the poor quality of the Nassau test by simply switching its advocacy to Nassau's progeny, the Louisiana test. What matters, however, is that both tests are racially gerrymandered and both are technically flawed (Wollack, 1997). It is the race-driven mode of test construction, not any particular test developed with it, that provides DOJ its legal trump card.
EVALUATION OF NASSAU TEST BY INDEPENDENT SCHOLARS
It is useful to look briefly at the Nassau test to see the racial gerrymandering process at work and how science is misused to disguise it.
The new Nassau test might never have come to the attention of personnel selection scientists had DOJ not been promoting it so vigorously and pressuring police departments to switch their business to Aon Consulting. Professionals in the test development community quickly became concerned about DOJ's new interference in test development. The first few who were able to obtain technical information about the Nassau test and how it worked in practice were stunned. Afraid to speak publicly, they called upon selected academics in June 1996 to independently evaluate the long technical report describing Nassau County's new test.
I was one of the academics called. We all read the report independently of one another, without prior knowledge of who the project consultants were, without prior information about the report's contents or origins, and without compensation offered, expected, or received. After reading the report, I obtained court records and interviewed a variety of people in Nassau County and test developers nationwide. In the following months three university researchers wrote critiques of the new test: myself (Gottfredson, 1996a, b, c; see also Gottfredson, 1997a, 1997b, in press), Craig J. Russell, who is J. C. Penny Chair of Business Leadership at the University of Oklahoma (Russell, 1996), and Frank L. Schmidt, who is Ralph L. Sheets Professor of Human Resources at the University of Iowa (Schmidt, 1996a, b; see also Schmidt, 1997).
Those evaluations were all highly critical of the report and the test it described. The unanimous opinion was that the concern for hiring more minorities had overridden any concern with measuring essential skills. As Craig Russell (1996) wrote, the "major impression...[is that] all decisions in the Nassau study were driven by impact adjustments." The three commentators' suspicion that the test had been shaped more by Justice's expectations than professional considerations was confirmed by one of Aon's own vice presidents (quoted in Zelnick, 1996, pp. 110-111):
"Through 18 years and four presidents the message from the Justice Department [in its litigation with Nassau County] was clearly that there was no way in Hell they would ever sign onto an exam that had an adverse impact on blacks and Hispanics. What we finally came up with was more than satisfactory if you assume a cop will never have to write a coherent sentence or interpret what someone else has written. But I don't think anyone who lives in Washington [DC] could ever make that assumption" (pp. 110-111).
In referring to the aftermath of Washington DC's many years of lax hiring, Aon's representative was echoing Frank Schmidt's prediction of disaster for Nassau County. Among other problems, Washington DC had developed a "notorious record for seeing felony charges dismissed because of police incompetence in filling out arrest reports and related records" (Zelnick, 1996, p. 111).
The first and most obvious sign that the Nassau test had been racially rigged was that it excluded precisely what both decades of scientific research and the DOJ-Nassau team's job analysis indicated the test must include--good measurement of cognitive skills. A second troubling sign was that the team's 1995 technical report failed to report the research results that are essential for independently evaluating a test. The report was extraordinarily long and complicated (with hundreds of pages of text, tables, and appendices), but that served only to divert attention from its many omissions. Its omissions were massive, and violated many requirements in the DOJ's own Uniform Guidelines and both sets of professional test standards (see Table 1). The DOJ-Nassau consultants were well aware of those standards, many of them having helped to write the standards.
Although the missing data preclude definitive judgments about certain technical matters, close study of the report with a technically trained eye illuminates how the project had been pressed toward a political purpose.
The project began well enough. It started with a detailed analysis of the duties and skills required of Nassau County police officers (see Table 2). That analysis revealed that a broad range of skills is important. However, it showed something else that such studies invariably do--that higher-order thinking skills (what the team labelled "reasoning, judgment, and inferential thinking") are especially critical. This is one reason why most police departments test for cognitive skills. Another reason is that general mental ability is the best predictor of how quickly and how well recruits learn the job. The DOJ-Nassau project, however, had conspicuously omitted any mention of aptitude for training and for keeping up with new laws and police technologies.
STRIPPING THE TEST OF COGNITIVE CONTENT. The test managers next made a long series of technical decisions about how to test for the skills its job analysis showed were critical to good police work. Many of those decisions were odd, but all had the same effect: to minimize the presence of mental demands in the test battery. In fact, most were explicitly justified as ways to lower disparate impact. Five deserve special mention.
1. Despite demonstrating the critical importance of mental skills, the project chose not to include in its experimental battery any of the many good tests of mental ability known to be job-related. The reason given was that "traditional paper-and-pencil" tests have disparate impact.
2. The DOJ-Nassau project instead developed its own, weaker measures of mental ability, which it simplified even further by administering them in "innovative" formats (for example, by creating video tests that required no reading or writing and by making the reading test's material available to candidates for study up to 30 days before the exam).
3. The project packed the experimental battery mostly with (paper-and-pencil) personality questionnaires that have uncertain job-relatedness but little or no disparate impact, extolling their "innovativeness."
4. The DOJ-Nassau project then calculated the job-relatedness of the 25 tests in a non-standard, improper way that was guaranteed to boost the apparent validity of the personality tests but depress the apparent validity of the cognitive tests. This statistical sleight-of-hand allowed the project to justify eliminating virtually all cognitive content for lack of useful validity.
5. The project managers already knew how much disparate impact each test had when they winnowed the tests, because they had taken the "unique" step of putting the cart before the horse: they gave the experimental 25-test exam to the 25,000 candidates before determining in the research sample of 508 police officers whether any of those tests were worth keeping (virtually none were, as the team recently revealed [Dunnette et al., 1997, Table 1). The only reason the report gives for this reversal of accepted practice is that the project did not want to inadvertently keep tests with much impact and discard ones with little impact.
It is important to point out that the racial gerrymandering of test content was accomplished here, as it usually will be, by a long series of decisions that may individually seem unremarkable. What is remarkable about them is the clear pattern they reflect: a consistent bias toward minimizing cognitive content for supposedly strictly scientific reasons.
Frank Schmidt (1996b) complained that "the biggest and most glaring conceptual problem [with the study] is the complete failure to draw on the cumulative scientific literature in any way." Craig Russell (1996) was less charitable: "It seems clear that the authors did use prior cumulative knowledge [but] in deciding to minimize the presence of cognitive ability in the predictor domain."
INFLATING THE APPARENT VALUE OF THE COGNITIVELY DENUDED EXAM. Denuding a test battery of crucial cognitive content is guaranteed to suppress its validity and perhaps remove it as a contender for the status of an "equally valid alternative with less disparate impact." This would clearly create a problem for DOJ. Unless it qualifies as an "equally valid alternative," DOJ cannot use a racially gerrymandered test as its legal trump card. That potential problem was avoided when DOJ and the test development team conveniently made a string of statistical errors which grossly overstated the value of the new exam, and which misled the judge who approved it.
1. The Nassau-DOJ test development project made three major errors in deciding how to calculate the validity of its preferred test battery (the final battery which consisted of 8 personality questionnaires and the low reading standard). All three errors had the effect of inflating the test battery's apparent value. Frank Schmidt (1996b, 1997) has detailed those errors. His bottom line conclusion is that, although the team claimed that its battery had a validity level of .35 (on a scale from 0 to 1.0), .14 is a more accurate estimate. That small value is not useless, but close to it. Moreover, .14 is itself surely an overestimate because, recall, the project had already improperly boosted the apparent value of the individual personality tests that dominated the final battery. In short, the test development team had overstated the validity of its exam by a factor of 2 to 3.
The team has since admitted one of its three errors (Dunnette et al., 1997), the one that is hardest to deny in view of the fact that one of its members had previously written a journal article about avoiding that very error.
2. DOJ's John Gadzichowski (U.S. v. Nassau, 1995, p. 23) then committed his own statistical error which understated by a factor of 2 the value of the county's two previous exams. In testimony based on this mistake, he told the U.S. District Court that the new Nassau test is over twice as valid as the previous ones. In truth, however, it is probably much less valid. Moreover, the new DOJ-Nassau battery is less valid (.14) than the typical "traditional paper-and-pencil" cognitive test (about .25 in police work) that the DOJ-Nassau team said it did not even consider using for reasons of disparate impact and which it later denigrated as having "very poor" validity for police work.
FAILING TO PROVIDE RESULTS REQUIRED BY FEDERAL REGULATIONS AND REQUESTED BY SCIENTIFIC PEERS. The first four pages of the technical report repeatedly stress that it was written to allow a "detailed technical review of the project" (p. 2) and even be "understandable to readers not thoroughly familiar with the technology" (p. 3). As already noted, the report fails to provide even the most basic data that DOJ's own Uniform Guidelines require. The report also fails to explain decisions (such as setting the passing score for the reading test at the first percentile) for which the Guidelines also require justification. As Craig Russell (1996) wrote, there is "a clear selective presentation of information."
During 1996, independent test developers reported being unable to get copies of the 1995 technical report despite DOJ promoting the Nassau test to their clients. It seems that DOJ and Aon Consulting were both telling inquirers to request information from the other. Only after complaints about the test appeared in the Wall Street Journal (October 24, 1996) did Aon finally make copies of the report available.
After finally obtaining the report, independent researchers and test developers began asking for the missing research results. In response, DOJ and Aon officials both stated that Aon would supply the missing information upon request. Aon has not done so. In fact, when pressed publicly for some of the data at an April scientific meeting, Aon's president, David Jones, replied that he would not release any further results without DOJ's permission. DOJ itself has yet to respond to requests that that Aon release the results that DOJ's own Uniform Guidelines required it to report in the first place.
SCIENTIFIC COVER AND GRAVY TRAINS
DOJ's new strategy for requiring quota-hiring differs from the old also by requiring the complicity of employment testing professionals. DOJ's trump card crumbles unless DOJ can claim that its non-impacting tests are at least as valid as others. But any claim to validity rests on the testimony of professional experts, because establishing a test's job-relatedness is a highly technical matter. That testimony is also more credible if it comes from experts not routinely employed by DOJ.
DOJ had attempted to forge such an alliance--its overwhelming power alloyed with the authority of scientific professionalism--for at least ten years. DOJ's first attempt in the late 1980s failed when the test developer in question (Arlington-based Richardson, Bellows, and Henry [RBH]) refused to accede to DOJ's pressure to dramatically reduce the cognitive content of its test for strictly race-based reasons. According to RBH president Frank Erwin, that disagreement resulted in RBH getting off what would have been "a gravy train."
With the 1990 consent decree in Nassau County, DOJ seems to have found a more accommodating test developer. Aon was well acquainted with DOJ's insistence on quota-hiring. It is the very same firm that had evaluated or created Nassau County's ill-fated 1983 and 1987 tests. (At that time Aon Consulting was named, respectively, Personnel Designs and HRStrategies). Its president knew very well that, as his vice president put it, "there was no way in Hell that [DOJ] would ever sign onto an exam that had an adverse impact on blacks and Hispanics." DOJ is now providing access to police departments around the country for AON and its two companion firms from the DOJ-Nassau County project. In return, DOJ gets scientific cover for insisting on race-driven testing.
But that scientific cover holds only as long as the alliance can continue working beyond public scrutiny. The DOJ-Nassau team's scientific peers have been rebuffed in their attempts to obtain the information necessary for verifying the team's claims. A team confident of its work would have been anxious to make that information available. As for DOJ, rather than answering legitimate questions, its officials imply, in official communications, that the tests' critics are driven by greed and bigotry.
PRESSURE, POLITICS, AND PROFITS
In summary, DOJ's Civil Rights Division has been spending federal tax dollars to strong-arm police departments into quota-hiring, in the process lowering their hiring standards and costing their jurisdictions millions of dollars.
DOJ has no business entering the test development business and coercively marketing the tests that it helps develop. Its interference in that market is comparable to the FDA working jointly with a major drug company to manufacture new drugs, ignoring its own regulations for assessing their quality, creating new rules that only that company's drugs can pass, punishing whistle-blowers who expose disguised hazards and inflated claims to the drugs' efficacy, and then coercively marketing the flawed products to hospitals coast to coast.
DOJ's political agenda is also extralegal. Federal employment discrimination law and regulation do not allow let alone require the elimination of disparate impact at any price; under the law, validity always trumps disparate impact. Not so for DOJ, which always seems to find a way to manipulate the appearance of validity up or down depending on the test's disparate impact. Cognitive tests have disparate impact, but decades of research show that they are often highly valid. The research also indicates that their elimination can wreak havoc on the employing organization and those it serves.
The problem of disparate impact arises not from flawed tests, but from racial gaps in job-related skills that the tests reveal. As already noted, the U.S. Department of Education and educational reformers continue to shower us with evidence of still distressingly large racial gaps in literacy and higher-order cognitive skills. Our efforts should go into reducing these skills gaps, not hiding them with phony tests.
DOJ, however, steadfastly refuses to recognize this challenging reality and the dilemma that skills gaps create for employers. For DOJ, "race-neutral testing" has nothing to do with whether a test measures relevant skills equally accurately in all races (the technical definition of a race-neutral or unbiased test); rather, for the Civil Rights Division, "race-neutral" simply means that a test yields equal scores for all races, whatever the relevant skill levels (for example, see DOJ lawyer John Gadzichowski's March 21, 1997, letter to Suffolk County's County Attorney Robert J. Cimino). DOJ accordingly is making it next to impossible to defend any meaningful mental standard, no matter how critical.
While Congress and the Clinton Administration try to improve public safety by putting 100,000 more police on the streets, DOJ's Civil Rights Division is working to eliminate mental standards in police hiring. Congress and the Administration have adopted the PoliceCorps program, which pays for college scholarships if the recipients agree to work four years afterwards as a police officer. DOJ, in contrast, refuses to accept any educational standards for police beyond a high school diploma if such standards have disparate impact, which they usually do.
Police work is becoming increasingly complex. Progressive new forms of policing, such as community-based or problem-solving policing, require even higher levels of cognitive skill. Police hiring standards should be going up, not down. However, police departments will not be able to maintain even the standards they have today unless DOJ is called to account for its preference-driven misuse of authority, funding, and public trust.
REFERENCES
AERA/APA/NCME (1985). Standards for educational and psychological testing. Washington, DC: American Psychological Association.
Aon Consulting (Undated). HRStrategies entry-level law enforcement selection procedure design and validation project. Detroit, MI: Aon Consulting, Human Resources Consulting Group (formerly HRStrategies).
Dunnette, M., Goldstein, I., Hough, L., Jones, D., Outtz, J., Prien, E., Schmitt, N., Siskin, B., & Zedeck, S. (1997). Response to criticisms of Nassau County test construction and validation project (Draft). Unpublished manuscript. Available www.ipmaac.org/nassau/
Equal Employment Opportunity Commission, Civil Service Commission, Department of Labor, & Department of Justice (1978, August 25). Uniform guidelines on employee selection procedures (1978). Federal Register, 43, No. 166.
Gottfredson, L. S. (1996a, December 10). New police test will be a disaster [Letter to the editor]. Wall Street Journal, p. A23.
Gottfredson, L. S. (1996b, October 24). Racially gerrymandered police tests. Wall Street Journal, p. A18.
Gottfredson, L. S. (1996c, September). The hollow shell of a test: Comment on the 1995 technical report describing the new Nassau County police entrance examination. Unpublished manuscript, University of Delaware. Available www.ipmaac.org/nassau/
Gottfredson, L. S. (1997a). TDAC's defense of its Nassau County police exam makes my point. Unpublished manuscript, University of Delaware. Available at www.ipmaac.org/nassau
Gottfredson, L. S. (1997b). Vacuous defense of a hollow test: Commentary on the Nassau County police exam. Paper presented at the annual meeting of the Society for Industrial and Organizational Psychology, St. Louis, MO, April 12, 1997. Available at www.ipmaac.org/nassau/
Gottfredson, L. S. (in press). Racially gerrymandering the content of police tests to satisfy the U.S. Justice Department: A case study. Psychology, Public Policy, and Law.
Hayden et al. v. Nassau (1996, June 6). NY Supreme Court, Trial/I.A.S. Part 13 (Justice Goldstein), Index No. 14699/96 Affirmation in opposition by William Pauley.
HRStrategies (1995, July). Nassau County, New York: Design, validation and implementation of the 1994 police officer entrance examination. Project technical report. Detroit, MI: HRStrategies (now a division of Aon Consulting).
Kirsch, I. S., Jungeblut, A., Jenkins, L., & Kolstad, A. (1993, September). Adult literacy in American: A first look at the results of the National Adult Literacy Survey. Washington, DC: U.S. Department of Education, Office of Educational Research and Improvement.
NAACP v. New Jersey State Police, 1996. EEOC Charge No. 171-94-0124.
O'Connell, R. J., & O'Connell, R. (1988, December 5). Las Vegas officials charge Justice Department with coercion in consent decrees. Crime Control Digest, 22(49).
Russell, C. J. (1996, July). The Nassau County police case: Impressions. Unpublished manuscript, University of Oklahoma. Available www.ipmaac.org/nassau/
Schmidt, F. L. (1996a, December 10). New police test will be a disaster [Letter to the editor]. Wall Street Journal, p. A23.
Schmidt, F. L. (1996b, July). Some comments on the Nassau County police validity case. Unpublished manuscript, University of Iowa. Available www.ipmaac.org/nassau/
Schmidt, F. L. (1997). Comments on the 1997 SIOP symposium on the Nassau County police test. Unpublished manuscript, University of Iowa. Available soon on www.ipmaac.org/nassau/
Society for Industrial and Organizational Psychology, Inc. (1987). Principles for the validation and use of personnel selection procedures. (Third Edition) College Park, MD: Author.
U. S. v. Nassau County (1995, September 22). CV 77 1881, U.S. District Court, Eastern District of New York. (Transcript of hearing).
Wollack, S. (1997). The Nassau test: The Justice Department's latest weapon in the war against merit systems. Paper presented to the International Personnel Management Association's Southern Regional Conference, Corpus Christi, TX, April 28, 1997.
Zelnick, R. (1996). Back fire: A reporter's look at affirmative action. Washington, DC: Regnery Publishing.
Table 1
Major Test Development and Documentation Standards Not Met by Technical Report for Nassau County Exam
-----------------------------------------------------------------
Information required by the federal government's Uniform
Guidelines (Equal Employment Opportunity Commission et al., 1978)
-----------------------------------------------------------------
15.B.2 description of existing selection procedures
No comparisons of new procedure with old. Tech report
refers readers to 1988 report that is not attached.
15.B.8 means and standard deviations
Not reported for 16 tests winnowed out of experimental
battery or by race for any test.
Not reported for any of the trial batteries tested or used.
15.B.8 intercorrelations among predictors and with criteria
Not reported for either applicants or incumbents.
15.B.8 unadjusted correlation coefficients
Not reported for any of the 25 tests.
15.B.8 basis for categorization of continuous data
No basis given for 1st percentile reading cutoff.
15.B.10 weights for different parts of selection procedure
Regression weights not reported.
----------------------------------------------------------------
Procedures/data/explanations recommended by professional testing
standards
----------------------------------------------------------------
APA Test Standards (AERA/APA/NCME, 1985)
Primary:
1.11 For criterion-related studies, provide basic statistics
including measures of central tendency and variability,
relationship, and a description of any marked nonnormality
of distributions
1.17 When statistical adjustments made, report both the
unadjusted and adjusted results
6.2 Revalidate test when conditions of test administration
changed
10.9 Give clear technical basis for any cut score
Secondary:
3.12 Provide evidence from research to justify novel item or
test formats
3.15 Provide evidence on susceptibility of personality
measures to faking
_______________________________________________________________
SIOP Principles (Society for Industrial
and Organizational Psychology, 1987)
Procedures in Criterion-Related Study:
4c Test administration procedures in validation research
must be consistent with those utilized in practice (p. 14)
5d Regression equations should be adjusted using the
appropriate shrinkage formula (p. 17)
5e Criterion-related studies should be evaluated against
background of relevant research literature (p. 17)
Research reports:
2 Deficiencies in previous selection procedures (p. 29
9 Summary statistics including means, standard deviations,
intercorrelations of all variables measured, with
unadjusted results reported if statistical adjustments made
(pp. 29-30)
(Summary) Provide enough detail in technical report to allow
others to evaluate and replicate the study (p. 31)
Use of Research Results:
12 Take particular care to prevent advantages (such as
coaching) that were not present during validation
effort. If present, evaluate their effect on validity (p. 34)
------------------------------------------------------------------------
Table 2
Description of Job Duties of Nassau County Patrol Officer (Source: HRStrategies, 1995, pp. 14-15)
_________________________________________________________________
[P]atrol officers have primary responsibility for detecting
and preventing criminal activity...and for enforcement of
vehicle and traffic laws...Patrol officers also are charged
with responsibility for rendering medical assistance to ill
or injured citizens...[including] severely injured, mentally
ill, intoxicated, violent or suicidal individuals....[They]
must pursue ['and take into custody'] individuals suspected
of criminal activity....[and] have knowledge of the laws and
regulations governing powers of arrest and the use of force
so as to avoid endangering the public, or infringing upon
individuals' rights....Patrol officers...[must] carry out a
variety of responsibilities to manage the [crime]
scene...includ[ing] the identification and protection of
physical evidence, identification and initial questioning of
witnesses or victims....[and] often communicate information
they obtain...to detectives...and others. [They] are
regularly assigned to deal with a wide variety of complex
emergency situations requiring specialized knowledge and
training....In some cases, an immediate, decisive
action...may be required to protect life or property, or to
thwart criminal activity. Patrol officers...document
extensively their observations and actions...and provide
statements and court testimony in criminal matters.
