The U.S. Equal Employment Opportunity Commission

Meeting of May 16, 2007 - Employment Testing and Screening


Employers seeking to make hiring and promotion decisions based on merit frequently use tests to establish a list of eligible candidates. The recent spat of litigation over these tests underscores that these tests are prone to disparate impact challenges. The Supreme Court first described the disparate impact theory in Griggs v. Duke Power Co., 401 U.S. 424, 431-2 (1971) (Title VII “proscribes not only overt discrimination but also practices that are fair in form, but discriminatory in operation. The touchstone is business necessity. . . . [G]ood intent or absence of discriminatory intent does not redeem employment procedures or testing mechanisms that operate as ‘built-in headwinds’ for minority groups and are unrelated to measuring job capability.”). The allocation of proof in a disparate impact case is as follows:

  1. Prima facie case: The plaintiff must prove, generally through statistical comparisons, that the challenged practice or selection device has a substantial adverse impact on a protected group. See 42 U.S.C. § 2000e-2(k)(1)(A)(i). The defendant can criticize the statistical analysis or offer different statistics.
  2. Business necessity: If the plaintiff establishes disparate impact, the employer must prove that the challenged practice is “job-related for the position in question and consistent with business necessity.” 42 U.S.C. § 2000e-2(k)(1)(A)(i).
  3. Alternative practice with lesser impact: Even if the employer proves business necessity, the plaintiff may still prevail by showing that the employer has refused to adopt an alternative employment practice which would satisfy the employer’s legitimate interests without having a disparate impact on a protected class. 42 U.S.C. § 2000e-2(k)(1)(A)(ii).

There are several methods of measuring adverse impact. One method is the EEOC’s Uniform Guidelines on Employee Selection Procedures, 29 C.F.R. §1607 et seq. (“Uniform Guidelines”), which finds an adverse impact if members of a protected class are selected at a rates less than four-fifths (80 percent) of that of another group. For example, if 50 percent of white applicants receive a passing score on a test, but only 30 percent of African-Americans pass, the relevant ratio would be 30/50, or 60 percent, which would violate the 80 percent rule. 29 C.F.R. §§ 1607.4 (D) and 1607.16 (R). The 80 percent rule is more of a rule of thumb for administrative convenience, and has been criticized by courts. See 1 LINDEMANN AND GROSSMAN, EMPLOYMENT DISCRIMINATION LAW, at 92-94.

The courts more often find an adverse impact if the difference between the number of members of the protected class selected and the number that would be anticipated in a random selection system is more than two or three standard deviations. See 1 LINDEMANN AND GROSSMAN, at 90-91. The defendant may then rebut the prima facie case by demonstrating that the scored test is job-related and consistent with business necessity by showing that the test is “validated,” although a formal validation study is not necessarily required. 29 CFR § 1607.5(B); see also Watson v. Fort Worth Bank & Trust Co., 487 U.S. 977, 998 (1988); Albermarle Paper Co. v. Moody, 422 U.S. 405, 431 (1975).

This memorandum summarizes significant relatively-recent cases from January 2000 through April 5, 2007, involving the use of scored tests in the employment context. It excludes polygraphs or so-called “honesty tests,” but includes physical strength and agility tests. The cases were located through a Westlaw ALLFEDS search, a LEXIS-NEXIS search, in the GENFED Library, CURRENT File, and a search of Fair Employment Practice Cases (BNA), also on LEXIS-NEXIS. All cases have been checked for subsequent history, which is included in the citation. The case are listed alphabetically and divided between courts of appeals cases and district court cases.


Adams v. Chicago,
469 F.3d 609 (7th Cir. 2006).

Minority Chicago police officers sued the City, claiming that an exam used for promotions to sergeant in 1997 had a disparate impact on minority candidates.

The challenged exam was created by an outside consultant and consisted of three parts: (1) multiple-choice questions covering the law, department procedures, and other regulations sergeants needed to know; (2) multiple-choice questions testing the administrative functions performed by sergeants, and (3) an oral exam based on written briefing. To qualify for the third part of the exam, candidates had to perform well on the first two parts. Each part of the exam was weighted equally, and candidates were ranked by scores to create a promotion list, with the highest scorer entitled to the first promotion. The exam was first used in 1994. Because the court had previously denied an injunction requested by the plaintiff-officers to prevent continued use of the exam, it was used again in 1996 and 1997. About one month before the 1997 exam, a task force appointed by the Mayor recommended that the City change its promotions so that “thirty percent of promotions to sergeant be based upon merit [i.e., on-the-job performance as rated by supervisors], with the promotional tests used to assure a ‘minimum level of competence.’” The plaintiff-officers here challenge the exam’s use for 1997 promotions, because the City did not implement the task force’s recommendations.

The City conceded that the promotional exam had a disparate impact on minority officers who sought promotion to sergeant. The plaintiff-officers conceded that the promotional examination was job-related and consistent with business necessity, based on a previous court ruling from litigation involving a similar test used to promote sergeants to lieutenants. Therefore, the remaining issue is whether the plaintiff-officers proved that there was another available method of evaluation which was at least equally valid and less adverse that the City refused to use.

At trial, the plaintiff-officers offered two pieces of evidence to show that the City failed to use an alternative, available method of evaluation: (1) Beginning in 1989, the City used merit to fill twenty percent of D-2 positions, which consists of police officers functioning as detectives, youth officers and gang crimes specialists, and (2) the City implemented the task force’s recommendation to base thirty percent of its officer-tosergeant promotions on merit starting in 1998. The trial court found that promotion processes used after 1997 were inadmissible as subsequent remedial measures under Rule 407 and granted summary judgment for the City, because without such evidence, the plaintiff-officers could not demonstrate that considering merit was available or would result in at least equally job-related, less adverse promotions.

On appeal, the Seventh Circuit ruled that the trial court erroneously excluded the evidence as a subsequent remedial measure, holding that Rule 407 did not apply to disparate impact situations and that such evidence qualified as an exception to Rule 407’s exclusion by being admitted for the another purpose—determining the availability of an alternative promotional method. The Seventh Circuit nonetheless upheld the ultimate decision below, because even while using the 1998 examination as evidence, the plaintiff-officers still could not show that the City had available to it an at least equally valid and less adverse exam by the 1997 promotions. The plaintiff-officers would have to show that the last officer promoted in the proposed merit-based selection process would be roughly as qualified as the officer last selected under the 1997 method. Even if the plaintiff-officers could show this, they could not show that the process for evaluating the officers based on merit for sergeant promotions was actually available for use during the 1997 promotions or that the City refused to adopt this alternative. The task force’s recommendation only one month prior to the 1997 promotional process was only prospective, and development of the merit-based process used in 1998 took approximately nineteen months. The Court further determined that, because the D-2 merit-based process, which did exist in 1997, involved promotions into non-supervisory positions, it could not be considered an available, at least equally valid method for promoting officers into the supervisory position of sergeant.

The dissent disagreed with the weighty burden the majority placed on the plaintiff-officers, reasoning that they only had to, and did, present evidence from which a reasonable jury could conclude that, at the time of the 1997 promotions, the City could have used a thirty-percent merit promotion process. The dissent disagreed that a reasonable alternative is not available merely because the defendant has not completed its own inquiry into the viability of the alternative—and that at least a material dispute of fact existed as to whether the City took proper steps to implement the task force’s recommendation promptly.

Allen v. Chicago,
351 F.3d 306 (7th Cir. 2003).

African-American and Hispanic police officers brought suit against city, alleging it engaged in race-based discriminatory promotions in violation of Title VII. This case arose from a challenge to the 1998 sergeant’s promotional process by minority officers who were not selected for promotion from officer to sergeant. Two classes of plaintiffs challenged the process. Subclass A plaintiffs were minority officers who failed a written qualifying test and were not eligible for promotions. Subclass B plaintiffs were minority officers who passed the written qualifying test, but were not promoted because of the city’s thirty-percent ceiling on the number of merit-based promotions. Although both the written qualifying test and the thirty-percent ceiling on merit promotions had a disparate impact on minority officers, the officers conceded that each was job related and consistent with business necessity. Therefore Subclass A and B plaintiffs each attempted to argue that other tests or selection devices, without a similarly undesirable racial effect, would also serve the employer’s legitimate interest in efficient and trustworthy workmanship.

Subclass A Plaintiffs

Subclass A officers challenged the written qualifying test prerequisite to merit promotions. Subclass A officers proposed that merit promotions be made without requiring a passing score on the written qualifying test. The court found that the officers lacked evidence to establish either that: (1) merit promotions with no qualifying score prerequisite would be substantially equally valid as merit promotions with the qualifying score prerequisite, and (2) whether such promotions would be less adverse in their impact.

The written qualifying test was designed to measure the skills, knowledge and abilities required by a minimally qualified sergeant on the first day of the job. The merit promotions were validated with the qualifying test prerequisite. The City argued therefore, there was no evidence that merit promotions alone would be equally valid. Without an initial assessment of job knowledge, skills and abilities, an officer who lacked the minimum level of competence might be promoted.

The officers argued that the City could eliminate the test and train nominators, who were already trained to assess meritorious traits, also to assess job knowledge, skills and abilities. The officers, however, provided no evidence to demonstrate that the nominators were capable of assessing these prerequisites. The court found the officers’ “bare assertion” about the ability of nominators to assess job knowledge was insufficient to show that this procedure was valid in comparison with the written qualifying test. The court further stated that, even if the officers had presented evidence of validity, they still presented insufficient evidence that their alternative would be less discriminatory. There was simply no way of knowing who would be promoted if the nominators assessed job knowledge, skills and abilities, without regard to passage of the written qualifying test. Thus, the court affirmed summary judgment to the City on the claim of Subclass A.

Subclass B

The Subclass B officers challenged the city’s thirty-percent ceiling on merit promotions as discriminatory. They argued that merit promotions should be made at a greater percentage of the total number of promotions made.

The court of appeals affirmed the district court’s determination that Subclass B had not established an equally valid, less discriminatory alternative, and thus the City was entitled to summary judgment. On appeal the officers argued that they “should not be required to come forward with evidence to show a correlation between their alternative procedure and job performance.” The court of appeals held that this argument directly contradicted their burden under the framework of 42 U.S.C. § 2000e-2(k)(1)(A)(ii) and Albemarle Paper Co. v. Moody, 422 U.S. 405, 425, (1975). Thus, without any evidence that the officers’ alternative of increasing merit promotions would lead to a workforce substantially equally qualified, the court could not accept the officers’ alternative as substantially equally valid.

Allen v. Rumsfeld,
No. 03-1496, 2003 WL 21739000 (8th Cir. July 9, 2003).

Federal employee brought action alleging that she was denied promotion because of her race and in retaliation for prior complaints of discrimination, both in violation of Title VII. The court found that plaintiff’s low scores on the skills narrative evaluation precluded her from establishing a prima facie case, because they indicated that she was not as qualified as the fifty-six employees who were promoted. Plaintiff contended that her low scores were the result of discrimination, either because the panelists who rated her skill narratives determined her race due to the activities discussed in them or because the defendant practiced discrimination by limiting racial minorities’ participation in activities that lead to achievement of higher scores. The record, however, demonstrated that participation in relevant activities was not race-specific and that the individuals rating the skill narratives had no information regarding the race, color, or national origin of the applicants. Furthermore, there was no evidence to support plaintiff’s claim that the persons who rated her application were able to determine her identity and then to retaliate against her for prior complaints of discrimination. Finally, the Court found unavailing plaintiff’s argument that the declining numbers of African-Americans receiving promotions was evidence that she was discriminated in this case and affirmed summary judgment for the defendant.

Anderson v. Westinghouse Savannah River Co.,
406 F.3d 248 (4th Cir. 2005), cert. denied, 126 S. Ct. 1431 (2006).

A black, female administrative assistant sued her employer alleging that its use of its Competency Based Posting System (CBPS) and Ranked Performance Pay Process (RP3) created a disparate impact on African-Americans. The hiring and promotion process under the CBPS contains 19 steps, three of which were challenged by the plaintiff on account of their subjectivity. The three challenged steps related to the applicants being selected for an interview and the interview panel choosing the candidate for the position after an interview. During the interviews, the panel considers six core competencies: teamwork, leadership, communications, business results, self-management, and employee development. The panel also considers functional competencies, which are selected to be specific to the position sought—such as being proficient in heating and a/c design.

The RP3 is designed to provide merit-based increases based on job performance. Managers must use the RP3 electronic evaluation worksheet to rate employees, and employees can be rated in various combinations, such as total ranking of all employees, ranking per work group, ranking per salary grade, etc. Merit increases are awarded based on each division’s budget and rankings. Plaintiff offered one expert’s opinion testimony on the disparate impact of the RP3 system. The Fourth Circuit affirmed the district court’s decision to exclude this testimony for lack of proper controls—his statistical analysis compared white and African-American employees without taking into account any differences in their job titles or position and, therefore, failed to compare similarly-situated workers. Without this expert testimony, Plaintiff’s RP3 disparate impact claim failed.

Plaintiff’s CBPS disparate impact claim also failed. Plaintiff failed to provide sufficient statistical evidence to show causation, because her expert’s comparison of the percentage of African-American employees who actually succeeded to the percentage he expected to succeed based on the total percentage African-American employees failed to account for any other variables, such as presentation in interviews, education, and experience. Additionally, the discretion of the interview panel was not as subjective or unfettered as the plaintiff alleged, seeing as how they were required to rely on core and functional competencies when evaluating the candidates.

Banks v. East Baton Rouge Parish School Bd.,
320 F.3d 570 (5th Cir. 2003), cert. denied, 124 S. Ct. 82 (2003).

Female former and current school janitors brought suit against school board, alleging retaliation and disparate impact discrimination in violation of Title VII and § 1983. According to the employees, the Board thwarted the employees’ immediate promotion, when the Board, acting pursuant to a consent decree, implemented a reading requirement and new salary structure for its janitors.

Prior to the Consent Decree, the Board employed three levels of janitors, Janitor I, Janitor II and Janitor III. Janitor I employees (all of whom were female) worked part-time for six hours per day, nine months per year and were responsible for performing basic tasks. Janitor II employees (most of whom were male) worked eight hours per day for the entire year, performed essentially the same tasks as Janitor I, with the addition of some duties such as lawn care. Janitor III employees (all of whom were male) worked full-time, performed the same tasks as Janitor I and II employees, with the addition of some supervisor tasks, such as locking up buildings and supervising cleaning crews.

After the Board eliminated medical benefits and reduced the hours of Janitor I employees, the Employees sued, alleging that the Board’s actions had a disparate impact on female employees, since all Janitor I employees were female. While this suit was pending, the DOJ commenced an investigation and ultimately filed suit against the Board alleging that the Board discriminated against women by reserving the Janitor II and Janitor III positions for males. After an evaluation of all school system positions, the Board decided to phase out the three-tiered janitor position and replace it with two new positions: “Janitor” and “Lead Janitor.” The Board implemented new testing procedures to select applicants for the new Janitor position. Applicants were required to take and pass a “practical” test involving the use of maintenance equipment, as well as a reading test, which tested an applicants’ ability to read at an eighth-grade level. Applicants for the Lead Janitor position were not required to take either test. The Board’s justification for the reading test was safety concerns based upon OSHA safety regulations, which are written on an eighth grade reading level, as were the majority of chemical workplace safety sheets. The Board then entered into the consent decree with the DOJ, which incorporated the Board’s plan for the new Janitor position.

Thereafter, when the plaintiffs applied for the new Janitor position, they took the required tests. All the plaintiffs passed the practical test, but only one passed the reading test. The plaintiffs who failed the reading test were given the option of either remaining in their old Janitor I jobs or taking the new Janitor position on a probationary basis, regardless of their current reading ability. The plaintiffs who took the probationary Janitor position were paid at the lowest pay step in the new pay scheme. Once a probationary employee demonstrated an eighth grade reading level, she would be moved up the pay scale to a level that corresponded with the old Janitor I pay scale.

The instant lawsuit followed. The plaintiffs alleged that the Board’s implementation of the new Janitor position was in retaliation for the plaintiffs’ previous lawsuit against the Board. When asked why the reading test was required, one school employee allegedly stated, “[t]hat’s what you get for filing a lawsuit.” The employees also alleged that the reading test had a disparate impact on female janitorial employees.

The district court granted the Board’s Motion for Summary Judgment, finding that plaintiffs failed to establish a prima facie case of retaliation. The Fifth Circuit affirmed, reasoning that the Board’s implementation of the reading test was not an “adverse employment action” under either Title VII or § 1983.

In addition, the court of appeals affirmed the district court’s finding that the plaintiffs failed to establish a prima facie case of disparate impact discrimination under Title VII. The plaintiffs argued that the female Janitor I employees were the only employees adversely impacted by the reading requirement, since the mostly-male Janitor II employees could place into the Lead Janitor positions without taking a reading test. The court of appeals, however, looked to the entire pool of potential applicants for the position, not just the actual applicants. The court found that the entire pool was open to both male and female applicants. Also, the plaintiffs failed to produce any statistical evidence tending to show that the reading requirement operated in a way that selected females in a pattern “markedly disproportionate” from the entire pool of applicants for the new Janitor position. In addition, the plaintiffs failed to produce any non-statistical evidence that the reading requirement selected female applicants in a significantly discriminatory pattern.

The court found that the Board, on the other hand, provided statistical evidence showing that the selection of the protected group, females, actually exceeded the selection of the comparison group, males. The Board explained that, since the position was created and the reading tests had been used, a total of 548 females and 471 males had applied for employment. Of those applicants, 87 females (or 15.9% of the total female applicants) and 56 males (or 11.9 % of the total male applicants) were selected for employment. The Board also maintained that, “the figures show that, in reality, more women have been selected for employment for the new janitor position than men despite the testing requirements.” Thus, the court held that that the district court correctly concluded that the plaintiffs failed to establish a prima facie case of disparate impact discrimination.

Banos v. Chicago,
398 F.3d 889 (7th Cir. 2005)

Minority police sergeants in the Chicago Police Department (CPD) sued the City of Chicago, alleging racially discriminatory disparate impact in the 1998 CPD promotional process in violation of Title VII.

The Promotional Process: Sergeants applying for lieutenant positions were put through a three-part evaluation: (1) a written qualifying test, (2) an assessment exercise, and (3) a merit selection process. Applicants were required to pass the written qualifying test before they could be considered for lieutenant under the latter two parts. Following the written test, the City made 70 percent of its promotions from a rank-order list of candidates based on their score on the assessment exercise, which was composed of written responses to questions. The remaining 30 percent of promotions were merit-based.

In their original complaint, the Plaintiffs alleged that both the written qualifying test and the assessment exercise unlawfully discriminated against them based on race. The district court certified two subclasses of Plaintiffs: Subclass A was composed of minority officers who failed the written qualifying test; Subclass B was composed of minority officers who passed the written qualifying test but did not achieve a high enough score on the assessment exercise to warrant promotion.

The outcome of this case turned on holdings in two other pertinent cases. In July of 2000, the Plaintiffs requested and were granted a stay of discovery pending resolution of a writ of certiorari filed in Bryant v. City of Chicago, 200 F.3d 1092 (7th Cir. 2000). In Bryant, the Seventh Circuit Court of Appeals held that the 1994 CPD promotional process was content-valid and thus not violative of Title VII. The Supreme Court denied the writ of certiorari in October of 2000.

Following this denial, the Plaintiffs amended their complaint, alleging that merit promotions were an equally valid, less discriminatory alternative to rank-order promotions, and that the City of Chicago violated Title VII by limiting the use of merit based promotions to 30 percent. Upon request, the Plaintiffs admitted, pursuant to Fed. R. Civ. P. 36(a), that the written qualifying test and the assessment exercise were valid under Title VII.

However, in the aftermath of Allen v. City of Chicago, No. 98-C7673, 2002 WL 31176003 (N.D. Ill. Sept. 30, 2002) (ruling that plaintiffs had failed to establish that merit-based promotions were an equally valid and less discriminatory alternative to rank-based promotions), the Plaintiffs sought to resurrect their original claims, acknowledging that Allen was fatal to their amended complaint. Once again, the Plaintiffs attempted to assert, under Fed. R. Civ. P. 36(b), that the written qualifying test and the assessment exercise violated Title VII, asking the district court to withdraw their earlier admissions. This request was denied and the district court entered summary judgment for the City of Chicago. The Seventh Circuit Court of Appeals affirmed, finding no abuse of discretion. Expressly endorsing the district court’s refusal to allow Plaintiffs to withdraw their admissions, the Court of Appeals declared, “Admissions, in some ways, are like sworn testimony. Once one is made, there is no need to revisit the point.”

Baptist v. Kankakee,
-- F.3d --, 2007 WL 789583 (7th Cir. 2007).

African-American police officers sued the City of Kankakee, alleging racial discrimination in the police department’s promotional policies and a promotional test. Defendant’s promotional exam was given for the purpose of creating lists for promotions to Sergeant or Lieutenant. The promotion exam consisted of a written exam, an oral exam for Sergeant candidates or Oral Assessment Center for Lieutenant candidates, merit (“Chief”) points, longevity points, and time-in-grade points. If an individual’s total score was 70 or more, that person would be entitled to additional “veteran’s points,” based on military service. Individuals were placed on the promotion eligibility list in rank-order based on total score.

Plaintiffs settled in open court and then moved to vacate the settlement. The Seventh Circuit rejected the plaintiffs’ arguments and affirmed the settlement and the order dismissing their disparate treatment claims.

Relevant settlement terms: The City agreed to engage in practices to safeguard against discrimination, such as: (1) establish a “Blue Ribbon Committee” to review recruiting, testing, and promotional policies; (2) employ an independent testing company for any hiring and promotion testing, and (3) conduct annual cultural diversity training.

Bell v. Potter,
No. 02-2732, 2002 WL 31641561 (7th Cir. 2002), cert. denied, 123 S. Ct. 1922 (2003).

An African-American applicant sued Postal Service, alleging that denial of his second application for employment was retaliatory because he filed a discrimination complaint related to his first application. The court found that the plaintiff’s comparison of potential postal employees and their test scores was irrelevant, because test scores were not among the reasons that the Postal Service articulated for not hiring the applicant. The court held that the applicant failed to establish a prima facie case of retaliation against the Postal Service.

Bew v. Chicago,
252 F.3d 891 (7th Cir. 2000), cert. denied, 534 U.S. 1020 (2001).

The plaintiffs in this case were African-American probationary police officers who were discharged by the City of Chicago for failing the Illinois Law Enforcement Officers’ Certification Examination. They sued, alleging a disparate impact on African-American and Hispanic officers. This is an unusual case in which the overall pass rate for African-Americans was 98.24% as compared to 99.96% for whites. The district court held that, even where there was no violation of the 80% guideline, there still might be disparate impact under a Z-score analysis, and in this case the Z-score was more than five standard deviations from the norm. The City’s motion for summary judgment was denied, and the case proceeded to a bench trial. In that trial the district court held that plaintiffs had established a prima facie case of disparate impact discrimination, but that the City had proved that the exam was related to the job, the cut-off score was reasonable and consistent with professional standards, and that defendants had shown a business necessity for the test. In this opinion the Court of Appeals affirmed that finding. Plaintiffs had objected to the cut-off score, but the court found it to be consistent with normal expectations of the knowledge necessary to be a police officer. In addition, the score satisfied the City’s desire not only to certify well-trained officers, but also to provide adequate numbers of officers for staffing purposes. Given the high pass rates, the Court held that to reduce a cut-off score to the point where all test-takers pass would be to make the test a futile exercise, because it no longer is a measuring device. It held that the cutoff score met both the business necessity and job-relatedness standards. Plaintiffs also objected to the cut-off score as arbitrary and not differentiating between those officers who would perform adequately from those who would not. Plaintiffs provided as proof of this the fact that certain probationary officers who were performing adequately were not within the cut-off score. The Court held, however, that cut-off scores need not be set so that they would select all good job performers and reject all bad performers. Plaintiffs had also objected to the rule that officers could take the certification exam only three times. When responding to this argument, the Appeals Court noted that, in light of its finding that the exam was valid and the cut-off score was appropriate, it is reasonable to expect plaintiffs to pass the exam. It held a policy of allowing three opportunities for test-takers was generous.

Biondo v. Chicago,
382 F.3d 680 (7th Cir. 2004), cert. denied, 543 U.S. 1152 (2005).

White applicants whose promotions within Chicago Fire Department (CFD) were denied or delayed as a result of the City’s five year use of standardized lists, established in response to disparate impact of competitive examination on minorities, filed suit under § 1983 and Title VII. After an advisory jury determination and two jury trials, the district court directed entry of judgment in favor of the plaintiffs. The City of Chicago appealed.

This case arises from a test implemented and developed in 1986 for the CFD position of lieutenant. CFD has five ranks: firefighter, engineer, lieutenant, captain, and battalion chief. CFD took care to ensure that the exam was both non-discriminatory and a valid test of skills. Yet although 29% of those who took the exam were either black or Hispanic, only 12% of those who received the highest 300 scores were in these groups.

The Department concluded that this disparate impact could be justified under the EEOC’s Uniform Guidelines on Employee Selection Procedures, 29 C.F.R. § 1607.4, only if the exam were valid for rank-order use--that is, if someone who scores higher on the test is “bound” to perform better than the person next in line. According to the Department’s expert, this examination had a standard error of measurement of 3.5, which is to say that a person who scored 80 and took a similar test again could score as high as 83.5 or as low as 76.5 without implying that his skills and probability of success in the higher position had changed.

Convinced that it could not make promotions from the 1986 list in rank-order fashion, the Department established what it called “standardized” lists and what most people would call racially segregated lists: it drew up one list for whites and another for blacks and Hispanics, and then made 29% of all promotions from the minorities-only list. The Department used these lists until 1991, promoting a total of 209 lieutenants from the 1986 exam. This process meant that the promotion of some white candidates was delayed, and others were not promoted even though minority candidates with lower scores became lieutenants.

The Department acknowledged that its approach could be sustained only if a compelling interest supported its use of race and ethnicity. However, it did not argue that either past discrimination or a quest for diversity supported its approach. It instead maintained that it had a compelling need to comply with 29 C.F.R. § 1607.4.

In a scathing opinion, the court clearly held that compliance with a governmental regulation is not automatically a compelling interest. The court reasoned that if that were so, Congress or any federal agency could direct employers to adopt racial quotas, and the direction would be self-justifying: the need to comply with the law or regulation would be the compelling interest. Furthermore, if avoiding disparate impact was a compelling governmental interest, “then racial quotas in public employment would be the norm.” As the court further explained, “[s]uch a circular process would drain the equal protection clause of its meaning.” The Court further admonished the City of Chicago’s two-list rank-order procedure, finding that the Civil Rights Act of 1991 explicitly forbids the City’s response to disparate impact. And while the 1991 Act did not apply retrospectively, the 1991 Act and 1964 Act jointly revealed that standardization cannot be an indispensable response to disparate impact.

The court then concluded that the City could have used bands reflecting the standard error of measurement. For example, the Department could have treated all scores in the range of 96-100 as functionally identical and made promotions by lot from that band. According to the Court, such a procedure would have respected the limits of the exam’s accuracy while avoiding any resort to race or ethnicity. Given that the City had options of this kind, the Court found that the City’s two-list procedure was not compelled. The court further reasoned that the Department’s assertion that it viewed rank-order promotions unsupportable was further undermined by the fact that after it created each list, the Department promoted in rank-order sequence from each list.

Although the court of appeals affirmed the district court’s decision on the merits, it vacated and remanded the judgment with respect to damages and equitable relief, finding that the substantial awards of compensatory damages and lengthy front pay awards were not supported by the record.

Bishop v. New Jersey,
144 Fed. Appx. 236 (3d Cir. 2005).

African-American plaintiffs challenged the 2000 version of a firefighter promotional exam that: (1) included 75 multiple choice questions in addition to the traditional essay format; (2) made passing the written portion a prerequisite to taking the oral potion of the exam; (3) utilized a numerical score instead of a pass/fail score for the written and oral portions of the exam; and (4) factored seniority into the exam scores only if the candidate passed the written and oral portions of the exam (the “Z formula”). Only 129 out of 287 applicants passed the exam in 2002, which included 29.5% of the African-American, 33.3% of the Hispanic, and 55.8% of the Caucasian candidates. The municipal defendants did not create or implement the exam, but did make selections from the list of passing applicants.

The district court held, and the Third Circuit affirmed, that the municipal defendants were not liable for the alleged disparate impact of the exam. Prior decisions had held that the use of an eligibility list containing the results of an allegedly tainted exam was “merely the neutral effect of a prior act of discrimination, but does not constitute a separate discriminatory action.” Because Newark had no choice but to use the eligibility list and exercised limited discretion in choosing whom to hire from the list, the City’s use of the list was a “neutral, ministerial action, rather than a separate discriminatory act.”

Chicago Firefighters Local 2 v. Chicago,
249 F.3d 649 (7th Cir. 2000), cert. denied, 534 U.S. 95 (2001).

This opinion relates to three consolidated cases – one of them dating back to 1987 – by white firefighters who complained that their equal protection rights were infringed by the affirmative action promotions of black and Hispanic firefighters made by the Chicago Fire Department. In this action the white firefighters made a number of arguments related to the ongoing efforts of the city to remedy what was acknowledged to be previous discrimination against minorities in selections for firefighter positions. These arguments related to the appropriate level for minority participation in the fire department, and whether the population of the city and surrounding area should be applied to the population of the fire department. There is lengthy discussion of the population issues, but the claims of the plaintiffs in this area were dismissed. The one claim of interest is the question of whether the city’s practice in banding test scores amounted to prohibited race norming. In reviewing other cases in this area, the court noted that, although banding has been upheld as a valid method of affirmative action, no court had previously been asked to consider its consistency with the prohibition of race norming scores. In this opinion the court holds that if banding is adopted in order to make lower black scores seem higher it would be a form of race norming and forbidden. Instead, the court holds that banding is a universal and normally an unquestioned method of simplifying scoring by eliminating meaningless gradations. It analogizes banding to schools that change from number grades to letter grades. In rejecting the objections to the city’s banding process, it notes that banding is most attractive when the range of abilities in a group being tested is relatively narrow, such as the skill difference between someone who receives 200 questions right on an exam as compared to someone who answers 199 questions correctly.

Cotter v. Boston,
323 F.3d 160 (1st Cir. 2003).

Seven Caucasian police officers sued the city under § 1983, alleging equal protection violations by the Boston Police Department (“BPD”) when promoting three African-American police officers. The BPD was operating under a consent decree governing sergeant promotions. The decree expired in 1995. This case arises from a reverse discrimination action challenging the subsequent 1997 promotional process. In 1997, the police department sought to promote thirty police officers to sergeant. Each candidate took a 1996 sergeant promotion examination given by the Human Resources Division. If promotions had been made in strict rank order of these scores, twenty-nine non-African-American officers and one African-American officer would have been promoted. The Department determined that this promotional decision would violate the “four-fifths rule,” indicating possible adverse impact on minority candidates. In addition, because of the disparity in the promotion of officers to sergeant, current racial tensions within the Department, and the documented history of past discrimination within the BPD, the BPD sought greater African-American representation among its sergeant ranks. Therefore, the Department promoted the top 26 officers in strict rank order which included all officers scoring 86 and above, with the exception of one officer who was bypassed for cause.

Only two of the seven non-African-American officers who achieved a score of 85 were promoted. The Department then promoted three African-American officers who achieved a score of 84 (the “African-American Officers”), while not promoting 5 officers who achieved an 85 and all officers who achieved a score of 84.

The BPD was required by Massachusetts law to provide a statement to the HRD explaining the reasons for its departure from strict rank order selections. See Mass. Gen. Laws ch. 31, § 27 (2002). The BPD sent a letter to the HRD stating that the departure from strict rank order to promote the African-American Officers was done to “ensure compliance with current EEOC guidelines, and applicable federal and state discrimination laws.” The HRD rejected this explanation, contending that the BPD was erroneously acting under a terminated consent decree.

In response to the HRD’s rejection, the Department promoted six additional officers (one formerly bypassed for cause and the five with a score of 85). The end result was that all 33 officers scoring 85 and higher were promoted and the three African-American Officers scoring 84 were promoted. However ten non-African-American officers scoring 84, including seven Caucasian plaintiffs, were not promoted. Of the 36 officers promoted to sergeant, four were African-American.

On May 21, 1999, plaintiffs filed suit against the City alleging that the Department violated plaintiffs’ civil rights under 42 U.S.C § 1983 by failing to promote plaintiffs to sergeant because of their race. After discovery, the City moved for summary judgment, on the grounds that the plaintiffs lacked standing, and that the promotions of the African-American Officers were a narrowly-tailored means of meeting several compelling governmental interests. On March 22, 2002, the district court granted the City’s Motion, dismissed plaintiffs’ claims and entered judgment in favor of the City. The district court held that the City’s actions were a narrowly-tailored means of remedying the continuing effects of past discrimination.

On appeal, the First Circuit held that the plaintiffs did not have standing because they failed to identify a cognizable injury warranting relief under § 1983. Even if BPD had not used race-conscious criteria, it nonetheless would have only promoted candidates with a score of 85 or higher. Thus, because appellants only scored 84, and otherwise not eligible for promotion even absent the use of the BPD’s race-conscious criteria, they could not establish standing for damages. Appellants also argued that they had standing to seek immediate promotion. The court found that two of the appellants, who had since been promoted to sergeant, lacked standing to seek immediate promotion. However, the Court agreed that the remaining appellants could make a colorable claim for standing to seek immediate promotion.

The court then turned to the merits of the case. The City asserted that the race-conscious action was necessary to ameliorate “vestiges” of past discrimination by the Department against African-American applicants and officers. The court found that the Department’s history of discrimination was well-documented by past litigation and records. As recently as October 1996, African-Americans comprised 25.02% of the BPD’s 1,547 officers, but only 16.49% of the BPD’s sergeants. The court concluded that this difference was statistically significant, not reasonably attributed to chance, and concrete evidence that discrimination existed at the time the African-American Officers were promoted.

Next, the court determined that the action taken by the City in its effort to remedy past discrimination was narrowly tailored to rectify the specific harm in question. In doing so, the court considered several factors, including the extent to which: (i) the beneficiaries of the order are specially advantaged; (ii) the legitimate expectancies of others are frustrated or encumbered; (iii) the order interferes with other valid state or local policies and (iv) the order contains (or fails to contain) built-in mechanisms which will, if time and events warrant, shrink its scope and limit its duration.

Applying these factors, the court concluded that the Department would have had to promote twenty African-American officers to create a situation whereby the percentage of African-American officers and African-American sergeants was approximately equal. The necessity for relief was great, but the means chosen by the Department were modest – only three African-American officers were promoted out of rank – indicating narrow tailoring. Only qualified minorities were promoted; they were therefore not specially or unfairly advantaged by their promotions. All officers were competing for a limited number of spots. Because of this competition, the City’s promotion of the African-American officers did not disturb any legitimate, firmly rooted expectations of the appellants. Had the City not departed from strict rank order, no additional Caucasian officers would have been hired. Finally, finding that the means were narrowly tailored, the court approvingly cited the fact that there were no quotas or long-term guidelines established.

Denney v. Albany,
247 F.3d 1172 (11th Cir. 2001).

White firefighters sued the city following their failure to be promoted to lieutenant. In this decision, the court of appeals reviewed the district court’s findings and affirmed, holding that there was insufficient evidence of discriminatory intent in the selections for lieutenant. The city’s promotional process for lieutenant was challenged previously in connection with a 1994 promotion process, and the plaintiffs in that action were successful in proving discrimination. This action was brought following a later qualification exercise. Twenty-three applicants completed the process, and 21 were placed in the eligibility pool. The qualification exercise included a written examination weighted at 30% of the overall score, a skills assessment center, weighted at 50%, and an oral interview with the police chief, weighted at 20%. Those who scored 70 out of a possible 100 on the three-step qualification exercise were considered qualified and placed in the pool of candidates from which lieutenant selections were made by the police chief. In making final selections from the pool, the police chief, who is black, did not refer to the scores on the qualification exercise. Instead he evaluated the candidates as to their demonstrated leadership, maturity, interpersonal skills, and a willingness to support management in its policies. By the time of this action an equal number of white and black firefighters had been selected for promotion to lieutenant. The plaintiffs in this case alleged that they were more qualified than two of the black firefighters who were selected. The court rejected plaintiffs’ argument that the qualification exercise process had a disparate impact on whites, holding that, since the plaintiffs successfully completed the qualification exercise, and since the qualification exercise scores were not used in the police chiefs selection of the candidates who would receive a promotion, the plaintiffs could not challenge that initial qualification exercise process. This agreed with the district court’s finding. The court looked closely at the written and oral components of the process, because the plaintiffs argued that the police chief engaged in a process that amounted to race norming, by ranking blacks higher on the interview than whites, thus increasing the combined scores of blacks relative to those of whites. The district court rejected this finding, since there was no statistical disparity between blacks and whites in the total score. It declined to find discrimination in the subjective aspects of the oral examination, holding that the oral examination analyzed different skills from the written examination and it was therefore not unusual that scores of candidates might differ in these two parts of the process. There was statistical evidence that overall, qualified black applicants were selected for promotion at a rate that was not statistically different from the rate at which qualified white applicants were promoted. Furthermore, the statistical anomaly in the scoring of blacks and whites on the oral exam was found to occur in another year in which the oral examination process had not involved the police chief, lessening the likelihood that he was engaged in race norming. Finally, the court rejected the plaintiffs’ claims of discrimination in the final selection process because it was purely subjective. It found no discrimination in the fact that the department used the qualification exercise scores solely to determine the pool of qualified candidates, and then relied on other criteria for the selection of lieutenants. The qualification exercise gauged some of the skills necessary for a lieutenant, but it was not intended to gauge a candidate’s ability in all of the skill areas necessary for a lieutenant. Although the plaintiffs might on paper be arguably the best candidates, the court held that the department was not required to select the best qualified candidate, and it would not reexamine its business decisions.

Donahue v. Boston,
371 F.3d 7 (1st Cir. 2004), cert. denied, 543 U.S. 987 (2004).

For a summary of the facts, see Donahue v. Boston, 304 F.3d 110 (1st Cir. 2002), cited below. On remand, the district court found that when the plaintiff last took the qualifying civil service exam in April 2001, he was no longer eligible for hire by the Boston Police Department (“BPD”) due to the age restriction set by a Massachusetts statute. Accordingly, the district court held that the plaintiff lacked standing to pursue his claim for prospective injunctive relief. The First Circuit affirmed the district court’s holding. The First Circuit similarly reasoned that the plaintiff did not satisfy a key requirement of standing to seek prospective relief because Massachusetts’s constitutional age limitation precluded plaintiff from being “able and ready” to apply for appointment to the BPD in December 2001, the date of the district court’s original opinion, as well as in 2004.

Donahue v. Boston ,
304 F.3d 110 (1st Cir. 2002), aff’d, 371 F.3d 7 (1st Cir. 2004).

A challenge to the selection process for new police officers in the Boston Police Department remains alive after an appeals court reversed the district court and allowed an unsuccessful white candidate’s claim for prospective relief under the Equal Protection Clause to proceed to trial. The current selection process is the result of a consent decree entered in 1973. Under that decree Boston uses a state-administered examination for appointments to the police academy, with a passing score of 70. When requested by the Boston Police Department, the Massachusetts Human Resources Division certifies an eligibility list of those who received a passing score on the most recent test; however, the eligibility list alternates minority and nonminority candidates. By definition, all individuals on the eligibility list are considered qualified for appointment. The list is further separated by splitting residents and nonresidents and giving priority to residents of Boston. Other preferences are also applied. Candidates who receive a passing score and qualify for one of four statutory preferences are moved to the top of the list. Those preferences are for children of firefighters or police officers killed in the course of their duties; disabled veterans; veterans; and widows or widowed mothers of veterans who were killed in action or died from a service-connected disability. In addition, special certification lists are created for those with three language skills: Spanish, Vietnamese, and French-Creole. Approximately one of every three hires is an individual from one of the language preference lists.

The plaintiff sat for the examination in April 1997, May 1999, and April 2001. In each case he received a high score – 96 in 1999 – but given the number of individuals taking the exam, and the various preferences, the process left him always remote from the “lowest” candidate selected. The district court rejected his several claims for relief, holding that under the Equal Protection Clause, he had no standing to claim either retrospective or prospective relief, because he could not show that he would have been hired under criteria that were race neutral. The court of appeals agreed with the district court that the plaintiff did not have standing to assert a claim for retrospective damages relief because he was unable to establish that he would have received the benefit he sought under a race-neutral policy. However, the court of appeals reversed and remanded the district court’s denial of prospective injunctive relief on standing grounds, because there was not enough evidence in the record to determine whether the plaintiff had standing to pursue the prospective relief sought.

Dyke v. O’Neal Steel, Inc. ,
327 F.3d 628 (7th Cir. 2003).

Temporary employee alleged that his employer violated the ADA by terminating him from his temporary position. The defendant, a warehouse operator, required permanent and temporary employees, who had worked for defendant for over thirty days, to pass vision and physical abilities tests. Plaintiff had worked for two weeks at the defendant’s warehouse as a temporary employee when, at the request of his supervisor, he submitted an application for a full-time position. The defendant’s personnel assistant, after receiving plaintiff’s application, and without administering the vision test, requested the plaintiff’s temporary agency transfer the plaintiff to another assignment. Her stated reason for the transfer request was her subjective belief that plaintiff, because he only had one eye, could not pass the defendant’s mandatory vision test. The plaintiff claimed that the defendant “regarded him” as disabled based upon the company’s vision standards. The court found that the company’s vision standards alone were not sufficient evidence to support a finding of “regarded as” disabled. The court further found that there was sufficient evidence to show that the defendant regarded the plaintiff as disabled by the company’s vision standard, together with the fact that the company failed to administer plaintiff the vision test and failed to inquire regarding the specifics of the plaintiff’s condition.

EEOC v. Dial Corp.,
469 F.3d 735 (8th Cir. 2006).

The EEOC brought a Title VII disparate treatment and disparate impact action against Dial on behalf of female applicants rejected for entry level positions when they failed a strength test. Entry level employees at Dial’s plant are required to carry approximately 35 pounds of sausage at a time, and must lift and load the sausage to heights between 30 and 60 inches high. These employees suffer a large number of work-related injuries. To help reduce the injury rate, Dial implemented several safety measures starting in 1996. In 2000, Dial implemented a strength test called the Work Tolerance Screen (WTS). In this test, job applicants had to carry a 35 pound bar and lift and load it onto frames approximately 30 and 60 inches off the floor. Prior to the test, forty-six percent of new hires were women. After the test, female hires dropped to fifteen percent. The overall female test passing rate was thirty-eight percent, while the men’s pass rate was ninety-seven percent. Although injuries did decrease after test implementation, the decrease had begun in 1998, after other safety procedures were instituted.

At trial, the EEOC presented evidence that the strength test was significantly more difficult than the actual job workers performed at the plant—the test required 6 lifts per minute without breaks, while the job averaged only 1.25 lifts per minute with breaks. The EEOC’s expert also testified that, before the test, the women’s injury rate was actually lower than that of males. Dial’s experts testified that the test effectively tested skills representative of the actual job and that the injury rate decrease could be attributable to the test. The trial court concluded that Dial’s use of the strength test had an unlawful disparate impact on the female applicants, because Dial could not demonstrate business necessity or show either content or criterion-related validity and failed to control for other variables potentially causing the injury rate decline. Dial appealed, claiming that the strength test was a business necessity because is decreased the number of injuries in the plant.

The Eighth Circuit affirmed the decision below. Once a plaintiff establishes a prima facie case of disparate impact, the employer must show the challenged practice is “related to safe and efficient job performance and is consistent with business necessity.” To establish the business necessity defense, the employer must prove that the practice was related to the specific job, the required skills, and physical requirements of the position. A validity study is not necessary if the employer can show the challenged procedure is sufficiently related to safe and efficient job performance. The Eighth Circuit rejected Dial’s arguments that the WTS was content valid, because its physiology expert’s testimony was discredited by the EEOC’s industrial organization expert, who testified that the WTS was more difficult than the actual job and created a “testing environment” where applicants tend to work harder during the test in order to outperform the competition. The Court also rejected Dial’s argument that the WTS was criterion-related valid due to the overall injury decline. The decline started prior to implementation of the exam, and women had lower injury rates than men, who had higher passing rates. Therefore, the test did not predict which applicants could safely handle the job, as Dial contended. Last, the Eighth Circuit approved placing the burden on Dial to establish that there was no acceptable less adverse alternative to the WTS. The Court explained that, as part of Dial’s burden of showing business necessity, it had to demonstrate the need for the challenged procedure. Here, Dial could not show that its other safety measures could not produce the same declining injury results. Because Dial failed to show business necessity, the burden never shifted to plaintiffs to show existence of a nondiscriminatory alternative.

Firefighters’ Inst. for Racial Equality v. City of St. Louis,
220 F.3d 898 (8th Cir. 2000), cert. denied, 532 U.S. 921 (2001).

An association of African-American firefighters and 22 individual plaintiffs sued the City of St. Louis and the firefighter’s union alleging discrimination in the promotional exam used for battalion fire chiefs. In this decision, the Eighth Circuit affirmed a grant of summary judgment to the City and union. The promotional exam in question was developed by Barrett & Associates to test job knowledge and supervisory and managerial skills. It included a written, multiple choice exam, fire scene simulation, and an oral briefing exercise. The test was administered in 1997 to 80 fire captains, of whom 53 were Caucasian and 25 were African-American. That resulted in 12 captains being placed on the eligibility list – ten Caucasians and two African-Americans. FIRE sued, alleging disparate impact where 19% of the Caucasian candidates but only 8% of the African American candidates were eligible for promotion based on their performance on the test. FIRE’s allegations included, first, the fact that the use of multiple choice questions violated a 1980 decision of the Circuit that found an earlier fire captain exam not job-related because the skills for a fire captain couldn’t be adequately measured by a multiple choice test. In this case, the City had presented evidence demonstrating that multiple choice questions could adequately measure the skills of a battalion chief. FIRE also alleged that the fire scene questions, referred to as “first-responder” questions, were not job-related because battalion chiefs do not have first-responder duties. However, the City presented portions of the standard operating procedures manual from which the first responder questions were drawn. FIRE also objected to the use of two management books as resources for the exams, alleging that they were hard to find and out-of-date. The City presented evidence that these were available in adequate supply, not difficult to obtain, and were management publications by the International Society of Fire Service Instructors, directly related to supervision of firefighters. The Court found the exam was related to safe and efficient job performance, consistent with business necessity, and noted the failure by FIRE to present any evidence of a less discriminatory procedure.

Garrett v. Hewlett Packard Co.,
305 F.3d 1210 (10th Cir. 2002).

Former employee brought an action against employer, alleging the employer subjected him to race and age discrimination. The defendant did not contest the plaintiff’s allegations that the ranking and evaluation system were wholly subjective. Absent evidence that defendant’s system of ranking and evaluation relied on objective criteria, the court held that plaintiff satisfied his burden to demonstrate pretext under the third prong of McDonnell Douglas for the purposes of avoiding summary judgment.

Garrison v. Gambro,
428 F.3d 933 (10th Cir. 2005).

Plaintiffs, all women over forty, sued Gambro alleging that the skills assessment test at issue had a disparate impact on women and employees over forty. As part of a reorganization, Gambro divided the equipment assembly department into two categories: EQ-1 (with only 6 positions), and EQ-2. Gambro required anyone who wanted to work in those positions to pass an industry-approved, standardized assessment examination that measured the skills of: assembly, inspection, mechanical comprehension, and mechanical dexterity. Those who did not pass the assessment could either accept a severance package or apply for other jobs within Gambro. All of the plaintiffs applied for the EQ-1 and -2 positions and failed the exam. However, of the eight people who passed the exam, half were women and half were over forty. The gender ratio of those hired for the EQ-1 position was four females to two males.

The Tenth Circuit affirmed the district court’s finding that the plaintiffs could not make out a prima facie case of disparate impact. Plaintiffs attempted to draw comparisons regarding all employees in the entire equipment-assembly area. However, the plaintiffs only applied for the six positions available in EQ-1 and did not apply for the positions available in EQ-2 or -3. The Court, therefore, held that plaintiff’s broad comparison was irrelevant, because only the at-issue jobs formed the proper basis of a disparate impact inquiry. With respect to these six positions, it was undisputed that four out of the six positions were given to women (hiring 19% of women applicants as opposed to 7% of male applicants). As an aside, the Court also noted that the hiring rate for women was greater than for men during the entire equipment manufacturing reorganization.

Gulino v. New York State Educ. Dep’t,
460 F. 3d 361 (2d Cir. 2006).

The Tests: Plaintiff African-American and Latino educators in the New York City public school system challenged the New York City Board of Education’s use of two different tests for permanent teacher certification. The National Teachers Examination Core Battery (“Core Battery Test”) was a general knowledge test that tested “a range of knowledge, skills and abilities that were deemed necessary to being a competent teacher.” The City eventually replaced the Core Battery Test with the Liberal Arts and Sciences Test (“LAST”) developed jointly by the State Education Department and National Evaluation Systems. LAST tests only an applicant’s knowledge of liberal arts and sciences and is only one component of a teacher’s certification. Also, LAST is a pass-fail exam that covers four basic subject areas by multiple choice and writing by use of an essay section. To pass LAST, a test-taker must score an average of 200 points on each section, but a low score in one section can be off-set by a high score in a different section.

The district court found that although both tests had a disparate impact on the plaintiff-minorities, they did not violate Title VII because they were job related. In so holding, the district court reasoned that although the defendants could not demonstrate formal validity for the LAST, the test satisfied job relatedness in light of the coincidence of three factors the court thought applied under a 1988 U.S. Supreme Court case: (1) the importance given to the ability to write an essay by those education professionals surveyed by the test-maker; (2) the weight of the essay writing portion of the test; and (3) the fact that the majority of plaintiffs would have passed the LAST but for the essay writing portion. Plaintiffs only appealed the district court’s holding with respect to the LAST.

Held: The Second Circuit found that the district court applied the wrong legal standard for job relatedness because the 1988 U.S. Supreme Court case it relied on did not, as the district court thought, “lower the bar” for test validation and had only persuasive force. Instead, the district court was bound by the U.S. Supreme Court and Second Circuit’s long standing content-validation requirements explained in Guardians Ass’n v. Civil Serv. Comm’n of New York, 630 F.2d 79 (2d Cir. 1980): “(1) [T]he test-makers must have conducted a suitable job analysis[;] (2) they must have used reasonable competence in constructing the test itself[;] (3) the content of the test must be related to the content of the job . . .[;] (4) the content of the test must be representative of the content of the job[; and] [there must] be (5) a scoring system that usefully selects from among the applicants those who can better perform the job.” (alteration in original). Because the district court failed to elicit sufficient facts under the Guardians standard, the Second Circuit vacated the district court’s judgment and remanded the case for determination consistent with the appropriate legal standard.

Heath v. Ohio Turnpike Comm’n,
No. 02-3392, 2004 WL 68526 (6th Cir. Jan. 8, 2004).

Plaintiff, an African-American part-time toll collector, brought action against Defendant, his employer, the Ohio Turnpike Commission, alleging race-based employment discrimination for failure to promote him to a full-time position. In opposing Defendant’s motion for summary judgment, Plaintiff argued generally that the Defendant’s testing and interview processes disproportionately blocked African-American part-time toll collectors from reclassification as full-time collectors. In support of his contention, Plaintiff presented evidence of a statistical disparity --the reclassification of twenty-six white and zero black part-time toll collectors between 1997 and early 2000. The court concluded, however, that plaintiff’s statistical evidence was not “sufficiently substantial” to raise “an inference of causation.” In reaching this conclusion, the court found persuasive the fact that Plaintiff’s own expert witness preliminarily concluded that, given the racial composition of the applicant pool, the number of black applicants selected for reclassification did not differ significantly from what would be expected absent improper discrimination. No contrary opinion was offered. Thus, because Plaintiff failed to make out a prima facie “disparate impact” case, summary judgment was appropriate for Defendant.

International Bhd. of Elec. Workers v. Mississippi Power & Light Co.,
442 F.3d 313 (5th Cir. 2006).

This disparate impact lawsuit by African-American employees and their unions against MP&L stemmed from a Reduction-in-Force plan which enabled laid-off employees with a certain measure of seniority to “bump” into junior positions, provided that the more senior employees could qualify for the new positions. To qualify for the clerical positions of Storekeeper and Plant Storekeeper, the plaintiffs had to pass a validated aptitude test called the Clerical Aptitude Battery (“CAB”). Neither of the plaintiffs met the cutoff score. Consequently, they were not allowed to bump into these positions. CAB is a test developed by EEI, which is responsible for validating the test by establishing the statistical correlation between success on the test and success in relevant jobs. EEI also provides suggested scores and ranges to employers and requires employers to be certified to conduct the test, whereafter the employer may set and vary its own cutoff scores.

The plaintiffs did not challenge the exam’s validity. Instead, they challenged the method of setting the cutoff scores for the Storekeeper positions. The score cutoff was originally 178 for this position, as recommended by EEI. After MP&L notified EEI of the high turnover and low passage rates, EEI recommended use of 150 as the cutoff score, which was so used from 1989 to 1993. In 1993, MP&L was acquired by Entergy, which raised the cutoff score to 180, at least in part for uniformity purposes. The plaintiffs could not meet the 180 cutoff score.

The defendants conceded the plaintiffs’ prima facie case of disparate impact but demonstrated that increasing the score significantly increases the likelihood that successful applicants for the position will develop into proficient employees. (Defendants’ expert showed that the 180 cutoff score created a 50% chance that an applicant would develop into an above-average worker and only a 31% chance of being in the bottom third. The 150 cutoff score made it equally likely—39%—that the candidate would become above average or end up at the bottom.) Despite MP&L’s showing of business necessity, the district court found in favor of the plaintiffs by imposing on defendants the burden of demonstrating the absence of acceptable alternative employment practices and finding that they failed to meet their burden. The Fifth Circuit reversed, finding that the district court had inappropriately placed the burden on the defendants to show that raising the cutoff score to 180 was the only means to achieve its legitimate business purpose. After reviewing the “direct and unambiguous statutory language” that Congress used to explain the disparate impact framework as well as the Supreme Court’s prior disparate impact decisions, the Fifth Circuit ruled that it is the plaintiff’s burden to show that there exists an acceptable alternative employment practice to the one at issue. Because the plaintiffs provided no meaningful alternative to the challenged testing practices in this case, they failed to meet their burden.

Isabel v. City of Memphis
404 F.3d 404 (6th Cir. 2005).

African-American police officers alleged that a facially-neutral written test administered by the City of Memphis had a disparate impact on minority promotions within the police department, and brought this action against the City.

In this case, the Sixth Circuit Court of Appeals repudiated the notion that the EEOC’s four-fifths rule is dispositive of disparate impact. Despite the fact that the test administered by the City of Memphis did not result in a violation of the four-fifths rule, the court allowed admission of two other statistical analyses (the “T” and “Z” tests) to support a finding of disparate impact. Background: In July of 2000, 120 sergeants applied for lieutenant positions; of those, 63 were African-American, and 57 Caucasian. The promotional process consisted of four parts: (1) a written test, (2) a practical exercise test, (3) performance evaluations, and (4) seniority points. The four components would account for 20%, 50%, 20%, and 10%, respectively, of each applicant’s score. However, only those who passed the written test, which had a cut-off score of 70, would be allowed to continue in the promotion process. Finding that the 70 cut-off score violated the four-fifths rule, the City of Memphis reduced the cut-off score to 66.

The test was developed and managed by Dr. Mark Jones, who created the exam in cooperation with Memphis police officers. Dr. Jones and expert officers constructed questions aimed at identifying the knowledge necessary for performing identified lieutenant tasks. Originally, Dr. Jones advised against implementing a cut-off score; he testified at trial that he was forced to do so by the union and admitted that the cut-off score “was incapable of distinguishing between candidates who can and cannot perform the job of lieutenant.”

98 candidates passed the test under the new 66 cutoff score; 51 were white and 47 were black. All parties agreed that, according to the numbers, the test did not violate the four-fifths rule established by the EEOC.

The City asserted on appeal that this fact was evidence of no disparate impact and that the four-fifths test was the only statistical analysis that the court should consider. The court disagreed, upholding the district court’s use of the “T-test” which evaluated the difference between the mean scores of whites and blacks on the test, and the “Z-test” which measured the passage rate across groups, showing a white passage rate of 90% and a minority passage rate of 74.6%. Plaintiff’s expert, Dr. Richard Deshon, testified that the difference of 15.4% was statistically significant. The four plaintiffs in this case scored below 66 and thus were not allowed to proceed to subsequent promotional assessments.

Based primarily on the T-and Z-test analyses, the district court found that the written test, as applied, unlawfully discriminated against minority applicants, holding that the Plaintiffs established a prima facie case by: (1) identifying a specific employment practice to be challenged; and (2) through relevant statistical analysis proving that the challenged practice had an adverse impact on African-Americans. Once Plaintiffs set forth this prima facie case, the burden shifted to the City to demonstrate that the challenged practice was-job related for the position in question and consistent with business necessity. The City attempted to meet its burden by presenting proof of content-related validity; it tried to show that the contents of the test had a direct relationship to the contents of the lieutenant job, and thus prove the test was necessary to assess practical performance ability. In their decision, the Sixth Circuit Court of Appeals established a rule for cut-off score validation: “In order to validate a cut-off score, it must be shown that the cutoff score establishes a point of minimal qualification.” In other words, the cutoff score must be able to separate those who are qualified to do the job from those who are not. Here, the district court found that the written test “could not be trusted to be related to actual job performance.” The Court of Appeals found strong evidence in support of the district court’s assessment of content validity, finding especially compelling the fact that a non-minority applicant who scored 66 (and thus was initially eliminated by the 70 cut-off score) ended up being the second-rated candidate overall after the entire promotional procedure.

The Sixth Circuit Court of Appeals affirmed the district court’s ruling on the grounds that the Plaintiffs established a prima facie case of disparate impact and on the grounds that the City failed to demonstrate that the challenged practice had content-related validity and was an approximation of a candidate’s potential job performance such that the cut-off score was a point of minimal qualification.

Note: the Court of Appeals also upheld the district court’s remedy, ordering the promotion of all Plaintiffs to lieutenant and awarding Plaintiffs attorney’s fees.

Kohlbek v. Omaha,
447 F.3d 552 (8th Cir. 2006).

White firefighters brought a reverse discrimination action against Omaha alleging they were not promoted due to the City’s implementation of Omaha’s most recent affirmative action plan (“Plan”), established as part of its mission to integrate the fire department. Omaha’s position was that the Plan was consistent with the OFCCP’s Guidelines.

Plan: In accordance with the Guidelines, the City determined an “availability” percentage in a given job group by considering the percentage of qualified minorities and the percentage of minorities among all those who are “promotable, transferable and trainable” in the organization. While the City calculated its internal availability rate for African-Americans to be 11.4% for fire captain and 5.8% for battalion chief, the actual percentages were 7.7% and 3.1%, respectively. The City uses the phrase “underutilization” of a position when the minority representation in that position was not within half of a person of the goal. The City utilized both written and practical tests to determine those who would be placed on a promotion eligibility list in rank order of their scores. When a promotion position opened, the fire department’s personnel director would send the names of the top five candidates to the fire chief, who conducts individual reviews based on test scores, ranking, seniority, education, discipline record, job performance and attendance. However, when there is underutilization for a position, the fire chief also considers race when making promotion decisions.

The plaintiffs: Caucasian Plaintiff 1 passed the August 2000 promotion exam for battalion chief and was ranked eleventh on the eligibility list, while an African-American was ranked twentieth. Following various promotions of higher ranked candidates, Plaintiff 1 was ranked second and the African-American was ranked below him. Two positions opened within six days of each other. The African-American, although ranked lower, was appointed to the first open position of battalion chief. The fire chief made this decision, at least in part, because African-Americans were underutilized in that position. The first-ranked candidate was appointed to the next open position. Had the fire chief followed the rank ordering, Plaintiff 1 would have received the second open position. Following various promotions, Caucasian Plaintiff 2 was ranked second and two African-Americans were ranked thirty-second and thirty-third. Because these two African-Americans were thereafter promoted out of rank order, which the chief testified probably would not have happened absent the Plan, Plaintiff 2 was not, but would have been, promoted.

The issue presented was whether the racial classifications used in making promotional decisions under the 2002 Plan were constitutional. The Court analyzed the Plan under the strict scrutiny standard of the Equal Protection Clause of the Fourteenth Amendment and reversed the district court’s holding that favored the defendant, finding that that Plan was not narrowly tailored to remedy specifically identified past discrimination. In so holding, the Court determined that the City’s method of determining underutilization by using the half person rule did not truly coincide with situations where discrimination could be legally inferred. Rather, the half person rule did not require a statistically-significant showing of discrimination before the chief began considering race when making promotion decisions. Therefore, the Court held that the 2002 Plan made racial classifications beyond its interest in remedying identifiable racial discrimination—both in general and with respect to the specific promotion decisions challenged by the plaintiffs.

Lanning v. Southeastern Pennsylvania Transp. Auth.,
181 F.3d 478 (3d Cir. 1999), cert. denied, 528 U.S. 1131 (2000), same holding on remand, 84 Fair Empl. Prac. Cas. (BNA) 1012 (E.D. Pa. 2000).

Five women sued SEPTA under Title VII, in a class action on behalf of all female applicants who applied for positions as police officers with the transportation authority during a two-year period. All applicants for these positions were subject to a physical fitness test, the first element of which was a 1.5 mile run which had to be completed in 12 minutes. Failure to do so disqualified the applicant. The district court ruled in favor of the Authority following a bench trial. On review the Third Circuit vacated that judgment and remanded the case.

In 1991 SEPTA hired an expert to develop a physical fitness test as a means of enhancing the level of fitness, physical vigor, and general productivity of its police force. The expert used twenty experienced officers as subject-matter experts, and, following a job study, he concluded that running, jogging and walking were important transit officer tasks and that they jogged on an almost daily basis. The subject-matter experts concluded further that it was reasonable to expect transit officers to run one mile in uniform, with all of their gear, in 11.78 minutes. The expert rejected this conclusion and determined that the correct standard instead should be for them to run 1.5 miles within 12 minutes. He calculated the aerobic capacity which this exertion represented and concluded that it was the level necessary to perform the job of a transit officer. During the test’s use, female pass rates were significantly different from male pass rates. In one year the female pass rate was 12% while the male rate was nearly 60%. In other years the female pass rate was 6.7%, compared to a 55.6% pass rate for men. The Transit Authority conceded that this part of the fitness test had a disparate impact on women. The Transit Authority had also begun testing incumbent police officers for their aerobic capacity. Following a protest by the union, however, it discontinued disciplining officers based on poor performance on the test, and instead begin an incentive reward program to encourage officers to meet fitness goals. Internal documents from the Authority showed that 86 percent of incumbent officers met the fitness standards, but also that the Authority had never taken any steps to determine whether officers who failed the fitness test performed poorly or hindered the Authority in its ability to meet its goals. Included among the plaintiffs’ evidence was the experience of a female officer who failed the 1.5 mile run, and was hired due to a clerical error. That officer during several years of service was decorated and nominated repeatedly for annual or quarterly awards. She was commended for her outstanding performance and served as one of two defensive tactics instructors. There was also evidence showing that there were an extremely low number of women in the transit police force: most recently 16 out of 234.

The district court found that the Authority had established that the level of aerobic capacity it targeted as required was job-related and consistent with business necessity. It also accepted the expert’s study as demonstrating “the manifest relationship of aerobic capacity to the critical and important duties” of a transit police officer. In a lengthy opinion discussing previous significant cases on business necessity, the appeals court concluded that the district court did not apply the correct legal standard, and accepted on the Authority’s expert’s qualifications alone the justification for the targeted aerobic capacity cutoff. The district court nowhere made its own independent analysis of whether the aerobic capacity which was required for the 1.5 mile run actually reflected the minimum aerobic capacity which was necessary to perform successfully the job of a police officer. The appeals court rejected the conclusion that more is better when it comes to fitness or aerobic capacity and said that concept has no bearing on the appropriate cutoff time, which should reflect minimum qualifications necessary to perform successfully the job in question.

Lanning v. Southeastern Pennsylvania Transp. Auth.,
308 F.3d 286 (3d Cir. 2002).

On remand, the district court entered judgment in favor of the defendants. Plaintiffs appealed the district court’s decision. The sole issue on remand had been whether or not SEPTA proved that its 42.5 mL/kg/min aerobic capacity standard measured the minimum qualifications necessary for the successful performance of the job of SEPTA transit officers. The plaintiffs argued that it had not, because a significant number of individuals who failed the run test could perform at least certain critical job tasks. SEPTA argued that the run test measured the “minimum qualifications necessary” in terms of aerobic capacity to successfully perform as a SEPTA transit officer because the relevant studies indicated that individuals who failed the test would be less likely to successfully execute critical policing tasks. The appeals court affirmed the district court’s grant of judgment in favor of the defendant. In doing so, the appeals court concluded that the defendant produced more than sufficient competent evidence to support the finding that a pre-hire, pre-academy training aerobic capacity of 42.5 mL/kg/min measured the minimum qualifications necessary for successful performance as a SEPTA transit officer, and thus, justified the conceded disparate impact on female candidates as a business necessity.

In determining whether the run test did indeed measure the minimum qualifications necessary for the job, the district court had credited a study that evaluated the correlations between a successful run time and performance on 12 job standards. The study found that individuals who passed the run test had a success rate on the job standards ranging from 70% to 90%. The success rate of the individuals who failed the run test ranged from 5% to 20%. The district court found that such a low rate of success was unacceptable for employees who are regularly called upon to protect the public. The court of appeals stated that, in doing so, the district court implicitly defined the “minimum qualifications necessary” as meaning “likely to be able to do the job.” Other studies cited by the district court offered similar results and showed that the defendant’s experts set the run cut off time at 12 minutes for objective reasons. For example, in one study, 80% of those passing the defendant’s run test met minimum job standards, while only 33% of those failing did. Another study showed that 84% of those passing the test could carry out an emergency assist, while only 14% of the failing group were able to do so.

The lengthy dissent stated that the district court’s findings of fact were erroneous. According to the dissent, it was clear from the record that no real attempt was made to establish either criterion-related or construct validity for SEPTA’s test, because no empirical data was submitted to show the required correlation between tested running times and ultimate job success. The only attempt to establish a correlation to actual job performance was an arrest analysis. The dissent stated that this analysis neither encompassed a representative spectrum of SEPTA transit officer job duties nor evidenced any unsatisfactory performance by those who failed to pass the test.

Mems v. City of St. Paul,
224 F.3d 735 (8th Cir. 2000).

Six African-American firefighters sued the St. Paul Fire Department alleging racial harassment of black firefighters who worked the night shift and disparate impact in the written examination given for promotion to captain. The district court granted summary judgment to the Fire Department on both claims. On this appeal the Court of Appeals reversed summary judgment for the Fire Department on the hostile working environment claim, holding that genuine issues of material fact existed with respect to the claim and summary judgment was inappropriate. The Appeals Court affirmed, however, summary judgment to the Fire Department on the claims of disparate impact in the promotional exam. The Fire Department used a written examination as a prerequisite to being considered for promotion to captain. Both sides agreed that over the four testing years in question, white applicants had a higher pass rate than black applicants when analyzed under the “Four-Fifths Rule.” The Appeals Court looked generally at the statistical record in the case and held it to be not well developed, and not supportive of claims that the Fire Department’s promotional practices in the past have had a general disparate impact on African-Americans. Therefore the Court held the question to be a narrow one of whether the written portion of the examination disparately impacted African-Americans. In reviewing the statistical evidence, the Appeals Court held that the sample size, which ranged from three to seven African-Americans in the several years, was too small to be statistically significant. There was agreement on this point from the statistical experts for both plaintiffs and defendants. In an attempt to overcome this deficiency, the plaintiffs combined all minorities and compared their exam results to that of white firefighters.

This provided a larger sample size than if African-Americans were analyzed alone. The Appeals Court rejected this combined analysis, holding that the plaintiffs’ claims were based exclusively on the impact on African-Americans, and they had not anywhere in their case produced evidence that other minorities were similarly situated or affected by the examination. It held the evidence to be insufficiently statistically significant and affirmed the district court’s grant of summary judgment.

Montemayor v. City of San Antonio,
276 F.3d 687 (5th Cir. 2001).

A jury verdict of nearly $900,000 in favor of a female cadet at the fire department training academy was reversed on appeal, based on her performance on a written examination and skills performance tests. The cadet had been admitted to the training academy pursuant to a previous state court order, after she argued that she was initially denied admission to the academy as retaliation for her claim of harassment during her admission interview because of the sexual content of some questions.

As candidates progressed through the training academy, they were given numerous written examinations as well as skills tests. The fire department’s policy was that cadets could retake written tests if they failed. The policy permitted a cadet to retake only two examinations; if a cadet failed a third written examination, he or she was dismissed from the academy, and was not permitted to retake the third examination. The plaintiff argued that her dismissal from the academy was based on further retaliation for her complaints about the conduct of her original interview. However, there was evidence in addition to her performance on the written examinations that she was a sub-standard cadet. In particular she failed to meet a minimum standard on a test of her ability to connect hoses and operate fire equipment, including power saws. The plaintiff agreed that the tests that she failed were for operations that would be critical to her performance as a firefighter. In addition, there was evidence that the policy about retests and dismissal after failing a third test was consistently applied, and that no cadet had ever been accepted as a firefighter who failed a third examination. This led the court to grant judgment as a matter of law to the city and reverse a jury verdict of $877,000. A modest jury verdict of $23,000 was upheld for her original claim that she had initially been denied admission to the academy based on retaliation for her objection to the sexually harassing questions at her interview. The content of the written examinations is not discussed and was not an element in this action.

Paige v. California Highway Patrol,
291 F.3d 1141 (9th Cir. 2002), cert. denied, 123 S. Ct. 1256 (2003).

A class of minority California Highway Patrol officers challenged the promotion process as having a disparate impact on minorities. In this decision the Ninth Circuit reversed and remanded the district court’s decision on statistical and validation issues. The challenged promotional process applied to all positions above entry level patrol officer, and included various written and oral examinations. The Highway Patrol argued that the appropriate statistical analysis of the outcome of the process was to look at each separate supervisory position. The plaintiffs, however, argued that aggregation of the data on all supervisory positions would be more probative than subdivided data. The appeals court agreed. Although there were different questions and different exam topics for each supervisory rank, the court found sufficient commonality among the duties and skills required by the various positions to justify aggregation. The plaintiffs also argued that the appropriate statistical analysis should compare the percentage of minorities who successfully completed the promotion process to external census data on all law enforcement officers in the geographic area. They further argued that the analysis should considered separate minority subgroups. The district court had agreed that use of external data for the statistical analysis was inappropriate, but had held that the analysis should compare white and nonwhite pools, rather than analyzing minority subgroups. The appeals court agreed with the latter finding, holding that where employment practices have identical discriminatory effects on members of all minority groups, and benefit only members of the white majority, there is no basis for evaluating individual minority groups. However, it rejected the use of external census data in analyzing the disparate impact of the promotional process. It noted that the California Highway Patrol promotional process was a closed process, and the only eligible candidates were officers who were already employed by the CHP. Therefore, the appropriate pool on which to base the statistical analysis of the promotion process was the pool of candidates eligible to apply for promotion, and that was the population of nonsupervisory officers within the CHP, not the external population of all area law enforcement officers. It did permit the use of data on examinations and eligibility lists from years prior to the start of liability, holding that the process operated as an ongoing discriminatory policy and practice. Finally, it found that the CHP had failed to show appropriate validation of the process as a whole, and of its individual parts, and there was no evidence that the examination tested for skills that were critical to performing well in the supervisory ranks. This opinion provides no discussion on validation efforts, if any, that had occurred.

Patterson v. Illinois Dept. of Corrections,
No. 01-3456, 2002 WL 1352462 (7th Cir. June 13, 2002).

Former state correctional officer brought action alleging that his termination for refusing to undergo a mandatory tuberculosis test violated the Rehabilitation Act of 1973, and the ADA. Plaintiff, as part of an annual screening, took a tuberculoses skin test (the “Mantoux test”). Plaintiff’s results were negative, but he subsequently suffered an allergic reaction to the test and was taken to the emergency room. Plaintiff’s personal physician advised him not the take the Mantoux test again. For the next two years, plaintiff took a chest X-ray in lieu of undergoing the Mantoux test. In the third year, defendant required all of its employees, who had not previously tested positive, to submit to the Mantoux test. Plaintiff refused and was suspended from work for ten days. An independent physician examined plaintiff and concluded that plaintiff would most likely not suffer from another adverse reaction to future Mantoux tests. The defendant then demanded that the plaintiff take the Mantoux test or be fired. Plaintiff refused and submitted a request for accommodation requesting that defendant accept a chest X-ray in lieu of the Mantoux test. Defendant subsequently dismissed plaintiff for refusing to submit to the Mantoux test. Plaintiff argued that the defendant’s tuberculosis testing requirement was an impermissible blanket policy that violated the Rehabilitation Act and the ADA. The crux of plaintiff’s argument was that the defendant’s policy required all employees to take the Mantoux test and did not allow the alternative of chest X-rays for individuals like him who reacted adversely to the Mantoux test in the past. The court disagreed, finding instead that the tuberculosis testing requirement to which plaintiff objected applied to all correctional officers, not only those identified as disabled, and therefore did not constitute an impermissible blanket policy.

Peace v. Wellington,
No. 05-4441, 2006 WL 3017118 (6th Cir. Oct. 23, 2006).

Two former corporals in the Mahoning County, Ohio Sheriff’s Department sued the Sheriff for disparate impact discrimination when they failed a promotion test.

All corporals were required to take and pass a Sergeant’s test or else be returned to the rank of Deputy, but with no loss of pay or departmental seniority. Those who passed with a composite score of at least seventy percent would be promoted to Sergeant. Of forty-nine deputies and eight corporals who took the exam, forty deputies and four corporals passed with a score of at least seventy percent. Plaintiffs failed the exam. Plaintiff’s disparate impact claim was dismissed on summary judgment by the trial court for failure to provide evidence establishing a prima facie case.

The Sixth Circuit affirmed the decision below, holding that the Plaintiffs presented “virtually no statistical substantiation for their disparate impact claim.” Although the Supreme Court has not specified the type(s) of statistical evidence upon which courts hearing disparate impact claims should or may rely, a plaintiff who presents statistical evidence need not rule out all other potential variables nor prove discrimination with scientific certainty. A plaintiff must, however, prove relevant adverse impact by a preponderance of the evidence. Here, the Sixth Circuit found that the plaintiffs’ conclusory assertion that “75% of those demoted to deputy were African-American”— with no substantiation of this figure and with no other evidence, such as data regarding the other corporals and deputies who took the exam or the total pool of test-takers—does not establish a prima facie case of disparate impact.

Price v. M&H Valve Co.,
No. 03-02785-CV-PT-M, 2006 WL 897231 (11th Cir. Apr. 7, 2006).

An African-American current employee of M&H Valve brought a Title VII cause of action alleging, inter alia, disparate impact in training based on race. M&H Valve had instituted a supervisor-training program as part of its conciliation agreement with the EEOC due to prior charges involving disparate promotions. The program was intended to provide all qualified employees interested in a supervisory position with the means to improve on skills necessary for such a promotion. To qualify for the program, M&H Valve management determined that its supervisor positions necessitated applicants to demonstrate at least a tenth-grade equivalency on the TABE exam. The TABE exam is a “norm-based” exam that tests grade levels in math, spelling, language, and reading. TABE is also one of only three tests used by Alabama for adult applicants seeking to obtain a GED.

The plaintiff only performed at the 4.5, 1.3, and 5.7 grade levels in reading, language and spelling, and math; and, therefore, was not eligible for the supervisor-training program. Plaintiff, in turn, alleged that M&H Valve failed to provide him with training opportunities that the Company provided to Caucasian employees.

The plaintiff provided evidence that there were no African-Americans in the current supervisor-training program class. M&H Valve responded on two fronts. To the extent the plaintiff alleged the TABE exam had a disparate impact on screening applicants, M&H Valve argued that the plaintiff could not produce sufficient statistical evidence to buttress his claim, nor could he show that M&H Valve’s legitimate business reasons for the test were pretextual. M&H also argued that the plaintiff failed to show that the actual training program, as opposed to the testing, had a disparate impact.

The Eleventh Circuit affirmed the district court’s grant of summary judgment in favor of M&H Valve, ostensibly because the plaintiff failed to exhaust his administrative remedies for his disparate impact claim by failing to provide adequate notice in his EEOC Charge. The Court nonetheless held, as dicta, that even assuming that the plaintiff had identified a particular, facially-neutral employment practice, he could not establish a prima facie case of disparate impact because “he failed to demonstrate a statistical disparity in the racial composition of employees benefiting from the practice and those qualified to benefit from the practice.” In so holding, the Court relied on precedent that “statistics based on an applicant pool containing individuals [such as the plaintiff] lacking minimal qualifications for the job would be of little probative value.”

Rutherford v. City of Cleveland,
No. 04-3904, 2006 WL 1526091 (6th Cir. June 1, 2006).

Background: Nonminority police officers challenged the City’s continued compliance with a consent decree’s race-based hiring plan issued to remedy the discriminatory impact that resulted from a police entrance exam administered in the 1960s and 1970s. In a prior case, the district court found that the entrance exam had a disparate impact on minorities and was not validated for job performance. As a result, the district court granted a consent decree, which, among other things, required the hiring of a specific percentage of minorities in order to remedy the effect of other disparate impact screening practices. In the 1980s, the district court amended the consent decree to include the following terms: (1) The consent decree would be enforced until 33% of the officers were minorities or until December 31, 1992, whichever came first. (2) However, if the City failed to hire at least 70 officers in any given year, the decree would continue an additional year for each year the City failed to hired 70 officers, unless the 33% was achieved. (3) Last, the City would hire three minority officers for every four non-minority officers and maintain separate eligibility lists for minority and non-minority candidates from which one out of three qualified candidates would be chosen.

In years following the decree, the City never reached the 33% target and failed to hire at least 70 officers in two different years. Therefore, the City continued to comply with the decree through December 31, 1994. Plaintiffs challenged the constitutionality of applying the consent decree from December 31, 1992 through December 31, 1994, arguing that the decree was no longer justified by a compelling governmental interest, nor narrowly tailored to achieve such an interest.

Held: The Sixth Circuit reasoned that the decree’s constitutionality should be analyzed by using the facts and circumstances existing at the time the court enacted the race-based remedy—as opposed to the time period being challenged. According to the Court, this is especially true when presumably the practice of discrimination has halted; but its effects are still felt. Therefore, the Sixth Circuit found that because the vestiges of racial discrimination had not been sufficiently relieved by 1992, and because the decree continued to be narrowly tailored to remedy the same, the City continued to be constitutionally bound by the decree’s terms. The Court did not agree that the City’s use of validated examinations for years prior to the 1993-1994 time period made the decree’s race-based relief unnecessary, as the facts showed that such exams were not effectively remedying the effects of past discrimination.

Sledge v. Goodyear Dunlop Tires N. Am., Ltd.,
275 F.3d 1014 (11th Cir. 2001).

The timing of the development of a test, the process by which it was administered, and the use of its results led a federal appeals court to vacate summary judgment in favor of an employer. The plaintiff was a black employee at a tire factory employed as a tire “builder,” who received significant experience as a maintenance mechanic in the course of his duties. He indicated to the Human Resources Department that he would like to be moved into a maintenance position, which would be a promotion. The collective bargaining agreement required posting of all open positions. Pursuant to the process, employees placed their name on the posting and the Human Resources Department selected those who would be interviewed. Plaintiff indicated his interest in three mechanic positions. At the time, there were 107 mechanics, of whom all but one were white. The plaintiff was not interviewed for any of three openings, and in order to bolster the possibility of promotion, he asked his supervisors in the tire building department to sign a letter stating that he was qualified to be a maintenance mechanic, which they did. In that same month, the plant engineer developed a two-part test for selecting those to be interviewed for maintenance mechanic positions. The first part was a written examination including mathematic problems and requiring identification of various tools. The second, “practical” part was administered by the maintenance supervisors and required the applicant to repair certain pieces of machinery and demonstrate welding skills. A score of 75% was required on the written examination, and, to be certified for an interview, supervisors needed to agree that the candidate had passed the practical part of the test.

Immediately following the implementation of the test, the plaintiff posted for two mechanic positions. Several applicants were tested, but he was not. He asked to take the exam but was denied that opportunity. Both positions were filled by whites who had not taken the test. He protested the awarding of the positions to two individuals who had not taken the written examination, and reiterated his request that Human Resources give him a chance to take the test. Before he received a response, another opening was posted, and he applied again. Several applicants were tested, but he was not. The job went to a white applicant who failed the examination. The plaintiff continued to protest and the company changed the first part of the test expanding it to include 169 possible points, compared to the previous 99 points. Of the 70 additional questions, 31 were on drawing and reading and 40 were word problems. A white applicant took the new test for a mechanic opening, and he was told to disregard the word problems. He passed on that basis and also passed the practical test. Two days later, following a 12-hour shift that had begun the previous evening, Human Resources told the plaintiff that if he wanted to take the test he would have to take it immediately. He agreed and took the new expanded test including the word problems. He failed the exam, but passed the practical section, which he took immediately following the written test. He asked to review his written examination, but was denied permission. Evidence before the court showed that he had passed the original, 99-point test, but with the word problem scores included, he did achieve not a passing grade. He was not retested. Although he posted on subsequent openings, he was denied an interview based on his failing score. A white applicant who filed a grievance with the union about his failure on the longer written test was permitted to retake the test. To avoid a claim of race discrimination Human Resources Department permitted the plaintiff to take it as well. The test they were given, however, was a new test designed by a member of the engineering staff. Neither of them passed the test, but the white candidate was given the job. In light of this history, the appeals court held that a reasonable jury could find that the plaintiff was qualified for a maintenance mechanic position and that the written examinations developed over a six-month period were nothing more than a pretext for racial discrimination. The test was not given to all applicants, and a passing score was not required of all individuals selected for maintenance mechanic. Furthermore, the administration of the task to the plaintiff at the end of a 12-hour shift, after previously denying him the opportunity to take the test, was also evidence of a lack of fairness.

Stout v. Potter,
276 F.3d 1118 (9th Cir. 2002).

Plaintiff female postal inspectors brought a Title VII action against the postmaster general, alleging denial of promotion on the basis of sex. The plaintiffs alleged both disparate treatment and disparate impact on the basis of sex in violation of Title VII. The district court granted summary judgment for the postmaster general, and the court of appeals affirmed. The court of appeals held that: (1) the female inspectors failed to establish prima facie case of disparate impact under Title VII and (2) the female inspectors failed to establish that the facially neutral screening process excluded female applicants.

This case arose when the plaintiffs, along with thirty-four other postal inspectors, applied for promotion to Assistant Inspector in Charge (“AIC”). There were five open positions. Six of the thirty-eight applicants for the position were women. A review panel initially screened all applicants on the strength of their applications and their supervisor’s evaluations. The panel identified the most qualified candidates and forwarded their names as potential interviewees to a separate selection committee that made the final selection decisions. From the original pool of thirty-eight, the screening panel identified ten applicants as the most qualified. None of the six female applicants was included. Due to unexpected circumstances, two of the six female applicants was granted interviews in a second screening round. One of those applicants was ultimately promoted.

Finding that Plaintiffs failed to establish a prima facie case of disparate impact, the district court focused on the final results of the promotion process. The district court noted that one out of six female applicants was promoted, whereas three out of 32 male applicants received a promotion to AIC. This meant that female applicants were promoted at a rate of more than 16 percent, compared to a promotion rate for male applicants of less than 10 percent.

The court of appeals focused not on the bottom line of the promotion process but, rather, on the intermediate stage of the promotion process, the interview by the selection committee. The court found that the interview functioned as a pass/fail barrier to further consideration. The court stated that the non-adverse result of the ultimate promotion decisions cannot refute a prima facie case of disparate impact at the dispositive interview selection stage. But because those who were not selected in the first round were again considered in the second round, the court found that they could not be analytically separated for purposes of disparate impact analysis. After noting that the probative value of any statistical evidence was limited by the small available sample, the court found that the evidence did not indicate a substantial statistical disparity.

The court noted that the female applicants comprised 13.3 percent (2 of 15) of all those interviewed and 15.8 percent (6 of 38) of the original applicant pool. The percentage of interviewees who were female was nearly proportional to the percentage of applicants who were female. The 2.5 percent difference was not a substantial or significant statistical disparity. In addition, the court stated that, as a “rule of thumb,” courts have also considered the “four-fifths rule” found in the Uniform Guidelines. Applying this rule, the court observed that the selection rate for female applicants to be interviewed was 33% (2 of 6) and the rate for male applicants was 41% (13 of 32). This meant that the rate of selection for women was 81 percent of the rate of selection for men, again demonstrating that no disparate impact was shown.

Sutherland v. Norfolk Southern Ry. Co.,
No. 02-3321, 2003 WL 1870723 (7th Cir. April 11, 2003)

Female railway employee sued her employer for alleged gender-based failure to promote in violation of Title VII. The district court granted summary judgment for employer. On appeal, the Court of Appeals affirmed the district court and held that the plaintiff failed to establish a prima facie case of discrimination.

The defendant required union employees like Plaintiff who desired to move to non-union positions to score at least 24 on the Wesman Personnel Classification Test (“PCT”); a score below 24 disqualifies a candidate from consideration. In 1999 plaintiff scored 17 on the PCT. Under defendant’s policy, candidates are not retested unless they have received additional relevant education. Plaintiff had not received such education; thus she was not retested and not promoted. Between January 2000 and October 2000, seven male Norfolk employees moved from union positions to trainmaster positions in Chicago- all scored above 26 on the PCT.

Plaintiff filed suit. She did not, however, challenge the validity of the test. Rather Plaintiff proceeded under a disparate treatment theory. The district court concluded: (1) Plaintiff failed to establish a prima facie case of disparate treatment based on Norfolk’s refusal to promote her, because she had not demonstrated that she was qualified for the promotion or that similarly-situated men were promoted; and (2) even if she had, plaintiff failed to demonstrate that Norfolk’s reason for denying her promotion-her low PCT score-was pretextual. Plaintiff appealed. The court of appeals affirmed the district court’s grant of summary judgment for the defendant, reasoning that plaintiff did not score high enough on the PCT to be qualified, and she has provided no evidence of a male yardmaster with a score below 24 who was promoted. In short, she had not established a prima facie case.

Vaughn v. Watkins Motor Lines, Inc.,
291 F.3d 900 (6th Cir. 2002).

Former employees sued employer-motor carrier for race discrimination. Plaintiffs alleged that defendant discharged them because of their race. Plaintiff Vaughn alleged that he took and passed the required test to be placed in a management position, but was informed that he was not eligible for a promotion in light of his poor attendance record. Vaughn did not dispute his poor attendance, but contested the accuracy of Defendant’s records and, in addition, claimed that several Caucasian employees failed the promotion test but were nevertheless elevated to management positions. The court found that, although Vaughn’s deposition testimony that Caucasian employees had failed the test yet were promoted to supervisory positions might be relevant in a failure-to-promote racial discrimination claim, here neither plaintiff asserted such a claim. Plaintiffs instead alleged that defendant discriminated against them on the basis of their race by terminating their employment. Thus, the defendant’s refusal to promote Vaughn had no bearing on the question of whether plaintiffs established a prima facie case for the particular employment discrimination claim that they asserted. The court of appeals concluded that the district court did not err in granting summary judgment in favor of Watkins on the plaintiffs’ racial discrimination claims.

Zottola v. City of Oakland,
No. 01-15238, 2002 WL 463695 (9th Cir. Mar. 4, 2002).

Plaintiff job applicant brought an action against the city regarding an entrance exam for a firefighter position that allegedly discriminated against Caucasian males. The plaintiff contended that direct evidence of discrimination was established by the fact that eight African-American candidates, interviewed by a three-member all African-American panel, scored significantly higher than the thirty white candidates interviewed by the same panel. The court found that statistical evidence, although probative of motive in a disparate impact case, was not relevant in plaintiff’s case because the sample size was too small, standing alone, to establish a prima facie case of intentional discrimination. Thus, the court concluded that the racial composition of the panel was not sufficient, in combination with the insubstantial statistical evidence, to show intentional discrimination. The plaintiff further contended that the defendant failed to provide sufficient evidence that the oral interview selection process had been properly validated and therefore could not establish business necessity. Plaintiff argued that defendant’s job-analysis failed to demonstrate that its interview selection process was job-related. In support of his argument, plaintiff contended that the defendant failed to demonstrate by professionally accepted methods that its oral interview process was predictive of or significantly correlated with the knowledge, skills, and abilities identified in its job analysis as important characteristics for entry-level fire fighters. The defendant argued that it relied on content validation to demonstrate that its oral interview was predictive of the knowledge, skills, and abilities for an entry-level firefighter. The evidence that the defendant presented to validate its oral interview included pre-testing results collected as part of the job analysis; statistical inter-rater reliability studies that were conducted to ensure that the panel scores were reflective of the candidate rather than the rater; anecdotal evidence that the Fire Department “raved” about the candidates who had been hired and that they were performing well in the academy; and expert testimony that these “behavioral consistency orals yield the most reliable and valid results,” and that the use of open-ended oral interview questions was appropriate, professionally accepted, and prevalent throughout the country. Based upon the defendant’s evidence of content validation, the court held that the defendant produced sufficient evidence of validation to send the question to the jury and survive plaintiff’s motion for judgment as a matter of law.


Association of Mexican-American Educators v. California,
937 F. Supp. 1397 (N.D. Cal. 1996), aff’d en banc, 231 F.3d 572 (9th Cir. 2000).

Several diverse groups of minority teachers challenged the California Basic Educational Skills Test (“CBEST”), passage of which is required to teach K-12, plus serve in some administrative and staff positions, in the public schools in California. In a lengthy decision focusing on the various job analyses, validation studies and scoring methods through which the test had been developed and evoked, the test was found to be “objective, cost-effective, and a valid way to assure that teachers and others employed in the public schools possess basic skills.”

The CBEST is a pass/fail examination composed of three sections: reading, writing and mathematics. The reading and mathematics sections are composed of multiple-choice questions, while the writing portion consists of two essay questions. Passing scores were required for both elementary school teachers, secondary school teachers, and numerous non-teaching positions, such as administrators, school counselors and nurses. The test was administered five to six times per year, and candidates could take the test an unlimited number of times. A candidate who failed one or more sections need only retake the failed sections. Limited compensatory scoring was permitted. In evaluating the need for such a test, the court took note of the crucial role that teachers occupy in society, and the fact that the education of all children has a profound effect on the future of the state and the country. It noted that teachers are role models and that students learn not only what they are taught directly but also what they observe. Those teachers and administrators who use improper grammar or who make mistakes in simple calculations model that behavior to the students, to the detriment of their education. The court described the real issue as being whether teachers in California’s public schools, all of whom must already have achieved a college degree, should be required to pass a test of precollege-level skills before they are allowed to teach. In an opinion which noted the substantial adverse impact of the test, the court held that the state was entitled to assure that teachers and others who work in the public schools possess a minimum level of competency in basic reading, writing and math skills before being entrusted with the education of children.

The court used the 80% Rule to evaluate the impact of the test battery, and called the evidence undisputed that an adverse impact existed with respect to first-time CBEST takers who are Latinos, African-Americans, and Asian-Americans. It was not persuaded by defendants’ arguments that cumulative pass rates should be used, rather than first-time pass rates. It further dismissed defendants’ arguments that each subpart should be analyzed for pass rates, rather than the CBEST as a whole. In reviewing the validation of the test, the court found that defendants had shown that the basic skills were important elements in the jobs for which the CBEST was required, and further that CBEST actually tested those skills. In reviewing the validation of the test, it found that content validation was adequate for the test, where it was not designed to predict a teacher candidate’s performance on the job. It found that content validity would not be appropriate where the test purported to measure a hypothetical construct or trait, unlike the instant case where the CBEST was designed to measure specific well-defined skills. The test had been validated three times. The first was a validity study conducted contemporaneously with the 1982-83 development of the test, then a 1985 validation study, and finally a job analysis and content validity study conducted during 1994 and 1995. The 1982 validation included 289 SME participants who conducted an item content review of the items on the reading and math subtests. They evaluated the relevance of each item to the job of teaching in California on a four-point scale for relevance. That study found overall relevance ratings of 76% “critical” or “important” for the reading examination, and 65% for mathematics. The 1985 content validity study was a practitioners’ review, which involved 234 teachers and other school professional employees. That population was 36% minority. Those panelists rated the CBEST skills as very relevant or moderately relevant at a percentage ranging from 87% to 99%. All the skills tested by CBEST were rated either very or moderately relevant.

The 1994 validation study began with a detailed job analysis, which used a job analysis survey form that included reading skills, writing skills and math skills, as well as various teacher job activities. The survey was completed by approximately 1100 teachers and 230 administrators from a distribution sample of 6000 teachers and 1100 administrators. The study found no meaningful differences among ethnic groups in the skills that were deemed important for teachers, a fact that the court took note of several times in the opinion. The expert who conducted this last validation study (Dr. Kathleen Lundquist) made conservative adjustments in the math and reading skills included in the revised version of the test, which was first administered in August 1995. In summarizing the various efforts at validating the CREST, the court found that the 1982 validation effort was appropriate under the professional standards of the time. It further noted that it was professionally acceptable to conduct job analysis surveys after the test had been developed. Although plaintiffs suggested that the validation study should have considered separately requirements for beginning rather than experienced teachers, the court noted that the test was one for basic skills and not skills that teachers learn on the job. It further disposed of plaintiffs’ arguments that the time limits had an adverse impact on the plaintiff class. The final portion of the opinion dealt with the setting of passing scores. These had been set by the California Department of Education, but inasmuch as the passing scores which were selected were lower than those recommended by subsequent studies, the court found the levels to be acceptable.

Plaintiffs’ final arguments related to a possible alternative to the CBEST. Suggested alternatives included a bachelor’s degree, completion of an accredited teacher preparation program, subject-matter certification through an examination, or completion of a state-approved coursework program. However, the court found that these could not substitute for the CBEST, and in fact that individuals who had satisfied these requirements still might not pass the CBEST, as experience had shown. Plaintiffs’ suggestion of a GPA requirement or a coursework alternative similarly failed, based in part on the court’s view that there was little control over individuals who had completed course requirements outside the State of California, where information about whether the institution was accredited, or the content of its program, was not known and could not be regulated by the State.

On appeal to a three-judge panel, the Ninth Circuit affirmed the district court’s judgment on all issues except the denial of costs to the prevailing defendants, remanding the case for a determination of the proper costs to be taxed against plaintiffs. However, on a rehearing en banc, the Ninth Circuit withdrew its earlier opinion and affirmed in full. First, the court held that Title VII applies to the California Commission on Teacher Credentialing. Defendants had argued to the contrary, because the examination administered by California to potential teachers is not an employment examination, since the State is not the employer of teachers. The State argued that in requiring the teacher examination, the State is exercising police powers, not proprietary powers, exempting it from Title VII. The Ninth Circuit disagreed, holding that the State is exercising both its police powers and proprietary powers and is therefore subject to Title VII.

The Ninth Circuit also affirmed on alternative grounds, finding that the district court’s factual findings were not clearly in error. In particular, the court held that the job analysis on which the test was based had been completed under professional standards. It found that the use of experts in interviews and observations of teachers to develop a list of activities and skills, which were then ranked to differentiate those that were important or critical, was not flawed. The appeals court agreed that the test had been shown to be predictive or significantly correlated with elements of work behavior, through professional accepted methods. It held that the passing scores were reasonable, where the State Superintendent of Education had made the determination of a logical breaking point based on professional estimates. The appeals court also upheld the use by the district court of a technical advisor who advised the court, but did not supply an expert’s report or testify. It noted the absence of any evidence of impropriety in the interaction between the district court and the expert. Finally, the Ninth Circuit held that the district court did not abuse its discretion in declining to award costs to the defendants. Plaintiffs’ lead expert, Dr. Joel Leftowitz, was expressly discredited. The testimony of Plaintiffs’ other expert was not persuasive to the Court. Dr. Kathleen Lundquist was the successful expert in industrial psychology for the defendants.

Barbera v. Metro-Dade County Fire Dept. ,
117 F. Supp. 2d 1331 (S.D. Fla. 2000).

A group of white males sued the fire department alleging that they were discriminated against based on a modification of the physical ability test. This opinion grants summary judgment to the fire department, holding that the modifications of the physical ability test were nondiscriminatory. The fire department had adopted an affirmative action program in 1984 with a long-term goal of increasing the representation of minorities and women in the fire department. The affirmative action plan established hiring goals for black, white and Hispanic males and females, in an effort to achieve a demographic make-up that more closely paralleled that of the county. The hiring process for firefighters included an initial screening, a written examination, physical ability test and an oral interview process. Those who successfully completed the oral interview were ranked and placed on an eligibility list. The department’s policies provided for hiring from anywhere on the eligibility list, without regard to rank-order. Out-of-rank-order hiring was used to fulfill the department’s hiring goals. Female and minority candidates were on occasion hired ahead of white male applicants who ranked higher than they did on the eligibility list. After its initial use of the physical ability test, the department modified the station used for the hose hoist element, by creating more room for the applicant to work, and modified the hose to make it easier to carry. In addition, the department ordered smaller sized gloves to accommodate applicants who needed those. In granting summary judgment to the fire department, the Court noted that the modifications to the physical ability test were made for the legitimate purpose of maximizing the number of qualified female applicants in the hiring pool. It found that the modifications the department had made eliminated non-essential obstacles which adversely affected females, and that those modifications did not constitute discrimination.

Bert v. AK Steel Corp.,
2007 WL 184746 (S.D. Ohio 2007).

Plaintiffs were sixteen African-American, unsuccessful applicants for employment with the defendant. Plaintiffs sought class certification and alleged that defendant’s written pre-employment screening test had a disparate impact on African-Americans. The court concluded that Rule 23(b)(2) certification was appropriate for the question of whether the challenged test had a statistically significant disparate impact on the African-American test takers. The Court conditionally certified two subclasses—one for each group of African-American applicants at the Middletown, Ohio and Ashland, Kentucky locations who took and failed the test within the 300-day period preceding the relevant charges.

Bert v. AK Steel Corp.,
No. 1:02-CV-00467, 2006 WL 1071872 (S.D. Ohio Apr. 24, 2006)

Nine of the sixteen original unsuccessful applicant plaintiffs sought class certification for African-Americans who failed a written, pre-employment test, alleging that the test has a disparate impact on African-American applicants. After a state agency screens applicants for minimum requirements (such as age and education), but before applicants are invited for an interview, applicants are invited to take the subject screening exam. The exam is graded by an outside consultant on a pass/fail basis, and only those who pass receive an interview invitation.

With respect to numerosity, the Court concluded that although the numbers are not large (30-60), other factors such as the low likelihood that all failing African-American applicants would pursue a separate claim initially satisfied the requirement. The Court also found that the issues of whether the Company’s qualifying exam has a statistically significant impact on African-American applicants, and if so whether it violates Title VII, are “clearly common to the proposed Applicant Class.” The Company challenged the plaintiffs’ expert by arguing his opinion was so fatally flawed that no common questions existed. The Court, however, found that he satisfied the lenient standard utilized at the certification stage, but divided the class into subclasses to account for the disparate hiring documentation kept at the two facilities at issue.

The Court next established the time period to use for determining typicality: the 300-day period preceding the date when purported class representatives timely filed their charges. The Court requested additional information from counsel regarding the adequacy of representation requirement. Plaintiffs also sought to be certified under Fed. R. Civ. P. 23(b)(2) by arguing that injunctive relief was appropriate as to the entire class. The Court agreed that injunctive relief against further use of the test was appropriate despite the Company’s argument that it ceased hiring for the tested-positions and had no plans to resume test administration, because such unilateral workforce hiring decisions did not preclude injunctive relief. However, due to, inter alia, the fact that only 42% of all passing applicants are hired by the Company, the plaintiffs were required to submit a more detailed description regarding a structure for proving its claim for back pay. The plaintiffs’ additional Rule 23(b)(3) argument, that a class action was the superior method because the disparate impact testing claim predominated over any individualized issues, failed in part due to differently situated plaintiffs who raised disparate treatment claims.

The Court suspended its decision on class certification until the plaintiffs file a supplemental brief addressing open issues such as back pay.

Biondo v. Chicago,
No. 88 C 3773 (N.D. Ill. Dec. 16, 2005).

In 1988, white firefighters challenged a 1986 exam designed to create a list of firefighters fit for promotion to lieutenant due to a “race norming” effort to increase the percentage of African-Americans and Hispanics in higher positions in the fire department. In 2000, the Northern District of Illinois found in favor of the plaintiffs in a trial on the merits, finding that the exam was essentially fair and, therefore, that the race norming efforts were unnecessary and worked to punish the plaintiffs. In 2002, a jury awarded damages to the plaintiffs, but the Seventh Circuit found the damages awards problematic and ordered new trials specifying various limitations on the calculations of certain damages. For more background, see Biondo v. Chicago, 382 F.3d 680 (7th Cir. 2004), cited above.

Ironically, the second jury trial awarded the plaintiffs even more money than they previously received, especially in light of the plaintiffs’ increased pension losses due to the passage of time.

Bradley v. City of Lynn,
443 F. Supp. 2d. 145 (D. Mass. 2006).

Plaintiff’s in this class action lawsuit allege that the written civil service cognitive ability exam used in 2002 and 2004 to qualify and rank applicants had a disparate impact on the selection and consideration of African-American and Hispanic candidates for entry-level firefighter positions in violation of Title VII. The defendants, the Human Resources Division of the Commonwealth of Massachusetts and the City of Lynn, claim that because of the statutory veterans’ preference, residency requirements, and other selection factors, the exam had no disparate impact on the bottom-line hiring of plaintiffs.

The Test (and other procedures): Both the 2002 and 2004 tests used multiple choice questions to test only cognitive ability. Defendants arbitrarily used the passing point of 70. Passing candidates were placed on eligible lists in order of test ranking, but disabled veterans, veterans, and widows of veterans whose deaths were service-related were ranked, in order of their respective standings, above all other passing candidates. Ranking was also affected by a candidate’s residence and special skills. Appointments were made from the first 2n + 1 candidates named in the list, where n is the number of vacancies. Candidates within the 2n + 1 pool were further screened by use of interviews, drug tests, and background checks. After conditional offers were made, candidates underwent a pass/fail physical abilities test, but such did not affect a candidate’s ranking. Past Massachusetts civil service exam validation studies and reports indicated that its past exams, which had cognitive and physical agility components, were valid. The 2002 and 2004 exams were designed to be equivalent or comparable to the past exams, but no outside consultants or industrial psychologists assisted and the 2002 and 2004 exams were not analyzed to ensure, for example, that the new reading level did not exceed the reading level of the job.

Analysis: The court ruled against the defendants, finding that analyzing the disparate impact at the “bottom-line” hiring level was inappropriate because the focus should be on “employment and promotion requirements that create a discriminatory bar to opportunities. . . . [as opposed to] the overall number of minority or female applicants actually hired or promoted.” (emphasis in original). Because the exam in this case was used as a ranking mechanism that dictated whether and when a passing candidate was reached for consideration, the court moved on to determine whether it had a disparate impact on hiring. The court utilized the four-fifths rule and chi-square analysis and found that the 2002 and 2004 exams created a disparate impact on minority scores in general and on minority hiring regardless of veteran status, statewide and by city. Based on expert analysis, the court disagreed with the defendants’ arguments that the veteran statutory preference and the “drop out” of minority candidates (drop-outs included those unwilling to serve as well as those screened out based on drug testing, background checks, and physical abilities) caused the disparate hiring statistics.

With respect to job relatedness, the defendants attempted to rely on the validation studies of past exams by proffering evidence that job duties had not since changed. The court, however, found that the defendants failed to demonstrate that the exams were job related and consistent with business necessity. Defendants failed to meet this burden because they relied on outdated validation testing, the 2002 and 2004 exams were not professionally created and validated (by failing to use experts in industrial psychology and test development), and the 2002 and 2004 tests differed significantly from the past validated exams as they ranked candidates solely on cognitive scores (which have a relatively low correlation with overall job performance of firefighters) and not on a valid combination of cognitive and physical scores. Additionally, the 2002 and 2004 cognitive test could not be used reliably to predict the overall entry-level firefighter job performance of candidates whose scores fell within eight points of each other. This eight-point margin of error further made the test invalid for ranking passing candidates.

Lastly, the plaintiffs demonstrated the availability of alternative selection devices with less discriminatory effects that would validly serve the defendants’ legitimate interest. One such alternative was to band the test scores of candidates falling within eight points of each other. Adding a physical agility test to the cognitive exam for ranking purposes was another suggestion. Plaintiffs’ general suggested alternatives, without a specific platform, satisfied this third prong. With all three steps of the analysis falling against the defendants, the court found defendants liable under Title VII.

Brown v. Chicago,
8 F. Supp. 2d 1095 (N.D. Ill. 1998), aff’d, 200 F.3d 1092 (7th Cir. 2000), cert. denied, 531 U.S. 821 (2000).

Forty-four minority police sergeants challenged the 1994 lieutenant selection examination following their failure to receive a promotion, based on their scores on the test. They alleged that the examination was not job-related or content valid and that, even if it were, there was an equally valid, less discriminatory, selection process which the City refused to use. This opinion found the examination content valid, but agreed that there was a less discriminatory, equally valid method of selecting lieutenants which the City should have used: the merit selection process, combined with the rank ordered-listing of examination scores.

The lieutenant examination had three components: a written examination, an oral examination and an in-basket test. The examination was professionally developed by a consulting firm which specialized in developing selection and promotion testing devices. It was based on a job analysis, observations of the job, and a review of all relevant documentation. Subject-matter experts were used throughout to review, comment and criticize. The oral briefing exercise simulated a situation that might be encountered on the job by a police lieutenant and tested the ability to brief subordinates on a problem or situation. After reviewing a briefing package of documents, the candidate presented an oral briefing which was tape recorded. The recording was evaluated later, in a process which prevented raters from knowing the race or identity of the individuals. The raters were given diversity training, as well as training in how to rate the briefing tapes. The written test was a 150-question multiple-choice examination, taken over a period of two and one-half hours. The test was developed using a reading level which was actually lower than that in the source materials used by a police lieutenant, and had been pilot-tested and reviewed by subject-matter experts. The in-basket simulation used a packet of information, again similar to what a police lieutenant might see on the job. It attempted to present routine job tasks such as reviewing reports, investigating or disciplining subordinates, and assigning tasks, in a simulation. Following a two and one-half hour review of the materials, the candidates were given 90 minutes to answer 60 multiple-choice questions based on the materials. This simulation was pilot-tested as well. The court concluded that the exam was content valid.

After the City received the test results, it determined to promote 13 sergeants, based not on the test results, but instead on merit selection, and a white candidate sued in state court. The state court issued an injunction forbidding the City from using anything other than test rank order. The City dropped its plan to use merit promotions in combination with the test, heeding the injunction of the state court.

In this opinion the federal court held that the City’s problem was of its own making, and that it should have filed an action in federal court to insure its right to use the combined process. The Court scheduled a later hearing to determine what remedy might be available to the forty-four plaintiffs in this case.

Carrabus v. Schneider,
119 F. Supp. 2d 221 (E.D.N.Y. 2000), aff’d, 2001 WL 699137 (2d Cir. June 20, 2001).

This action follows years of litigation over the selection of police officers for Suffolk County, including a 1986 consent decree which required criterion-related validation of the entry level examination then in use. That examination was administered in 1988, 1992, and 1996 and was developed by the consulting firm of Richardson, Henry, Bellows & Co. In 1999 following a review by the Department of Justice, the county administered another examination developed by the consulting firm of SHL Landy, Jacobs, and this law suit ensued. This is a reverse discrimination action brought by white candidates who alleged that their scores were the result of manipulation of raw scores, weighting of certain portions of the test and lowering of the importance of certain cognitive portions of the test at their expense. In granting the County’s motion to dismiss, the court held that, while the exam followed litigation that was aimed at correcting a hiring scheme that disadvantaged minority applicants, there was no indication that the adoption of the new exam was done with an intent to discriminate. In fact all applicants who took the exam were treated and scored in an identical fashion. In addition, the County’s desire to design a hiring exam that would lessen a discriminatory impact on minorities was not in effect a quota system in which candidates were treated differently. In this case all police officer applicants were treated identically. In reviewing the disparate impact claim, the court found insufficient evidence that plaintiffs were disadvantaged by the minimization of cognitive skills in a facially neutral exam.

Castillo v. American Board of Surgery,
221 F. Supp. 2d 564 (E.D. Pa. 2002).

The plaintiff received a medical degree in Peru and was certified to practice general medicine in the United States. He subsequently applied for certification as a surgeon and successfully passed one of the two required Board examinations -the written one. Certification required additionally the passage of an oral exam, and plaintiff three times failed to pass this component. The oral exam included three 30-minute oral sessions conducted by different teams of two examiners. Unsuccessful candidates could be retested twice, but then were required to spend one year in a surgery training program before applying to be retested. Plaintiff twice received failing scores from four of the six examiners, and in his third attempt was given a failing score by five of the six examiners. Candidates could receive a critique of their performance, and were instructed to report any unusual or offensive conduct by the examiners during the testing. After failing the exam for the third time the plaintiff filed suit alleging disparate treatment because of his national origin. The court permitted his claims only as to the last testing process, and his statistics showed that for that test, the Hispanic failure rate was 30% while the Caucasian failure rate was 12.24%. He asserted that this difference established a prima facie case. The court disagreed, however, holding that the numbers were too small on which to base any inferences, where 197 applicants took the exam, and only 10 of those applicants were Hispanic. It further noted that he failed to account for other reasons that applicants failed the exam. In particular he did not consider the number of re-takes, and defendant presented credible evidence that re-examinees, regardless of their race or national origin, failed at a greater rate than first time test takers. The court accepted the Board’s assertion that it had a legitimate nondiscriminatory purpose in failing applicants whom it deemed not qualified to be certified as surgeons. The plaintiff was unable to present any evidence of pretext, particularly where he made no complaints at the time of the exams about offensive or discriminatory behavior by the examiners.

Coger v. Connecticut,
309 F. Supp. 2d 274 (D. Conn. 2004), aff’d, Coger v. Conn. Dept. of Public Safety, No. 04-1886-CV, 2005 WL 1800627 (2d Cir. July 27, 2005).

An African-American applicant brought action against state, Department of Public Safety, and Department of Administrative Services, Bureau of Selection and Training, pursuant to § 1981, Title VII, and the Connecticut Fair Employment Practices Act alleging discrimination based on race for failure to hire him to the position of Connecticut State Police Officer Trainee.

This case arose after Plaintiff failed the oral portion of a 1995 exam, earning a decile score of three out of ten. Plaintiff claimed that, because he received a decile score of nine on the oral exam he previously took in 1993, his low score on the 1995 exam must be attributed to the record of his previous application and because he was Black. The only evidence plaintiff cited as support for his claim of discrimination was a letter written to the plaintiff from the Director of Personnel Assessment and Employment Services for the State of Connecticut, in which the Director stated that it was extremely rare for an applicant to have such a sizable difference between two oral examination scores. However, in the same letter, the director also explained that he reviewed the records from the 1995 examination and saw no errors or problems in the scores Plaintiff was given. The court concluded, therefore, that Plaintiff failed to show that his test was graded in a discriminatory manner and that it did not accurately reflect his performance on the examination or qualifications for the job.

The court further found that the plaintiff failed to offer any evidence from which a rational juror could infer a discriminatory motive on the part of the State employees who administered the exam. Defendant presented evidence, including affidavits regarding the procedures employed by the Department of Administrative Services in administering the oral exam portion of the selection process, which showed that it was administered in full compliance with EEOC Guidelines. For example, the examination did not test knowledge of police procedure or laws, because of their potential discriminatory impact on protected classes. Rather, the examination tested what the department had identified as important skill sets for a police officer, such as visual observation and ability to analyze situations. Furthermore, Defendant’s evidence showed that, in an attempt to eliminate any potential bias against protected classes by exam administrators, Defendant standardized the test and trained monitors and examiners in standardized grading.

In addition, the Director of Personnel Assessment and Staffing also testified regarding the administration of plaintiff’s examination. The Board grading plaintiff’s oral exam was comprised of two employees of the Department of Public Safety, an African-American male Sergeant, and the commanding officer of the Polygraph Unit of the Department of Public Safety, a white male Trooper. A Department of Administrative Services Monitor, a white female, was also present to oversee the test administration. The Sergeant’s affidavit stated that, during the exam, plaintiff was shown a videotape of a series of four situations and asked questions concerning what he observed while watching the tape. The questions asked were prepared questions asked of each candidate. Model answers for each question had previously been prepared, and the Sergeant used these model answers to determine whether the plaintiff’s response was correct. According to the Sergeant, when plaintiff responded with an answer which identified a fact or observation listed as a correct response, the Sergeant gave him credit for that correct answer. The Sergeant asserts that he recorded his corresponding scores as plaintiff gave responses to the questions, but that the plaintiff’s responses to the questions did not have the elements which were being sought on the score sheet, and therefore he received a low score. The examination team then compared scores at the end of the examination to ensure that the Sergeant and the Trooper did not deviate by more than one point from each other on the scores given. Therefore, there was no evidence before the court to create doubt that the normal examination procedures were not followed with respect to plaintiff’s examination, and that all of the standardization mechanisms were not in place to ensure plaintiff of a bias-free evaluation.

Finally the court further noted that, Board Four, the same Board that examined plaintiff, passed eight of the twelve black candidates who took the exam. The court then relied upon the Supreme Court’s recognition in St. Mary’s Honor Ctr. v. Hicks, 509 U.S. 502, 513-14 (1993), that in determining whether discrimination was a motivating factor for an adverse employment decision, evidence of an employer’s non-discrimination can be considered as well. In sum, because Plaintiff offered no evidence, other than his own conclusory affidavit, that he was scored unfairly on the test, the court concluded that he could not survive defendant’s motion for summary judgment.

Davidson v. Citizens Gas & Coke Utility,
470 F. Supp. 2d 934 (S.D. Ind. 2007).

Eight hourly employees of Citizens Gas & Coke Utility (“Gas & Coke” or “defendant”) and two unsuccessful applicants for employment, all African-American, sued Gas & Coke for, inter alia, disparate impact under Title VII. Plaintiffs alleged that the written test called work competency assessment (“WCA” or “test”), which was used to screen interdivision transfers, promotions, and applicants, had a disparate impact on African-Americans. The two applicants that were not hired conceded to the defendant’s motion for summary judgment. The trial court ruled against Gas & Coke and in favor of the remaining plaintiffs on summary judgment.

The WCA was developed by Roland Guay, a Purdue University professor in Organizational Leadership and Supervision. It was intended to allow Gas & Coke to measure a person’s ability to take on certain jobs. A test taker was required to score at a certain level, depending on the position, to be deemed eligible for hire, promotion, or transfer. Plaintiffs provided statistical evidence that African-Americans scored disproportionately lower than others taking the WCS, thereby reducing their changes of promotion. Plaintiffs also offered expert testimony that the WCS was not properly validated. Defendant made two arguments. Defendant argued that there was no statistically significant differences in selection rates based on race for all but one of the job postings for which any plaintiff applied. Defendant also argued that the plaintiffs lacked standing, because they were not otherwise qualified for the positions each sought.

The Court found that Plaintiffs satisfied their prima facie case because the test results undisputably fell within the guidelines established for disparate impact when a minority group pass-rate is less than 80% of the pass-rate of nonminorities, otherwise known as the EEOC’s “four-fifths” rule. The Court rejected the defendant’s first argument, holding that it was essentially a “bottom-line” defense, which the Supreme Court has declared unavailable. Defendant’s argument that there was no significant difference in “selection rates based on race” is the argument that defendant had a favorable bottom-line. “Where a challenged practice is an examination that is being used as a bright-line barrier to promotion . . ., it does not suffice as a defense that the employer might be able to demonstrate statistically acceptable bottom-line results from the promotion process.” Title VII is not analyzed at the bottom line, and such an argument does not defeat Plaintiffs’ prima facie showing, where the plaintiffs failed the test, the test excluded African-Americans disproportionately, and plaintiffs could not be considered for promotion if they did not pass the test.

As to standing, the Court held that its reading of the Seventh Circuit’s case law interpreting Connecticut v. Teal, 457 U.S. 440 (1982), was that, where direct proof exists establishing that failure of the test at issue caused a plaintiff to receive no further consideration for hiring or promotion, he has standing to sue. Based on this, the Court rejected Gas & Coke’s standing argument and found that, once plaintiffs’ failing scores on the discriminatory test eliminated them from the opportunity to compete further, they incurred a cognizable injury and had standing, regardless of Gas & Coke’s position that they were unqualified for the job anyway. Gas & Coke failed to argue that the WCA was justified by business necessity.

Davidson v. Citizens Gas & Coke Util.,
No. 1:03-CV-18820SEB-JPG, 2006 WL 694291 (S.D. Ind. Mar. 10, 2006).

Plaintiffs sought class certification in a lawsuit against the defendant for requiring them to take a Work Competency Assessment (WCA) test in order to qualify for promotion, transfer, and/or hiring. The Court denied the plaintiffs’ motion for certification due to shortcomings with respect to the named representative litigants and their counsel. Two of the named representatives had repeat felony convictions, which not only discredited them but also constituted a stand-alone reason for not being hired. The adequacy of other named representatives was also questionable because their claims implicated issues of seniority, collective bargaining contract interpretation, attendance records, and disciplinary records, any of which could be factors differentiating them from other applicant class members. The Court also found that counsel for the plaintiffs had demonstrated a lack of diligence and case management skills, thereby casting doubt on her ability to properly pursue the proposed class action. Although failure to establish any one of the four certification prerequisites bars certification, the Court also added that alleging both disparate impact and disparate treatment claims caused a lack of commonality, further rendering class action treatment inappropriate.

EEOC v. Aon Consulting. Inc.,
149 F. Supp. 2d 601 (S.D. Ind. 2001)

In this action, the Equal Employment Opportunity Commission sought an order requiring full compliance with subpoenas duces tecum that it served on Delphi Automotive Systems and Aon Consulting, Inc., for employment tests and validation studies. Aon Consulting developed and administers tests that Delphi Automotive uses to screen job applicants. These tests include both written tests, job simulation exercises, and a structured interview form. The EEOC alleged that the testing procedures are discriminatory against non-white job applicants for hourly positions. Delphi and Aon Consulting objected to the subpoenas unless the EEOC was required to hold these documents confidential from the original charging parties. The district court held that the confidentiality of these documents was important, that they would not be disclosed to the complaining parties, and that all documents and copies be returned within 180 days after either the conclusion of the EEOC’s investigation or the issuance of a right-to-sue letter, unless litigation followed. In its holding the court described the “unusual need to maintain confidentiality of the tests and validation studies.” It declined to accept the EEOC’s argument that its ordinary confidentiality procedures would be sufficient. The court referred to the Supreme Court’s holding in Detroit Edison Co. v. NLRB, 440 U.S. 301 (1979), in which it recognized the importance of maintaining the confidentiality of tests used for employment decisions. The court rejected the EEOC’s argument that limiting its use of the tests and related materials might interfere with its investigative power. It held that there is “an extraordinarily compelling case for confidentiality” of employment tests and validation studies, because disclosure could destroy the value of the tests, which is based on secrecy of the items. The EEOC argued that after it concluded its investigations, retaining the information in its files, subject to the provisions of the Freedom of Information Act, would be appropriate. In this opinion, however, the court accepted FOIA’s protections as sufficient only during the period of the open investigation, and required that the materials be returned within 180 days of the conclusion of those activities.

Gulino v. Bd. of Educ. of City School Dist. of City of New York ,
236 F. Supp. 2d 314 (S.D.N.Y. 2002), rev’d in part on other grounds, 2002 WL 31887733 (S.D.N.Y. Dec 26, 2002), rev’d, vacated, remanded, 460 F.3d 361 (2d Cir. 2006).

In this case, the plaintiff class was comprised of African-American and Latino teachers in the New York City public school system. The case arose when a group of teachers lost their licenses because they failed to pass the National Teacher Core Battery Exam (“NTE”). Plaintiffs sued the Board of Education and the State Education Department under Title VII and state laws alleging that their rights were violated through the imposition of a requirement that they pass either the NTE or the Liberal Arts and Sciences Test of the New York State Teacher Certification Examination (“LAST”) to receive or retain their teaching license. The plaintiffs alleged that the use of the tests as a requirement for obtaining permanent teaching certificates had a disparate impact on African-American and Latino teachers. In support of their disparate impact theory, the plaintiffs alleged that white test-takers passed both tests at a rate that was statistically significantly higher than the rate of African-American and Latinos and that the tests were misused and did not measure whether the test-takers were qualified to be teachers. Plaintiffs’ statistical expert showed that African-American and Latino test-takers’ pass rate was roughly 45% and white test-takers pass rate was roughly 85%. Plaintiffs contend that these results met the required showing for disparate impact under either the 80% rule or the standard deviations required to show statistical significance. The defendant attacked the statistical results on three grounds. Defendants’ first ground for attack was that the plaintiffs’ expert failed to consider other variables, such as quality of schools attended, grade point average, facility in English and socioeconomic factors, which rendered the report of little or no probative worth. The court found this ground to be inappropriate in a disparate impact case. In making this finding, the court stated that in a disparate impact case plaintiffs need not show specific racial motivation on the part of the employer where the causal connection between the challenged practice and the adverse employment decision was clear. For that reason, the court found that there was no requirement that plaintiffs control for variables other than race and ethnicity in their statistical proof.

Defendants’ second ground for attack was that plaintiffs’ expert failed to compare the pass rates of comparable groups. The defendants, citing the Supreme Court’s decision in Hazelwood School Dist. v. United States, 433 U.S. 299 (1977), argued that the mere recitation of how many minority candidates versus white candidates pass or fail was meaningless unless the court is assured that those groups were comparably qualified. The court found that Hazelwood was not controlling because it was a failure-to-hire case. Thus, the appropriate comparison in Hazelwood was between the racial composition of the school’s teaching staff and the racial composition of the qualified public school teachers in the relevant labor pool. In this case, which was not a failure-to-hire case, the court stated that the appropriate comparison was between those employees in the protected group and those not in the protected group. Therefore, the court found that the plaintiffs’ expert did not study the wrong population.

Defendants’ third ground for attack was that plaintiffs’ expert incorrectly analyzed only the pass rates for first-time test takers. Since the issue in this case was whether plaintiffs were able to pass the tests over the course of the five years that they were allowed under their provisional licenses, the defendants argued that the first-time pass rate was not important. Rather, the overall pass rate was what really mattered. Further, defendants presented at least some evidence that the pass rates between white and minority test takers over the course of five years was not necessarily indicative of adverse impact. Plaintiffs asserted two arguments in response. First, plaintiffs argued that the overall pass rate was not important. Plaintiffs contented that each failure deprived the test-taker of an employment opportunity, because plaintiffs were not allowed to apply for full city licenses until they passed the test. Second, plaintiffs alleged that the defendants’ expert’s study, which analyzed results for five years, showed a disparate impact when analyzed by the plaintiffs’ expert. The court concluded that this sort of “battle of the experts” is the reason “we have trials.” Accordingly, the court denied that part of the plaintiffs’ motion for summary judgment seeking a declaration of a prima facie case of disparate impact.

Likewise, the court denied both the plaintiffs and defendants’ motion for summary judgment with regard to whether the test was appropriately used to demote the plaintiffs. Plaintiffs argued that, while there may be some validity to using the test for evaluating first-time teaching applicants, the defendants misused the test to demote plaintiffs who had worked as teachers for many years. In response, defendants argued that the tests were “manifestly related” to the job of teaching and thus justified by business necessity. The court found that there remained an issue of fact with regard to both contentions and denied the relevant parts of both parties’ motions.

Gulino v. Bd. of Educ. of City School Dist. of City of New York,
201 F.R.D. 326 (S.D.N.Y. 2001).

This action was brought by teachers in the New York City Public School System who either lost their teaching license or were unable to obtain a teaching license because of their failure to receive a satisfactory score on either the national TeacherCore battery or the liberal arts and sciences tests of the New York State Teacher Certification Examination, which followed the earlier national TeacherCore battery. The plaintiffs allege a disparate impact in the tests on African-American and Latino teachers. In its opinion, the court grants class certification to the plaintiffs, but in doing so provides some information about the tests. Plaintiffs allege that the most successful test takers are college students, because the tests measure what is learned in general college courses, and that persons with Masters and Ph.D. degrees fail at a substantially higher rate than those with less education, regardless of race. They assert that the developer of the test, Educational Testing Service, validated it to measure the preparation of applicants for initial teaching positions or licenses, and not for decisions regarding retention or termination. Plaintiffs claim that the use of the tests to make retention and termination decisions violates relevant professional standards. According to the plaintiffs’ evidence, white test-takers pass the NTE an average rate of 83.7% while African-American and Latino test-takers pass at rates between 40.3% and 43.9%. The later test allegedly has pass rates of 93% for white test-takers and African-American and Latino rates of between 50-56.4%. For current public school teachers, 78% of the white teachers who took the earlier test passed compared with 42% of African-Americans and 34.9% of Latinos. The later test has 79.1% as a white pass rate compared to 37.5% for African Americans and 28.3% for Latinos. The court held that the requirements for certification of a class were met by the plaintiffs and granted class certification. Note: In an unpublished decision, the court, after a bench trial, ruled for the defendants. See Gulino v. Bd. of Educ. of City Sch. Dist. of City of New York, No. Civ. 8414, (September 4, 2003). The court concluded that plaintiffs established a prima facie case of disparate impact; however, the court granted judgment in favor of defendants because it found that they met their burden of proof in showing that the NTE and LAST were job-related. Plaintiffs argued that the Court should follow a disparate impact proof standard enunciated by the Second Circuit in the 1970s and 1980s — namely formal validity. Under this theory, Plaintiffs would have prevailed with respect to the LAST, but not the NTE. The court refused to follow this standard. In doing so, the Court cited Watson v. Fort Worth Bank & Trust, 487 U.S. 977 (1988), and held that formal validation studies are not required to show that a test is job-related. The court reasoned that, despite defendants’ failure to demonstrate formal validity, the LAST was job-related because of the coincidence of three factors: (1) the importance given to the ability to write an essay by those education professionals surveyed by NES; (2) the weight of the essay writing portion of the test, and (3) the fact that the majority of plaintiffs would have passed the LAST but for the essay writing section. The Court further found that Plaintiffs failed to offer a cost-effective, practical alternative to the tests used by defendants in certifying teachers. In light of the foregoing, the Court found for defendants. But, see Gulino v. New York State Educ. Dep’t, 460 F.3d 361 (2d Cir. 2006), supra, for the Second Circuit’s ruling on appeal.

Hands v. DaimlerChrysler Corp.,
282 F. Supp. 2d 645 (N.D. Ohio 2003).

Employee brought race discrimination and retaliation claims against employer under Title VII and Ohio law, and a hybrid claim against his employer and union under the Labor Management Relations Act (LMRA). The employee challenged the employer’s apprenticeship examination system as having a disparate impact on African-Americans. In support, Plaintiff argued that the comprehensive point assessment tallying system, the criteria for further accumulation of qualifying points, and test scores of overall point assessment, were never made available to her or, to her knowledge, any other African-American who took the test. Plaintiff further argued that the employer’s failure to make this information available, combined with the undisputed fact that of those admitted to the Skilled Trades Apprenticeship Program, all but 1 employee was White, and no African-American was ever admitted during the entire 18 years plaintiff was employed by Defendant, that she established her prima facie case of showing that the employer’s testing practices as administered by the employer and the union, disparately impacted African-Americans. The court found the foregoing was insufficient to establish that the testing program caused any racial disparity. Plaintiff failed to present any evidence, particularly the required statistical evidence, to demonstrate that the disparity complained of was the result of one or more of the contested employment practices. Therefore, the court dismissed plaintiff’s claim.

Hawkins v. Home Depot USA, Inc.,
294 F. Supp. 2d 1119 (N.D. Cal. 2003), aff’d, No. 03-17268, 2005 WL 752247 (9th Cir. Apr. 4, 2005).

Employee, an African-American male, sued his employer, a retail store, after his employer eliminated plaintiff’s night shift position due to reorganization and then terminated plaintiff after he failed to pass a sales associate test that defendant required for the position plaintiff had applied for, which was a newly created day-shift freight team position.

Defendant presented evidence that the new position involved customer contact and, therefore, one of the qualifications for the position was passing the customer service oriented sales-associate test. According to defendant, plaintiff failed this test. Plaintiff questioned whether he in fact failed the test, noting that he was never given a copy of the test he took and that defendant did not produce the test result records for him as it did for the other night crew employees. Defendant submitted testimony that it was unable to provide plaintiff with a copy of the test and his answers because the test was administered electronically. In any event, the court found that plaintiff’s doubts did not constitute affirmative evidence that he in fact passed the test. Furthermore, even ignoring the issue of the sales-associate test, plaintiff admitted that he was unwilling to work during the day and the job description for freight-team associate identified the ability to work a “flexible schedule” as one of the minimum qualifications of the job. Thus, the court concluded, even if plaintiff had passed the sales test, the record showed that he nonetheless would not have been qualified to work as a freight-team associate.

Jeffrey v. Ashcroft,
285 F. Supp.2d 583 (M.D. Pa. 2003).

Plaintiff, Father Jeffrey, a former chaplain for the Federal Bureau of Prisons (“BOP”), brought action under the Rehabilitation Act alleging he was discriminated against because of his disabilities, asthma and chronic obstructive pulmonary disease, following his discharge for failing a physical abilities test (“PAT”). The district court held, inter alia, that a genuine issue of material fact existed as to whether satisfactory completion of a physical abilities test was essential in assessing the fitness for employment as a BOP chaplain.

Father Jeffrey had served in the prison ministry for several years. In 1998, the BOP hired him as a chaplain for a one-year probationary period. Father Jeffrey’s retention by the BOP, after the one-year period, was contingent, in part, upon the successful completion of the PAT, which consisted of five timed tests intended to measure the physical abilities required for the performance of correctional work.

The PAT was developed by industrial psychologists based upon a review of physical tasks required for a variety of job activities expected of correctional workers. The PAT was intended to measure dynamic strength, gross body equilibrium and coordination, stamina, and explosive strength. The components of the PAT were a: (1) dummy drag, requiring an individual to drag a 75-pound dummy over a distance of at least 694 feet within three minutes, intended to replicate an emergency scenario in which a victim is dragged to safety; (2) a ladder climb during which a person is to climb an eight foot, ten inch ladder and retrieve an item of contraband, intended to test the ability to search for contraband concealed in high places using the assistance of a ladder; (3) an obstacle course, requiring a person to open locked doors, re-lock the doors, and proceed over, under and around tables, desks, etc., within 58 seconds, intended to replicate an emergency situation within an institution; (4) a quarter mile run and handcuffing of an individual within two minutes and 35 seconds, measuring stamina, and (5) a stair climb, in which the participant wears a twenty-pound weight belt and ascends and descends two flights of stairs three times, with the participant completing the test within 45 seconds.

Father Jeffrey was given two opportunities to take the PAT. He did not complete all components of the PAT on the first try; on the second attempt, he failed to meet the time limits in four of the five tests. Father Jeffrey was terminated due to his failure to pass the PAT. Father Jeffrey attributed his inability to successfully complete the PAT to his impaired breathing capacity, due to a combination of asthma and chronic obstructive pulmonary disease (“COPD”).

At the outset, the court noted that neither a disparate impact nor disparate treatment paradigm was pertinent in this case. Rather, because it was evident that the BOP relied upon Father Jeffrey’s disability in making the decision to terminate his employment, the key issue was whether the PAT was essential in assessing fitness for employment as a BOP Champlain.

The court analyzed whether the PAT measured the physical ability to perform functions essential to the position of prison Chaplain under the regulations implemented under the ADA, which employs the same standards as the Rehabilitation Act employment discrimination claims. In its analysis, the Court noted that under 29 CFR § 1630.2(n)(2)(I), a function my be regarded as essential “because the reason the position exists is to perform that function.” Because the PAT measured the ability to perform correctional work, the court found that the Chaplain position did not exist to perform the function of responding to disturbances and maintaining discipline at a penal institution.

Other factors the court looked to in determining whether the PAT was essential in assessing fitness for employment as a BOP Champlain were: (1) the BOP’s business judgment; (2) the number of persons available to respond to prison disturbances; (3) whether Father Jeffrey was hired for his expertise in responding to prison disturbances; (4) the written job description; (5) amount of time devoted to prison disturbances; (6) whether Father Jeffrey posed a risk to himself of others, and (7) treatment of other employees in the same position. Examining the evidence under these factors, the court found that the BOP’s judgment and job description, which included responding to emergencies and providing security, were not conclusive evidence. The court next examined the BOP’s evidence that Chaplains, on occasion, have responded to prison disturbances and other emergency situations, but concluded that the evidence did show that the number of employees available to respond to a prison disturbance was so small that a prison Chaplain must be expected to be a first responder and chaplains did not often respond to emergency situations. The court analyzed evidence proffered by Father Jeffrey that the BOP employs Chaplains that have not passed the test and determined that that evidence weighed against a finding that Father Jeffery’s inability to do so presented a risk of harm to himself and satisfactory completion of the PAT was an essential criterion of the position of chaplain.

The court concluded that the Father Jeffrey tendered sufficient evidence to submit the issue of whether satisfactory completion of the PAT was an essential criterion of the Chaplain position to the jury. Thus, the court denied the government’s motion for summary judgment.

Johnson v. City of Memphis,
2006 WL 3827481 (W.D. Tenn. 2006).

Over the course of decades, the City of Memphis’s employment practices have been challenged frequently as having disparate impact on African Americans. In fact, the City was subject to a consent decree requiring remediation of prior discrimination and prevention of future disadvantages to protected groups. Despite this, the City continued to administer various employment tests with adverse impact. More recently, Plaintiffs challenged the 2000 and 2002 processes used by the City and won.

Plaintiffs previously won summary judgment on the 2000 process for promoting police sergeants, which consisted of a two-part test (a written exam plus a practical application exercise, together weighted at 70%) as well as performance evaluations (weighted at 20%) and seniority (weighted at 10%). Candidates had to score a 70 on the written exam to proceed to the practical test. Because the written exam had a disparate impact, the City reduced the cut-off score to 66, which satisfied the EEOC’s four-fifths rule. However, because the contents of some of the practical test leaked, the City canceled that portion. Instead, the City increased the weight of the written exam and performance evaluations. Plaintiffs challenged these changes, and the court found plaintiffs proved a prima facie case, and that defendants could not prove the written test was job-related or a business necessity. The court had also previously determined that plaintiffs showed a prima facie case with respect to the 2002 process, which consisted of an investigative logic test, a job knowledge test, an application of knowledge test, a grammar and clarity test, and an oral response test. The 2002 process was based on a comprehensive job analysis that focused on the knowledge, skills, abilities, and personal characteristics (“KSAPS”) needed for successful performance as a sergeant. The court rejected the plaintiffs’ challenges to the validity and reliability of the exam. Based on expert testimony, the court determined that the investigative logic, oral, and job knowledge were properly content-valid, as they reflected critical job duties, real materials, and realistic scenarios encountered by sergeants. The court also found the City’s bona fide seniority system perfectly acceptable as an add-on to the test score totals. Also based on expert testimony, the court found that the 2002 process was sufficiently reliable. Further, because the 2002 process was reliable and valid and the scores had substantial variance, the court rejected plaintiff’s third argument that rank-ordering of candidates based on the results was improper.

Nonetheless, the court accepted the plaintiffs’ argument that alternative testing modalities with less undesirable racial effect existed and were capable of serving the City’s interests. The plaintiffs pointed to the City’s 1996 process, which went unchallenged and used a practical exercise. Plaintiffs also pointed to a merit promotion process, which the City’s expert had successfully used in a different city. Because the plaintiffs provided fully alternative devices, the court ruled in their favor.

Johnson v. Memphis,
355 F. Supp. 2d 911 (W.D. Tenn. 2005).

African-American police officers who were denied promotions to sergeant based in part on tests administered by the City of Memphis filed suit alleging discrimination pursuant to §§ 1981 and 1983, Tennessee statute, the Fourteenth Amendment’s equal protection clause, city ordinances, and Title VII. The officers moved for partial summary judgment on limited issues with respect to their Title VII claim of whether the written test component of sergeant promotions process for year 2000 resulted in disparate impact on African-Americans, whether the City was able to offer evidence that test was job-related or consistent with business necessity, and whether the promotional process employed by city in sergeant promotions for the year 2003 resulted in disparate impact for African-American police officers.

2000 Promotional Process

In scoring the 2000 written test, Defendant first applied a “cut score” of 70. Those making a score of 70 or above were considered to have passed the exam; candidates with scores below 70 were not permitted to move to the next step of the promotional process. However, when defendant applied the cut score of 70, it resulted in disparate impact to African-American candidates. Therefore, Defendant adjusted the cut score to 66. As such, anyone scoring 66 or higher was considered to have passed the written test and was allowed to continue in the promotional process by taking the practical test. The 66 and above score was determined to be the score needed to ensure that the results satisfied the Equal Employment Opportunity Commission’s (“EEOC”) four-fifths rule. The Court, however, refused to consider only the four-fifths rule and considered the evidence of plaintiffs’ expert witness. Plaintiffs’ expert, Dr. Richard DeShon, contended that the statistical evidence supported a finding that the test resulted in “substantial” disparate impact which was significant both statistically and practically. In fact, comparing the mean differences in scores both before and after the cut score was lowered to 66 showed a significant difference. Applying a T-test analysis to compare the mean difference between the two groups, there were between 2 and 3.5 standard deviations, depending on which comparison was being made. Dr. DeShon asserted that this was a “substantial and statistically significant difference” which was extremely unlikely to occur due to chance. Furthermore, when applying the Z-test to compare the difference in pass rates between African Americans and Caucasians, Dr. DeShon determined that the 2.74 standard deviation was statistically significant and that the results were highly unlikely to occur by chance. Additionally, using the d score as a measure of effect size to determine practical significance, Dr. DeShon noted that there was a practical difference in test performance between black and white officers. Therefore, the court concluded, Plaintiffs made a prima facie showing that the differences in selection rate were significant in both statistical and practical terms.

Accordingly, because plaintiffs established a prima facie case of disparate impact, the burden of production shifted to defendant to show that the test was job-related and a business necessity. However, defendant did not offer any proof that the written test was job-related or a business necessity. Therefore, the court granted plaintiffs’ motion for summary judgment for the Title VII claim of disparate impact as to the 2000 promotional test.

2003 Promotional Process

Because of the long history of difficulties with discriminatory promotional practices, the defendant took great pains to ensure that the 2003 test it employed was job-specific and tested the skills required for the position of sergeant. Defendant hired a consultant, Dr. P. Richard Jeanneret of Jeanneret and Associates, to create such a test. Following the test, the defendant learned that the test had a significant disparate impact on African-Americans seeking promotion to sergeant, in spite of the efforts made to avoid such a result. However, defendant relied on the test and promoted accordingly, thus disproportionately promoting whites to sergeant positions. Because there was no genuine issue of fact concerning this issue, in the interest of narrowing the issues for trial, the court granted plaintiffs’ motion for partial summary judgment as to the issue of disparate impact regarding the 2003 promotion test.

Lewis v. Chicago,
No. 98 C 5596, 2005 WL 693618 (N.D. Ill. Mar. 22, 2005).

African-American plaintiffs who applied for entry-level firefighter jobs scored between 65 and 88 on the entrance exam and sued the City of Chicago alleging that the exam created a disparate impact on minorities. Plaintiffs argued that the City’s decision to set the cutoff score at 89 had an unjustified, adverse impact on African-American applicants. The City argued that the test validly measured some of the cognitive skills needed for training and for performing the job and that the 89-point cutoff was justified by administrative convenience to help manage the number of applicants. The Test: The test was developed using a “content-oriented” test validation strategy—the content of the test was aimed to reflect important aspects of performance on the job. Job analysis by the City’s expert revealed a list of 46 skills deemed critical to the job. Eight of the 18 most essential skills were physical, and seven more were “cognitive.” Four of the cognitive skills were allegedly tested by the exam—the ability to: (1) comprehend written information; (2) understand oral instructions; (3) take notes; and (4) learn from demonstration. The test consisted of two parts, a multiple choice section and a video demonstration, during which applicants watched an instructional video, took notes, and then answered technical questions based on the video, their notes, and other written materials. Sixty-five was set as the passing score, because such a score demonstrated the minimum level of cognitive ability needed to master the Academy curriculum and perform the firefighter job. 94.5% of Caucasian test-takers and 72.3% of African-American test-takers passed the test. Those passing applicants were considered “qualified.”

After receiving the results, City officials set the cut-off score at 89. Those who scored at least 89 were considered “well qualified.” This change had a strong effect on racial makeup, as there were approximately 5.4 times more “well qualified” Caucasians than African-Americans. However, evidence adduced at trial indicated that the expert who designed the exam was aware, and informed the City, that there was no statistical difference between any two scores that are within 13 points of each other. Therefore, there was no statistical significance between the ability of those who scored 87 or 88 as opposed to 89 through 98. The expert suggested that the City randomly select applicants who passed the test with a score of 65, but the City disregarded this suggestion and continued with the 89 cutoff score, aware of its disparate impact. However, after the City ran out of “well qualified” applicants, it began offering positions to “qualified” applicants at random!

The parties did not dispute the disparate impact of the test in general, or the severe disparate impact of the 89-point cutoff. The City’s primary arguments --that the test was valid and consistent with business necessity -- failed for the following reasons:

  1. The Court found doubt that the test could reliably measure the four cognitive skills it was designed to measure. Evidence at trial demonstrated that the video portion of the exam, having its debut during this test, hinged almost exclusively on the single, least important firefighting skill—a candidate’s ability to take notes. The Court held that this undermined the test’s utility as a valid measure of the candidates’ relative cognitive skills.
  2. Uncontroverted trial evidence established that the City’s cutoff score of 89 could not, and was never intended to, make the distinction between qualified and unqualified candidates. Therefore, the City could not establish its “business necessity” defense. In fact, the “cut off score of 89 was a statistically meaningless bench mark [that] provided no information regarding the relative abilities of the test-takers.” The Court consequently held that the City was not justified in its decision to set the cutoff score at 89.
  3. Further, the City was unable to prove the test’s ability to predict firefighter performance or its validity as a measure of the “trainability” of candidate firefighters.
  4. Last, even if the City had successfully proven that the disparate impact of its test was justified by business necessity, the Court held that the plaintiffs could still show that an equally valid and less discriminatory alternative was available. In fact, the City utilized this alternative when it ran out of candidates by randomly selecting from those who passed the exam with a score of 65 or better.

Kielczynski v. Village of LaGrange,
122 F. Supp. 2d 932 (N.D. Ill. 2000).

Plaintiff was a female police officer who participated in two selection processes for promotion to sergeant. She filed her original complaint after the 1996 selection process, but amended her complaint for sex discrimination following her failure to be selected for promotion in 1999. In this opinion, the district court denied summary judgment to the police department. The sergeant promotion process included a written examination, an oral interview, and a series of merit and efficiency ratings completed by police supervisors on the candidates. Each of these three components comprised 30% of a candidate’s total score with additional points added for seniority and military service. The Board of Fire and Police Commissioners made the selection for sergeants. While the guidelines permitted any of the top three candidates to be selected, no one had ever been promoted to sergeant who had been other than the top-rated individual on the sergeants eligibility list. In each of these two processes, the plaintiff scored highest on the written examination. In the 1996 process, she tied with another candidate for the highest score on the oral interview, and in 1999 she was actually highest ranked on the oral interview as well. However, on the merit and efficiency ratings, she scored the lowest of all candidates in both years. The supervisors who provided these ratings were all males, and the plaintiff alleged that they deliberately awarded low scores to her to keep her from being one of the top three candidates. Among the evidence presented by the plaintiff was the inconsistency of her ratings. Her scores varied more than the scores given to other candidates, and showed a range of 63 points (from a high score of 80 to a low score of 17). Plaintiff also presented evidence that the supervisors were unable to provide adequate explanation for their ratings, and in some cases their merit and efficiency ratings for the plaintiff were inconsistent with other separate evidence about their evaluation of her performance. In comparing the plaintiff s excellent performance on the objective part of the examination, the court held that there was sufficient evidence from which a rational fact finder could infer that the defendant’s reasons for her ranking on the merit and efficiency component were a pretext for discrimination.

Lander v. Montgomery County Bd. of Comm’rs,
159 F. Supp. 2d 1044 (S.D. Ohio 2001), aff’d, 2003 WL 1819692 (6th Cir. Apr. 3, 2003).

The plaintiffs in this case are four public works employees of Montgomery County who challenged the assessment tests used to establish eligibility lists for promotions in the public works department. In this opinion, the district court denied their request for a preliminary injunction enjoining the continued use of its assessment test scores and the resulting eligibility list. Plaintiffs alleged that the assessment testing process violated Title VII and had a disproportionate adverse impact on African-Americans. They further alleged that the pass rate for Black test takers was sufficiently lower than the pass rate for white test takers to constitute a prima facie case.

The test process the County used was developed in response to complaints from the NAACP and the plaintiffs about the lack of promotional opportunities for minorities in the department. They complained that, despite their qualifications, they were passed over for promotions and were further not given an opportunity to demonstrate their ability to perform work at a higher level. As a result, the County contracted with the Miami Valley Career Technology Center (“CTC”) for the creation and administration of an assessment test. The CTC met with County representatives to discuss job classifications and position requirements. They also used examples of old test questions and reviewed written job descriptions. Ultimately they developed five assessment tests for the three job classifications for which promotional selections were made. The basic level test assessed mechanical reasoning and information locating skills. For the upper two positions, the tests were skills test which assessed competence in electrical, plumbing, carpentry and HVAC skills. The tests included both a hands-on and a written component. Those four skills tests were used to evaluate candidates for promotion to both the second and third level positions. While the positions in question might be categorized as journeymen level positions, the County had instructed the test developer to create tests that did not require true journeymen level knowledge. The tests as designed required only a high school level of knowledge in order to obtain a good score. The tests were used in combination, and candidates for a level 2 or 3 position might take several skills tests, but the score on the best one or two tests (one test for level 2 and two tests for level 3) were used in evaluating and ranking the candidates. The promotion eligibility list was based on those individuals who scored in at least the fiftieth percentile for the basic test, and 70% on the one or two tests used for the upper level positions. Ranking was by strict score order of those who received a passing score. The pass rates for African-Americans and Caucasians are not described in this opinion, but it is clear from the numbers that some African-Americans did receive passing scores at a level that placed them on the eligibility list for promotions. The district court held that none of the plaintiffs was entitled to injunctive relief, based on the likelihood of their success on the merits. Among the four plaintiffs, one had not taken the test, because he was not a current employee of the public works department; another plaintiff took the tests and was placed on the promotion eligibility list, and received a promotion during the pendency of the litigation. The third plaintiff declined to participate in the testing process on the advice of his union representative. The fourth plaintiff had failed the level 1 mechanical reasoning test, but in this case the court found that the sample size of eight Caucasians and 12 African-Americans was too small to provide statistically meaningful results. While it might be adequate in a trial on the merits, it was inadequate to demonstrate the likelihood of irreparable harm if an injunction is not issued. Among the other arguments the plaintiffs made which were rejected by the court were, first, that the County had violated Title VII by lowering the bar and making its assessment tests so easy that less qualified Caucasians were able to pass them. By this “dumbing down” process, the plaintiffs argued that the County enabled less-qualified Caucasians to leap-frog over African-American candidates and receive promotions. Plaintiffs also argued unsuccessfully that they were disadvantaged by the new promotional system, and that the County should be required to return to the old promotional system which was more advantageous to them. The court held that the crucial question in this case was whether the African-Americans fared worse on the new assessment test than Caucasians who participated in the same testing process, and that the old assessment process was irrelevant.

Lans v. Florida Highway Patrol,
83 Fair Empl. Prac. Cas. (BNA) 747 (S.D. Fla. 2000).

A Black state trooper who was conditionally hired was rejected after he received an unacceptable rating on the psychological evaluation instrument that the Florida Highway Patrol (“FHP”) used for screening. Conditional hires as state troopers needed to pass an intensive screening process, including a written examination, a physical fitness test, a physical examination, a polygraph examination, a background investigation and a psychological test. Failure in any part of the screening process would disqualify a conditional hire from the opportunity for permanent employment. Those who passed the screening process are placed on a list of qualified hires from which final selections for the training academy are made. The plaintiff received an unacceptable rating on the psychological evaluation, which included five standard psychological components, as well as a questionnaire and a clinical interview. In granting summary judgment to the FHP, the court rejected plaintiffs evidence. First, it rejected his evidence of another psychological test which he took seven years ago and passed. The Court noted that the plaintiff was not treated any differently than any other conditional hire. It rejected plaintiff’s expert affidavit that psychological tests may be biased against African-Americans. In addition, there was evidence presented that African-American males passed the FHP’s psychological evaluation in 1998 and 1999 at a rate of 22 out of 27 candidates, and the Court cited this fact in holding that the plaintiff failed to establish a prima facie case of racial discrimination.

Marable v. District of Hospital Partners, L.P.,
2006 WL 2547992 (D.D.C. 2006).

Plaintiffs, six African-American former employees of defendant, sought class certification on behalf of all African-American applicants who were not hired as a result of failing defendant’s 3-part screening examination, consisting of math, reading comprehension and writing ability components. Defendant administered this test battery to internal candidates (current employees) as part of a restructuring. Internal candidates who passed participated in a training program, which too involved tests that had to be passed in order to continue employment. Defendant also administered the challenged exam to external candidates as a means of selecting them for employment. Plaintiffs sought to include both groups of candidates in their proposed class.

The court rejected the plaintiffs’ proposed class in light of the defendant’s argument that, although the content of the exams was the same, the purpose was different, depending on whether the candidate was internal or external. For internal candidates, the defendant explained that the exam was used to assess ability to succeed in the training program. For external candidates, the defendant explained that the exam was intended to determine whether the candidates had minimum proficiencies required for the job. The court determined that the different intended goals defeated the class’s commonality, as defendants would have to prove the test was valid based on two different sets of criteria. The court also determined that the class lacked typicality, because external and internal candidates failing the exam suffered distinct injuries—being refused consideration for a position versus being refused admittance into a training program, respectively.

The court considered, but rejected, plaintiffs’ argument for subclass certification. A subclass of internal candidates failed to satisfy numerosity requirements, and plaintiffs did not have a representative for the proposed external applicant subclass.

Menoken v. Blair,
No. 03-01775(HHK), 2006 WL 1102809 (D.D.C. Apr. 26, 2006).

A black female attorney filed suit against the United States Office of Personnel Management (“OPM”) alleging that the federal Administrative Law Judge (ALJ) exam had a disparate impact on African-Americans and females. OPM designed, administered, and scored an exam that all individuals interested in becoming an ALJ must first pass to be eligible. The exam consisted of four parts: (1) a Supplemental Qualifications Statement (“SQS”); (2) a written demonstration; (3) a Personal Reference Inquiry (“PRI”); and (4) a panel interview. The plaintiff did not believe her final score on the exam reflected her qualifications and filed a number of appeals, a formal Charge, and then a lawsuit. The plaintiff alleged the test was invalid because: (1) the OPM had a mistaken understanding of how validity is measured; (2) the OPM’s process in developing the exam ran counter to methods generally accepted by professionals in the field; and (3) the exam did not actually measure the qualifications needed to fulfill ALJ duties.

The Court agreed with OPM when denying the plaintiff’s motion for partial summary judgment because it inappropriately obligated OPM to demonstrate the exam’s business necessity before the plaintiff satisfied her initial burden of proffering evidence that the exam had a disparate impact. The Court held that the “expressions of concern over the ethnic and gender makeup of the ALJ ranks” which permeated the record were, standing alone, insufficient evidence of disparate impact. The plaintiff’s argument regarding the validity of the exam, that the OPM utilized an incorrect relatedness standard, also failed at summary judgment because there was disputable evidence about whether the OPM utilized the appropriate standard for business necessity, especially in light of the OPM’s validation study performed subsequent to the exam’s creation. The Court disregarded the plaintiff’s more technical arguments that some aspects of the administration and scoring of the exam violated the Uniform Guidelines, reasoning “it is not clear that strict adherence to the ‘intricate details’ of the Guidelines is mandatory under Title VII.” Therefore, the deviations that the plaintiff emphasized for the Court would be insufficient to declare the exam invalid as a matter of law.

M.O.C.H.A. Soc., Inc. v. City of Buffalo ,
No. 98-CV-99C, 2005 WL 589834 (W.D.N.Y. February 28, 2005).

Plaintiffs Men of Color Helping All Society, Inc. (“M.O.C.H.A.”) filed suit against the City of Buffalo alleging that the City of Buffalo’s promotion of firefighters to the rank of lieutenant based on an examination given on March 14, 1998 (referred to herein as the “Lieutenant’s Exam” or the “Exam”) had a disparate impact on African-American firefighters, and that the City has failed to prove the use of the Exam was “job-related for the position in question and consistent with business necessity.”

In January 1998 the City of Buffalo posted notice of a March 14, 1998 administration of a Lieutenant’s Exam. The notice advised that the Exam would be given 100 percent relative weight in ranking candidates on the eligible list resulting from the Exam, with points to be added to successful candidates’ scores based on seniority. A total of one hundred seventy-nine White firefighters and eighty-nine African-American firefighters took the Exam. One hundred thirty-three White candidates passed and forty-six failed, for a pass rate of 74.3 percent. Thirty-eight African-American candidates passed and fifty-one failed, for a pass rate of 42.6 percent. The exam was designed by the Department of Civil Service of the State of New York.

The court did not address the merits of the parties’ case in this ruling. Rather, the court denied the Plaintiffs’ motion for summary judgment and granted the defendant’s Motion for relief pursuant to Rule 56(f).

Plaintiffs’ summary judgment argument relied exclusively upon the opinions set forth in the expert report of Kevin R. Murphy, Ph.D., disclosed for the first time in connection with the present motion. Dr. Murphy related that he was hired by plaintiffs as an expert on employment testing “to provide an evaluation of the validity of the [Lieutenant’s Exam]” (id. at p. 1). Based on his review of Plaintiffs’ experts’ deposition transcripts and exhibits, other pertinent documentary evidence and publications, and his own statistical analysis of the Exam results, Dr. Murphy concluded: (a) The use of this exam to make employment decisions (promotion to Fire Lieutenant) had an adverse impact on African American examinees, and (b) Neither the City of Buffalo and nor the Department of Civil Service of the State of New York had presented credible evidence of the validity of this test for the purpose of making decisions about promotion into the position of Fire Lieutenant. According to plaintiffs, this is all the evidence the court needed in order to enter judgment in their favor as a matter of law finding the City liable for employment discrimination under Title VII. The Court, however, disagreed finding that the most efficient way to bring the matter to resolution was to encourage expeditious completion of all reasonable and necessary fact and expert discovery in order to fully and fairly address the claims and defenses on their merits.

Moore v. Ware,
87 Fair Empl. Prac. Cas. (BNA) 474 (La. Ct. App. 2001), reversed on other grounds, 91 Fair Empl. Prac. Cas. (BNA) 346 (La. 2003).

The imposition of a requirement that a candidate for promotion to police sergeant pass all parts of a physical fitness test, coinciding with the eligibility of the first Black candidate in two years, was found to be a violation of the Equal Protection Clause of the Constitution. The Black officer passed the written promotional test and was placed at the top of the promotion list based on his seniority. He had been acting sergeant in the department for six months while the current sergeant was on medical leave. The state’s civil service plan included a provision for candidates to demonstrate “good health and physical fitness sufficient to perform the essential duties of the position with or without accommodations.” That civil service system provision was adopted in 1995, but until the plaintiff became eligible for promotion, no police officer had been required to pass a physical fitness test. The plaintiff became eligible for promotion in June 1997 and in mid-July the police chief issued a policy that required passage of the fitness test. When the sergeant position became open, the plaintiff was given the physical test, but due to his obesity and high blood pressure he was unable to pass the obstacle course despite several attempts. He was passed over for sergeant and the job was offered to two white officers who were lower on the seniority list. Plaintiff filed suit, and the district court found in favor of the city. On appeal, this court held that the police department’s application of the test requirement was arbitrary and capricious. It noted that none of the white officers who came up for promotion during the first two years after the fitness requirement was implemented was required to pass the physical fitness test. It took note of the police chief’s explanation that, “you have to start somewhere,” and called it a random application of the law. It further noted that plaintiff’s physical fitness was of little concern during the six months when he capably filled the position during the absence of his superior. The strong language of the court includes its description of the random application of the test as a “blatant violation” of the plaintiff’s rights and “a naked attempt to subvert the civil service system of promotion by seniority.” There is no detail offered about the test, except the identification of the obstacle course and the indication that elements of the test were timed.

Nelson v. Flint,
136 F. Supp. 2d 703 (E.D. Mich. 2001).

This is a reverse discrimination action brought by white candidates for sergeant positions in the City of Flint Police Department. The plaintiffs took a written examination for promotion to sergeant in 1994. This written examination was a requirement of the collective bargaining agreement with the Police Officer’s Association. Following the administration of the written examination, the City developed a rank-order eligibility list for sergeant selections. The top three candidates on the eligibility list were available for selections to sergeant, in addition to any other candidates who were within three percentage points of the highest certified score on the exam. Plaintiffs participated in additional promotional processes in 1996, 1997 and 1998, and challenged the selection of sergeants who were women or minorities in those processes, where they fell within the permitted three percentage points band, but did not score higher on the written examination than the two plaintiffs. The Chief of Police had the authority to make the final selections for sergeants from the eligibility list. Once the pool of those within three percentage points of the highest certified score was identified, the Chief asked for input from his captains as to who would be the best sergeant selections. There was substantial evidence of negative input from captains on both the plaintiffs, specifically with respect to outbursts and behavioral or rules infractions. The evidence presented showed none of the women or minorities selected had similar episodes or infractions. The court granted the summary judgment motion of the city, because it failed to find any evidence of a discriminatory intent on the part of the city. In an effort to bolster their case, the plaintiffs attempted to introduce into evidence the city’s affirmative action plan, but the court found no evidence that this plan had ever been adopted by the city, or that it formed a basis for the selection of women and minorities.

Pasco v. Potter,
214 F. Supp. 2d 183 (D. Mass. 2002).

Plaintiff was accepted into an associate supervisor program for the United States Postal Service, but was terminated after three weeks in the program following his failure to achieve a passing score on the tests administered at the end of weeks 1 and 2. He sued, alleging that age was a factor in his termination. The court found no direct evidence of age discrimination or statistical evidence of disparate impact. Of those who had been previously dismissed for failure to achieve passing scores, five were under the age of 40 and three over the age of 40. The only other support for plaintiff’s claim was his assertion that he “couldn’t think of any ... reason why they would let me go” other than his age. The court rejected plaintiff’s claims and granted summary judgment to the Postal Service.

Rhodes v. Cracker Barrel Old Country Store, Inc. ,
213 F.R.D. 619 (N.D. Ga. 2003).

African-American employees, former employees, and applicants sued a restaurant chain alleging continuing systemic racial discrimination in employment through discriminatory selection and compensation procedures. The decision rendered in this opinion was a denial of class certification. Among the employment practices that plaintiffs challenged was defendant’s Personal Achievement Responsibility Program (“PAR” Program). Five PAR levels existed for each of defendant’s nine hourly store positions. An employee could progress from one PAR level to the next PAR level by working a specific time in his or her existing PAR level, receiving a passing score on his or her evaluation, and passing a written test. The passing score required for evaluations and written tests increased at each level. The court noted, without reaching the merits of plaintiffs’ claim, that to prove disparate impact, plaintiffs would have to present evidence concerning an individual’s PAR level, whether the individual passed a performance evaluation (and, if the individual did not pass a performance evaluation, whether his manager evaluated him in a non-discriminatory manner), whether the employee studied sufficiently to pass the PAR test, and whether he attained the next PAR level. Denying class certification, the court concluded that common issues were not raised by the class as required by Federal Rule of Civil Procedure 23(a). Thus, none of plaintiffs’ claims, which attacked all of defendant’s employment practices, were deemed suitable for class treatment.

Ricci v. DeStefano,
2006 WL 2828419 (D. Conn. 2006).

Seventeen white and one Hispanic candidate challenged the New Haven Civil Service Board’s (“CSB”) decision not to use certain test results during its promotion process. Although this is more of a disparate treatment claim rather than one of disparate impact, the analysis is interesting.

The exam consisted of written and oral components counting for 60% and 40% of an applicant’s score, respectively. A severe disparity resulted from the test’s administration in 2003. The CSB utilized a “Rule of Three,” whereby an open position would be filled from among the three individuals with the highest scores on the exam. In this case, seven positions were open, so the top nine candidates were considered. No African-Americans made it into the top nine. After conducting multiple hearings and weighing various expert and lay opinion testimony, the CSB decided not to use the test results, but did so without conducting a validation study of this test or any alternatives.

The court was not persuaded by plaintiffs’ argument that defendants violated Title VII by refusing to conduct a validity study before rejecting testing results. Although an employer may use test validity as a defense to disparate impact allegations, the court held that neither the Uniform Guidelines nor case law require or mandate a validity study where an employer decides against using a certain selection procedure that produces a disparate impact. The court further held that the law does not require use of a test where the employer cannot “pinpoint its deficiency explaining its disparate impact under the four-fifths rule simply because they have not yet formulated a better selection method.”

Satchell v. FedEx Corp.,
No. C 03-02659 SI, C 03-02878 SI, 2005 WL 2397522 (N.D. Cal. Sept. 28, 2005).

Plaintiffs sought to acquire class certification to challenge various promotion policies within Federal Express that they alleged caused a disparate impact on minorities, including Federal Express’s Basic Skills Test (“BST”). A permanent employee must pass the BST to be promoted from Handler to Courier, or other driving position. The plaintiffs alleged that the BST had a statistically significant adverse impact on minorities, disqualifying almost two-thirds of the African-Americans who applied. The Court granted the plaintiffs’ motion to certify the class.

First, the Court easily found that plaintiffs satisfied the numerosity requirement and also found that representation was adequate.

The Court also found that the plaintiffs showed commonality. With respect to the BST specifically, the Court noted that even if some employees only take certain portions of the BST for certain jobs, the validity of the entire exam was still a common issue, and the statistical evidence on which the plaintiffs relied was common to the class. Consequently, inferences drawn therefrom would also be common to the class.

Respecting typicality, the Court noted that all of the named plaintiffs either took and passed or were exempt from taking the BST, but that none of the class representatives did so. The Court agreed with the defendant that if the validity of the BST was to be challenged, separately or otherwise, at least one representative plaintiff must have taken and failed some part of the BST. The Court then granted the plaintiffs leave to amend their Complaint to add such a plaintiff.

Plaintiffs sought to be certified under Fed. R. Civ. P. 23(b)(2) by seeking injunctive relief with respect to the class as a whole. Although the plaintiffs sought monetary damages as well, the Court concluded that the plaintiffs’ prayer for injunctive relief, such as eliminating the passage of the BST as a requirement for promotions into Courier positions, was the primary relief sought.

Speller v. City of Roanoke,
No. 7:99 cv 00904, 2001 U.S. Dist. LEXIS 14153 (W.D. Va. Sept. 5, 2001).

The plaintiff was an African-American unsuccessful applicant for a position in the Roanoke City Fire Department. He alleged race discrimination in his failure to be hired, or alternatively that the psychological tests used by the City to evaluate potential firefighters have a disparate impact on African-American applicants. The battery of psychological tests used by the Fire Department included the TIAS Attentional and Interpersonal Style Inventory, the PRF Personality Research Form, and the IS5 Inwald Survey 5. The plaintiff applied for a Fire Department position in 1998, and was not among the 15 applicants selected. He was advised that he was not hired because he placed in Category II on the psychological examination and because of observations of his behavior during the hiring process which led him to be considered to be a maverick. In this opinion, the district court granted summary judgment on both claims for the defendants. In rejecting the plaintiff’s claim of employment discrimination, the court held that the Department’s requirement that an applicant score above a Category II was for the City to make, and that there was no evidence the Department had ever hired applicants who received Category II scores. The plaintiff placed his claim of disparate impact in the test on the fact that in his year, the test eliminated the only minority applicant, and that was he. He argued that the test therefore eliminated 100% of minority applicants, which violated the “Four-Fifths Rule.” The court rejected this argument in the absence of evidence of the Caucasian pass rate, which was not presented by the plaintiff.

Sutherland v. Norfolk Southern Ry.,
No. 01 C 2337, 2002 US Dist. LEXIS 14658 (N.D. Ill. 2002), aff’d, 63 Fed. Appx. 904 (7th Cir. 2003).

The plaintiff was a yardmaster who applied for promotion to trainmaster. Her original position was a switchman, and to qualify for promotion to yardmaster she took three written tests: SRA verbal, SRA non-verbal, and the personnel classification test (“PCT”). She achieved passing scores on the two SRA tests (51 on each), which was all that was required to be eligible for promotion to a yardmaster position, and she was promoted. The PCT score was only considered when a score on either of the SRA tests was below the required score. Her PCT score was 17, which ranked her in the bottom 10% of all those who took that test.

After three years as a yardmaster, she applied for promotion to trainmaster, for which the PCT score determined eligibility. Candidates were classified as “recommended,” “recommended with reservations,” or “not recommended,” based on whether they scored 27 or higher (recommended), 25 or 26 (recommended with reservations) or lower than 25 (not recommended). Based on her score, she was not recommended as an eligible candidate.

The PCT measures verbal and numerical problem solving-skills, vocabulary and ability to communicate. The railroad’s position was that a candidate scoring too low on the PCT would have problems communicating with other managers and difficulty solving problems that arose daily. The test was required of all candidates, both internal and external, and that requirement had been consistently applied. The validity of the test was not challenged by the plaintiff, nor was the fact that the railroad’s policy was not to retest, absent indications that the individual had taken additional formal coursework. In granting summary judgment to the railroad, the court found that the plaintiff was neither qualified for the position of trainmaster nor similarly-situated to any of those who were promoted during the period in question.

United States v. Delaware,
No. 01-020-KAJ, 2004 WL 609331 (D. Del. March 22, 2004).

The United States brought this employment discrimination action against the State of Delaware, the State’s Department of Public Safety, and that department’s Division of State Police (collectively the “State” or “DSP” or the “defendants”).

In an earlier opinion, the court held that the United States had established a prima facie case that the defendants’ use of a multiple-choice reading comprehension and writing test known as the “Alert” to screen applicants seeking employment as DSP Troopers had a disparate impact on African-American applicants because those applicants passed the Alert at a statistically significantly lower rate than Caucasian test takers. A bench trial was held to afford the defendants an opportunity to demonstrate that, despite the disparate impact of the Alert test, their use of that test from 1992 to 1998 was lawful because it was “job related for the position in question and consistent with business necessity.” See 42 U.S.C. § 2000e-2(k)(1)(A)(i).

After the bench trial, the court concluded that the defendants failed to meet their burden of proof and that, while the Alert was a valid and reliable test for law enforcement employment screening, the defendants set the cutoff score at an impermissibly high level. The court further concluded that the range within which the cutoff score could reasonably have been set was 66 to 70%.

The Alert

From 1981 through October 1998, defendants used the Alert as part of their entry-level Trooper selection process. The Alert was a 160-item multiple choice test consisting of 60 items designed to measure reading comprehension and 100 items designed to measure four aspects of writing skills, namely, spelling, clarity, grammar, and detail.

For the recruit classes in question, the DSP used Alert cutoffs that range from 115 to 123, or 71.875% to 76.875%, varying by difficulty of test form. When Alert scores were standardized, the sample-size weighted cutoff score used during the period at issue was approximately 75% of items correct. Those who failed the Alert were ineligible to continue in the hiring process for that recruit class, but could take the Alert again the following year.

It was undisputed that the Alert assessed reading and writing skills that were relevant to the job responsibilities of a DSP Trooper. Such skills were at the core of their responsibilities to investigate and report unlawful activity. The parties disagreed, however, over the degree of validity of the assessment yielded by the Alert and the cutoff score appropriate to establish that Trooper candidates had the minimum level of literacy necessary for successful performance as a trooper.

Reliability and content validity

To prove that the Alert was content valid, i.e., the content of the test corresponds to the related abilities required to perform the job, defendants presented evidence through Dr. Stephen Wollack, an expert in industrial and organizational psychology. Dr. Wollack testified that the test was content valid because Troopers need to read and write and the Alert is a reading and writing test. He further identified 20 validation studies, conducted between 1982 and 2001 involving 82 police departments. Dr. Wollack also assessed the validity of the Alert with respect to the Trooper job in Delaware. In doing so, he collected information regarding the Trooper job task and the skills and abilities required to perform these tasks. Based upon the data he collected, Dr. Wollack concluded that DSP troopers routinely depend upon written materials to perform essential tasks and preparing reports was an important and frequent part of the job. Dr. Wollack’s study also included readability analyses that showed that the reading level of the Alert matched the reading level required for the DSP trooper job.

Criterion-related validity

Defendants also presented evidence of the Alert’s criterion related validity, i.e., the relationship between the Alert and trooper job performance, through Dr. Jeanneret, an industrial and organizational psychologist. Dr. Jeanneret developed a performance dimension rating form (“PDRF”), which was a rating scales used as a performance evaluation tool. Every DSP Trooper was rated on the scale by their supervisors, who were provided written instructions on completing the PDRF. After the troopers were rated, Dr. Jeanneret examined the statistical relationships between the Alert scores Troopers received when they applied to the SDP and their PDRF performance ratings. Dr. Jeanneret found that the Alert scores were statistically significantly related to the PDRF Composite. Dr. Jeanneret further testified that the correlation between performance and the Alert indicated the relationship one would expect between a test of cognitive abilities, such as the Alert, and performance in a law enforcement job. The United States’ expert agreed.

However, defendants conceded that Dr. Jeanneret’s reported correlations at most explained that performance on the Alert predicts between approximately 4% and 9% of the variation in the PDRF composite ratings. Although the predictive capacity was thus weak, as the United States’ expert testified, if the strength of a statistical relationship is such that it reaches a benchmark level of statistical significance, then, the relationship between the two variables is real. The court concluded, that although the evidence demonstrated that the relationship between the Alert scores and performance in the relevant areas of the Trooper job was relatively weak, it was still an appropriate basis for decision-making by the State because its predictive power was statistically significant.

Utility and expectancy analysis

After determining that the evidence established that the Alert had both content and criterion-related validity, the court next turned to the question of whether the cutoff score set by defendants fairly approximated the minimum literacy qualifications necessary for successful performance of the job of DSP Trooper. The Court rejected Dr. Jeanneret’s purported utility analysis, i.e., the estimation of the institutional gains or losses anticipated from different employee selection criterion, finding it to be nothing more than a “more is better rationale.” Dr. Jeanneret’s analysis merely supported the conclusion that the higher the score on the Alert, the more likely it is to screen out more candidates who might otherwise have difficulty performing as a trooper with respect to the literacy aspects of the job. The Court also rejected Dr. Jeannert’s expectancy analysis, which was based upon the Alert scores and ratings on the PDRF composite for 190 incumbent Troopers in the validation sample because of inherent flaws within the analysis.

Two-step analysis

The court further rejected Dr. Wollack’s two-step analysis to assess whether defendants’ Alert cutoff scores corresponded to the minimum level of reading and writing skills necessary for successful job performance. In his two-step analysis, Dr. Wollock asked DSP supervisors to estimate what percentage of Troopers they had supervised over an eight-year period (January 1992 through December 1999) had unsatisfactory reading and writing skills. The supervisors returned an average estimate of 4.58%. Dr. Wollack then applied the supervisors’ estimates to the raw Alert scores obtained at the time of selection by the 269 Troopers hired during the seven-year period at issue in this case (1992-1998), and determined that an Alert cutoff score of 122 (76.2%) would have eliminated the lowest 4.58% of the Alert score distribution. The Court rejected this approach, finding that other jurisdictions use the Alert with lower cutoff scores than those used by defendants. Dr. Wollack did not disagree with the use of lower scores and had previously recommended lower scores. Dr. Wollack admitted that the standard of measurement of the Alert was such that Alert scores differing by as much as 6.5 points may not represent any differences in skill level.

Regression analysis

Defendants and plaintiffs presented evidence of regression analysis conducted on the Alert scores and PDRF Composite information collected in the case. Linear regression is an analytical technique that examines the relations between two variables by plotting data on the X (horizontal) and Y (vertical) axes of a graph and is helpful in predicting an unknown value from a correlated known value. The court also rejected the regression analysis, finding it not worthy of credence because it is revealing only of what was foreordained: it is mathematically guaranteed to identify a cutoff score above the actual cutoff score used by the DSP. Furthermore, a cutoff score pegged at the point that correlates with a PDRF Composite rating of 144.5, which is the point the defendants claim represented minimally acceptable performance, would eliminate 50% of the individuals who would be predicted to perform the job at that level of competence. That fact was brought to life by the 14 out of 18 Alert failures in the validation sample who went on to perform the DSP Trooper job successfully.


The Court noted that the purported objectiveness of the statistical evidence in this case seemed to melt away as well-respected, highly qualified statistical experts drew widely varying conclusions from the data. The defendants’ experts took the plaintiffs numbers and generated the highest possible cutoff score, while the plaintiff’s experts used the defendants’ numbers to achieve the lowest possible cutoff score. Based upon the range of appropriate cutoff scores presented by both parties, the Court concluded that a band in the 66-70% range was appropriate. The Court explained that this band met with the Third Circuit’s precedent that a criterion used to determine the “minimum qualifications necessary” means “likely to be able to do the job.”

United States v. Erie
411 F. Supp. 2d 524 (W.D. Penn. 2005).

The federal government brought a lawsuit against the City alleging its physical agility test (PAT) used to screen police officer candidates had a disparate impact on female applicants and therefore violated Title VII. The City conceded that the PAT had a disparate impact and the Court for the Western District of Pennsylvania found for the plaintiff in a bench trial.

The Test: As a result of a controversy surrounding the old PAT, the Civil Service Commission requested that the Erie Bureau of Police develop a new PAT. The Police Chief recruited an experienced Captain to design a more fair and “state-of-the-art” physical test. To do this, the Captain reviewed tests used by other law enforcement agencies and drawing upon his own experience from patrolling the streets to create a realistic obstacle course for entrant candidates. The test was designed to ensure that candidates possessed the physical ability necessary to do the job—particularly patrol the streets. To establish time restraints and repetition requirements, the Captain used averages from volunteers— consisting of 19 officers, three of whom were women— who ran the course. The new PAT was first used in 1994 and consisted of a 220-yard run, during which each applicant had to negotiate four obstacles (climb a six-foot high wall, crawl through a window opening three feet above ground, crawl under a two-foot high and eight-foot long platform, and climb over a four-foot wall). After this, candidates had to do 17 push-ups and nine sit-ups and complete the entire course in 90 seconds. No one developing the exam conducted studies or had studies made; no one developing the exam had special education or expertise in relevant fields of study; and no one ensured that the volunteers were significantly representative of the police force.

Those involved in implementing the initial PAT thought it was a “fair” measure of “basic physical fitness,” and constituted a “medium” or “average” level of physical ability. Over the next few years, the PAT underwent changes to compensate for job relevance and to respond to complaints from the community regarding the number of women who failed. In 1998, a five-second grace period was added so that candidates completing the course within 95 seconds had the opportunity to retest. Complaints about the six-foot wall continued; and in 2000, the PAT was further modified to give applicants the choice of climbing either a six-foot solid wall or a six-foot chain link fence. Candidates were also able to use a 12-inch wooden box for assistance. The stated purpose of these changes was to make the PAT more “practical” and fair to women. In 2002, candidates could only take the PAT if they passed a written exam first. Also, the push-ups and sit-ups were moved to precede the obstacle course, and the number of push-ups and sit-ups was changed to 13 each. The City also provided training sessions for applicants. The 2002 PAT extended the cutoff time to 95 seconds while keeping a five-second grace period. The 2002 passage rate for women was 30% (7 of 23). The time period under challenge was between 1994 and 2002.

The female to male passing rates from 1996, 1998, 2000 and 2002 were 4.3% vs. 53.7%, 14.3% vs. 72.2%, 11.8% vs. 77.3%, and 30.4% vs. 84.7%, respectively. When the lawsuit began, the Erie Bureau of Police consisted of 193 men and nine women.

This opinion illustrates the importance of expert testimony and the research, analysis, and acceptable standards on which experts rely when forming their opinions. The Court relayed an extensive factual description of every expert’s opinion and methods, finding that the plaintiff’s experts were more credible and basing its conclusions of law on them.

Based on the plaintiff’s experts, the Court required the PAT test to be validated as a Unitary Test (as opposed to divided into its various components). Importantly, the City could not prove the job-relatedness of the push-up or sit-up component of the PAT under the validation strategies generally followed by the employment testing profession. The Court also held that the City had not established that its use of a 90-second cutoff time for the PAT was consistent with business necessity, because it could not prove that such a standard corresponded to the minimum qualifications needed to successfully perform the job of a police officer. Rather, the Court concluded that the cutoff had been set at an inappropriately high level in that “the City used a nonrepresentative sample of 19 volunteers and because the City chose to utilize the average scores of the volunteers, all of whom the City admitted were performing their jobs at least adequately, rather than determining the level which distinguished successful from unsuccessful performers.” The City’s argument that, because on average, the officers in the sample were older than the typical applicant fell on deaf ears.

In conclusion, the Court noted that the City made genuine efforts to develop a PAT responsive to the Bureau’s needs, and yet was limited by its finances and expertise. Nonetheless, the Court concluded that it was bound by Title VII, and that the City simply failed to meets its burden of proof.

United States v. Garland,
Civ. Action No. 3:98-CV-0307-L, 2004 WL 741295 (N.D. Tex. Mar. 31, 2004).

The United States brought suit under Title VII alleging disparate impact on the manner in which the City of Garland selected entry-level police officers and firefighters. The employment practices at issue related to three cognitive skills tests: the ALERT police exam; the IPMA B-4 firefighter exam used in 1992; and the ALARM firefighter exam used since 1994. Each are paper-and-pencil multiple choice tests. (Non-cognitive traits were assessed in post-test components of the selection process). Defendant used a “multiple hurdle” model to weed out applicants at various stages of the selection process. After screening for basic prerequisites, the City rejected applicants who scored below 70% on the cognitive tests. Thereafter, some or all of the remaining pool underwent: (i) a physical agility test, and (ii) interview, background investigation, and polygraph exam. The remaining applicants were hired in descending rank order based on their cognitive skills test scores. Occasionally, Defendant’s police department had more vacancies than remaining applicants, in which case all applicants were hired.

Plaintiff called Drs. Siskin, Landy, Jones, Hough, and Hornick as expert witnesses. Defendant called Drs. Stoikov, Schemmer, Morris, Lundquist, Wollack, and Barrett as expert witnesses. As the court observed, “This case involves a substantial amount of highly technical documents and testimony, and with eleven experts is really a battle between the heavyweights on each side.” Plaintiff challenged: (i) Defendant’s use of a 70% cutoff score on the cognitive skills test; (ii) Defendant’s use of the cognitive skills test scores to rank order applicants eligible to proceed to the physical agility test, and (iii) Defendant’s use of the cognitive skills test scores to rank order its hiring decisions.

70% Cutoff Score

The 70% cutoff resulted in statistically significant differential passage rates among white, black and Hispanic test-takers, depending on the test and year(s) at issue. Thus, a prima facie case of discrimination was found to exist for each exam.

The City defended its use of the IPMA B-4 as a valid and useful tool for the selection of entry-level firefighters by introducing evidence that established its content validity and criterion-related validity. The appropriateness of the 70% score was established by use of the Angoff Method and by crediting all test-takers with two questions that should not have been included in the examination. In rebuttal, the Plaintiff ironically advocated replacement of the IPMA B-4 with the ALARM exam, which the City was currently using and the United States challenging in the same litigation.

Similarly, the City introduced evidence establishing the content validity of the ALERT and ALARM exams with respect to reading comprehension and writing ability. The 70% cutoff was shown to be reasonable based on a survey of incumbent supervisors and examinations or incumbent officers and firefighters, both in Garland and in larger jurisdictions. Plaintiff countered that 43 officers and firefighters who scored below 70% found employment in the same fields in other jurisdictions. Unsurprisingly, the court found such testimony ineffective at impeaching the cutoff score. Further, one of the Plaintiff’s experts acknowledged that, had the City followed a common method for establishing a cutoff score used by his firm, the cutoff score would have been higher than 70% on each exam. Last, and perhaps most damaging, two of Plaintiff’s experts acknowledged that there is no single acceptable method for establishing a cutoff score on an employment examination, but that such decisions are often arbitrary and need only be based on good business judgment. In rebuttal, Plaintiff advocated the use of the PSP exam instead of ALERT, and the CWH exam instead of ALERT. However: (i) the PSP was shown to have an adverse impact against blacks; (ii) a comparative study between the PSP and ALERT used a higher cutoff score than the 70% utilized by the City; (iii) the criterion-related validation study for the C6WH eliminated data from nearly 40% of the study participants; (iv) neither expert Plaintiff used to support their chosen alternatives had ever seen the exams at issue!


The Court found Plaintiff’s challenge to the rank-ordering of applicants, based on the cognitive skills test results, to advance to the physical agility test problematic because neither party was able to identify a complete list of those persons who advanced to this level of the application process. Further, Plaintiff’s expert attempted to introduce evidence of a disparate impact by basing his mean score analysis on all test-takers, regardless of whether they passed or failed the test at issue, rather than only those test-takers who passed the cognitive skills test. Additionally, the same expert aggregated his results across all years, rather than follow the rank-order process that occurred on an annual basis.

Rank-Order Hiring

Because the City’s police department exhausted its eligibility list on several occasions, no adverse impact could have occurred during those years. Plaintiff’s expert was found to have improperly drawn his pool of qualified candidates from the ranks of those who passed the cognitive skills test only, as opposed to those who passed all of the stages of the application process. This resulted in the expert treating an applicant who passed the cognitive skills exam but failed the post-test components of the process as having effectively failed the cognitive portion of the exam. Further, he aggregated the results of all years rather than conduct a year-by-year review. Consequently, Plaintiff did not establish a prima facie case of disparate impact with respect to this practice. Several of Plaintiff’s experts were discredited by the Court.

United States v. State of Arkansas,
2007 WL 951880 (E.D. Ark. 2007).

Plaintiffs sought to enforce a settlement agreement that had been implemented with the court’s supervision in response to a class action filed by African-American employees of the Arkansas State Police Department and which aimed at ensuring African-American men and women were not disadvantaged by the selection, hiring, assignment, and promotion policies and practices of the Department. Plaintiffs sought such enforcement because they contended that use of polygraph and written entrance exams by the Department for hiring and selection were having an adverse impact. The Department sought to terminate the court’s supervision, claiming that it has met the terms of the agreement. One of the Department’s obligations under the settlement agreement was to “develop and implement nondiscriminatory systems for selection and hiring and for promotion.” The Department was to create one system for the selection and hiring of commissioned or certified troopers and one for promotions to sergeant and lieutenant, with the help of Dr. John Veres. The court found that the Department had satisfied its obligations under the settlement agreement.

The Department teamed with Aon Consulting in 1999 to develop and validate a job-related entrance exam. Aon’s exam consisted of four parts: (1) Cognitive (following policies and procedures); (2) Biographical (work and school experiences); (3) Personality (adjustment score), and (4) Personality (dependability score). A candidate passed if each of his or her cognitive and biographical scores were at or above one standard deviation below the mean and if his or her total score equaled a certain number. The first administration of this exam resulted in disparate impact against African-Americans. The Department, Dr. Veres, Aon, the DOJ, and plaintiffs’ counsel developed an alternative method for scoring, hoping to reduce the disparate impact without decreasing operational validity. This alternative method did not succeed. The same parties met again and agreed on a third method for scoring—eliminate the minimum cognitive and personalityadjustment scores (which had a disparate impact) and instead require only a certain minimum overall score. Despite these efforts, plaintiffs claimed the Department was in violation of the settlement agreement. The court determined that, because all parties, including the plaintiffs, agreed that this third method would produce job-related results and reduce the adverse impact and because the plaintiffs offered no comparable alternative, the third method of scoring the entrance exam satisfied the settlement agreement.

As part of its selection/hiring process, the Department also administered a polygraph examination to elicit personal data along with financial, alcohol, drug and criminal history. Those who pass the polygraph test continue to participate in a background check, which checks on much of the same information by conducting an investigation and interviews. In applying the Four-Fifths Rule to the numbers produced from polygraph and background check administrations during 2003 through 2006, the court observed a disparate impact only during the 2004 polygraph and background components and the 2006 background component. However, due to the small size of the recruiting classes analyzed, such results were not necessarily statistically significant. When all four years are considered collectively, there was no adverse impact for either component. Further the polygraph/background components were held to have been shown to be job-related and necessary to achieve legitimate law enforcement policy objectives, and plaintiffs had not proffered a comparable alternative. The Department contracted with another consultant to develop and validate a job-related promotion system. This system consists of a written examination and oral board for each appropriate rank. Candidates are ranked according to their performance. Summaries of personnel records for the top five candidates are provided to the Director for final consideration, who determines who is most qualified. To reduce disparities, the Department made various suggested revisions, including reducing the Director’s discretion by making him consider enumerated factors that must be reflected in the Director’s written recommendations. Plaintiffs did not object to these revisions. Although administration of this process produced non-adverse results, plaintiffs argued that the process was still subject to manipulation by the Department. The court rejected plaintiffs’ argument. In light of the non-adverse results, the job-relatedness of the process, and plaintiffs’ failure to proffer a comparable alternative, the court found in favor of the Department.

United States v. New York City Bd. of Ed.,
448 F. Supp. 2d 397 (E.D.N.Y. 2006).

Plaintiff white intervenors challenged the propriety of a settlement agreement entered into by the Board of Education to resolve claims of disparate impact caused by custodian entrance exams, which were brought by the United States government.

The exams in dispute are as follows. Custodial Exam No. 5040 was administered in 1985. Applicants who passed the exam and had the requisite experience were given a “practical oral” exam, graded on a pass/fail basis. Those who passed were placed on an eligibility list and then placed in rank order into open positions. The Custodian Engineer Exam Nos. 8206/8609 were administered in 1989. If the applicant passed the exam and had the requisite experience, he or she was placed on the eligibility list (no practical oral exam was used). The Custodian Exam No. 1074 was administered in 1993. Applicants who passed were given a practical written exam, graded on a pas/fail basis. Those who passed and who had the requisite experience were eligible for hire.

With significant disparity rates on African-Americans and Hispanics on the 5040 and 1074 exams (between 23.3% and 58.8% of the pass rates for Caucasians) the intervenors conceded that they had a disparate impact. Accordingly, the district court found that the statistical evidence satisfied the higher prima facie standard and was unquestionably sufficient “to serve as a predicate for a voluntary compromise containing race-conscious remedies.”

The intervenors challenged the statistical significance of the disparity rate of the 8206/8609 test on Hispanics because, when unqualified test takers were removed from the population, the disparity rate decreased. When calculated per a post-hoc review (by deriving the number of unqualified test-failers based on data generated from a post-hoc review conducted by the Board), the pass rate ratio for Hispanics was 89.3% of Caucasians. Non-intervenors proposed calculating pass rates for a population composed of qualified test-passers (since the number of unqualified passers was known) and all testfailers (since the number of unqualified failers was known) and determined a pass rate ratio of Hispanics to be 77.7% of Caucasians. The court held that, because the disparity rate when unqualified test-takers are removed from the equation is in dispute, it could not determine, without a hearing, that this exam resulted in the manifest imbalance sufficient to warrant affirmative-action relief for the one affected Hispanic.

United States v. Virginia Beach,
No. 2:06-cv-00189-RAJ-FBS, settlement (E.D. Va. Apr. 3, 2006).

The Department of Justice filed a complaint, along with a proposed consent decree (settlement agreement), against the City of Virginia Beach alleging that its use of a mathematics test (which was part of the National Police Officer Selection Test (“POST”)) as a pass/fail screening device in its selection process for entry-level police officers disparately impacted African-Americans and Hispanics. The consent decree indicated that between 2002 and June 2005, only 59% of African-American and 66% of Hispanic applicants passed the test, while 85% of Caucasian applicants passed. The DOJ alleged in its complaint that the test was not job-related because it did not predict whether an applicant was able to successfully perform the police officer job. If the court approves the consent decree, Virginia Beach will pay $160,000 to qualified African-Americans and Hispanics who applied and who would have otherwise qualified for a position but for their failing math scores. The City will also provide priority job offers to those minorities who were eliminated based on their math scores and will only use the POST if applicants are required to score at least 70% on reading and grammar and obtain an overall score of at least 60%.

White v. Kellogg Co.,
No. 8:98 CV 322, 2000 U.S. Dist. LEXIS 8043 (D. Neb. June 7, 2000).

Plaintiff was a seasonal production worker who applied for a regular full-time position when the company set up a hiring process for 59 openings. The company received 3,700 applications or requests for transfer to those positions. The hiring process included three phases. First, the company screened and sorted applications for job experience, education, availability to work overtime and criminal records. Second, it invited applicants who passed the initial screening to take a standardized test battery. That test battery was developed by a human resources psychologist based on his interviews with incumbent employees with more than 20 years job experience. From his interviews, the psychologist developed a list of knowledge, skills and abilities needed for positions in the packing, processing, and warehouse and shipping departments. The test battery he developed included a reading test, an arithmetic test, an inspection and measurement test, process monitoring and problem solving test, a teamwork test and an employee safety test. Applicants who passed the standardized tests were invited to interview. At this point the company used a background check on those who passed the interview. The plaintiff passed from the initial screening to the test, but failed the reading portion of the test, scoring two points below the minimum low score required for passage. His failure of the reading portion of the test meant that he failed the examination. Plaintiff alleged that the standardized test had a disparate impact on African-Americans and applicants over age 40. In this brief opinion, the court denied summary judgment on the disparate impact of the standardized test. It noted the extensive evidentiary record, and found that there was a genuine issue of material fact as to whether the standardized pre-employment test had a disparate impact on protected groups.

Williams v. Metropolitan Police Dept. of St. Louis,
No. 4:05CV415 HEA, 2005 WL 2491459 (E.D. Mo. Oct. 7, 2005).

Plaintiff sued the Police Department alleging that the examination for promotion to Sergeant had a disparate impact on African-Americans. Plaintiff’s test scores did not qualify him for either resulting Cluster. Cluster A was the top qualifying category and contained only three of 95 African-American applicants. There were seven African-Americans out of a total of 37 applicants in Cluster B. Defendants moved to dismiss the lawsuit for failure to state a cause of action, arguing that Plaintiff had no factual basis for arguing there should have been more African-Americans in Clusters A and B, for failing to allege sufficient statistics to support the exam’s alleged disparate impact, and for failing to establish a causal connection and the alleged disparate impact. After reviewing the lenient notice pleading standards, the Court held that Plaintiff’s complaint was sufficient, in that he alleged: (1) he took the exam; (2) he was not listed in one of the qualifying Clusters; (3) the exam was racially discriminatory; (4) the percentage of African-Americans who qualified should have been higher; (5) the components used to qualify candidates were biased; (6) the scoring did not meet federal guidelines; and (7) the exam resulted in subjecting plaintiff to discrimination. The Court noted that Plaintiff was under no obligation to demonstrate that he could ultimately prove the allegations contained in his complaint and therefore did not have to provide, for example, statistical evidence.

This page was last modified on May 16, 2007.

Home Return to Home Page