Testimony of Pauline Kim

Chair Burrows, Commissioners, thank you for the opportunity to address the issue of artificial intelligence and employment discrimination.

I am the Daniel Noyes Kirby Professor of Law at Washington University School of Law in St. Louis. My research and scholarship center on the law of the workplace, with a particular focus on how emerging technologies are impacting anti-discrimination law and employee privacy rights.

When AI is incorporated into automated decision systems, or predictive algorithms, and used for hiring and promotion, these tools offers many advantages to employers, including efficiency and scalability. They also have the potential to remove some forms of human bias from these processes. However, as is now well recognized, these tools can operate in ways that are biased, and may discriminate along the lines of race, sex, and other protected characteristics.[1] Computer based assessments can also create barriers to equal employment for individuals with disabilities. The technical assistance issued by the Chair last year provides crucial guidance to employers about how to comply with the Americans with Disabilities Act when using these tools, especially, the importance of providing individualized assessment and reasonable accommodation to individuals with disabilities. In these remarks, I will focus instead on issues of systemic bias that can arise when employers use algorithms to predict the suitability of workers for particular jobs.

Numerous studies and reports have documented the ways in which bias can creep into automated systems.[2] When incomplete, unrepresentative, or error-ridden data are used to train a model, the resulting predictions can produce biased outcomes. Training data may encode biased human judgements, for example, when the data includes subjective scores assigned by humans, and the model takes them as objective measures of performance. And because predictive models extract patterns in past data to make future predictions, even highly accurate models may simply reproduce existing patterns of discrimination and occupational segregation.

In addition to data problems, the choice of the target variable can have a significant impact on who is given access to employment opportunities. The target variable is the outcome the system is designed to predict. Paying attention to how it is defined and measured is critical to avoiding bias. Good employees have many different traits and the designer of automated hiring systems must decide which attribute to focus on. Many of the most valuable qualities in an employee are difficult to define or to measure accurately with data. And so the designer may choose instead a target that is easily measurable, such as customer ratings or time on the job. That choice of the target variable can be highly consequential.

Take, for example, a model that predicts the best candidates by selecting those who most closely resemble applicants who were hired in the past. If past hiring decisions were infected by bias, the model’s predictions will be as well. Another example is a model that rates highly applicants who are least likely to leave the paid labor force. Such a model will disproportionately screen out women of childbearing age or workers with disabilities, who are more likely to have breaks in employment, even though they are fully capable of performing the job. Thus, the initial step of problem formulation[3]—deciding how the problem to be solved by the algorithm is defined—is crucial to avoiding discrimination.

Automated systems that rely on machine learning to constantly update can also create problematic feedback loops. Proponents of these systems argue that they can learn and improve continuously over time. However, unlike with online advertising, hiring tools cannot be subject to meaningful A/B testing. Low-ranked candidates will not be hired and their job potential cannot be observed. As a result, false negative outcomes cannot be detected and corrected, and erroneous assumptions about lack of ability may be reinforced over time.

Long before an employer makes its hiring decisions, predictive algorithms also play a critical role in matching job candidates with potential opportunities. Most employers today advertise job openings on social media sites like Facebook, or rely on job matching platforms to identify promising candidates. These new labor market intermediaries utilize algorithms to channel information about opportunities to different users, and their operation can determine which opportunities a job seeker learns about.[4] Studies have documented that ad-targeting algorithms distribute job advertisements in racially- and gender- biased ways that reflect stereotypes about what kinds of people perform certain jobs.[5] These effects occur even when the employer has requested race- and gender-neutral ad targeting, and wants its job advertisements to be distributed to a broad and diverse pool.

 

The Uncertain Application of Existing Anti-Discrimination Law

Anti-discrimination laws apply to these automated decision tools and provide some leverage to prevent or redress discriminatory harms. However, current law is incomplete. There are a number of gaps and uncertainties about how the doctrine applies to automated decision systems.

Existing doctrine was developed with human decision-makers in mind and does not always fit the risks of discrimination posed by automated systems. For example, a formalistic view of disparate treatment discrimination might suggest that so long as a model does not take a protected characteristic into account, it does not violate disparate treatment. Conversely, it might assume that any model that does take a protected characteristic into account is discriminatory.

This interpretation of disparate treatment law is too simplistic. An employer could engage in disparate treatment without expressly relying on a protected characteristic like race or sex by using proxy variables to produce exactly the same effect.[6] On the other hand, in order to ensure that a model is fair for all groups, it may be necessary to take protected characteristics into account.[7] For example, the only way to audit for unintended bias is to make use of data about protected characteristics.[8]

Thus, simply prohibiting consideration of race or sex in a model would not only fail to prevent discrimination from occurring, it could be counterproductive as well.

Uncertainty also affects the application of disparate impact doctrine. Under current law, when an employer practice has a disparate impact on a protected group, the employer has a defense if it can show that the practice is “job related and consistent with business necessity.”[9] That defense, which was codified as part of disparate impact theory by the Civil Rights Act of 1991, is not explained in the statute. In order to interpret its meaning, many turn to the Uniform Guidelines on Employee Selection Procedures issued in 1978. The Guidelines set out methods for validating an employer test based on the industrial psychology literature at that time, and as a result, they do not address some of the unique challenges posed by AI and predictive algorithms.

For example, some automated decision systems rely on data mining to extract patterns. They may uncover variables that are strongly predictive of the target variable, but have no clear connection to job performance. Some machine learning models are so complex that an employer that relies on them may not be able to explain its decision to reject candidates, making it difficult to apply concepts of “job relatedness” and “business necessity.” The Guidelines were not designed to address situations like these.

The third step of the disparate impact analysis allows a plaintiff to show that a less discriminatory alternative is available to the employer. Again, there is uncertainty how plaintiffs can show this when challenging predictive algorithms, given that there are many, potentially infinite, models that could be designed for a particular application.[10]

Another uncertainty surrounding the use of automated decision-tools relates to remedial efforts. If an employer detects that a predictive algorithm has a disparate impact on a disadvantaged group, what can it do in response? Some researchers have questioned whether efforts to remove discriminatory effects might themselves run afoul of anti-discrimination law by taking account of race, sex, or other protected characteristics. They have expressed concern that taking sensitive characteristics into account to prevent disparate impact might be construed as a form of disparate treatment.[11] Existing case law permits taking race and other sensitive characteristics into account in order to level the playing field and ensure equal access to opportunities;[12] however, the application of those principles needs to be clarified in the context of algorithmic decision-making.[13]

Finally, it is uncertain whether existing law reaches labor market intermediaries like online advertising and job matching platforms.[14] These entities play an increasingly important role in shaping the job market and access to opportunities, but it is unclear whether or when they would be considered “employment agencies” covered by Title VII, and what responsibilities employers have when relying on these platforms to recruit workers.[15]

Aside from these legal uncertainties, practical challenges exist as well. Title VII’s enforcement scheme relies primarily on retrospective liability to redress past discriminatory harms. Although the EEOC brings enforcement actions, individual workers file the vast majority of employment discrimination suits and accessing remedies may be difficult for them. It has always been harder to detect and challenge discriminatory hiring decisions than firing decisions because of the difficulty obtaining evidence of discrimination when outside the firm. Individual workers will find it even more difficult to challenge biased hiring algorithms. Part of the problem is that applicants often do not know when or how employers are using automated systems. Even with greater transparency, they will typically lack the technical expertise and resources needed to assess the fairness of these tools or to bring a legal challenge.

 

The Limits of Self-Regulation

Before turning to some suggested reforms, I want to acknowledge that automated decision systems are not inevitably discriminatory. A well-designed and implemented system may help employers reduce the influence of human bias. Employers should be allowed some latitude to explore ways in which AI tools can help to remove bias and increase the diversity of their workforce. However, it is important not to get caught up in the rhetoric claiming that data-based tools are inherently neutral and objective.

If the goal is to create more equitable workplaces, relying on industry best practices and employer self-regulation is insufficient. While many firms care deeply about diversity, equity and inclusion, not all do. Robust regulatory tools remain important to address the bad actors. And even well-intentioned firms face significant constraints when trying to do the right thing. They may lack the expertise to understand the risks of discrimination, or the resources to engage in ongoing auditing and testing needed to prevent these harms. Detecting and removing bias requires close analysis and ongoing scrutiny of automated systems.

Another concern is that employers motivated primarily by liability risk avoidance will adopt pro forma, symbolic steps that do not meaningfully address discriminatory risks. Extensive research by sociologists has documented how many firms responded to civil rights laws, and in particular, the threat of sexual harassment liability, by creating procedures and checklists which signaled their good faith but did not address the root causes of discrimination and harassment.[16] Given the experience with “best practices” that shaped firms’ responses to sexual harassment, but often proved ineffective,[17] the EEOC should be cautious about allowing employers to rely on procedural checklists as evidence that their selection tools comply with anti-discrimination laws.

 

What Can the EEOC Do?

Given the ambiguity about how employment discrimination law applies to AI and other automated systems, it may be useful to clarify the law in a handful of discrete areas.

First, the EEOC should clarify that AI tools that produce a disparate impact cannot be defended solely on the basis of statistical correlations.[18] The employer should have to demonstrate the substantive validity of its selection tools. It should bear the burden of showing that the model was built using accurate, representative, and unbiased data, and that it actually measures job-relevant skills and abilities. This approach is consistent with the position taken by a coalition of civil rights organizations.[19] It would also create incentives for employers who purchase these systems from outside vendors to closely scrutinize these tools before deploying them.

Second, the EEOC should offer guidance on the duty of employers to explore less discriminatory alternatives. Researchers have demonstrated that there is no unique model for solving a given optimization problem.[20] Because there are typically multiple models that can be developed and used in any given application, designers should explore and document which options have the least discriminatory effect. If alternative, comparably effective models are available, then arguably, an employer’s choice to use a model that has discriminatory impact is not consistent with business necessity.

Third, the EEOC should make clear that taking steps to correct or prevent a model from having a disparate impact is not a form of disparate treatment.[21] In order to de-bias models, designers will need to make use of data about sensitive characteristics. When building a model, they should examine proposed target variables for implicit bias. Avoiding discriminatory impacts may also require scrutinizing the representativeness and accuracy of training data, oversampling underrepresented groups, or removing features that encode human bias. And because AI tools may behave differently when applied to actual applicants compared with training data, it is critical to audit their effects once deployed. Strategies like these require paying attention to race or other protected characteristics in order to avoid bias and build AI tools that are fair to all. Because these types of de-biasing strategies do not make decisions about individual workers turn on protected characteristics, they should not be considered a form of disparate treatment. By clarifying that it is permissible to take protected characteristics into account in order to remove disparate impact, the EEOC can encourage voluntary employer efforts to rigorously examine their practices and to avoid any discriminatory effects.[22]

Fourth, the EEOC could offer guidance about the legal responsibilities of labor market intermediaries, such as job-matching platforms, that play a significant role in procuring workers for employers and employment opportunities for job seekers. Very little case law exists applying the statutory definition of an “employment agency” under Title VII, and as a result, the legal responsibility of online platforms to ensure that they provide a level playing field for all workers remains unclear. Even where these entities cannot be held directly liable, the EEOC could conduct research and educate employers about how the predictive algorithms these platforms use to distribute information can cause bias in the job advertising and recruiting process.

Beyond clarifying discrete issues in existing law, any regulatory steps should be taken cautiously. Because the technology at issue is complicated and rapidly evolving, it is important not to freeze into place standards that will quickly become obsolete. In particular, what appear to be “best practices” today may turn out to be sub-optimal solutions in the future. Locking them into place through legal doctrine or by recognizing employer defenses, could end up excusing or immunizing practices that are later determined to be harmful.

For these reasons, much of the EEOC’s efforts in this area should be forward-looking, aimed at building its capacity to audit automated hiring tools, to study their effects on workforce participation, and to research solutions, both technical and practical, that will ensure that these tools work to open opportunities to the widest possible pool of workers. An important part of this work will require increasing transparency by employers about when and how they are utilizing automated decision systems, how those systems were designed, what training data was used to build them, and the effects of these systems on workers.

Finally, the EEOC should consider developing data analytic tools to study employers and their human resources processes rather than workers. By leveraging data and computational tools, these systems could help to diagnose where or why bias is occurring, or to predict which practices are more likely to broaden the diversity of employees who are hired and to support their success in the workplace.

Thank you again for your time and for focusing attention to the important issues of employment discrimination and AI.

 

[1] The report by the White House Office of Science and Technology Policy, Blueprint for an AI Bill of Rights (October 2022), documents many examples of algorithmic bias across a range of social applications.

[2] See, e.g., Solon Barocas & Andrew D. Selbst, Big Data’s Disparate Impact, 104 Calif. L. Rev. 671 (2016).

[3] Samir Passi & Solon Barocas, Problem Formulation and Fairness, Proceedings of the Conference on Fairness, Accountability, and Transparency 39 (ACM 2019).

[4] Pauline T. Kim, Manipulating Opportunity, 106 Va. L. Rev. 69 (2020).

[5] Muhammad Ali et al., Discrimination through Optimization: How Facebook’s Ad Delivery Can Lead to Biased Outcomes, 3 Proceedings of the ACM on Human-Computer Interaction 1 (2019); Piotr Sapiezynski et al., Algorithms that “Don’t See Color”: Comparing Biases in Lookalike and Special Ad Audiences, arXiv:1912.07579 [cs] (2019); Anja Lambrecht & Catherine Tucker, Algorithmic Bias? An Empirical Study of Apparent Gender-Based Discrimination in the Display of STEM Career Ads, 65 Management Science 2966 (2019).

[6] See, e.g., Cynthia Dwork et al., Fairness through Awareness, Proceedings of the 3rd Innovations in Theoretical Computer Science Conference on - ITCS ’12 214 (ACM Press 2012); Moritz Hardt et al., Equality of Opportunity in Supervised Learning, arXiv:1610.02413 [cs] (Oct. 2016).

[7] See, e.g., Talia B. Gillis & Jann L. Spiess, Big Data and Discrimination, 86 U. Chi. L. Rev. 459 (2019); Sam Corbett-Davies & Sharad Goel, The Measure and Mismeasure of Fairness: A Critical Review of Fair Machine Learning, arXiv:1808.00023 [cs] (Aug. 2018).

[8] Pauline T. Kim, Auditing Algorithms for Discrimination, 166 U. Pa. L. Rev. Online 189 (2017).

[9] 42 U.S.C. §2000e-2(k)(1)(A).

[10] Emily Black, Manish Raghavan, and Solon Barocas, Model Multiplicity: Opportunities, Concerns, and Solutions, 2022 ACM Conference on Fairness, Accountability, and Transparency 850 (ACM Jun. 2022); Charles T. Marx, Flavio du Pin Calmon, and Berk Ustun, Predictive Multiplicity in Classification, arXiv:1909.06677 [cs, stat] (Sep. 2020).

[11] This worry stems from what I argue is a misunderstanding of the holding in Ricci v DeStefano, 557 U.S. 557 (2009). See Pauline T. Kim, Auditing Algorithms for Discrimination, 166 U. Pa. L. Rev. Online 189, 200-202 (2017). Pauline T. Kim, Data-Driven Discrimination at Work, 58 Wm. & Mary L. Rev. 857, 925–32 (2017).

[12] See, e.g., Maraschiello v. City of Buffalo Police Department, 709 F.3d 87, 89 (2d Cir. 2013); Duffy v. Wolle, 123 F.3d 1026, 1038–39 (8th Cir. 1997); Rudin v. Lincoln Land Cmty. Coll., 420 F.3d 712, 722 (7th Cir. 2005); Mlynczak v. Bodman, 442 F.3d 1050, 1050 (7th Cir. 2006). 

[13] Pauline T. Kim, Race-Aware Algorithms: Fairness, Nondiscrimination and Affirmative Action, 110 Calif. L. Rev. 1539 (2022).

[14] Pauline T. Kim, Manipulating Opportunity, 106 Va. L. Rev. 69 (2020).

[15] Pauline T. Kim & Sharion Scott, Discrimination in Online Employment Recruiting Symposium: Law, Technology, and the Organization of Work, 63 St. Louis U. L.J. 93 (2018).

[16] See, e.g., Frank Dobbin & Alexandra Kalev, The Promise and Peril of Sexual Harassment Programs, 116 Proc. Nat’l Acad. Sci. 12255 (Jun. 2019); Frank Dobbin & Alexandra Kalev, The Civil Rights Revolution at Work: What Went Wrong, 47 Annual Review of Sociology 281 (Jul. 2021).

[17] EEOC Select Task Force on the Study of Harassment in the Workplace, Report of Co-Chairs Chai R. Feldblum & Victoria A. Lipnic (2016).

[18] Pauline T. Kim, Data-Driven Discrimination at Work, 58 Wm. & Mary L. Rev. 857, 921 (2017).

[19] Civil Rights Principles for Hiring Assessment Technologies, The Leadership Conference Education Fund, available at: https://civilrights.org/resource/civil-rights-principles-for-hiring-assessment-technologies/#

[20] Emily Black, Manish Raghavan, and Solon Barocas, Model Multiplicity: Opportunities, Concerns, and Solutions, 2022 ACM Conference on Fairness, Accountability, and Transparency 850 (ACM Jun. 2022); Charles T. Marx, Flavio du Pin Calmon, and Berk Ustun, Predictive Multiplicity in Classification, arXiv:1909.06677 [cs, stat] (Sep. 2020).

[21] Pauline T. Kim, Race-Aware Algorithms: Fairness, Nondiscrimination and Affirmative Action, 110 Calif. L. Rev. 1539 (2022).

[22] Ass’n of Firefighters v. City of Cleveland, 478 U.S. 501 (1986) (“We have on numerous occasions recognized that Congress intended voluntary compliance to be the preferred means of achieving the objectives of Title VII”).