Written Testimony of Kathleen K. Lundquist, Ph.D. APT Metrics, Inc.

Meeting of April 15, 2015 - EEOC at 50: Confronting Racial and Ethnic Discrimination In the 21st Century Workplace

Chair Yang and Commissioners Barker, Feldblum, Lipnic and Burrows: Good morning and thank you for the opportunity to share some of my thoughts with you about the current status, opportunities and challenges facing us as we confront discrimination in the 21st Century Workplace.

The framework for my remarks is that of an Industrial Organizational psychologist - one who works at matching people to jobs - ensuring that the way we make predictions about who is best suited for a job accurately and fairly assesses the skills, abilities and other characteristics needed for success in the job.

Technological Advances in the Nature of Selection Systems

Candidates for jobs today are increasingly being screened using online technology: online applications (some of which contain scored questions the answers to which may disqualify the applicant from further consideration); biodata1 or personality tests administered online to thousands of candidates; responses to "interview" questions which may be videotaped and uploaded online, and even online background checks. In a survey of domestic and international organizations, 81 percent of the respondents indicated that they were utilizing online assessments, and 83 percent indicated that they allow applicants to complete the assessments remotely in unproctored settings (Fallaw & Kantrowitz, 2011). Much of the data collection occurs with candidates entering information or responses to test questions from their ipads or smart phones. Moreover, if adverse impact can be minimized or eliminated by these tools, employers are often willing to sacrifice some level of validity to increase diversity and reduce the risk of litigation (a phenomenon sometimes known in the literature as the diversity validity dilemma).

Huge volumes of candidates are screened for entry-level jobs using this type of technology, with limited direct human interaction in the selection process. Big data analytics and test results are increasing in popularity and replacing what was once an emphasis on work history and personal interviews (Walker, Wall Street Journal, September 20, 2012).

From an employer's standpoint, the technology supporting internet-based assessment has become much more affordable and scalable, leading to significant cost savings, more diverse applicant pools, increased accuracy and administrative flexibility (Moomaw & Scott, 2014). For racial and ethnic minorities, however, this emphasis on technology may not only raise questions about access to the internet and the technology to apply for jobs, but also about the content and processes by which they may be screened out.

Despite the ease with which these data are collected, there is surprisingly little real validation evidence being collected to substantiate the job relatedness of the instruments used. The apparent adverse impact against racial and ethnic minorities associated with criminal background checks is but one example of a widely-used selection procedure for which most employers have no validity evidence.

Validity may be further challenged by concerns about the comparability of results obtained in typical proctored testing situations (even when the test is administered by computer) versus those gained from unproctored administration on a mobile device. Such differences as screen size, loading times and internet connectivity issues may impact a candidate's performance on the assessment and lead to "inaccurate" conclusions about the candidate's abilities (Illingworth, Morelli, Scott & Boyd, 2014). These challenges are particularly important when the test measures responses to non-verbal, visual images, as is the case with some work sample tests and some promising less adverse alternatives to typical paper and pencil measures of cognitive ability. As Moomaw and Scott (2014) point out, professional standards recognize the need to demonstrate the equivalency of assessments delivered on mobile and non-mobile devices, particularly for high-stakes testing such as for employment.2

However, I want to point out that we can also harness this technology for the greater good. One of the major challenges in employment testing has been ensuring that the selection procedure itself fairly represents what an individual will be able to do on the job. Recent applications of technology and research in the testing field have permitted employment tests to measure skills in ways that are more similar to how the individual actually would perform on the job. Research has shown that "high fidelity" selection procedures, such as online work samples which simulate the job or situational judgment tests (e.g., managerial In Basket exercises or simulated troubleshooting on a fictitious manufacturing system), enhance candidate acceptance and often reduce adverse impact.

Empirical Research on Racial and Ethnic Differences

Over the past 50 years, researchers have continued to identify ways to change what we measure, how we measure it and how we use the results in attempts to reduce adverse impact against racial and ethnic minorities. Recent research has examined the extent of racial and ethnic differences in different types of selection procedures. This information can be very helpful in designing a selection process which reduces barriers for racial and ethnic minorities.

What We Measure

Much of this research has focused on identifying alternatives to cognitive ability tests because such tests have historically shown the largest differences between white and African American or Latino candidates. Interestingly, the research has also shown cognitive ability tests to be among the best predictors of job performance across a wide range of jobs and across varying levels of job complexity (Schmidt, Shaffer & Oh, 2008). Some promising recent research has identified new approaches to measuring cognitive ability which are both predictive of job performance and have substantially less adverse impact. These alternative cognitive ability measures rely less on verbal cues, context and previous learning and more on the use and application of information presented as part of the test (Bosko, Allen & Singh, 2015; Goldstein, Scherbaum & Yusko, 2009). These measures appear to be useful for predicting training success, multitasking and the ability to focus attention on task completion despite distraction. Results have shown these measures to be both good predictors of job performance and to have substantially lower racial and ethnic group differences.

Measures of non-cognitive skills, particularly measures of personality, integrity and emotional intelligence, have consistently shown small to no racial and ethnic group differences. Tests measuring non-cognitive characteristics, such as Conscientiousness, Extraversion, Agreeableness, Openness to Experience and Emotional Stability (also known as the Five Factor Personality Model) have gained widespread use as initial screens in the hiring process. These tests (whether in the form of a personality test or measured by a biodata inventory) are also used in combination with cognitive tests to reduce the adverse impact against racial and ethnic minorities in a selection process. This approach typically increases validity because it covers a larger portion of the skills required to perform the job.

How We Measure

When properly designed and validated, alternative measurement methods, such as situational judgment tests, work samples, structured interviews and simulations, have been found to have potential for reducing adverse impact and maintaining the ability to predict success in the job (Ployhardt & Holtz, 2008). Such tests measure the ability to identify and understand job-related issues or problems and to select the proper course of action to resolve the problem. Their greater validity stems from covering a broader set of the skills required for the job and their reduced adverse impact appears to result from candidate acceptance, lower verbal and reading requirements and increased engagement of the test taker.

Another innovative example of adapting measurement methods involves the administration and scoring of structured interviews. In our own work, we have found that interviews which are audio recorded and scored by panels, each of which is assigned to score only one of the interview responses minimized the error variance associated with using multiple assessor panels across job candidates. This innovative approach of audio recording rather than videotaping the interviews has been shown to significantly reduce race differences on the interview (McKay, Curtis, Snyder, & Satterwhite, 2005).

In reviewing this literature, I offer one caution. It is important to recognize that the identification of less adverse alternatives must consider both what we measure and how we measure it, as the impact on the test taker's performance is inextricably linked to both the content and how it is delivered. Techniques themselves may have more or less adverse impact, depending on the types of skills measured and the cognitive demands of the assessment approach.

To illustrate this point, take the case of cognitive ability tests which have historically been reported to have a one standard deviation difference between the average scores of white and African American examinees. When cognitive ability is measured using an assessment center or a situational judgment test instead of a paper-and-pencil multiple choice test, the research shows that difference decreases to less than .65 standard deviations (Bobko & Roth, 2013). Same construct, different type of test. Conversely, situational judgment tests can vary widely in the extent of group differences depending on the skills being measured, ranging from less than a quarter of a standard deviation for measures of interpersonal skills to more than five times that (over a full standard deviation) for leadership skills. Same type of test, different constructs.

My point here is that finding and eliminating barriers by identifying less adverse alternatives is a complex process and one that still requires additional research. This research is particularly lacking in data for groups other than African Americans. Given the rapidly growing Hispanic workforce, a better understanding of the relative adverse impact of different selection procedures would be very helpful.

Finally, even well-designed selection processes cannot overcome a lack of diversity of the applicant pool recruited for the job. It has been our experience both in the private sector and in the public sector (particularly in the area of policing) that diversity of the leadership of the organization plays a significant role in attracting diverse candidates and ultimately in improving their representation in the workforce. The problem is not just a lack of diversity in entry level hiring, but in the leadership pipeline and in the decision-making ranks.

The Changing Skill Demands of the 21st Century Workplace

Fifty years ago many entry-level jobs required reading, writing and basic calculation skills. Today, workplaces are often global, multitasking and technological fluency are expected and clients can be expected to come from a wide range of cultures. Consequently, the skill demands have changed.

Several surveys have examined the skills important for success in this evolving workplace. Communication skills, teamwork, critical thinking, innovation, work ethic and literacy are among the most frequently identified skills needed. A combination of cognitive, personality and interpersonal skills will be required to succeed in this era of global competition, technology and innovation. As Daniel H. Pink, the author of A Whole New Mind, writes:

We must perform work that overseas knowledge-workers can't do cheaper, that computers can't do faster, and that satisfies the aesthetic, emotional, and spiritual demands of a prosperous time (p. 61).

These required skills present opportunities and challenges: the skills needed will certainly provide a different and, in some cases, a more level playing field for ethnic and racial minorities competing for jobs. They will also require those designing selection processes to look more broadly at the tools available to measure these skills fairly and accurately.

Thank you.


Footnotes

1 Biodata is a testing method which includes questions (typically multiple choice) about past events and behaviors in the candidate's life or work history that reflect personality attributes, attitudes, experiences, interests, skills and abilities which may be predictive of job performance for a given occupation. (OPM.gov, 2015)

2 "The American Psychological Association's Standards (2014) require supporting evidence and documentation regarding the extent to which the scores are interchangeable, and if they are not interchangeable, guidance must be provided on how to interpret the scores under the different conditions. The International Test Commission's (ITC) Guidelines (2005) are even more rigorous, requiring test developers to demonstrate that the two versions have comparable reliabilities, correlate with each other at the expected level from the reliability estimates, correlate comparably with other tests and external criteria, and produce comparable means and standard deviations or have been appropriately calibrated to render comparable scores" (Moomaw & Scott, 2014,p. 2)."


References

American Educational Research Association, American Psychological Association & National Council on Measurement in Education. (2014). Standards for educational and psychological testing. Washington, DC: Authors.

Bobko, P. & Roth, P. L. (2013). Reviewing, categorizing, and analyzing the literature on black-white mean differences for predictors of job performance: verifying some perceptions and updating/correcting others. Personnel Psychology, 66, 91-126.

Bosco, F., Allen, D. G. & Singh, K. (2015). Executive attention: An alternative perspective on general mental ability, performance and subgroup differences. Personnel Psychology, March 13, 2015 online.

Fallaw, S. S., & Kantrowitz, T. M. (2011). 2011 Global assessment trends report [White paper]. Retrieved from http://www2.Shl.com/campaign/2011-global-assessment-trends-report/thankyou.apx.

Goldstein, H. W., Scherbaum, C. A., & Yusko, K. P. (2009). Revisiting "g" intelligence, adverse impact, and personnel selection. In J. L. Outtz (Ed.), Adverse impact: Implications for organizational staffing and high stakes selection ( pp. 95-134). New York: Routledge/Taylor & Francis.

Hebl, M. R., Madera, J. M., & Martinez, L. R. (2014). Personnel Selection. In F. T. L. Leong (Ed.), APA Handbook of Multicultural PsychologyL Vol. 2. Applications and Training. Washington, D.C.: APA.

Illingworth, A. J., Morelli, N.A., Scott, J.C., and Boyd, S.L. (2014). Internet-based, unproctored assessments on mobile and non-mobile devices: Usage, measurement equivalence, and outcomes. Journal of Business Psychology, 29(2).

International Test Commission. (2005). International guidelines on computer-based and Internet delivered testing. Granada, Spain: Author.

McKay, P. F., Curtis, J. R., Jr., Snyder, D., & Satterwhite, R. (2005). Panel ratings of tape-recorded interview responses: Interrater reliability? Racial differences? Paper presented at the 20th Annual Conference of the Society for Industrial and Organizational Psychology, Los Angeles.

Moomaw, M. E. & Scott, J. S. (2014). The promise and perils of mobile assessments. Darien, CT: APTMetrics.

Office of Personnel Management (2015). Biographical data (Biodata) Tests. Retrieved from https://www.opm.gov/policy-data-oversight/assessment-and-selection/other-assessment-methods/biographical-data-biodata-tests/.

Pink, D. H. (2005). A Whole New Mind. New York: Riverhead Books.

Ployhardt, R. & Holtz, B. (2008). The diversity-validity dilemma: Strategies for reducing racioethnic and sex subgroup differences and adverse impact in selection. Personnel Psychology, 61, 153-172.

Schmidt, F. L., Shaffer, J. A., & Oh, I. (2008). Increased accuracy for range restriction corrections: Implications for the role of personality and general mental ability in job and training performance. Personnel Psychology, 61, 827-868. Doi:10.1111/j.1744-6570.2008.00132.x

Walker, J. (2012). Meet the new boss: Big data. Wall Street Journal, September 20, 2012.