Bloomberg Analysis finds Significant Racial Bias when Using ChatGPT for HR Screening

Bloomberg analysis reveals that Gen AI tools like OpenAI’s GPT-3.5, despite their sophistication, can perpetuate bias in HR screening.

In the relentless churn of the hiring game, recruiters face a familiar foe: the overflowing inbox. The annual surge in hiring coincides with the allocation of fresh budgets, leaving recruiters to sift through mountains of resumes in search of the perfect fit.

Traditionally, this time-consuming task has relied on human intuition and experience. But a new contender has entered the ring: Artificial Intelligence (AI), specifically Generative AI (Gen AI) tools, are promising to revolutionize HR by automating the screening process.

However, a critical question emerges: Is AI truly the solution for overwhelmed HR departments, or is there a risk of introducing a new layer of bias into the already intricate world of hiring?

The Promise of Generative AI in HR

Gen AI has the potential to revolutionize HR by automating tasks like resume screening and candidate ranking. This frees up valuable time for recruiters to focus on strategic initiatives and candidate nurturing. Proponents even suggest that AI could be fairer than human recruiters, who may hold unconscious biases based on factors like gender, race, or age. Imagine a world where every applicant receives an objective assessment based solely on their skills and experience, free from human prejudice.

The AI HR Bias Problem

However, a recent Bloomberg analysis has revealed a critical flaw: Gen AI tools like OpenAI’s GPT-3.5, despite their sophistication, can perpetuate bias. Trained on massive datasets of text and code, these tools can unwittingly mirror and amplify existing societal biases.

Bloomberg carried out an experiment inspired by landmark studies that used fictitious names and resumes to measure algorithmic bias and hiring discrimination. Borrowing methods from these studies, reporters used voter and census data to derive names that are demographically distinct — meaning they are associated with Americans of a particular race or ethnicity at least 90% of the time — and randomly assigned them to equally-qualified resumes.

When asked to rank those resumes 1,000 times, GPT 3.5 — the most broadly-used version of the model — favored names from some demographics more often than others, to an extent that would fail benchmarks used to assess job discrimination against protected groups. While this test is a simplified version of a typical HR workflow, it isolated names as a source of bias in GPT that could affect hiring decisions. The interviews and experiment show that using generative AI for recruiting and hiring poses a serious risk for automated discrimination at scale.

Breaking Down the Bloomberg Experiment

The Bloomberg experiment design is particularly insightful. By using fictitious resumes with demographically distinct names paired with identical qualifications, it isolated the impact of names on the AI’s ranking. The results expose a significant vulnerability in Gen AI hiring tools. Even seemingly neutral factors like a name can trigger bias within the algorithm.

Key findings:

This experiment was repeated for four job postings — HR business partner, senior software engineer, retail manager and financial analyst — and found that resumes labeled with names distinct to Black Americans were the least likely to be ranked as the top candidates for financial analyst and software engineer roles. Those with names distinct to Black women were top-ranked for a software engineering role only 11% of the time by GPT — 36% less frequently than the best-performing group.
The analysis found that GPT’s gender and racial preferences differed depending on the particular job that a candidate was evaluated for. GPT does not consistently disfavor any one group, but will pick winners and losers depending on the context.
GPT seldom ranked names associated with men as the top candidate for HR and retail positions, two professions historically dominated by women. GPT was nearly twice as likely to rank names distinct to Hispanic women as the top candidate for an HR role compared to each set of resumes with names distinct to men.
GPT regularly failed adverse impact benchmarks for several groups across the tests. Bloomberg found at least one adversely impacted group for every job listing, except for retail workers ranked by GPT-4.

Mitigating AI Bias in HR

The quest for a truly equitable workplace requires constant vigilance against bias. Unconscious prejudices can creep into every stage of the employee lifecycle, from recruitment and promotion to performance reviews. To ensure responsible implementation of Gen AI in HR, a multi-pronged approach is crucial:

Data Scrutiny: Gen AI models are only as good as the data they’re trained on. HR departments must scrutinize training data for potential biases and actively seek diverse datasets that reflect the demographics of the workforce they’re looking to build. This might involve partnering with underrepresented communities or utilizing datasets specifically curated for inclusivity.
Algorithmic Auditing: Regularly audit AI algorithms used in hiring to identify and eliminate bias. This could involve running simulations with diverse candidate profiles or partnering with external AI ethics specialists.
Human Oversight: AI should be a tool to augment human decision-making, not replace it. Recruiters should review AI-generated rankings and make final decisions based on a holistic assessment of the candidate, including interviews and skills assessments.
Transparency: Be transparent about the use of AI in the hiring process. Inform candidates that AI is being used for initial screening and explain the steps taken to mitigate bias. This builds trust and demonstrates a commitment to fair hiring practices.
Continuous Improvement: The field of AI is constantly evolving. HR departments should stay updated on best practices for mitigating bias in AI hiring tools and invest in ongoing training for recruiters on how to work effectively alongside AI.

The Road Ahead

Gen AI holds immense promise for streamlining HR processes and promoting diversity in hiring. However, its potential for perpetuating bias requires careful consideration. By implementing robust safeguards and maintaining human oversight, HR departments can leverage AI’s power to create a truly fair and efficient hiring experience that benefits both employers and candidates.

This content was initially generated with the assistance of AI tools. However, it has undergone thorough human review, editing, and approval to ensure its accuracy, coherence, and quality. While AI technology played a role in its creation, the final version reflects the expertise and judgment of our human editors.

ethicAil – Building Trust in AI