Recruitment

AI in hiring: What HR needs to know about algorithmic bias

It is nothing new that artificial intelligence (AI) has transformed the recruitment process. Where once there was a recruitment manager sorting through piles of files and comparing CVs sent by job applicants over and over again, today there is smart technology that does the job in less time than it takes a human being to make a cup of coffee.

AI offers tools that allow for large-scale candidate screening, video interview evaluation, and even performance prediction. The advantages of time savings and perceived impartiality have led many companies to use AI in these processes.

Organisations such as Hilton have reported reducing hiring times from six weeks to just five days thanks to their experiments with different tools, while others have seen improvements in diversity of between 20% and 100% after adopting AI systems.

However, nothing is perfect, least of all a technology that acts as a mirror of what were previously purely human (and therefore imperfect and biased) decisions.

So the efficiency of AI in recruitment and hiring processes comes with a caveat: AI is only as fair as the data and design that underpin it.

In a well-known case, Amazon scrapped its experimental hiring algorithm after discovering that it downgraded resumes that included the word “women.” The tool had absorbed the historical gender bias in its training data, effectively penalising female candidates without human intervention.

It's not only biased data; algorithms themselves can also introduce bias. 

For HR professionals, this is a wake-up call. The promise of AI must be balanced with a clear understanding of how biases are introduced into the system and what can be done to manage them.

When history becomes a handicap: Dataset Bias

Most AI hiring tools are trained on historical recruitment data. If that data reflects discriminatory practices—consciously or not—the algorithm will likely replicate those same patterns. This can manifest in several ways: misspelled or outdated personal information might disadvantage women or ethnic minorities; CVs sourced mainly from white male applicants can skew algorithmic preferences; and using proxy indicators like postcode can inadvertently exclude candidates from historically marginalised communities.

Take the example of large language models ranking CVs: a 2024 joint study by MIT and Boston University found that applicants with white male-associated names were routinely favoured over identical CVs with Black or female-associated names.

All these biases and prejudices automatically adopted by AI, which it reproduces and in many cases amplifies, represent a challenge for HR teams.

These professionals are saving a lot of time on administrative or routine tasks, allowing them to focus on strategic decisions. But making these decisions more fairly involves critically examining the origin and composition of the data used by their suppliers. It is no longer enough to ask whether a model works; we must ask for whom it works. We must insist on seeing demographic breakdowns of training data and ask how biases have been checked and mitigated before implementation.

Structural bias in models

Even with well-balanced data, a tool’s design can embed or amplify bias. Many models operate as “black boxes”, offering no insight into how decisions are made.

For example, some tools analyse facial expressions or vocal tones to infer traits like enthusiasm or confidence—yet these cues vary widely across cultures and individuals. Worse still, AI-driven transcription software often struggles with accented or impaired speech.

A 2025 study from the University of South Australia showed up to 22% transcription error rates for candidates with non-standard speech patterns—versus near-perfect accuracy for standard speakers.

HR leaders must ensure tools are not being misapplied or deployed outside their intended use case. If a model was trained to assess call-centre tone, it shouldn't be used to evaluate engineering applicants. At the same time, choose vendors that prioritise explainability—those that can clearly articulate how and why the tool reaches its conclusions.

The mirage of fairness: Misleading metrics

AI fairness metrics sound reassuring, but definitions of fairness vary, and some may unintentionally entrench inequality.

Calibration fairness, for example, assumes that individuals with similar inputs (education, location, skills) should receive similar outcomes. But if those inputs reflect structural disadvantage, then the fairness becomes superficial.

A Nature article published in 2024 introduced the FAIRE benchmark and demonstrated how slight changes in a CV—such as swapping a name from “John” to “Aisha”—can result in vastly different rankings from an LLM, despite identical experience.

These findings underscore the importance of interrogating which metrics are being used, and whether they align with your organisation’s DEI priorities.

This is where HR's oversight becomes essential. Don’t take “we’ve tested for fairness” at face value. Ask what type of fairness was tested (statistical parity, calibration, equalised odds?), what protected characteristics were considered, and how these findings influenced model deployment.

Fairness in practice is a moving target, make sure your strategy moves with it.

When human hands reinforce machine bias

Even the best-designed AI system can fail if applied inconsistently. For instance, hiring managers may override AI recommendations in favour of “gut feel”, selectively using outputs that support their preconceptions while dismissing those that don’t. This tendency, sometimes called “reinforcement politics”, can turn AI into a justification tool for existing power structures rather than a catalyst for fairness.

To prevent this, HR should establish clear governance protocols for how AI recommendations are used in hiring decisions. It’s essential to ensure consistency across recruitment teams and document when and why exceptions are made.

Better still, integrate hiring panels that include diverse stakeholders to challenge assumptions and promote collective accountability.

Ethical AI in practice

According to a 2023 SHRM survey, 41% of organisations using AI in hiring reported observing biased outcomes . Meanwhile, IMD and Microsoft’s 2023 AI Readiness Report revealed that only 35% of businesses had protocols in place to address algorithmic bias in their hiring tools.

Fortunately, a new wave of technical and organisational solutions is emerging. Tools like IBM’s AI Fairness 360 and Microsoft’s Fairlearn enable organisations to test and audit AI models for bias. Inclusive design principles, where diverse development teams are involved early, are also reducing risks from the outset. Meanwhile, the EU’s Artificial Intelligence Act is starting to impose clear standards on algorithmic accountability.

Within this changing landscape, HR can take the lead by:

  • Sourcing inclusive, representative training datasets.
  • Commissioning regular third-party audits of AI tools.
  • Embedding explainability into procurement criteria.
  • Combining AI assessments with structured human evaluation.
  • Soliciting candidate feedback post-interview to identify hidden barriers.
  • Monitoring DEI metrics through dashboards to track impact over time.

Browse more in: