Rendered at 22:00:25 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
alexpotato 22 hours ago [-]
I went to a state school.
I then went on to work for multiple firms that placed a premium on candidates from Ivy League/Top Tier (Stanford/Duke etc) candidates.
This taught me that:
- Their are pros and cons to any selection criteria.
- There are smart people everywhere. One of the smartest people I ever worked for spent several years in prison for drug dealing. He was on par with many of the Managing Directors I've worked for
- There was a study where they asked big bank recruiters which school consistently produced people who were excellent employees 2-3 years out from hiring and the answer was Penn State (not my alma mater)
- There used to be "manager's choice" hires where managers had 1 slot in a training program where they could select whoever they wanted. Sometimes that was terrible. Sometimes that person was top of their training program.
- Smart people are just as capable as creating problems as less intelligent people. Smart people, in some ways, are better at creating problems. Especially if the incentives reward them for creating those problems.
pc86 9 hours ago [-]
> There used to be "manager's choice" hires where managers had 1 slot in a training program where they could select whoever they wanted. Sometimes that was terrible. Sometimes that person was top of their training program.
This seems like a great idea to me if you institute a feedback loop so managers who pick trash eventually lost the option to pick, and managers who pick rock stars eventually get more picks / more responsibility.
llmslave 20 hours ago [-]
ivy league advantage even after working on the job is unreal and underestimated
kenjackson 24 hours ago [-]
I think this partially buries the lede:
"As a single hiring vendor comes to dominate screening for an industry, it may be more likely that candidates are shut out."
If we move to using just a small number of AI models to help do things like hiring, we will amplify biases and possibly completely lock out portions of the population. We need to be very careful when using AI systems to evaluate people in general -- not because they might be biased (which they might be), but because even a small bias, if used by virtually everyone, can be damning.
tbrownaw 23 hours ago [-]
> We need to be very careful when using AI systems to evaluate people in general -- not because they might be biased (which they might be), but because even a small bias, if used by virtually everyone, can be damning.
I don't think this even requires any bias.
Assume there's some loose ordering of who is or isn't a good hire, and every employer has their own fuzzy view of it. If you get slightly better or worse as a potential hire (pick up an extra degree, let your latest certification lapse, whatever), it gets somewhat easier or harder to get hired.
Now assume that same ordering, but all employers share the same view of it. I'd expect the divide between employable and not employable to be much sharper.
actionfromafar 23 hours ago [-]
Well, I'd say that specific ordering is the bias. But I see what you mean. The bias is arbitrary, but still very real.
Also, we will of course have all kinds of attempts to "game" the system to get ahead. Optimizing (even more) for the metric. Degree mills, for instance.
pc86 24 hours ago [-]
If you want to make meaningful change in this avenue you really can't use words like "bias" or "systemic" because anywhere from 49-51% of the population will immediately shut down upon hearing that. Someone can argue (and many do to varying levels of success) that systemic bias doesn't exist, which means this doesn't exist, which means there no problem.
However, "this AI model can decide that some subset of people, perhaps random, perhaps not, are simply not hirable for any job" makes sense to most people regardless of political bent.
rayiner 23 hours ago [-]
The problem with the term “systemic bias” is that it takes a word that’s about differential treatment and changes the subject to disparate outcomes.
For example, the article here shows disparate impact: that different percentages of applications are passed through the AI filter. But it doesn’t show differential treatment of otherwise identical applications based on race.
nyrikki 18 hours ago [-]
Capitulation is a bad counterpropaganda tactic, especially with terms that have well defined domain specific meanings.
Note that the OP uses "systemic rejection", while the paper does reference bias, it is in the precise meaning of the word.[0] And this is not targeted at the general public.
You may want to look into the 1990 GOPAC handout "Language: A Key Mechanism of Control" to understand why some groups would simply just weaponize any term that was substituted. Academic papers need to error on being precise, to be effective, not focused on handling the general public with kids gloves IMHO.
Edited to add, listen to Lee Atwater's 1981 Interview on the Southern Strategy for even more context.
It's strange to say it might be biased. Bias is absolutely impossible to avoid, especially with how today's "AI" works.
You might be able to avoid it with a panel of AI, similar to how we try to avoid it by using panels of humans, but even that turns out to be contentious and not surefire.
I have feeling with AI it'll be even worse, since folks / companies can pass the buck (similar to how health insurance companies are now using it to deny folks).
kcexn 4 hours ago [-]
Everyone knows there is bias. The problem this article highlights is that by delegating screening and human judgment to a few AI vendors those vendors will bias all employers in the same way.
tbrownaw 22 hours ago [-]
> * Bias is absolutely impossible to avoid, especially with how today's "AI" works.*
Unless you're taking the "there are multiple mathematically incompatible ways to define bias" view of the topic, just do what's already known best practice for high-bureaucracy human review. Which is too define an overly-pedantic standard rubric.
mstewartgallus 50 minutes ago [-]
"we will amplify biases and possibly completely lock out portions of the population."
A lot of the capitalists see that as a positive.
SecretDreams 22 hours ago [-]
Agreed. Humans are also biased, but our biases are different across a lot of socio-economic factors. So when we have different people in these positions, the biases become less bias-y.
But LLMs are statistical models. They are aggregating all biases into a general super bias. And they're all converging towards the same solutions.
slashdave 23 hours ago [-]
It is also illegal
pc86 9 hours ago [-]
What law does it violate? It's not even clear to me what "it" you're referring to.
mstewartgallus 47 minutes ago [-]
capitalists have never cared about that
wand3r 1 days ago [-]
Did I miss the part of the article where they break down how they determined race? Is the algorithm blind to race? It looks like they specifically looked at 83k people applying to ~100 companies which notably were Fortune 500 companies. Could there simply be candidate discrepancies here? Hard for me to follow the full methodology but it doesn't necessarily seem either malicious or that well structured. Don't you need to have a control group of applicants who are similar on paper? To allege DISCRIMINATION is quite bold.
If you click through, the paper says the race is self-reported.
“Our data tracks 4,197,168 applications. It includes applicant gameplay features and for each application,
the application date, the position name and employer, metadata about the position and employer, and the
numerical score and final recommendation each applicant received for each completed application. 40.2% of
applicants self-report race with a breakdown of 16.8% Asian, 14.2% White, 3.6% Black, 3.0% Hispanic, and all
other racial categories below 2% (i.e. fewer than 100,000 applicants).”
zerocrates 1 days ago [-]
The 83,000 applications to Fortune 500 companies, that was a different previous study they compared their results to. This paper's takeaway is that unlike that Fortune 500 data, the applications here that went through an ML vendor's screening process showed evidence of "systemic rejection," where some applicants got rejected across the board at higher rates than you'd expect if they were facing independent would-be employers.
gacgacgac 1 days ago [-]
Yes. You missed it. They are using a test dataset of 83k resumes generated in 2022 for this paper and comparing it as a baseline against their observational data: https://www.nber.org/papers/w29053
The dataset is constructed, deliberately, to hold candidate performance constant and vary the names of candidates to appear to be associated with a specific race.
AStrangeMorrow 1 days ago [-]
From looking at how that was done, it seems they (the paper you linked) used an older paper which looked at which names are frequent enough and more biased toward a certain demographic (90% of that name occurrence falls within that demographic).
But they picked 9 family names per group. Which sounds quite low. And combined that with first names to reach 500 first+last names per group.
I wonder how much of the bias we see has to do with the names actually picked versus it being racially motivated (absolutely not denying that this probably is a factor, but might not be the only one).
For example, in France there is the national BAC end of high school exam. If you you at the names X grade distribution, and look at the higher “very good” bracket: some names are heavily under-represented (less than 5% of say “Jordan” get that grade) while some are over-represented (35% of “Josephine” get such a grade). The exam is for the most part anonymous, but some names are definitely heavily correlated with lower/higher income groups. So nothing surprising: Josephines tend to come from richer families, thus in average get better education/support, thus better grades. Same thing is true with family names to a smaller extent.
So I wonder how much of the bias we see, be it from real persons or the AI has more to do with a class thing than a racial thing. Again those are not neatly separate things, but still
pc86 24 hours ago [-]
Race and socioeconomic status are pretty strongly correlated but I'd imagine it's possible to do a study to see what the extent of each's influence is. You'd need to find "high socioeconomic" names that are also strongly correlated with race(s) themselves correlated with low socioeconomic status and vice versa which honestly might be the hardest part. The disambiguation from a statistical standpoint doesn't seem that difficult once you have the data.
pc86 9 hours ago [-]
Another hindrance to collecting data in the first place is that idiots will read this and downvote it because they think statistics are racist, I guess?
1 days ago [-]
rayiner 24 hours ago [-]
That’s an earlier paper. This one involves 3 million real applicants, with no control for applicant quality.
xp84 24 hours ago [-]
Wow. So, all the 'people' and 'resumes' involved are fake, but they submitted them to real jobs?
Cool.
In any event, I'd happily support a ban on all parts of the ATS that could be involved in automated approval, rejection, or scoring being able to see candidate names. But I sense the author of this has a bigger agenda.
8note 24 hours ago [-]
id expect any algorithm to learn race by other properties in the data?
its going to be in the rest of the data because race has a meaningful correlation, and pleanty of causation with being disadvantaged in real ways, that can also affect the ability to then do certain jobs.
like, the environmental pollution and building interstates and freeways through black communities, on purpose to do bad things to those communities, then results in a bunch of noise and particulate pollution, that is bad for developing brains.
you wont be able to do some meritocratic non-racist hiring without fixing the environmental racism. otherwise youre just mirroring racism other people built for you
Oras 1 days ago [-]
Misleading title the paper [0] does not mention any CV screening that might suggest racial or gender bias. It is purely about assessment tool. No AI or LLMs.
I'm not saying AI is not biased, but this study does not prove that.
> Fig. 1. The pymetrics process.
> Stage 1: Applicants apply to positions.
> Stage 2: Applicants are directed to the pymetrics platform to play assessment games.
> Stage 3: pymetrics algorithms use applicant gameplay features to recommend 58.2% of applicants per position on average.
> Stage 4: Employers decide which applicants to interview or hire, typically rejecting applicants that were not recommended by pymetrics.
alain94040 1 days ago [-]
The European Union passed The Artificial Intelligence Act, which classifies:
High-risk – AI applications that are expected to pose significant threats to health, safety, or the fundamental rights of persons. Notably, AI systems used in health, education, recruitment, critical infrastructure management, law enforcement or justice. They are subject to quality, transparency, human oversight and safety obligations
That's a pretty common sense legislation to me.
anon373839 1 days ago [-]
The AI “safety” industry is lobbying for federal preemption so that states won’t have the power to enact these types of sensible regulations.
pc86 24 hours ago [-]
> > European
anon373839 20 hours ago [-]
Right, I'm saying that these are sensible regulations and Anthropic et al want to prevent US states from being able to pursue them.
pc86 24 hours ago [-]
This is one of those things where the first sentence sounds completely fine and reasonable, maybe even objectively good.
Of all the things listed "recruitment" doesn't belong to me. Is the argument that it is someone's fundamental human right to get someone else to pay them to do a job? Or is it strictly about human oversight?
tadfisher 20 hours ago [-]
Friend, we are discussing a demonstration of systemic bias in hiring decisions made by AI models. The argument is that it is someone's fundamental right to be treated the same as someone else with the same qualifications but different skin color.
tbrownaw 23 hours ago [-]
> That's a pretty common sense legislation to me.
There's no reason to single out AI vs any other approach to the same topics.
72027372920 1 days ago [-]
[dead]
daft_pink 24 hours ago [-]
Anyone who’s done hiring wouldn’t be shocked by this:
We find applicants are more likely to be rejected from every position they apply to than would be predicted by the baseline of each position making statistically independent decisions.
Obviously a rejected resume is more likely to be rejected by every other employer and an accepted resume is more likely to be accepted by every other employer. Like online dating, most employers are looking for some baseline indicators that you are going to be successful and stable.
verteu 23 hours ago [-]
> Obviously a rejected resume is more likely to be rejected by every other employer and an accepted resume is more likely to be accepted by every other employer.
But that wasn't the case for non-algorithmic screening. From the paper:
"By contrast, we find that when first round screening is not mediated by a single screening procedure, systemic rejections are close to the baseline. To support the empirical validity of our baseline, we study homogeneous outcomes in the largest study of first-round screening at U.S. employers to date. Kline et al. [38] generated 83000 synthetic resumes and submitted these resumes to vacant positions at 108 US companies between October 2019 and April 2021, a similar time period to our data. The companies, which are a subset of the Fortune 500,15
collectively employ 15 million workers. We analyze the homogeneity observed in the resulting callback outcomes in their data. We find that the baseline is an effective estimator of the systemic rejection rate for this dataset. As shown in Figure 3, the observed systemic rejection rate is accurately predicted by the baseline and a chi-squared goodness-of-fit test cannot reject equality of the two distributions (2 = 20.05, = 0.69). In other words, while the largest previous study observes systemic rejection rates consistent with employers making statistically independent decisions, the algorithmic hiring data shows significantly correlated outcomes that lead to higher-than-baseline systemic rejection rates."
tbrownaw 23 hours ago [-]
The paper with the non-algorithmic screening used synthetic resumes. I rather suspect that they didn't generate a realistic distribution of qualifications levels.
verteu 23 hours ago [-]
Yeah, good point.
drillsteps5 5 hours ago [-]
This does not sound right to me. It would be correct if all companies and their hiring managers had the same requirements/looking for candidates with the same qualifications - but they're not.
Obviously as a hiring manager you're looking for a hard working individual with a number of successfully completed projects and glowing referrals from multiple places of employment, but you're also looking for a person with expertise in particular technologies/industries/whatever other areas of expertise. To a large extent requirements for each role are unique, however many do have some overlap.
So being rejected from one position might simply mean there's a misalignment between what the company is looking for and what the individual has. Which might not be the case with other companies.
So if we're seeing increasing number of candidates being consistently rejected at multiple places the question "why" is a valid one.
zeroonetwothree 24 hours ago [-]
Yes I don’t understand why this is surprising or problematic at all?
Actually the fact that they found this result didn’t hold in a different dataset is especially weird.
pc86 24 hours ago [-]
> a rejected resume is more likely to be rejected by every other employer
This makes sense to me, albeit intuitively and in a way I can't articulate.
> an accepted resume is more likely to be accepted by every other employer
but this doesn't necessarily follow from the prior for me. Plenty of people get really good jobs and are really successful in them only after dozens or hundreds of rejections with a nearly-identical resume.
daft_pink 23 hours ago [-]
If you look at the chart, the systemic rejection rate is only like 5-10%. It’s not a huge impact and it’s just about getting an interview not getting the job, so they could still get rejected.
I just think certain resumes will get an interview almost every time in some industries and certain resumes will likely never get an interview almost every time, but the majority of resumes are like you say have different aspects that appeal to one empoyer over another.
heylook 24 hours ago [-]
The intuition is that they are not truly independent statistical events. Each trial reveals more information about the underlying "quality" of the resume (for passing this trial, not necessarily real world "quality" of the candidate). We are not rolling dice where each toss is fundamentally unrelated to prior tosses.
dash2 1 days ago [-]
> To measure adverse impact, we apply the EEOC’s “four-fifths rule,” which flags a position when one group is recommended at less than 80% of the rate of the most-recommended group
That seems like a nonsensical way to measure racial discrimination. What could justify it?
gacgacgac 1 days ago [-]
Have you googled this? The EEOC is a federal agency, and they've published on this topic quite extensively. The four fifths rule is used to define if there is a "substantially different selection rate". It does not measure racial discrimination. It measures selection rate.
It indicates there may be adverse impact to one group. It specifically is not used to resolve racial discrimination.
It's purely a signal for "we should consider asking more questions, because this appears unusual". That's what your quote says too, it "flags" a low recommendation -- it's indicating further study and investigation is likely warranted.
rayiner 1 days ago [-]
Your summary of the EEOC guidance is correct. The problem is that the study here is using the four-fifths rule as a measurement of discrimination, instead of as a flag that triggers further investigation. It's in section 3.1 of the paper: https://arxiv.org/pdf/2605.27371.
"Adverse impact occurs when there is (i) practically and (ii) statistically significant disparities in the selection rate for the group of interest when compared against the selection rate ′ of the most selected group ′
. Practical significance requires the impact ratio ... to be less than 0.8, which is why the EEOC guidance is colloquially referred to as the 'four-fifths' rule."
The headline numbers reflect the positions for which the 4/5 rule was triggered, not the result of some further investigation: “We discovered that 26% of Black applicants and 15% of Asian applicants applied to positions where the AI system discriminated against their racial group.” Based on the methodology, I think that means that 26% of black applicants applied to positions that were flagged under the 4/5ths rule.
nemomarx 1 days ago [-]
I guess it measures if there's more than one std deviation gap between highest and lowest? Assuming that's twenty percent here
it sounds like how you'd get that kind of metric at least
paisawalla 1 days ago [-]
This is an application of the disparate impact doctrine. Even facially neutral policies are considered suspect if they produce results that correlate against protected groups, irrespective of intent.
This doctrine is the basis for much of employment law. It is a significant reason why employers don't administer IQ tests (or equivalents) to screen candidates since ~the 90s.
A common objection to the doctrine is that it leads to unfalsifiable discrimination claims, which is why it seems nonsensical to you.
tptacek 2 hours ago [-]
No it isn't. This is an Internet myth. IQ tests in white-collar employment aren't unlawful or even risky, and several household name giant corporations (with deep pockets for plaintiffs attorneys) do so openly for some of their roles.
59percentmore 1 days ago [-]
And a common rebuttal to the objection is that systemic racism is often difficult to untangle in a way that produces a neat chain of cause and effect (not least of which because discrimination can happen unconsciously or secretly); because the impact exists whether intent can be shown or not, the desire remains to ameliorate that impact.
If the issue happens upstream of the defendant to a claim - generally an organization being sued by an individual with fewer resources - it incentivizes such entities to push for changes upstream, so that they don't get stuck with the bill.
AnthonyMouse 23 hours ago [-]
> And a common rebuttal to the objection is that systemic racism is often difficult to untangle in a way that produces a neat chain of cause and effect (not least of which because discrimination can happen unconsciously or secretly)
We have a "disparate impact" and nobody can prove what proportion of it is due to things like parental income or childhood education as opposed to racism on the part of the employer. Because the former considerations are real contributors, the metric can regularly be expected to exceed the threshold even if the contribution of racism by the employer was zero. Doesn't that imply that we're essentially accusing people of racism at random?
> because the impact exists whether intent can be shown or not, the desire remains to ameliorate that impact.
The median household income for Asian Americans of Indian ethnicity is more than double those of Burmese ethnicity:
This is objectively a disparate impact and likely shows up in several other metrics in addition to income. Disparate results can almost universally be obtained by arbitrarily segmenting the population into different groups and comparing the midpoints. Americans of Australian ancestry have a higher median income than those of Irish ancestry, Bolivians higher than Cubans. The result is often because the lower down group has a history of being oppressed.
What reasoned means can we use to determine which groups get the benefit of these methods to ameliorate the disparity and which don't? What should be done about the inherent impossibility of doing them simultaneously, e.g. because hiring a South African woman over a Haitian man would reduce the disparity on one axis while increasing it on another? Notice that considering each group separately could result in unconditional liability because either available alternative puts you over the threshold for one group or the other.
> If the issue happens upstream of the defendant to a claim - generally an organization being sued by an individual with fewer resources - it incentivizes such entities to push for changes upstream, so that they don't get stuck with the bill.
Do we want to apply this logic to other things? The median income in California and New York are significantly higher than they are in Alabama or West Virginia and they have higher ranked public schools. We can correspondingly expect that when applicants from different states apply for the same job, the ones from California and New York (even if they're the same race etc.) are more likely to be selected because they had more advantages growing up, even though none of them chose where they were born.
By the same reasoning we should then have the federal government penalize employers for hiring the applicants from the more affluent states so that it "incentivizes such entities to push for changes upstream, so that they don't get stuck with the bill." Does it make sense to do that?
aakresearch 20 hours ago [-]
That would end exactly at "Harrison Bergeron" world as described by Kurt Vonnegut, would it not? If every perceived advantage would require you to wear a "handicap".
AnthonyMouse 18 hours ago [-]
What do you mean? That's 20th century technology. It's making individualized evaluations. The elites wouldn't put up with that. They'd either have to take handicaps themselves or risk people noticing that they don't.
That's the beauty of aggregate statistics. You have some elite job with 100 slots. Before you had 60 of the slots going to cronies and the remainder being allocated on merit. The cronies were disproportionately of the same ethnicity so your statistics are skewed, but don't worry, all the cronies still get their slots. Because statistics can be balanced by getting even more cronies, this time of a different ethnicity, and giving them as many of what used to be the merit slots as you need to manipulate the average. Using statistics is perfect for pretending that you're giving people something when you're actually taking something away.
paisawalla 1 days ago [-]
What evidence would disprove the claim that systemic racism is the cause of a persistent disparity?
pc86 23 hours ago [-]
Why is this the one time someone is expected to disprove a claim rather than the claimant being expected to provide evidence?
If you're making the claim you need to provide the evidence.
Most people would say that a persistent disparity means it's possible there is discrimination, but it's not definitive proof.
tbrownaw 22 hours ago [-]
I read that question as a suggestion that the claim is unfalsifiable (ie, bullshit, unscientific, etc).
7 hours ago [-]
gacgacgac 1 days ago [-]
Importantly, the rule is not used to resolve racial discrimination claims. It's purely meant as the first test to evaluate whether a deeper dive is warranted. Fast, first pass data analysis tools are very useful for spotting unintended consequences.
Manuel_D 1 days ago [-]
To the contrary, companies have been found liable for discrimination solely based on having the wrong percentages outcomes in its objective hiring assessments: https://en.wikipedia.org/wiki/Griggs_v._Duke_Power_Co.
paisawalla 1 days ago [-]
You are selectively adhering to the letter of the law, when the practical effects are already well known and studied. One is not obligated to ignore literature, nor abstain from doing a simple extrapolation from the incentives placed on the table.
There is a large body of literature concerning the question "does disparate-impact enforcement cause employers to alter hiring behavior in ways unrelated to actual productivity or discrimination?" and the answer is largely "yes". As you suggested elsewhere in this discussion, Google may be useful.
SiempreViernes 1 days ago [-]
That's not particularly surprising nor objectionable, of course legislation that reminds employers they shouldn't discriminate based on race changes practice even for companies that aren't actually caught doing it.
To act like it's bad that people of colour have a more fair chance of getting employed because of some piece of legislation is simply insidious. It's just been over a month since black people lost the right to a fair vote.
rayiner 1 days ago [-]
> It's just been over a month since black people lost the right to a fair vote.
Literally the opposite happened. The Supreme Court ruled that there was VRA §2 liability when there was evidence of racially-motivated gerrymandering: "In short, §2 imposes liability only when the evidence supports a strong inference that the State intentionally drew its districts to afford minority voters less opportunity because of their race." (Louisiana v. Callais, p. 26)
paisawalla 24 hours ago [-]
I don't start from the conclusion that disparities are evidence of racism.
runako 1 days ago [-]
> selectively adhering to the letter of the law
Are you suggesting that companies should violate the law here? What do you recommend?
Edit: charitably, "adhering to the letter of the law" is sometimes shortened to "law-abiding" and is generally what we want.
paisawalla 24 hours ago [-]
You've misunderstood the point.
Prior to the beginning of your excerpt is the word "You", meaning the comment's author is the subject, not "companies". I'm saying the commenter is appealing to black letter law for the answer to the question "what happens when..." but we have observational evidence to answer the question.
runako 23 hours ago [-]
> we have observational evidence to answer the question.
Isn't the point that the observational evidence amounts to the companies in question steer clear of illegal behavior?
There are anti-money laundering laws, so banks institute procedures to help them comply. Yes, we expect companies to change their processes so they don't break the law. That's the point of the law.
I am confused with what you think companies should do in this situation. Expose themselves to legal and civil liability? Or change their behaviors so that close scrutiny indicates they are trying to comply with the laws and any bad actors acted against internal procedure?
paisawalla 17 hours ago [-]
The thread you're weighing in on is about whether the four-fifths rule is merely diagnostic, or functionally punitive.
You're arguing about something else, using the form "given we know that racism is happening, and furthermore we know where and how it is happening, why shouldn't we just do these Totally Cost Free and Obviously Good Things that are just like fighting money laundering"
Well, I just don't think any of these things are as evident as you seem to assume they are. Also fwiw I grew up in the US, where I was called all sorts of slurs -- like, the real ones you probably censor in your internal monologue when you see them written -- throughout K-12 education. I still don't believe in the existence of pervasive and oppressive racism the way you seem to assume it.
logicchains 1 days ago [-]
>What could justify it?
The assumption that applicants from all races are on average equally qualified for every position. Whole subfields of modern academia are based on that assumption.
aenis 1 days ago [-]
I am wondering - if in those circles, questions such as 'is NBA intentionally discriminating against asians - or is the fact that long distance running is dominated by, say, Ethiopians an example of discrimination' are ever discussed - or declared taboo and racist? I don't doubt that the assumption is just plain, demonstrably wrong - we all evolved under different types of environmental pressures - I am just wondering if the proponents of the all-the-races-are-same-on-average are ever discussing those obvious facts, and what answers do they come up with to explain the, say, unfair underrepresentation of Japanese in the NBA.
59percentmore 1 days ago [-]
The assumption is that no one has the authority to decide that all races aren't equally qualified for every position.
xp84 24 hours ago [-]
"Races" aren't qualified for anything. Neither are star signs or favorite Hogwarts houses.
Individuals are qualified or unqualified. If a company happens to end up with less than 1/4 Ravenclaws or not very many Virgos, it doesn't mean hate is a reason. It could be that the Ravenclaws that applied were a bit less qualified than those from the other houses.
I guess my point is, doing the statistical analysis for race and gender and drawing conclusions, while being completely blind to the one single factor any sane hiring manager should be focusing on -- actual qualifications for the role -- doesn't make any sense.
avadodin 24 hours ago [-]
It could make sense if one was looking to make interventions early on before the candidates reach the selection process.
Don't claim AI is discriminating against non–selects, though.
I doubt companies are using Gr*k to make their hiring decisions.
Manuel_D 23 hours ago [-]
Eliminate the double negative, and you're making the same statement as the comment above.
sdellis 1 days ago [-]
Unless you believe that Black people are racially inferior, I think this is simply evidence of racial discrimination at a systemic level, from education through employment. AI merely reenforces the systems built to favor white people.
adammarples 23 hours ago [-]
There are many other potential explanatory factors than your simple binary. Black people in America started in a very bad and difficult position, only a few generations ago, with huge racial discrimination, no money, and generational distrust of institutions. That is a factor that will affect what you see today without any current system of racial discrimination or inferiority.
sdellis 9 hours ago [-]
If I hear you correctly, the lack of reparations toward Black people in America is more to blame for the discrepancy than systemic racism? Perhaps it could be both?
I am getting downvoted because it's hard to admit that AI only reinforces the culture that it is trained on. It is the perfect technology to keep systemic racism in place, all while being the perfect scapegoat for lack of personal or corporate accountability.
mstewartgallus 40 minutes ago [-]
The failure of land reform in reconstruction really fucked things up a lot.
lazide 24 hours ago [-]
‘Every one is the same’, even when one group or another doesn’t like doing some kind of work for some reason.
Because surely no one would have legitimate preferences based on their gender, cultural norms, etc. or real differences in aptitude due to childhood exposure, education, or said norms and preferences.
> Since the 80% test does not involve probability distributions to determine whether the disparity is a “beyond chance” occurrence, it is usually not regarded as a definitive test for adverse impact. Instead, other statistically significance tests, such as the standard deviation analysis, may be used for this purpose.
But then my question recurs: isn’t this a ridiculous way to measure discrimination? It’s assuming that the only thing that differs between the different ethnic applicant pools is their ethnicity, which is essentially never going to be true.
gacgacgac 1 days ago [-]
It's not used to measure discrimination. It's used to identify outcomes that appear to be potentially discriminatory. You have to do the legwork afterwards.
Like. If I am evaluating a developer on lines of code written, I am a bad manager. But if an engineer has 40% fewer lines of code than the team median, it's absolutely ok for me to go, "Interesting. What's the story there? Are they slower or is there some other factor?"
Same idea -- this is purely a fast, first pass metric that can quickly assess if something warrants a deeper evaluation.
dash2 1 hours ago [-]
Right, but what I meant was: the other tests that the article says are used for definitively proving discrimination are equally bad, and subject to the same objection. Just substituting “one standard deviation“ or “statistical significance“for “80%“ doesn’t fix the fundamental problem here, which is that there are unmeasured confounders.
blharr 1 days ago [-]
You are correct, but especially in current day that analogy is quite bad.
I expect Median LoC might be very high with the average developer using AI these days... but the dev who is making atomic changes that are fixing the AI output is probably tiny LoC but way more important
moate 1 days ago [-]
How would you like me to define "starting point" in a way that you believe you'll be able to understand?
If you are trying to say "more data needed, headline misleading" you should say that instead of misrepresenting the 4/5ths rule. Also the word "can" implies uncertainty of conclusion. This isn't ridiculous, the authors point out that this is the first large scale study of this topic. Nothing has been "proven" here, it's showing that this warrants further investigation and attention.
Do you read many academic papers, because you seem to be having a rough go here.
kolbe 1 days ago [-]
You could be an Iranian sponsored bot. I'm not saying you are. You could be so don't get mad at me for publishing that statement. Because if I say "can," then I don't need to be accountable for any misinformation.
moate 19 hours ago [-]
I could be! And in science when you posit a hypothesis you then back it up with data points of statistical significance.
The authors of this study have done that here. To borrow from your example, if you saw a statistically significant amount of my posts highlighting the merits of the Iranian government in a ways that run counter to the general global consensus of their actions, you would then have something that other people might agree was worth looking into. A hypothesis is not “misinformation”.
This article has not claimed to have proven anything other than outcomes in a process. I don’t understand why this is so upsetting to you.
kolbe 19 hours ago [-]
I don't know who you think you're talking to about being upset, but this was exactly the halfwitted "u maaaad" response I would expect from a bot.
poplarsol 1 days ago [-]
The desire to subsidize employment for Democratic constituencies by threatening legal action if they aren't given enough jobs.
They find "disparate impact" of pymetrics across racial groups, but it doesn't seem like they controlled for anything.
efavdb 1 days ago [-]
They also say that if they do the analysis globally the effect goes away. Curious, does that not imply that if one domain is biased against some group there would be another where the bias was in its favor?
zeroonetwothree 24 hours ago [-]
Also has issues of random chance causing these differences. How many different positions are there that have the chance of a 80% effect?
verteu 23 hours ago [-]
They're using a Benjamini–Hochberg correction (alpha=0.05) to account for multiple comparisons (see Table 2).
"Cards held by African-American sellers sold for approximately 20% ($0.90) less than cards held by Caucasian sellers, and the race effect was more pronounced in sales of minority player cards."
tbrownaw 24 hours ago [-]
> We find that people who submit multiple applications to positions screened by the same algorithmic hiring vendor are more likely to be rejected from every position to which they apply than would be true if the companies made decisions statistically independently from one another. Ten percent of applicants who submit four applications are rejected from all the places to which they apply.
> Our research also found that this pattern does not appear to be the case in other circumstances. We analyzed data from the largest prior study of hiring decisions, which sent 83,000 applications to 108 Fortune 500 firms during the same time period as our study and did not focus on whether AI was used to make decisions. We found that the rate at which applicants were rejected from every firm they applied to in this data was no higher than what you’d expect if each company decided independently of the others.
It sounds like this study was using real-world applicants, and the other study they're comparing against was using synthetic applicants.
Consider the chance of being accepted as being composed of signal+bias+noise. Noise is random. Signal is a per-applicant value, and what's meant to be measured. Bias is a per-group value, and an artifact of the measuring process.
If acceptance/rejection is independent between positions applied for (as in the synthetic applicant study), that suggests that it's random or composed entirely of noise; ie there is no signal; ie the applicants are all equally qualified.
If acceptance/rejection is correlated, that means there is some nonzero amount of (signal+bias). But real-world applicants are not all identical, so there should be some amount of signal. So you can't just assume zero signal in order to infer that there must be bias.
slashdave 23 hours ago [-]
I think I am confused.
A inferior candidate (by skill) is going to be consistently rejected, no?
asdff 1 days ago [-]
Some job application websites I've seen actually have a yes or no option to consent to AI review that they claim is to simply assist HR and not actually screen you. I always select no. There is no way that selecting yes would ever be in my interest. I'm sorry, I'm going to force a real human to look at my stuff if I still can.
bluefirebrand 1 days ago [-]
My fear is that pressing "no" on stuff like that is going to become an auto-rejection in the vast majority of cases
simpaticoder 1 days ago [-]
It won't be rejected. Your resume will be meticulously placed into a human review queue pending the allocation of someone to look at the contents. Meanwhile the position will be filled, and so serving no purpose the review queue will be emptied.
bluefirebrand 1 days ago [-]
Oddly enough, being rejected by process versus being rejected by a person doesn't actually make me feel any better about the coming future
:)
jcims 1 days ago [-]
It's probably not going to be an auto-rejection, it's just going to sit in a queue that looks like this
My point is that this is effectively an auto rejection
booleandilemma 1 days ago [-]
[dead]
ApolloFortyNine 1 days ago [-]
I truly don't doubt it's possible for the AI to be 'racist'.
>If the AI had recommended Black and Asian candidates at the same rate as it recommended the most-favored group (typically white applicants), 40,000 more of their applications would have advanced to the next stage of hiring.
I don't think this is the right benchmark here, or at least, it would be very interesting if the actual outcome, offer or rejected, was considered at the end.
gacgacgac 1 days ago [-]
You are misreading this sentence. This sentence is saying: "Using a constructed dataset of resumes, whose only difference was a name change, we would anticipate a system evaluating on qualifications to produce an equal distribution of candidates across names. Our observed result was highly unequal, and that warrants further investigation."
_0ffh 1 days ago [-]
To me it appears as if the study using the constructed dataset was a completely different one than the one that was concerned with AI.
For the AI study real data from "3.4 million people who submit 4 million job applications to 1,700 job postings across 150 employers and 11 industry sectors" was used.
jsemrau 1 days ago [-]
Interesting timing as Workday is facing Discrimination Claims in California doing the same thing.
> If we pool all of its recommendations together — treating the vendor as one giant hiring process — we don’t find adverse impact. If we look at each position separately, as would be typical in an evaluation of adverse impact, then we expose the adverse impact in many positions.
AI works by learning patterns. So it will become bias by just learning from factors like education history, schools attended, employment history, ZIP codes, or geographic location. Those 3 factors alone are an easy proxy for race.
And if you add names into the equation (if the AI was trained without removing applicant names), the model can become even more bias.
stevenicr 1 days ago [-]
I expected more information from the article and 'the paper' -
I see nothing that shows any system was making a decision on race. How is the race being presented to the AI?
All this is showing from what I can see, is that certain groups of people were more often denied a next step in the process - but why?
Was the AI going by spelling and grammar? Were there names that were different but the rest of the resume was exactly the same? Were there pictures?
There were mentions that the rate of each group may be more prominent in the data when you split apart different types of jobs instead of all jobs in aggregate..
One could read that like it's inferred; that more warehouse jobs are offered to a race and less admin jobs..
but that same would happen if AI was more focused on perfect grammar for one job and it was not as much of a factor for a warehouse job.
Also if the people applying for the various jobs were self selecting, acceptance percentages this would skew things based upon which ones were applied / not applied to right?
There are so many ways you could draw conclusions like this from data, however correlation is not causation, yet this seems to say it is.
I feel this is an important thing to watch, but Stanford may not be the place to trust with 'Policy Recommendations' as it's very unclear there is any proof that 'AI Hiring Tools Yield Racial Bias and Systemic Rejection' from this study and paper.
PS - now that I see the HN title did not have the word "can" in it, and the title of the article is actually "Tools Can Yield" - maybe that is less accusing and more noting.
rnxrx 1 days ago [-]
Genuine curiosity: Is there any speculation as to what these tools are keying on to reject those particular applicants? It seems like it just being the applicant's name is too easy an answer, but I could be overthinking it.
ericol 1 days ago [-]
2 days ago there was another interesting article on the effects of AI in hiring[1]
I don’t think AI screening is effective. But this study is just disparate impact.
xrd 1 days ago [-]
Would be very interested to see how this affects post-50 workers. That's a protected class and I would imagine an ambulance chasing lawyer would be excited for a class action lawsuit.
OrvalWintermute 1 days ago [-]
The Pymetrics game is rigged by design:
Only 40% self report gender/race
no resume data, no education information, degrees, schools, GPA, major, work experience, skills/certifications
Zero job qualifications
zerocrates 1 days ago [-]
Well, they're only looking at whether the pymetrics gameplay algorithm ML thing recommends the candidate, not any of that other stuff. The outcome they're looking at here isn't whether the people actually got hired, or got passed by other screening layers or anything.
x313 1 days ago [-]
This study only looks at one specific vendor algorithmn (a job assesment given by a company called pymetrics)
all2 1 days ago [-]
LLMs are trained on the Internet, which isn't exactly known for it's race agnostic opinions.
ETH_start 1 days ago [-]
A racially disparate outcome is not evidence of racial bias.
mstewartgallus 29 minutes ago [-]
Correlation does not imply causation but it is really curious how the USA has a history of settler-colonialism, slavery, segregation, eugenics, mass incarceration and imperialism. The USA and the tech industry also literally have edgelord internet Nazis in positions of power right now.
TheMagicHorsey 1 days ago [-]
Imagine if they applied this same logic to the NBA draft.
black6 1 days ago [-]
I'm struggling to figure out what they're trying to say here in the linked (and very anemic) paper:
> 30% of Black applicants apply to at least one position that demonstrates adverse impact against Black applicants.
The whole thing reads like a tautology.
gacgacgac 1 days ago [-]
You are reading a paper without understanding the language of the paper. Adverse Impact has a specific meaning, and in this case it's specifically meaning that Black candidates were selected only four fifths as often as white candidates when their qualifications were identical. The study is only suggesting that further investigation is warranted.
black6 23 hours ago [-]
Thanks. It was unclear reading the article and linked paper. I didn't follow the citation/link trail far enough.
rayiner 23 hours ago [-]
Your initial assessment was correct and this part of the post above is incorrect: “when their qualifications were identical.” The paper doesn’t control for identical qualifications.
casey2 8 hours ago [-]
So if the AI is selecting the economically optimal person for the job does that mean previous hiring systems were biased in favor of Asian and Black hires? Why would that have been the case?
zombot 9 hours ago [-]
And then they complain about algorithmic monoculture in applications.
It's surprising to me to hear that these systems are considered racist when they're the same ones that are so color blind that they generate pictures of SS soldiers as African American women.
roysting 24 hours ago [-]
There is no isolation of variables. This is not science.
This is propaganda.
jongjong 24 hours ago [-]
I think the discrimination aspect is downstream from this fact:
> We follow 3.4 million people who submit 4 million job applications to 1,700 job postings across 150 employers and 11 industry sectors. Each job application was assessed by an AI hiring tool built by a single third-party vendor.
3.4 million people applying to just 150 employers... Who are all using just 1 platform. WTF. This is where the discrimination is happening. Why the f do 3.4 million people feel forced to apply to just 150 employers and why the f do all these 150 employers feel forced to use just one platform. WTF.
zeroonetwothree 24 hours ago [-]
That’s the platform that gave them the data. I don’t think they claim it’s all the applications of this set of people.
jongjong 23 hours ago [-]
I realise this but it's still incredible to think because that's about 22k applicants per company.
Even if that's just part of each company's total hiring pipeline, it's clear; something's wrong. I don't know how long this study has been running but 22k is a lot of people, even over a year. These companies are too big. That's the problem.
mstewartgallus 26 minutes ago [-]
Capitalism develops into monopolies pretty inevitably. Economies of scale just make monopolies more profitable. Monopolies also get in bed with the state and do dirty tricks to stamp out competition. But monopoly capitalism has been the case for over a 100 years now.
petesergeant 1 days ago [-]
I’m sure (really sure) there are real problems with AI and bias, but this is a weird study that isn’t looking at resumes or anything, it’s looking at how candidates did in some weird psychometric tests.
gacgacgac 1 days ago [-]
Double check the link. The study clearly looked at resumes.
petesergeant 1 days ago [-]
I’ve rechecked it, and I still think I’m right. What am I missing? This is the paper under discussion: https://arxiv.org/pdf/2605.27371
engineer_22 1 days ago [-]
> Using our large dataset of real hiring AI recommendations, we test our hypothesis. We find that people who submit multiple applications to positions screened by the same algorithmic hiring vendor are more likely to be rejected from every position to which they apply than would be true if the companies made decisions statistically independently from one another.
I would be surprised if the results were different.
logicchains 1 days ago [-]
Could the AI actually see the race of the applicants? Or was it just discriminating on the basis of some factor it found that was correlated with race, like SAT scores?
foolserrandboy 1 days ago [-]
It rejected Asians more because of their higher SAT scores? If it’s not directly based on applicants disclosing their ethnicity then probably something more obvious like names.
runako 1 days ago [-]
> discriminating on the basis of some factor it found that was correlated with race, like SAT scores
Hypothetical SAT score: 1060
How does that help you predict the race of an individual applicant? It's been a while since I took the SAT, but I didn't realize one's score provided so much information.
I'm going to assume that people aren't allowed to put "don't send me black applicants" into their process even if they do see race in the application as that's entirely illegal.
The paper's conclusion, that we need to study this more, is showing the authors likely believe this to be a byproduct of inherent/invisible bias.
bakugo 1 days ago [-]
> To put this in perspective: If the AI had recommended Black and Asian candidates at the same rate as it recommended the most-favored group (typically white applicants)
Some people just can't help but put their biases on display at every opportunity, even when it comes to the most minute details.
gacgacgac 1 days ago [-]
Nothing in this has any bias in it? Which words are you suggesting are biased? This study measured constructed resumes where only names were changed, and observed the rate each group was favored (the percentage of resumes that passed). One group must be "most favored" because thats how math works. It's the group whose percentage was the highest. The resumes were fictional and equivalent across race, only the names were changed.
bakugo 1 days ago [-]
Look closer at the capitalization of the words in the quoted sentence.
moate 1 days ago [-]
Where do you think this sentence shows bias?
The phrase "most-favored" means, "most recommended by the AI relative to the field".
What did you think this sentence meant?
redsocksfan45 1 days ago [-]
[dead]
everyone 1 days ago [-]
Its fucking crazy that people are using these systems for important tasks like hiring. They have zero understanding about how these systems work. And LLMs are absolutely not designed to do those sorts of jobs, they're designed to be chatbots and to fool a human conversing them that they are responding intelligently. Of course they're gonna be useless at other tasks.
(I assume they're just using a big LLM for this, it doesnt say, it just says "AI" when they say "AI like that they usually mean LLM".. A custom trained hiring ML system would be better)
engineer_22 1 days ago [-]
Isn't HR basically just an LLM with ears and teeth?
GrinningFool 1 days ago [-]
[dead]
marsven_422 1 days ago [-]
[dead]
anonreeeeplor 1 days ago [-]
[dead]
huflungdung 1 days ago [-]
[dead]
JuniperMesos 1 days ago [-]
[flagged]
techblueberry 1 days ago [-]
> These results are consistent with AI hiring tools being completely racially unbiased, and real-world hiring managers feeling social pressure to hire underqualified black people
And so managers are feeling social pressure to hire under qualified Asians as well? I must not be up to date on the latest culture war talking points, because I thought Asians were underrepresented.
JuniperMesos 1 days ago [-]
Yeah, if they themselves are asian. One of the most prevalent complaints about Indian hiring managers in the silicon valley tech industry is that they preferentially hire Indians and push out non-Indians to a tremendous degree, and are often helping Indian hires commit pretty blatant credential fraud.
1 days ago [-]
dzonga 1 days ago [-]
does your anecdote comprise of the various instances when CVs were discriminated against cz people's names sounded black ?
but you want to spew nonsense. every racial group includes its own under-qualified people ! there's no social pressure i.e DEI excuse you wanna give - but just economic agents acting for their own interests
jimmy76615 1 days ago [-]
[flagged]
tamimio 1 days ago [-]
You don’t need a complicated study to find out, do it yourself for science. Get a resume, make few different versions but keep the context the same, change the layout (one time education on top other on bottom etc etc), and use different names to signal different backgrounds, and you can extend it to schools too and gender, and send it to the same employers, you will see wonders!!
I tried it before, and discrimination is there, I would get one resume rejected quickly and few days later the same company would invite another resume for a screening call. I tried this before and after AI hype, results weren’t that different btw, and that was tested in US and Canada employers only.
jazz9k 1 days ago [-]
We can't take blanket percentages as a reason for racial bias. Were they all equally qualified?
Too many of these studies only focus on percentages and the end result is unqualified candidates getting hired from minority groups at the expense of qualified ones.
gacgacgac 1 days ago [-]
Please read the study or at least the comments here before jumping to the conclusion. Yes, they used constructed resumes, so the qualifications were exactly the same. And no, literally no one is suggesting this proves racial discrimination. It's applying the four fifths rule, a fast, coarse evaluation that is used to identify if maybe theres worth investigating more for a conclusive evidence of racial discrimination.
The authors are saying it's worth doing more research, because in a controlled data set the results appear unbalanced.
Oras 1 days ago [-]
> Please read the study or at least the comments here before jumping to the conclusion. Yes, they used constructed resumes
Looks like you didn't read the paper. There are no resumes involved. It is about assessment games.
etchalon 22 hours ago [-]
I think you're confusing this specific study with a different study, which did use duplicative resumes, and has been repeated:
Many people seem to think racism begins and ends with using a slur. You can usually get a measure of this by seeing someone's reaction to the statement:
> There is no such thing as anti-white racism.
If you find yourself wanting to disagree with that then, I'm sorry but you simply don't know what racism is. Racism is pervasive, insidious and systemic.
A good example in the hiring space is what's called the "second syllable name problem". Traditionally Afrcian names often stress the second syllable (eg Jamal, Lakisha, Malik, Lashonda). Studies have shown that such names have higher rejection rates in job applications [1]. So if you're wondering about the four-fifths rule, it's because it exposes this kind of bias. It's not proof of bias. It simply means further investigation is required.
The problem with AI hiring tools is the logic is opaque. You have no idea why an AI system is rejecting or selecting candidates and you may find it's doing something illegal. Some companies want to hide behind this opaqueness, arguing that if no explicit decision was made then there is no bias. But that's not how system racism works.
There are many such signals that correlate with race that if they affect selection rate, it could be a problem. Did you go to an HBCU? Was your high school in a minority-majority area? What about your previous employers?
> If you find yourself wanting to disagree with that then, I'm sorry but you simply don't know what racism is.
You are saying that if you think anti-white racism can exist, you don't know what racism is. That's obviously ludicrous.
etchalon 22 hours ago [-]
There are essentially two definitions of racism at this point.
The colloquially version, which means "prejudice based on race" and a second version, which specific groups and people have advocated for, which means something like "structural oppression through cultural and governmental means". It's more complicated than just that, but it's a fairly narrow term for them.
So when one person says "there's no such thing as anti-white racism", you hear, "No one's prejudiced against white people for being white!" Obviously, that's ludicrous.
But that person is likely using the, I have no idea what to call it, "advocate definition" maybe, definition would which preclude anti-white racism from existing within that narrow definition of racism.
So it's a debate where people aren't speaking the same version of a language, convinced each other are uninformed, reactionary or stupid.
peyton 1 days ago [-]
[flagged]
anonfunction 1 days ago [-]
This is something I've been working on exposing to AI labs through my startup LatentEvals[1], and found similar results in other industries from lending to insurance claims.
Happy to share some sample reports if anyone is interested!
Don't have much to add beyond being grateful for everyone working to call this out, with a hope some lawsuits drop and our SCOTUS doesn't decide racial bias in AI is fine because we can't prove the AI is racist in its heart.
I then went on to work for multiple firms that placed a premium on candidates from Ivy League/Top Tier (Stanford/Duke etc) candidates.
This taught me that:
- Their are pros and cons to any selection criteria.
- There are smart people everywhere. One of the smartest people I ever worked for spent several years in prison for drug dealing. He was on par with many of the Managing Directors I've worked for
- There was a study where they asked big bank recruiters which school consistently produced people who were excellent employees 2-3 years out from hiring and the answer was Penn State (not my alma mater)
- There used to be "manager's choice" hires where managers had 1 slot in a training program where they could select whoever they wanted. Sometimes that was terrible. Sometimes that person was top of their training program.
- Smart people are just as capable as creating problems as less intelligent people. Smart people, in some ways, are better at creating problems. Especially if the incentives reward them for creating those problems.
This seems like a great idea to me if you institute a feedback loop so managers who pick trash eventually lost the option to pick, and managers who pick rock stars eventually get more picks / more responsibility.
If we move to using just a small number of AI models to help do things like hiring, we will amplify biases and possibly completely lock out portions of the population. We need to be very careful when using AI systems to evaluate people in general -- not because they might be biased (which they might be), but because even a small bias, if used by virtually everyone, can be damning.
I don't think this even requires any bias.
Assume there's some loose ordering of who is or isn't a good hire, and every employer has their own fuzzy view of it. If you get slightly better or worse as a potential hire (pick up an extra degree, let your latest certification lapse, whatever), it gets somewhat easier or harder to get hired.
Now assume that same ordering, but all employers share the same view of it. I'd expect the divide between employable and not employable to be much sharper.
Also, we will of course have all kinds of attempts to "game" the system to get ahead. Optimizing (even more) for the metric. Degree mills, for instance.
However, "this AI model can decide that some subset of people, perhaps random, perhaps not, are simply not hirable for any job" makes sense to most people regardless of political bent.
For example, the article here shows disparate impact: that different percentages of applications are passed through the AI filter. But it doesn’t show differential treatment of otherwise identical applications based on race.
Note that the OP uses "systemic rejection", while the paper does reference bias, it is in the precise meaning of the word.[0] And this is not targeted at the general public.
You may want to look into the 1990 GOPAC handout "Language: A Key Mechanism of Control" to understand why some groups would simply just weaponize any term that was substituted. Academic papers need to error on being precise, to be effective, not focused on handling the general public with kids gloves IMHO.
Edited to add, listen to Lee Atwater's 1981 Interview on the Southern Strategy for even more context.
[0] https://arxiv.org/pdf/2605.27371
You might be able to avoid it with a panel of AI, similar to how we try to avoid it by using panels of humans, but even that turns out to be contentious and not surefire.
I have feeling with AI it'll be even worse, since folks / companies can pass the buck (similar to how health insurance companies are now using it to deny folks).
Unless you're taking the "there are multiple mathematically incompatible ways to define bias" view of the topic, just do what's already known best practice for high-bureaucracy human review. Which is too define an overly-pedantic standard rubric.
A lot of the capitalists see that as a positive.
But LLMs are statistical models. They are aggregating all biases into a general super bias. And they're all converging towards the same solutions.
Definitely open to opposing or critical views
If you click through, the paper says the race is self-reported.
“Our data tracks 4,197,168 applications. It includes applicant gameplay features and for each application, the application date, the position name and employer, metadata about the position and employer, and the numerical score and final recommendation each applicant received for each completed application. 40.2% of applicants self-report race with a breakdown of 16.8% Asian, 14.2% White, 3.6% Black, 3.0% Hispanic, and all other racial categories below 2% (i.e. fewer than 100,000 applicants).”
The dataset is constructed, deliberately, to hold candidate performance constant and vary the names of candidates to appear to be associated with a specific race.
But they picked 9 family names per group. Which sounds quite low. And combined that with first names to reach 500 first+last names per group.
I wonder how much of the bias we see has to do with the names actually picked versus it being racially motivated (absolutely not denying that this probably is a factor, but might not be the only one).
For example, in France there is the national BAC end of high school exam. If you you at the names X grade distribution, and look at the higher “very good” bracket: some names are heavily under-represented (less than 5% of say “Jordan” get that grade) while some are over-represented (35% of “Josephine” get such a grade). The exam is for the most part anonymous, but some names are definitely heavily correlated with lower/higher income groups. So nothing surprising: Josephines tend to come from richer families, thus in average get better education/support, thus better grades. Same thing is true with family names to a smaller extent.
So I wonder how much of the bias we see, be it from real persons or the AI has more to do with a class thing than a racial thing. Again those are not neatly separate things, but still
Cool.
In any event, I'd happily support a ban on all parts of the ATS that could be involved in automated approval, rejection, or scoring being able to see candidate names. But I sense the author of this has a bigger agenda.
its going to be in the rest of the data because race has a meaningful correlation, and pleanty of causation with being disadvantaged in real ways, that can also affect the ability to then do certain jobs.
like, the environmental pollution and building interstates and freeways through black communities, on purpose to do bad things to those communities, then results in a bunch of noise and particulate pollution, that is bad for developing brains.
you wont be able to do some meritocratic non-racist hiring without fixing the environmental racism. otherwise youre just mirroring racism other people built for you
I'm not saying AI is not biased, but this study does not prove that.
[0] https://arxiv.org/pdf/2605.27371
From the paper:
> Fig. 1. The pymetrics process. > Stage 1: Applicants apply to positions. > Stage 2: Applicants are directed to the pymetrics platform to play assessment games. > Stage 3: pymetrics algorithms use applicant gameplay features to recommend 58.2% of applicants per position on average. > Stage 4: Employers decide which applicants to interview or hire, typically rejecting applicants that were not recommended by pymetrics.
High-risk – AI applications that are expected to pose significant threats to health, safety, or the fundamental rights of persons. Notably, AI systems used in health, education, recruitment, critical infrastructure management, law enforcement or justice. They are subject to quality, transparency, human oversight and safety obligations
That's a pretty common sense legislation to me.
Of all the things listed "recruitment" doesn't belong to me. Is the argument that it is someone's fundamental human right to get someone else to pay them to do a job? Or is it strictly about human oversight?
There's no reason to single out AI vs any other approach to the same topics.
We find applicants are more likely to be rejected from every position they apply to than would be predicted by the baseline of each position making statistically independent decisions.
Obviously a rejected resume is more likely to be rejected by every other employer and an accepted resume is more likely to be accepted by every other employer. Like online dating, most employers are looking for some baseline indicators that you are going to be successful and stable.
But that wasn't the case for non-algorithmic screening. From the paper:
"By contrast, we find that when first round screening is not mediated by a single screening procedure, systemic rejections are close to the baseline. To support the empirical validity of our baseline, we study homogeneous outcomes in the largest study of first-round screening at U.S. employers to date. Kline et al. [38] generated 83000 synthetic resumes and submitted these resumes to vacant positions at 108 US companies between October 2019 and April 2021, a similar time period to our data. The companies, which are a subset of the Fortune 500,15 collectively employ 15 million workers. We analyze the homogeneity observed in the resulting callback outcomes in their data. We find that the baseline is an effective estimator of the systemic rejection rate for this dataset. As shown in Figure 3, the observed systemic rejection rate is accurately predicted by the baseline and a chi-squared goodness-of-fit test cannot reject equality of the two distributions (2 = 20.05, = 0.69). In other words, while the largest previous study observes systemic rejection rates consistent with employers making statistically independent decisions, the algorithmic hiring data shows significantly correlated outcomes that lead to higher-than-baseline systemic rejection rates."
Obviously as a hiring manager you're looking for a hard working individual with a number of successfully completed projects and glowing referrals from multiple places of employment, but you're also looking for a person with expertise in particular technologies/industries/whatever other areas of expertise. To a large extent requirements for each role are unique, however many do have some overlap.
So being rejected from one position might simply mean there's a misalignment between what the company is looking for and what the individual has. Which might not be the case with other companies.
So if we're seeing increasing number of candidates being consistently rejected at multiple places the question "why" is a valid one.
Actually the fact that they found this result didn’t hold in a different dataset is especially weird.
This makes sense to me, albeit intuitively and in a way I can't articulate.
> an accepted resume is more likely to be accepted by every other employer
but this doesn't necessarily follow from the prior for me. Plenty of people get really good jobs and are really successful in them only after dozens or hundreds of rejections with a nearly-identical resume.
I just think certain resumes will get an interview almost every time in some industries and certain resumes will likely never get an interview almost every time, but the majority of resumes are like you say have different aspects that appeal to one empoyer over another.
That seems like a nonsensical way to measure racial discrimination. What could justify it?
It indicates there may be adverse impact to one group. It specifically is not used to resolve racial discrimination.
It's purely a signal for "we should consider asking more questions, because this appears unusual". That's what your quote says too, it "flags" a low recommendation -- it's indicating further study and investigation is likely warranted.
"Adverse impact occurs when there is (i) practically and (ii) statistically significant disparities in the selection rate for the group of interest when compared against the selection rate ′ of the most selected group ′ . Practical significance requires the impact ratio ... to be less than 0.8, which is why the EEOC guidance is colloquially referred to as the 'four-fifths' rule."
The headline numbers reflect the positions for which the 4/5 rule was triggered, not the result of some further investigation: “We discovered that 26% of Black applicants and 15% of Asian applicants applied to positions where the AI system discriminated against their racial group.” Based on the methodology, I think that means that 26% of black applicants applied to positions that were flagged under the 4/5ths rule.
it sounds like how you'd get that kind of metric at least
This doctrine is the basis for much of employment law. It is a significant reason why employers don't administer IQ tests (or equivalents) to screen candidates since ~the 90s.
A common objection to the doctrine is that it leads to unfalsifiable discrimination claims, which is why it seems nonsensical to you.
If the issue happens upstream of the defendant to a claim - generally an organization being sued by an individual with fewer resources - it incentivizes such entities to push for changes upstream, so that they don't get stuck with the bill.
We have a "disparate impact" and nobody can prove what proportion of it is due to things like parental income or childhood education as opposed to racism on the part of the employer. Because the former considerations are real contributors, the metric can regularly be expected to exceed the threshold even if the contribution of racism by the employer was zero. Doesn't that imply that we're essentially accusing people of racism at random?
> because the impact exists whether intent can be shown or not, the desire remains to ameliorate that impact.
The median household income for Asian Americans of Indian ethnicity is more than double those of Burmese ethnicity:
https://en.wikipedia.org/wiki/List_of_ethnic_groups_in_the_U...
This is objectively a disparate impact and likely shows up in several other metrics in addition to income. Disparate results can almost universally be obtained by arbitrarily segmenting the population into different groups and comparing the midpoints. Americans of Australian ancestry have a higher median income than those of Irish ancestry, Bolivians higher than Cubans. The result is often because the lower down group has a history of being oppressed.
What reasoned means can we use to determine which groups get the benefit of these methods to ameliorate the disparity and which don't? What should be done about the inherent impossibility of doing them simultaneously, e.g. because hiring a South African woman over a Haitian man would reduce the disparity on one axis while increasing it on another? Notice that considering each group separately could result in unconditional liability because either available alternative puts you over the threshold for one group or the other.
> If the issue happens upstream of the defendant to a claim - generally an organization being sued by an individual with fewer resources - it incentivizes such entities to push for changes upstream, so that they don't get stuck with the bill.
Do we want to apply this logic to other things? The median income in California and New York are significantly higher than they are in Alabama or West Virginia and they have higher ranked public schools. We can correspondingly expect that when applicants from different states apply for the same job, the ones from California and New York (even if they're the same race etc.) are more likely to be selected because they had more advantages growing up, even though none of them chose where they were born.
By the same reasoning we should then have the federal government penalize employers for hiring the applicants from the more affluent states so that it "incentivizes such entities to push for changes upstream, so that they don't get stuck with the bill." Does it make sense to do that?
That's the beauty of aggregate statistics. You have some elite job with 100 slots. Before you had 60 of the slots going to cronies and the remainder being allocated on merit. The cronies were disproportionately of the same ethnicity so your statistics are skewed, but don't worry, all the cronies still get their slots. Because statistics can be balanced by getting even more cronies, this time of a different ethnicity, and giving them as many of what used to be the merit slots as you need to manipulate the average. Using statistics is perfect for pretending that you're giving people something when you're actually taking something away.
If you're making the claim you need to provide the evidence.
Most people would say that a persistent disparity means it's possible there is discrimination, but it's not definitive proof.
There is a large body of literature concerning the question "does disparate-impact enforcement cause employers to alter hiring behavior in ways unrelated to actual productivity or discrimination?" and the answer is largely "yes". As you suggested elsewhere in this discussion, Google may be useful.
To act like it's bad that people of colour have a more fair chance of getting employed because of some piece of legislation is simply insidious. It's just been over a month since black people lost the right to a fair vote.
Literally the opposite happened. The Supreme Court ruled that there was VRA §2 liability when there was evidence of racially-motivated gerrymandering: "In short, §2 imposes liability only when the evidence supports a strong inference that the State intentionally drew its districts to afford minority voters less opportunity because of their race." (Louisiana v. Callais, p. 26)
Are you suggesting that companies should violate the law here? What do you recommend?
Edit: charitably, "adhering to the letter of the law" is sometimes shortened to "law-abiding" and is generally what we want.
Prior to the beginning of your excerpt is the word "You", meaning the comment's author is the subject, not "companies". I'm saying the commenter is appealing to black letter law for the answer to the question "what happens when..." but we have observational evidence to answer the question.
Isn't the point that the observational evidence amounts to the companies in question steer clear of illegal behavior?
There are anti-money laundering laws, so banks institute procedures to help them comply. Yes, we expect companies to change their processes so they don't break the law. That's the point of the law.
I am confused with what you think companies should do in this situation. Expose themselves to legal and civil liability? Or change their behaviors so that close scrutiny indicates they are trying to comply with the laws and any bad actors acted against internal procedure?
You're arguing about something else, using the form "given we know that racism is happening, and furthermore we know where and how it is happening, why shouldn't we just do these Totally Cost Free and Obviously Good Things that are just like fighting money laundering"
Well, I just don't think any of these things are as evident as you seem to assume they are. Also fwiw I grew up in the US, where I was called all sorts of slurs -- like, the real ones you probably censor in your internal monologue when you see them written -- throughout K-12 education. I still don't believe in the existence of pervasive and oppressive racism the way you seem to assume it.
The assumption that applicants from all races are on average equally qualified for every position. Whole subfields of modern academia are based on that assumption.
Individuals are qualified or unqualified. If a company happens to end up with less than 1/4 Ravenclaws or not very many Virgos, it doesn't mean hate is a reason. It could be that the Ravenclaws that applied were a bit less qualified than those from the other houses.
I guess my point is, doing the statistical analysis for race and gender and drawing conclusions, while being completely blind to the one single factor any sane hiring manager should be focusing on -- actual qualifications for the role -- doesn't make any sense.
Don't claim AI is discriminating against non–selects, though.
I doubt companies are using Gr*k to make their hiring decisions.
I am getting downvoted because it's hard to admit that AI only reinforces the culture that it is trained on. It is the perfect technology to keep systemic racism in place, all while being the perfect scapegoat for lack of personal or corporate accountability.
Because surely no one would have legitimate preferences based on their gender, cultural norms, etc. or real differences in aptitude due to childhood exposure, education, or said norms and preferences.
Here's some analysis of what it is and why it's useful as a canary in the coal mine: https://www.prevuehr.com/resources/insights/adverse-impact-a...
> Since the 80% test does not involve probability distributions to determine whether the disparity is a “beyond chance” occurrence, it is usually not regarded as a definitive test for adverse impact. Instead, other statistically significance tests, such as the standard deviation analysis, may be used for this purpose.
But then my question recurs: isn’t this a ridiculous way to measure discrimination? It’s assuming that the only thing that differs between the different ethnic applicant pools is their ethnicity, which is essentially never going to be true.
Like. If I am evaluating a developer on lines of code written, I am a bad manager. But if an engineer has 40% fewer lines of code than the team median, it's absolutely ok for me to go, "Interesting. What's the story there? Are they slower or is there some other factor?"
Same idea -- this is purely a fast, first pass metric that can quickly assess if something warrants a deeper evaluation.
I expect Median LoC might be very high with the average developer using AI these days... but the dev who is making atomic changes that are fixing the AI output is probably tiny LoC but way more important
If you are trying to say "more data needed, headline misleading" you should say that instead of misrepresenting the 4/5ths rule. Also the word "can" implies uncertainty of conclusion. This isn't ridiculous, the authors point out that this is the first large scale study of this topic. Nothing has been "proven" here, it's showing that this warrants further investigation and attention.
Do you read many academic papers, because you seem to be having a rough go here.
The authors of this study have done that here. To borrow from your example, if you saw a statistically significant amount of my posts highlighting the merits of the Iranian government in a ways that run counter to the general global consensus of their actions, you would then have something that other people might agree was worth looking into. A hypothesis is not “misinformation”.
This article has not claimed to have proven anything other than outcomes in a process. I don’t understand why this is so upsetting to you.
They find "disparate impact" of pymetrics across racial groups, but it doesn't seem like they controlled for anything.
"Cards held by African-American sellers sold for approximately 20% ($0.90) less than cards held by Caucasian sellers, and the race effect was more pronounced in sales of minority player cards."
> Our research also found that this pattern does not appear to be the case in other circumstances. We analyzed data from the largest prior study of hiring decisions, which sent 83,000 applications to 108 Fortune 500 firms during the same time period as our study and did not focus on whether AI was used to make decisions. We found that the rate at which applicants were rejected from every firm they applied to in this data was no higher than what you’d expect if each company decided independently of the others.
It sounds like this study was using real-world applicants, and the other study they're comparing against was using synthetic applicants.
Consider the chance of being accepted as being composed of signal+bias+noise. Noise is random. Signal is a per-applicant value, and what's meant to be measured. Bias is a per-group value, and an artifact of the measuring process.
If acceptance/rejection is independent between positions applied for (as in the synthetic applicant study), that suggests that it's random or composed entirely of noise; ie there is no signal; ie the applicants are all equally qualified.
If acceptance/rejection is correlated, that means there is some nonzero amount of (signal+bias). But real-world applicants are not all identical, so there should be some amount of signal. So you can't just assume zero signal in order to infer that there must be bias.
A inferior candidate (by skill) is going to be consistently rejected, no?
:)
My point is that this is effectively an auto rejection
>If the AI had recommended Black and Asian candidates at the same rate as it recommended the most-favored group (typically white applicants), 40,000 more of their applications would have advanced to the next stage of hiring.
I don't think this is the right benchmark here, or at least, it would be very interesting if the actual outcome, offer or rejected, was considered at the end.
For the AI study real data from "3.4 million people who submit 4 million job applications to 1,700 job postings across 150 employers and 11 industry sectors" was used.
https://www.yahoo.com/news/us/articles/california-judge-upho...
Sounds a bit like Simpson's Paradox [0]
[0] https://en.wikipedia.org/wiki/Simpson%27s_paradox
AI works by learning patterns. So it will become bias by just learning from factors like education history, schools attended, employment history, ZIP codes, or geographic location. Those 3 factors alone are an easy proxy for race.
And if you add names into the equation (if the AI was trained without removing applicant names), the model can become even more bias.
I see nothing that shows any system was making a decision on race. How is the race being presented to the AI?
All this is showing from what I can see, is that certain groups of people were more often denied a next step in the process - but why?
Was the AI going by spelling and grammar? Were there names that were different but the rest of the resume was exactly the same? Were there pictures?
There were mentions that the rate of each group may be more prominent in the data when you split apart different types of jobs instead of all jobs in aggregate.. One could read that like it's inferred; that more warehouse jobs are offered to a race and less admin jobs.. but that same would happen if AI was more focused on perfect grammar for one job and it was not as much of a factor for a warehouse job.
Also if the people applying for the various jobs were self selecting, acceptance percentages this would skew things based upon which ones were applied / not applied to right?
There are so many ways you could draw conclusions like this from data, however correlation is not causation, yet this seems to say it is.
I feel this is an important thing to watch, but Stanford may not be the place to trust with 'Policy Recommendations' as it's very unclear there is any proof that 'AI Hiring Tools Yield Racial Bias and Systemic Rejection' from this study and paper.
PS - now that I see the HN title did not have the word "can" in it, and the title of the article is actually "Tools Can Yield" - maybe that is less accusing and more noting.
I guess this one just compounds.
[1] https://news.ycombinator.com/item?id=48620142
Only 40% self report gender/race
no resume data, no education information, degrees, schools, GPA, major, work experience, skills/certifications
Zero job qualifications
> 30% of Black applicants apply to at least one position that demonstrates adverse impact against Black applicants.
The whole thing reads like a tautology.
https://news.ycombinator.com/item?id=48620142
> We follow 3.4 million people who submit 4 million job applications to 1,700 job postings across 150 employers and 11 industry sectors. Each job application was assessed by an AI hiring tool built by a single third-party vendor.
3.4 million people applying to just 150 employers... Who are all using just 1 platform. WTF. This is where the discrimination is happening. Why the f do 3.4 million people feel forced to apply to just 150 employers and why the f do all these 150 employers feel forced to use just one platform. WTF.
Even if that's just part of each company's total hiring pipeline, it's clear; something's wrong. I don't know how long this study has been running but 22k is a lot of people, even over a year. These companies are too big. That's the problem.
I would be surprised if the results were different.
Hypothetical SAT score: 1060
How does that help you predict the race of an individual applicant? It's been a while since I took the SAT, but I didn't realize one's score provided so much information.
The paper's conclusion, that we need to study this more, is showing the authors likely believe this to be a byproduct of inherent/invisible bias.
Some people just can't help but put their biases on display at every opportunity, even when it comes to the most minute details.
The phrase "most-favored" means, "most recommended by the AI relative to the field".
What did you think this sentence meant?
(I assume they're just using a big LLM for this, it doesnt say, it just says "AI" when they say "AI like that they usually mean LLM".. A custom trained hiring ML system would be better)
And so managers are feeling social pressure to hire under qualified Asians as well? I must not be up to date on the latest culture war talking points, because I thought Asians were underrepresented.
but you want to spew nonsense. every racial group includes its own under-qualified people ! there's no social pressure i.e DEI excuse you wanna give - but just economic agents acting for their own interests
I tried it before, and discrimination is there, I would get one resume rejected quickly and few days later the same company would invite another resume for a screening call. I tried this before and after AI hype, results weren’t that different btw, and that was tested in US and Canada employers only.
Too many of these studies only focus on percentages and the end result is unqualified candidates getting hired from minority groups at the expense of qualified ones.
The authors are saying it's worth doing more research, because in a controlled data set the results appear unbalanced.
Looks like you didn't read the paper. There are no resumes involved. It is about assessment games.
https://www.aeaweb.org/articles?id=10.1257/0002828042002561
> There is no such thing as anti-white racism.
If you find yourself wanting to disagree with that then, I'm sorry but you simply don't know what racism is. Racism is pervasive, insidious and systemic.
A good example in the hiring space is what's called the "second syllable name problem". Traditionally Afrcian names often stress the second syllable (eg Jamal, Lakisha, Malik, Lashonda). Studies have shown that such names have higher rejection rates in job applications [1]. So if you're wondering about the four-fifths rule, it's because it exposes this kind of bias. It's not proof of bias. It simply means further investigation is required.
The problem with AI hiring tools is the logic is opaque. You have no idea why an AI system is rejecting or selecting candidates and you may find it's doing something illegal. Some companies want to hide behind this opaqueness, arguing that if no explicit decision was made then there is no bias. But that's not how system racism works.
There are many such signals that correlate with race that if they affect selection rate, it could be a problem. Did you go to an HBCU? Was your high school in a minority-majority area? What about your previous employers?
This kind of bias doesn't have to be intentional.
[1]: https://www.npr.org/2024/04/11/1243713272/resume-bias-study-...
> If you find yourself wanting to disagree with that then, I'm sorry but you simply don't know what racism is.
You are saying that if you think anti-white racism can exist, you don't know what racism is. That's obviously ludicrous.
The colloquially version, which means "prejudice based on race" and a second version, which specific groups and people have advocated for, which means something like "structural oppression through cultural and governmental means". It's more complicated than just that, but it's a fairly narrow term for them.
So when one person says "there's no such thing as anti-white racism", you hear, "No one's prejudiced against white people for being white!" Obviously, that's ludicrous.
But that person is likely using the, I have no idea what to call it, "advocate definition" maybe, definition would which preclude anti-white racism from existing within that narrow definition of racism.
So it's a debate where people aren't speaking the same version of a language, convinced each other are uninformed, reactionary or stupid.
Happy to share some sample reports if anyone is interested!
1. https://www.latentevals.com/