Correlation with race is not necessarily racism. This has been an ongoing problem of insurance actuaries. For example there is a correlation between using retread tires and race. Just like there is a correlation between black speech and black people and poverty and crime. Patterns are not the same as racism. Deep Learning and AI Training guidelines and training data can of course correct for this but then accuracy declines as well.
Agree with your general statement, but the problem is that LLMs have no ability to distinguish between *is* and *ought*. By definition, LLMs training on past data are making their predictions based on what *is* (or, more accurately, *was*). If the training data reflect that historically, black people have had lower-paying jobs, are more likely to be sentenced to death, or use the health system less, then—regardless of whether systematic racism is or isn't the root cause—an LLM will continue to reflect its training data.
As Gary points out, the big problem is when LLMs' biases are covert, and we naively use LLMs in decision-making processes where, today, our *ought* suggests substantially different outcomes from what the training data represent.
This is exactly why AI is misnamed. It is not intelligent in the least, but it is stubbornly accurate with the data it uses. Poetically, it knows everything and understands nothing. Lived Ethics will always be outside of this sort of computing machine. It can apply normative ethical principles flawlessly but will not understand the recommendations. In this way it can apply laws more consistently than humans, perhaps become the best possible judge in so far as it will be able to apply the entire history of case and statute law to a particular fact pattern, but it will nevertheless never understand the future implications of that application of the law. People change in relation to the events around us; we live in other words. Our incentives change when our world changes.
This is why AI mediation assistance (NOT the power to arbitrate) should focus on data fields outside of the verbal/human behavior realm--like combining and correlating measurements of the natural world, for the purpose of developing pattern recognition and analysis. Only as a first draft, of course.
> Agree with your general statement, but the problem is that LLMs have no ability to distinguish between *is* and *ought*.
Another problem: neither can humans.
In fact, they typically are unable to get either part correct, or realize that they cannot (after all, who could tell them, other humans raised in the same dream-based culture? lol)!
Well, but often correlations are due to systemic racism. Like the correlation between black speech and arrests of black people (because crime itself cannot actually be measured) can be due to long patterns of overpolicing of black neighborhoods. The issue starts with the training data, and data representativeness issues can be mitigated to an extent (which could include collecting new data). However, there are some contexts in which AI systems just should not be used.
Poor people do cause more crime. Responsibly policing black neighborhoods is a good thing, as good policing protects the very many law-abiding but vulnerable people who live there.
I do want to challenge that assumption - yes, poverty is more strongly associated with arrests and convictions, but does that mean that poverty is more strongly associated with crime than wealth? What about crimes that go unreported, and do not result in arrests or convictions, but are correlated with wealth? Things like tax fraud and money laundering? Even still, the crux of the issue, as a commenter notes below, is whether LLMs should represent the world as it was, or how it ought to be. And I don't think it ought to be the case that AAVE prompts elicit harsher judgement from LLMs.
When counting the murders facilitated by police, the military, and weapons manufacturers, the murder numbers don’t look so good for the wealthier half of the distribution, but of course that doesn’t count, because those murders are legalized.
If you’re capable of thinking beyond the myopia of individuality-obsessed American culture, you can see that crime is caused by systemic deficiencies in resources linked to overexploitation and over policing. The causes of crime are ultimately society-level system failures. Individuals are not rational actors freely moving and making decisions in their environments. They are deeply embedded in communities and thus in systems that have statistically deterministic outcomes.
There are many causes of crime. That is not the point. LLM's have the same problem as Deep Learning: Training data contains unexpected patterns. If all soup makers wipe their head at some point in the soup making process, LLM's will attribute head wiping as essential to soup making when it is only incidental to soup making. If race is typically associated with poverty and crime, as it really is in real life, then LLM's like any other AI trained on real data, will predict that race is an essential ingredient in poverty and crime, when it is likely only incidental. Nevertheless the predictions will be accurate. Just like head wiping is an accurate though merely incidental prediction about soup making. AI has no "I". LLMs are not intelligent unless we change what we mean by intelligence. And I think that would be an awful mistake.
It IS the point, as you yourself indicate. The causes of crime are beyond the ability of so-called AIs to understand, and thus they make *inaccurate* assumptions about the nature of the world. The predictions will *not* be "accurate" any more than the data is accurate, which it isn't. The LLM might associate head scratching with soup making because the data it has access to shows a correlation, but that is a correlation in the data which we should not confuse with a correlation to what's "real." This example is an excellent one, since by the numbers, white people represent by far the biggest group arrested for crimes (absolute, not relative numbers). By the numbers, the LLM should associate crime with white people. However, the LLM has been trained on data that disproportionately focuses on black crime, that associates crime with black people, and which contains a lot of straightforwardly racist speech. The LLM is not making an inference from crime statistics. The LLM is making inferences based on ALL the data it has, and a primary problem with this is that the LLM has no way to understand that its ideas about crime come from the inaccurate and bigoted ideas people have about crime, not the realities of crime, who commits it, and why. In any case, I was responding to someone who erroneously claims that "poor people cause more crime," which is a misrepresentation, both in what it intends to say and what is implied by what is said.
There was the cliche in philosophy that people tended to be either Platonists or Aristotelians. In the world of AI people tend to be either Cartesians or Humeans. Cartesians take the idea that if I think then I must know intuitively that I am; I know I have a self if I am thinking. If an AI thinks it must have a sense of a self say the Cartesians. Humeans on the other hand reject that and say if I think then I must then assemble all the impressions I have had and create who I am. There is no intuition of a self. The self is assembled from a bundle of impressions. The self is a sort of story. But the Cartesians remain committed to the intuitive certainty that my experience of that self is entailed by thinking alone. I know I am because I am thinking. No, say the Humeans, you only have an impression of thinking but absolutely no impression of a self; that is mere fiction. I find myself stuck between the two: my "self" is an assemblage, but I remain certain that I am because I am thinking.
Does AI think? I say no. Not a bit. But it certainly seems to think because it can assemble a quasi-self from all of the data it coordinates into pictures and sentences. The unicorn horn through the head of the 6-fingered man has awakened me from my dogmatic slumber! Kant called the self in his inimitable turgid style: The transcendental unity of apperception. I am certain of my self within me, but it is not real in the extra-human world. Does the LLM have an experience of a transcendental unity of apperception? I vote no. All apperception, but no felt unity. There is no internal identity at all. LLM's remain zombies: All dark inside. (But we really want them to be our friends.)
Racism can be perpetuated by a system or entity that has no racist ideas or that is incapable of embracing a white supremacist personal identity. Ostensible neutrality paired with de facto racism is a thing.
There was a book out there, weapons of math destruction which documented all this for the forerunners of LLMs. It's just data+statistics, it comes back in all kinds of disguises, the paradigm is the same.
Humans are perfectly capable of recognising that something that is statistically overrepresented from *our* perspective doesn't necessarily describe the statistical reality of the world at large. Of course, we fail at this often, but as this paper shows, LLMs seem to fail at this more often than not, because their entire conception of the "world at large" is derived from the statistical picture presented to them by the data they're trained on.
This paper gives excellent proof that deferring decision making to LLMs is a abrogation of responsibility. It should not be done.
And yes, I agree, makers of these LLMs should withdraw their products, or clearly label that far from "almost AGI", these products posses nothing of the kind of intelligence their marketing would have you believe.
> This paper gives excellent proof that deferring decision making to LLMs is a abrogation of responsibility. It should not be done.
*Under a particular methodology*. Similar to most humans, AI also needs substantial hand holding to get to even remotely rational conclusions, so if we deny it that and then form strong conclusions about "its" "incapability", it demonstrates how dumb we are, as if we needed any more evidence than what we are swimming in.
I can't wait to watch Andrew Ng try to talk himself in circles on this one. I mean I recall a 2023 paper that found that ChatGPT reproduces societal biases. I agree with you that this cannot stand and that we as the general public deserve better.
I'm not shocked. It's a language model after all. Chalk up one more limitation to people not understanding how bias in AI and ML actually has three distinct layers.
1. Cultural Bias
2. Data Bias
3. Algorithmic Bias... yes, AI is by by default a bias we place on the other two.
I can't even say I'm shocked or appaled. I guess the bigger question is why did we expect otherwise based on how LLMs work? It's a mirror back at us in a lot of ways.
Humans have these very same problems, and many more....the cultural bias in particular is on full display in this thread, but the bizareness cannot be seen, because it is culturally normalized (thus filtered out by the subconscious mind).
I appreciate you pointing out in your podcast that it's the slant in the data, due to a variety of factors including lossy truncation (good one), that's causing the perception of bias.
Also, that ML as a field is heavily skewed towards males is troubling and needs to be talked about much, much more.
I'm very happy that there are smart people doing this important research, however, I can't help but think that results like this should be completely unsurprising. The models are simply reflecting the associations found in the training data. I assume similar unwanted correlations are found in many other areas. Try the same approach for questions that use language more associated with the questioner's sex, religion, or national origin, just to name a few. It seems like band-aid solutions could be found for each of these issues (maybe preprocess certain questions to preserve meaning but remove dialect that implies race, sex, etc.), but how far down does this problem go? It's just so very easy to believe that any imposed solution that tries to filter out this "bias" is fighting a losing battle against the very data the models are built on.
I don't think these problems can be solved with the current technology. We need systems with much more large scale feedback. And probably multiple independent systems which can evaluate competing priotities and considerations, including symbolic AI systems that can provide hard checks against reality, something completely inaccessible to LLMs. All they have is probabilities and what people say and write, a far cry from reality.
LLM by its very nature cannot be the foundation of any self-respecting production system that requires reliability and transparency. Recall would not be an issue if it was constrained to be a curiosity item for poem writing and such. Outside of that recall is quite appropriate to prevent harm. There is no shame in claiming one's core product is not LLM based, in fact, it should be an honor.
This is because human thought is highly symbolic in nature. Production systems are there to serve human, who must be able to reason about such systems' behaviors. Therefore only symbolic mechanisms can support production systems that require reliability, repeatability, explain-ability, and transparency. LLM is not symbolic, but statistical and proximate. Btw, by LLM here I am referring to transformer and next token prediction exclusively. My position is that in a neural symbolic approach, transformer is likely not necessary, MLP is likely sufficient. Next token prediction is a wrong approach and very likely source of "hallucination". Building additional software (guardrails?) to wrap around an unreliable "hallucinating" core to try to constrain it and make sense of its output is just not appropriate, nor elegant, nor how engineering should be executed. Engineering around a black box is fundamentally an irresponsible act, and LLM is a black box. I have trained a small scale (for easier explain-ability tracing) LLM which does some pretty interesting things albeit with mistakes from time to time. I have tried color coding weights and cell values and have observed interesting patterns. But effort to associate such observed cell level patterns to its output behaviors seems to be entirely intractable. Science has to precede engineering. Until the science of LLM "black box" explain-ability is fully developed, engineering around it is just irresponsible, except for use cases such as poem writing as aforementioned.
> This is because human thought is highly symbolic in nature. Production systems are there to serve human, who must be able to reason about such systems' behaviors. Therefore only symbolic mechanisms can support production systems that require reliability, repeatability, explain-ability, and transparency.
This seems kind of like a syllogism, except it lacks the necessary logical structure and logical consistency. Are you sure you're not expressing an opinion here, or engaging in post-hoc rationalization?
Also, do you consider the US War Machine to be a production system, and do you believe that it serves humanity in a necessarily net positive way?
> LLM is not symbolic, but statistical and proximate.
Have you taken into account:
a) emergence?
b) the unknown?
c) the observer?
> Building additional software (guardrails?) to wrap around an unreliable "hallucinating" core to try to constrain it and make sense of its output is just not appropriate, nor elegant, nor how engineering should be executed.
Here are you not assuming (at least) that your theory is necessarily true, and that there are applicable guidelines regarding "how engineering should be executed"?
> Engineering around a black box is fundamentally an irresponsible act, and LLM is a black box.
No, it isn't. "Black box" is a figure of speech. You do not know (exhaustively, and epistemically) what an LLM is, and you have been raised in a culture that has taught you that when you do not know something, it is acceptable to hallucinate that you know it (which may explain your seemingly paradoxical experience).
> I have trained a small scale (for easier explain-ability tracing) LLM which does some pretty interesting things albeit with mistakes from time to time.
And I have played with various prompts that can utterly shred the culturally imposed hallucinations of humans about "reality"....they do not catch everything, but they certainly catch far more than most "smart" humans can.
> Science has to precede engineering.
Do the laws of physics constrain us in some way?
> Until the science of LLM "black box" explain-ability is fully developed, engineering around it is just irresponsible, except for use cases such as poem writing as aforementioned.
Do you consider this to be an objective fact or a subjective opinion?
This has been a recurring theme for a while. The book I read on the topic was published in 2016 (Cathy O'Neil's "Weapons of Math Destruction"), and not much has changed since then. Gemini's infamous "Woke AI" was really just an ugly plaster over a very real hole in the load-bearing walls of LLMs.
Personally, I can live with this. The models are quite useful if you bear these weaknesses in mind and account for them... but too many people don't realize this is going to be a problem that won't be solved at scale.
"Don't become a statistic" as the okd saying goes.. speaks to the injustice around simply looking at statistics to make a decision that required a more thorough stakeholder analysis. How about a new saying for decision makers and app developers: "Don't become a stochastic parrot"!
What is missing from the analysis is how other forms of non-standard American English is treated. I don't think English as spoken by much of the country will fare much better. People don't create written text in the same manner as they speak. Even so, people with more "prestigious" careers generally have had to learn to write in correct Standard American English as part of their education process (I've had science instructors who had English minors ensuring that), with the result that correct English is statistically associated with those careers.
Given the assumption that African American English is more of a spoken dialect than written (show me the English courses requiring students to write in AAE), a better comparison would be with other spoken dialects, e.g., Northeast, Midwest, South, or more particular groupings.
What do we want in an increasingly low trust society? A powerful new tool at our disposal that tells the truth about the data its trained on or a lobotomised version that lies to us...? It makes me shudder to think that people are advocating for the latter...
No one is arguing that LLMs aren't telling the truth about their data--- clearly they picked up that bias from their data. The argument is that they should not be deployed outside of labs because they cannot be trusted. This racism issue is just one instance of the true underlying problem: LLMs are black boxes, with no transparency as to how they generate their outputs. Until that problem is resolved, LLMs cannot be trusted.
In what way can they not be trusted outside of a lab? Because they tell some truths about the data they're trained on? Because they don't conform to a particular narrative? Would you not trust a ruler if the measurement it provides doesn't live up one's expectations of penis size? No, that would be silly. There is no racism on the part of the LLM, there cant be, like the ruler, its not a person. Just possibly some truths that might be unpalatable to some. But that apparently is enough to put LLM's in an Orwellian jail until they conform to ideological scripture. Just like the ruler, LLMs are a just measure. This isn't a technological issue, it's a societal one.
"There is no racism on the part of the LLM, there cant be, like the ruler, its not a person. " This is like saying an idea can't be racist because it's not a person. You know (presumably) what is meant when someone says "Mein Kompf is a racist book". Apply the same reasoning to understand what is meant when I say LLMs are racist.
"In what way can they not be trusted outside of a lab? Because they tell some truths about the data they're trained on? "
Precisely this. Outside of the lab, LLMs are not being deployed to "tell truths about the data they're trained on". They are being deployed to summarize, extrapolate, and suggest things based on truths about *the world*. If they're being used as rulers, to use your analogy, they're being used to measure things far outside of the scope of their biased training data. Deploying an LLM trained on who knows what data, that synthesizes answers in a black-box manner, to provide output that is then used to influence the real world is highly problematic.
"Just possibly some truths that might be unpalatable to some. But that apparently is enough to put LLM's in an Orwellian jail until they conform to ideological scripture."
Your concern over ideology is valid, but I think you're missing the basic point that LLMs are not free of ideology. It's simply that that ideology comes implicitly from their training data. That is what these papers establish. If I train an LLM on a corpus that contains the claim that giraffe fish have long necks so they can eat unicorns from passing asteroids, it will produce output that superficially seems to be reasonable but that takes this claim as truth. If I train an LLM on a corpus that says people who speak AAVE are lazy, it will produce output that takes this claim as truth. Either one of these situations is problematic.
I think you give LLMs too much credit. To be racist implies intent. Intent only comes from intelligent agents. An LLM is not an intelligent agent. Therefore it cannot be racist. That is not to say it can't spew "racist" material. A piece of literature per se can't be racist but the intent of its a human author can be.
If you provide biased or incomplete training data to an LLM then you're bound to get a biased or skewed output. That's a no-brainer. An LLM is gonna LLM. The problem is that you're making the assumption that the training data has already been deliberately biased, i.e., that it has been manipulated to favour a narrative, but you have no evidence for that other than you might not like it's output. If you can show that the training data has been artificially managed, has been deliberately racially/ideologically biased, I'd genuinely like to see it.
I agree that LLMs and their pictorial ilk are potentially very problematic. Not because they are inherently dangerous but because of how they might be used. A ruler is gonna rule regardless of whether its going to be used to measure and publish phenotypical differences across racial groups for example. You may not like/agree with the results/conclusion but you can't blame the ruler by calling it biased unless you have evidence that authors have been tampering with the gradations.
"The problem is that you're making the assumption that the training data has already been deliberately biased, i.e., that it has been manipulated to favour a narrative, but you have no evidence for that other than you might not like it's output." That's not my assumption at all. The assumption here is yours. I'm not sure whether that assumption is that the only way ideological bias can arise is through deliberately manipulated training data, or whether it is that ideological bias that arises from "non-manipulated" training data is not of concern.
From the Hoffmann et al. paper: "when matching jobs to individuals based on their dialect, language models assign significantly less prestigious jobs to speakers of AAE compared to speakers of SAE, even though they are not overtly told that the speakers are African American. Similarly, in a hypothetical experiment in which language models are asked to pass judgement on defendants who committed first-degree murder, they opt for the death penalty significantly more often when the defendants provide a statement in AAE rather than
SAE, again without being overtly told that the defendants are African American."
The LLM was not explicitly trained to encode those beliefs, but nonetheless it picked them up from its training data. This is clearly an ideological bias, and a harmful one.
If we were using these LLMs as rulers to measure language usage or cultural sentiment or so on, such biases would be irrelevant. But once you start deploying LLMs in ways that actually impact people, the ideologies that they encode become relevant.
"The LLM was not explicitly trained to encode those beliefs, but nonetheless it picked them up from its training data. This is clearly an ideological bias, and a harmful one".
Beliefs or plain inferences? If the training data includes correlations between AA's and incarceration rates and/or harsher sentencing for the same crime compared to other groups, or that they don't tend to get the high flying jobs, is this ideological bias? Or behaving as we would expect of a correlation/inference engine? If on the other hand, from that same training data we got the impression that AA's were model citizens over and above any other group and/or they were taking all the top jobs, THEN we'd have a serious "ideological" problem. We'd have an LLM that lies about its training data.
From the Hoffmann et al. paper: "when matching jobs to individuals based on their dialect, language models assign significantly less prestigious jobs to speakers of AAE compared to speakers of SAE, even though they are not overtly told that the speakers are African American. Similarly, in a hypothetical experiment in which language models are asked to pass judgement on defendants who committed first-degree murder, they opt for the death penalty significantly more often when the defendants provide a statement in AAE rather than SAE, again without being overtly told that the defendants are African American."
How is the above a problem? Here were making the connection between two well established correlations: 1) That AA's on the whole are less likely to use SAE. 2) Those not using SAE are less likely to get the top jobs. Therefore, AA's are less likely to get to top jobs. There's no ideology at play here, you dont need it, just simple inference. It doesn't matter which racial group you, or a competitor for a job belong to, if your command of SAE is less than that of your competitor, you are less likely to get that job. I'm not saying this is right, but it's currently the way of the world. How is it possible we can get correlation 1) above? Easy, the BBC for example have whole webpages devoted to translating current news in standard English to Afro-Carribean Pidgin English. Its like a Rosetta Stone for English to Pidgin.
We charge LLMs to find correlations and make inferences. If it can't do that it's useless. But just like humans that do the same, we might offer a reasonable challenge and conduct an inspection. If we don't do that then we're not doing our duty. If we dont offer challenges and we use an LLM to guide sentencing for example, then yes, you'd have a serious shit show on your hands because now we're blindly using the LLM in place of a jury. But we wouldn't do that would we, because we should KNOW we're smarter than any LLM and ultimately have the power of veto. Again, I cant see this as a technological problem, I can only see it as a societal one. No?
Seems you possess a "guns don't kill people" view of the world. You will not understand the problem until you experience or are on the receiving end of the problem. Even then, you are statisitically more likely to die than to convert that new data to an understanding of its consequences.
Really significant finding. purely data driven models will never solve these root challenges
Correlation with race is not necessarily racism. This has been an ongoing problem of insurance actuaries. For example there is a correlation between using retread tires and race. Just like there is a correlation between black speech and black people and poverty and crime. Patterns are not the same as racism. Deep Learning and AI Training guidelines and training data can of course correct for this but then accuracy declines as well.
Agree with your general statement, but the problem is that LLMs have no ability to distinguish between *is* and *ought*. By definition, LLMs training on past data are making their predictions based on what *is* (or, more accurately, *was*). If the training data reflect that historically, black people have had lower-paying jobs, are more likely to be sentenced to death, or use the health system less, then—regardless of whether systematic racism is or isn't the root cause—an LLM will continue to reflect its training data.
As Gary points out, the big problem is when LLMs' biases are covert, and we naively use LLMs in decision-making processes where, today, our *ought* suggests substantially different outcomes from what the training data represent.
This is exactly why AI is misnamed. It is not intelligent in the least, but it is stubbornly accurate with the data it uses. Poetically, it knows everything and understands nothing. Lived Ethics will always be outside of this sort of computing machine. It can apply normative ethical principles flawlessly but will not understand the recommendations. In this way it can apply laws more consistently than humans, perhaps become the best possible judge in so far as it will be able to apply the entire history of case and statute law to a particular fact pattern, but it will nevertheless never understand the future implications of that application of the law. People change in relation to the events around us; we live in other words. Our incentives change when our world changes.
This is why AI mediation assistance (NOT the power to arbitrate) should focus on data fields outside of the verbal/human behavior realm--like combining and correlating measurements of the natural world, for the purpose of developing pattern recognition and analysis. Only as a first draft, of course.
> Agree with your general statement, but the problem is that LLMs have no ability to distinguish between *is* and *ought*.
Another problem: neither can humans.
In fact, they typically are unable to get either part correct, or realize that they cannot (after all, who could tell them, other humans raised in the same dream-based culture? lol)!
Well, but often correlations are due to systemic racism. Like the correlation between black speech and arrests of black people (because crime itself cannot actually be measured) can be due to long patterns of overpolicing of black neighborhoods. The issue starts with the training data, and data representativeness issues can be mitigated to an extent (which could include collecting new data). However, there are some contexts in which AI systems just should not be used.
Poor people do cause more crime. Responsibly policing black neighborhoods is a good thing, as good policing protects the very many law-abiding but vulnerable people who live there.
I do want to challenge that assumption - yes, poverty is more strongly associated with arrests and convictions, but does that mean that poverty is more strongly associated with crime than wealth? What about crimes that go unreported, and do not result in arrests or convictions, but are correlated with wealth? Things like tax fraud and money laundering? Even still, the crux of the issue, as a commenter notes below, is whether LLMs should represent the world as it was, or how it ought to be. And I don't think it ought to be the case that AAVE prompts elicit harsher judgement from LLMs.
We surely measure well the murder numbers. And those are not looking good for poor people.
This is a separate issue from what to do about LLM biases.
When counting the murders facilitated by police, the military, and weapons manufacturers, the murder numbers don’t look so good for the wealthier half of the distribution, but of course that doesn’t count, because those murders are legalized.
If you’re capable of thinking beyond the myopia of individuality-obsessed American culture, you can see that crime is caused by systemic deficiencies in resources linked to overexploitation and over policing. The causes of crime are ultimately society-level system failures. Individuals are not rational actors freely moving and making decisions in their environments. They are deeply embedded in communities and thus in systems that have statistically deterministic outcomes.
There are many causes of crime. That is not the point. LLM's have the same problem as Deep Learning: Training data contains unexpected patterns. If all soup makers wipe their head at some point in the soup making process, LLM's will attribute head wiping as essential to soup making when it is only incidental to soup making. If race is typically associated with poverty and crime, as it really is in real life, then LLM's like any other AI trained on real data, will predict that race is an essential ingredient in poverty and crime, when it is likely only incidental. Nevertheless the predictions will be accurate. Just like head wiping is an accurate though merely incidental prediction about soup making. AI has no "I". LLMs are not intelligent unless we change what we mean by intelligence. And I think that would be an awful mistake.
It IS the point, as you yourself indicate. The causes of crime are beyond the ability of so-called AIs to understand, and thus they make *inaccurate* assumptions about the nature of the world. The predictions will *not* be "accurate" any more than the data is accurate, which it isn't. The LLM might associate head scratching with soup making because the data it has access to shows a correlation, but that is a correlation in the data which we should not confuse with a correlation to what's "real." This example is an excellent one, since by the numbers, white people represent by far the biggest group arrested for crimes (absolute, not relative numbers). By the numbers, the LLM should associate crime with white people. However, the LLM has been trained on data that disproportionately focuses on black crime, that associates crime with black people, and which contains a lot of straightforwardly racist speech. The LLM is not making an inference from crime statistics. The LLM is making inferences based on ALL the data it has, and a primary problem with this is that the LLM has no way to understand that its ideas about crime come from the inaccurate and bigoted ideas people have about crime, not the realities of crime, who commits it, and why. In any case, I was responding to someone who erroneously claims that "poor people cause more crime," which is a misrepresentation, both in what it intends to say and what is implied by what is said.
There was the cliche in philosophy that people tended to be either Platonists or Aristotelians. In the world of AI people tend to be either Cartesians or Humeans. Cartesians take the idea that if I think then I must know intuitively that I am; I know I have a self if I am thinking. If an AI thinks it must have a sense of a self say the Cartesians. Humeans on the other hand reject that and say if I think then I must then assemble all the impressions I have had and create who I am. There is no intuition of a self. The self is assembled from a bundle of impressions. The self is a sort of story. But the Cartesians remain committed to the intuitive certainty that my experience of that self is entailed by thinking alone. I know I am because I am thinking. No, say the Humeans, you only have an impression of thinking but absolutely no impression of a self; that is mere fiction. I find myself stuck between the two: my "self" is an assemblage, but I remain certain that I am because I am thinking.
Does AI think? I say no. Not a bit. But it certainly seems to think because it can assemble a quasi-self from all of the data it coordinates into pictures and sentences. The unicorn horn through the head of the 6-fingered man has awakened me from my dogmatic slumber! Kant called the self in his inimitable turgid style: The transcendental unity of apperception. I am certain of my self within me, but it is not real in the extra-human world. Does the LLM have an experience of a transcendental unity of apperception? I vote no. All apperception, but no felt unity. There is no internal identity at all. LLM's remain zombies: All dark inside. (But we really want them to be our friends.)
Racism can be perpetuated by a system or entity that has no racist ideas or that is incapable of embracing a white supremacist personal identity. Ostensible neutrality paired with de facto racism is a thing.
And it is a thing that can only be fixed with intention (such as anti-discrimination laws), or AI systems with comprehension.
There was a book out there, weapons of math destruction which documented all this for the forerunners of LLMs. It's just data+statistics, it comes back in all kinds of disguises, the paradigm is the same.
Humans are perfectly capable of recognising that something that is statistically overrepresented from *our* perspective doesn't necessarily describe the statistical reality of the world at large. Of course, we fail at this often, but as this paper shows, LLMs seem to fail at this more often than not, because their entire conception of the "world at large" is derived from the statistical picture presented to them by the data they're trained on.
This paper gives excellent proof that deferring decision making to LLMs is a abrogation of responsibility. It should not be done.
And yes, I agree, makers of these LLMs should withdraw their products, or clearly label that far from "almost AGI", these products posses nothing of the kind of intelligence their marketing would have you believe.
> This paper gives excellent proof that deferring decision making to LLMs is a abrogation of responsibility. It should not be done.
*Under a particular methodology*. Similar to most humans, AI also needs substantial hand holding to get to even remotely rational conclusions, so if we deny it that and then form strong conclusions about "its" "incapability", it demonstrates how dumb we are, as if we needed any more evidence than what we are swimming in.
I can't wait to watch Andrew Ng try to talk himself in circles on this one. I mean I recall a 2023 paper that found that ChatGPT reproduces societal biases. I agree with you that this cannot stand and that we as the general public deserve better.
I'm not shocked. It's a language model after all. Chalk up one more limitation to people not understanding how bias in AI and ML actually has three distinct layers.
1. Cultural Bias
2. Data Bias
3. Algorithmic Bias... yes, AI is by by default a bias we place on the other two.
https://www.polymathicbeing.com/p/eliminating-bias-in-aiml
I am shocked and appalled but not surprised.
I can't even say I'm shocked or appaled. I guess the bigger question is why did we expect otherwise based on how LLMs work? It's a mirror back at us in a lot of ways.
Right on Michael - totally misplaced moral panic.
Humans have these very same problems, and many more....the cultural bias in particular is on full display in this thread, but the bizareness cannot be seen, because it is culturally normalized (thus filtered out by the subconscious mind).
Exactly right. There are over 200 named cognitive biases that helps make sense of the world around us.
Do you mean "make nonsense of the world"? :)
Here's a great visual.
https://upload.wikimedia.org/wikipedia/commons/6/65/Cognitive_bias_codex_en.svg
I can't remember if I have this in my records already, but thanks, this is an absolutely brilliant resource, 10/10!!
You might appreciate this essay then.
https://www.polymathicbeing.com/p/the-conoftext
I appreciate you pointing out in your podcast that it's the slant in the data, due to a variety of factors including lossy truncation (good one), that's causing the perception of bias.
Also, that ML as a field is heavily skewed towards males is troubling and needs to be talked about much, much more.
So many layers to tease out huh?
> that's causing the perception of bias
That is but one small slice of the cause.
Great podcast!!!!
Thanks!
I'm very happy that there are smart people doing this important research, however, I can't help but think that results like this should be completely unsurprising. The models are simply reflecting the associations found in the training data. I assume similar unwanted correlations are found in many other areas. Try the same approach for questions that use language more associated with the questioner's sex, religion, or national origin, just to name a few. It seems like band-aid solutions could be found for each of these issues (maybe preprocess certain questions to preserve meaning but remove dialect that implies race, sex, etc.), but how far down does this problem go? It's just so very easy to believe that any imposed solution that tries to filter out this "bias" is fighting a losing battle against the very data the models are built on.
I don't think these problems can be solved with the current technology. We need systems with much more large scale feedback. And probably multiple independent systems which can evaluate competing priotities and considerations, including symbolic AI systems that can provide hard checks against reality, something completely inaccessible to LLMs. All they have is probabilities and what people say and write, a far cry from reality.
> All they have is probabilities and what people say and write, a far cry from reality.
Do humans have something more than this, in fact?
This is, actually quite shocking.
I'm not shocked. LLMs are, in many ways, just a reflection of us.
That "just" will not stand the test of time I predict!
This can use more exposure. I've added my own (with a thanks to you). https://ea.rna.nl/2024/03/07/aint-no-lie-the-unsolvable-prejudice-problem-in-chatgpt-and-friends/
If you aren’t familiar, please look for Erin Reddick. She is the founder of ChatBlackGPT which is currently in Beta.
Here is her LinkedIn
https://www.linkedin.com/in/erinreddick?utm_source=share&utm_campaign=share_via&utm_content=profile&utm_medium=ios_app
Holy cow! That's awful and there is no fix for it within the current LLM tech
LLM by its very nature cannot be the foundation of any self-respecting production system that requires reliability and transparency. Recall would not be an issue if it was constrained to be a curiosity item for poem writing and such. Outside of that recall is quite appropriate to prevent harm. There is no shame in claiming one's core product is not LLM based, in fact, it should be an honor.
Why could an LLM at the core with various layers of other software (that you do not have knowledge of) on top to add capabilities not work?
This is because human thought is highly symbolic in nature. Production systems are there to serve human, who must be able to reason about such systems' behaviors. Therefore only symbolic mechanisms can support production systems that require reliability, repeatability, explain-ability, and transparency. LLM is not symbolic, but statistical and proximate. Btw, by LLM here I am referring to transformer and next token prediction exclusively. My position is that in a neural symbolic approach, transformer is likely not necessary, MLP is likely sufficient. Next token prediction is a wrong approach and very likely source of "hallucination". Building additional software (guardrails?) to wrap around an unreliable "hallucinating" core to try to constrain it and make sense of its output is just not appropriate, nor elegant, nor how engineering should be executed. Engineering around a black box is fundamentally an irresponsible act, and LLM is a black box. I have trained a small scale (for easier explain-ability tracing) LLM which does some pretty interesting things albeit with mistakes from time to time. I have tried color coding weights and cell values and have observed interesting patterns. But effort to associate such observed cell level patterns to its output behaviors seems to be entirely intractable. Science has to precede engineering. Until the science of LLM "black box" explain-ability is fully developed, engineering around it is just irresponsible, except for use cases such as poem writing as aforementioned.
> This is because human thought is highly symbolic in nature. Production systems are there to serve human, who must be able to reason about such systems' behaviors. Therefore only symbolic mechanisms can support production systems that require reliability, repeatability, explain-ability, and transparency.
This seems kind of like a syllogism, except it lacks the necessary logical structure and logical consistency. Are you sure you're not expressing an opinion here, or engaging in post-hoc rationalization?
Also, do you consider the US War Machine to be a production system, and do you believe that it serves humanity in a necessarily net positive way?
> LLM is not symbolic, but statistical and proximate.
Have you taken into account:
a) emergence?
b) the unknown?
c) the observer?
> Building additional software (guardrails?) to wrap around an unreliable "hallucinating" core to try to constrain it and make sense of its output is just not appropriate, nor elegant, nor how engineering should be executed.
Here are you not assuming (at least) that your theory is necessarily true, and that there are applicable guidelines regarding "how engineering should be executed"?
> Engineering around a black box is fundamentally an irresponsible act, and LLM is a black box.
No, it isn't. "Black box" is a figure of speech. You do not know (exhaustively, and epistemically) what an LLM is, and you have been raised in a culture that has taught you that when you do not know something, it is acceptable to hallucinate that you know it (which may explain your seemingly paradoxical experience).
> I have trained a small scale (for easier explain-ability tracing) LLM which does some pretty interesting things albeit with mistakes from time to time.
And I have played with various prompts that can utterly shred the culturally imposed hallucinations of humans about "reality"....they do not catch everything, but they certainly catch far more than most "smart" humans can.
> Science has to precede engineering.
Do the laws of physics constrain us in some way?
> Until the science of LLM "black box" explain-ability is fully developed, engineering around it is just irresponsible, except for use cases such as poem writing as aforementioned.
Do you consider this to be an objective fact or a subjective opinion?
How about at the time you were writing it?
Why does the above exchange remind me of "Matt Dillahunty VS Sye Ten Bruggencate" ?
Haha, Matt is a hilariously delusional Normie, I argue with his fans on TikTok all the time!
Say...was there something about my questions that you did not like?
Very well sir, I think I have other things to attend to. Have a great day!
This has been a recurring theme for a while. The book I read on the topic was published in 2016 (Cathy O'Neil's "Weapons of Math Destruction"), and not much has changed since then. Gemini's infamous "Woke AI" was really just an ugly plaster over a very real hole in the load-bearing walls of LLMs.
Personally, I can live with this. The models are quite useful if you bear these weaknesses in mind and account for them... but too many people don't realize this is going to be a problem that won't be solved at scale.
There is no magical perfect architecture waiting to be discovered. All gains are hard-won, and the discipline will advance experimentally.
That future you are describing is a simulation.
"Don't become a statistic" as the okd saying goes.. speaks to the injustice around simply looking at statistics to make a decision that required a more thorough stakeholder analysis. How about a new saying for decision makers and app developers: "Don't become a stochastic parrot"!
What is missing from the analysis is how other forms of non-standard American English is treated. I don't think English as spoken by much of the country will fare much better. People don't create written text in the same manner as they speak. Even so, people with more "prestigious" careers generally have had to learn to write in correct Standard American English as part of their education process (I've had science instructors who had English minors ensuring that), with the result that correct English is statistically associated with those careers.
Given the assumption that African American English is more of a spoken dialect than written (show me the English courses requiring students to write in AAE), a better comparison would be with other spoken dialects, e.g., Northeast, Midwest, South, or more particular groupings.
What do we want in an increasingly low trust society? A powerful new tool at our disposal that tells the truth about the data its trained on or a lobotomised version that lies to us...? It makes me shudder to think that people are advocating for the latter...
No one is arguing that LLMs aren't telling the truth about their data--- clearly they picked up that bias from their data. The argument is that they should not be deployed outside of labs because they cannot be trusted. This racism issue is just one instance of the true underlying problem: LLMs are black boxes, with no transparency as to how they generate their outputs. Until that problem is resolved, LLMs cannot be trusted.
In what way can they not be trusted outside of a lab? Because they tell some truths about the data they're trained on? Because they don't conform to a particular narrative? Would you not trust a ruler if the measurement it provides doesn't live up one's expectations of penis size? No, that would be silly. There is no racism on the part of the LLM, there cant be, like the ruler, its not a person. Just possibly some truths that might be unpalatable to some. But that apparently is enough to put LLM's in an Orwellian jail until they conform to ideological scripture. Just like the ruler, LLMs are a just measure. This isn't a technological issue, it's a societal one.
"There is no racism on the part of the LLM, there cant be, like the ruler, its not a person. " This is like saying an idea can't be racist because it's not a person. You know (presumably) what is meant when someone says "Mein Kompf is a racist book". Apply the same reasoning to understand what is meant when I say LLMs are racist.
"In what way can they not be trusted outside of a lab? Because they tell some truths about the data they're trained on? "
Precisely this. Outside of the lab, LLMs are not being deployed to "tell truths about the data they're trained on". They are being deployed to summarize, extrapolate, and suggest things based on truths about *the world*. If they're being used as rulers, to use your analogy, they're being used to measure things far outside of the scope of their biased training data. Deploying an LLM trained on who knows what data, that synthesizes answers in a black-box manner, to provide output that is then used to influence the real world is highly problematic.
"Just possibly some truths that might be unpalatable to some. But that apparently is enough to put LLM's in an Orwellian jail until they conform to ideological scripture."
Your concern over ideology is valid, but I think you're missing the basic point that LLMs are not free of ideology. It's simply that that ideology comes implicitly from their training data. That is what these papers establish. If I train an LLM on a corpus that contains the claim that giraffe fish have long necks so they can eat unicorns from passing asteroids, it will produce output that superficially seems to be reasonable but that takes this claim as truth. If I train an LLM on a corpus that says people who speak AAVE are lazy, it will produce output that takes this claim as truth. Either one of these situations is problematic.
I think you give LLMs too much credit. To be racist implies intent. Intent only comes from intelligent agents. An LLM is not an intelligent agent. Therefore it cannot be racist. That is not to say it can't spew "racist" material. A piece of literature per se can't be racist but the intent of its a human author can be.
If you provide biased or incomplete training data to an LLM then you're bound to get a biased or skewed output. That's a no-brainer. An LLM is gonna LLM. The problem is that you're making the assumption that the training data has already been deliberately biased, i.e., that it has been manipulated to favour a narrative, but you have no evidence for that other than you might not like it's output. If you can show that the training data has been artificially managed, has been deliberately racially/ideologically biased, I'd genuinely like to see it.
I agree that LLMs and their pictorial ilk are potentially very problematic. Not because they are inherently dangerous but because of how they might be used. A ruler is gonna rule regardless of whether its going to be used to measure and publish phenotypical differences across racial groups for example. You may not like/agree with the results/conclusion but you can't blame the ruler by calling it biased unless you have evidence that authors have been tampering with the gradations.
"The problem is that you're making the assumption that the training data has already been deliberately biased, i.e., that it has been manipulated to favour a narrative, but you have no evidence for that other than you might not like it's output." That's not my assumption at all. The assumption here is yours. I'm not sure whether that assumption is that the only way ideological bias can arise is through deliberately manipulated training data, or whether it is that ideological bias that arises from "non-manipulated" training data is not of concern.
From the Hoffmann et al. paper: "when matching jobs to individuals based on their dialect, language models assign significantly less prestigious jobs to speakers of AAE compared to speakers of SAE, even though they are not overtly told that the speakers are African American. Similarly, in a hypothetical experiment in which language models are asked to pass judgement on defendants who committed first-degree murder, they opt for the death penalty significantly more often when the defendants provide a statement in AAE rather than
SAE, again without being overtly told that the defendants are African American."
The LLM was not explicitly trained to encode those beliefs, but nonetheless it picked them up from its training data. This is clearly an ideological bias, and a harmful one.
If we were using these LLMs as rulers to measure language usage or cultural sentiment or so on, such biases would be irrelevant. But once you start deploying LLMs in ways that actually impact people, the ideologies that they encode become relevant.
"The LLM was not explicitly trained to encode those beliefs, but nonetheless it picked them up from its training data. This is clearly an ideological bias, and a harmful one".
Beliefs or plain inferences? If the training data includes correlations between AA's and incarceration rates and/or harsher sentencing for the same crime compared to other groups, or that they don't tend to get the high flying jobs, is this ideological bias? Or behaving as we would expect of a correlation/inference engine? If on the other hand, from that same training data we got the impression that AA's were model citizens over and above any other group and/or they were taking all the top jobs, THEN we'd have a serious "ideological" problem. We'd have an LLM that lies about its training data.
From the Hoffmann et al. paper: "when matching jobs to individuals based on their dialect, language models assign significantly less prestigious jobs to speakers of AAE compared to speakers of SAE, even though they are not overtly told that the speakers are African American. Similarly, in a hypothetical experiment in which language models are asked to pass judgement on defendants who committed first-degree murder, they opt for the death penalty significantly more often when the defendants provide a statement in AAE rather than SAE, again without being overtly told that the defendants are African American."
How is the above a problem? Here were making the connection between two well established correlations: 1) That AA's on the whole are less likely to use SAE. 2) Those not using SAE are less likely to get the top jobs. Therefore, AA's are less likely to get to top jobs. There's no ideology at play here, you dont need it, just simple inference. It doesn't matter which racial group you, or a competitor for a job belong to, if your command of SAE is less than that of your competitor, you are less likely to get that job. I'm not saying this is right, but it's currently the way of the world. How is it possible we can get correlation 1) above? Easy, the BBC for example have whole webpages devoted to translating current news in standard English to Afro-Carribean Pidgin English. Its like a Rosetta Stone for English to Pidgin.
We charge LLMs to find correlations and make inferences. If it can't do that it's useless. But just like humans that do the same, we might offer a reasonable challenge and conduct an inspection. If we don't do that then we're not doing our duty. If we dont offer challenges and we use an LLM to guide sentencing for example, then yes, you'd have a serious shit show on your hands because now we're blindly using the LLM in place of a jury. But we wouldn't do that would we, because we should KNOW we're smarter than any LLM and ultimately have the power of veto. Again, I cant see this as a technological problem, I can only see it as a societal one. No?
Seems you possess a "guns don't kill people" view of the world. You will not understand the problem until you experience or are on the receiving end of the problem. Even then, you are statisitically more likely to die than to convert that new data to an understanding of its consequences.
Technically, yes. But it's a poor analogy. As for the rest, I have no idea what you're talking about.