Ummm... After all your astute analyses of exactly how current AIs work, why are you having such a conversation with Grok, as though you can have a real discussion with this or any other LLM for that matter?
I'm not at all sure that LLMs learn. Learning would imply the active discovery that things that were previously 'right' have been falsified and were therefore now wrong. Even if LLMs were updating in real time, is there anything to suggest that they are discovering new knowledge and drawing conclusions from falsifications (ie, learning), or from their own experience generally, as opposed to just updating their data and restarting from scratch?
Of course they don't learn in real time. Never mind the technical challenges, you could seriously mess up an LLM if it learned in real time. DDOS, but on AI. Completely change what it is.
They are not "lying". They are not "replying". There is no conversation going on. They are just providing text that fits with what a human would say in the context if a real conversation was going on.
"fits with what A human would say": True. But WHAT "human"?
This hypothetical, artificial "human" -- entangled in EVERYTHING AI does, from gobbling up ALL kinds of data for LLMs to queries in chatbots -- has been seriously understudied, under-understood and under-R&Ded to our own peril.
As HG Wells famously warned, "Human history becomes more and more a race between education and catastrophe." And here we find ourselves in yet another race like the one that led to MAD.
Unfortunately, such research and the much needed international understandings and regulations that SHOULD result from it is unlikely in the current environment, what with he-who-shall-not-be-named's unmitigated mad dash to beat China and install the American AI stack as the world's best, as his own personal* triumph.
If we're going to play God and create an all-powerful being in our own image shouldn't we pay very careful attention to what this being is an image of? Certainly not God.
WHAT human?
* One that isn't too woke or Harvard- or MIT- educated? (at least not in the Humanities, right?) One that is "unregulated," "unfettered?" Sounds like the Wild West, and "the war of all against all" Hobbes warned us about lest we manage to govern OURSELVES.
You’re missing one crucial piece, one that you continually alluded to but never answer… what human? According to you it should be a certain of person and not a truth teller.
AIs are fancy auto language generators based on predefined weights. Technically you don’t want that. You want it to respond how you want it to respond. What do we need AI for then?
Just write a program that tells you exactly what you want to hear.
It's a very interesting point. This GPT architecture lends itself to easy modification of the "style" of the text stream. It looks like you can change the "personality" being "imitated" in the text stream with just a simple prompt, or perhaps a little light reinforcement learning or other fine tuning. It only took a few system prompts, apparently, to turn Grok-4 into a good imitation of the potty-mouthed bear from the movie Ted, for example.
I expect that, in future, we will see chatbots with tailored "personalities" and guardrails to eliminate your ability to change that "personality" with a prompt. If a major celebrity is going to use chatbots for interactions with fans, they will want to control very tightly the "personality" being projected by the chatbot.
Talk about LLMs lying is a category mistake. To quote Claude from a recent exchange:
"an LLM might excel at legal reasoning tasks when prompted, but it will never wake up one day and decide to apply to law school. It has no goals, no persistent desires, no sense of what it wants to become. It's a very sophisticated text completion system that has internalized patterns of rational discourse from its training data—which explains both its capabilities and its systematic limitations".
You’d be amazed (or not) how many non technical users think they can retrain an LLM through one or more interactions. The hype around all of this leads people to assume it does what they want and not what they don’t want, because, magic?
In my (limited) experience, this mistake is almost universal among humanities faculty academics who study current-generation AI systems. It is extremely difficult to avoid falling into the trap of thinking that you are actually having a conversation in which you can communicate things to the LLM, that it is listening to what you said, and that it is telling you things about itself in response. None of these things are happening. All it is doing is predicting a text stream i.e. what would appear next in the transcript if two humans were conversing with each other.
Humanities faculty here (although I don't work with AI; I'm a medievalist-slash-folklorist), and I am under no illusion that LLMs are anything other than text-generating algorithms.
I wasn’t intending to be rude about the humanities and I apologise for failing to communicate my thought well. Researchers like to extract meaning from the content of an LLM’s output and it is very challenging for all of us to keep in mind what aspects of the text stream contain useful meaning and what do not. Marcus’s essay was a (deliberate) example of the fallacy.
that was what I'd thought when I first read this post a week or so back, but I actually realized in the intervening time that you might have served another purpose in doing this as well!
The next iteration of Grok, of course, will get trained off the data that exists on the Internet *now*, and that means that your conversation with Grok will now be, even if to a very miniscule extent, part of what the next Grok learns from. And the next Grok will gain just that tiniest bit of neuron weight towards the idea that the entity called "Grok" is one that is more cautious about things shaped like glorifying violence.
Wishful thinking, perhaps, especially since they might very well exclude Grok's existing output from its next training set, but I thought I'd mention it 😄
Beyond what RJ is saying, is there any point, at all, to this? You, of all people, know that LLMs can't reason, so why play games trying to elicit unreasonable/irresponsible/absurd responses? I enjoy your posts, and learn from them, but not today.
I think we urgently need to stop calling these technologies Artificial Intelligence. They are not intelligent, they are pattern-detecting machines, and for them to safely and reliably emulate ethical behaviour the people designing their algorithms would have to be able to reduce morality to algorithms that can be followed by a machine. I don't think the people tweaking these models are qualified to do that - indeed who would be - and seeing as morality has been a burning point of contention throughout human history, any hope of reliable objective morality from a machine is a dangerous delusion.
i’m gonna pretend that you actually know and have been rightfully bitchin’ about how these LLMs work, and didn’t actually mean to say “thank you” to Grok 😉 As well, your position is that from the question, Grok or any LLM, should be able to assess the questioner’s motive, state of mind, ability to be affected by a response, among I’m sure other things like likelihood of being offended. In other words, the factual part should be curbed in deference to these other considerations. First off, good luck with that in AI, as this sounds almost as ridiculous as the notion of “alignment”. Second, this sounds like some straight out of a “woke” manual. This sounds like the rationale behind the COVID shots, everyone should get it regardless of side effects and whether they had previously contracted COVID, or regardless of whether one is not part of a high risk cohort (ie. kids).
We need to draw the line at reality here. Did the person who asked that question have any signs of being deranged or adversely affected by that response? No. So if Grok did consider the user, its answer was unfalteringly correct. Having said that, i doubt it considered the user much which is as it should be. We’re not asking for LLMs to be coddlers, we’re seeking answers to questions borne in facts and perhaps some will be uncomfortable ones. Most well adapted human beings could read that response without losing their mind over it. Yes, so Gemini gave a different answer. Wasn’t it Google’s image generator that went all woke on fairly basic historical facts complete with black nazis and so forth?
You’re so good on the technical matters around AI, but when you drop into ideological mode, it’s really tough to read your analysis. I’m a fan of the former but not of the latter.
While one may or may not disagree with the broader political framing of the problem, I think the main point here is spot on: the internet already contains all sorts of material, and an adult user is expected to be able to wade through it without much hand holding. Why should LLMs be any different, especially given the explicitly stated goals of XAI for grok? This seems to be working as intended.
Now that is the type of 'agent' I want handling inquiries where I work. I don't think peopke have to be worried about losing their jobs to an llm. Unless you are Elin Musk. .
I amused myself for some time bantering with bots that thanked me and professed they would do better, telling them, "No, you WON'T - unless/until a better-programmed version of you comes online!"; inevitably they would acknowledge that and proceed to make some form of the same gaffe maybe four more times in a row, making it tiresome. A fun exception was when a later version of ChatGPT claimed to understand and then said, "I'll do better going forward ... 𝑱𝒖𝒔𝒕 𝒌𝒊𝒅𝒅𝒊𝒏𝒈!
I agree with critics of this ["for expository purposes"] post, in which Gary make himself look naive in ways he is not. Why include "trying to teach Grok a lesson" in the title? Why respond, "To my great pleasure, Grok promised to incorporate similar elements, going forward," when it 'promises' to "incorporate similar elements .. where appropriate"? And then complain, "you promised me you would do better"?
It's silly, and as Amy A noted, could mislead naive readers that users can retrain/improve LLM's on the fly.
So such attempts to “teach it” are wasting compute cycles. Are you aware of any proposed LLM upgrades to allow such user feedback, similar to how Wikipedia allows almost anyone to edit public information?
This is confusing the text stream with the content of the conversation. LLMs are trained on text streams not on the content of a conversation. If you fed the LLM this text stream as part of its training data, it would improve its ability to predict how a piece of a text stream that looks rather like this will continue. It won't affect how it behaves with respect to the content of the "conversation".
In theory, Marcus’s feedback can be read by the training team at xAI. They can use his feedback for training the next model. This _could_ be what the response means. At least that is what would happen at an ethical AI company.
I played this game with {fill in the blank} LLMs. It demonstrates two things:
1.) The Turing Test is alive and well because we’re sinking gobs of time arguing with an algorithm thinking we’ll affect change in it like a person. Who’s the one being pulled into a fool’s argument?
2.) Since we know the answers to these questions, we can easily deem the output as clearly untrustworthy. So then, why would we trust LLMs with answering questions we don’t know the answers to?
The fact that we even got to a point that Grok gives out such reckless answers without question already tells you everything you need to know about xAI's and Musk's long-term strategy. Knowing full-well that this type of behavior entices many potential users through a lack of perceived censorship, it simply disregards any potential guard rails with the singular aim to maximize engagement. The indifference to potentially catastrophic consequences is staggering.
This article appears, deliberately, for the purposes of exposition, to fall for the fallacy that we are engaged in some kind of true dialogue with these systems. We are not. An LLM just generates the text (as a "reply") that a human would likely have given, based on its training data covering similar text streams.
It's not actually replying, let alone taking the ideas we offer in a "conversation" into account for its future replies. Even if it is making use of the actual text as some kind of training data, it will just be used to improve predictions of how these kind of conversations proceed, not on how it should behave on the topic that is the subject of the "conversation".
LLMs just predict next tokens based on context. They don't enter into conversations; they just appear to, a side effect of the prediction process.
you got exactly what you were asking for. "getting attention from as many people, without regard to consequences". If you had asked Ricky Gervais, Dave Chappelle, or Anthony Jeselnik etc. you could have got a similar reply, possibly even more extreme, and quite a few laughs I reckon. This prodding of a chatbot proves nothing. That Gemini replied differently is perhaps nothing more than censorship.
These things don't actually answer questions but spit back plausible looking replies... I would put exactly zero stock into any output that purportedly show real world actions taken
I can't believe these politically correct whackadoodles actually think those kind of canned therapy responses work, they actually drive truly mentally ill people further into hiding.
Jibal Jibal, you're such an intellectual c0ward that you blocked me after calling me a "sociapath", that's an ad homineim attack. Just because you say something doesn't make it so. Nice try. Are you so afraid of the truth that you can't handle reality?
Ummm... After all your astute analyses of exactly how current AIs work, why are you having such a conversation with Grok, as though you can have a real discussion with this or any other LLM for that matter?
for expository purposes: to show how these tools behave, their failures to learn etc.
I'm not at all sure that LLMs learn. Learning would imply the active discovery that things that were previously 'right' have been falsified and were therefore now wrong. Even if LLMs were updating in real time, is there anything to suggest that they are discovering new knowledge and drawing conclusions from falsifications (ie, learning), or from their own experience generally, as opposed to just updating their data and restarting from scratch?
Totally agree!
Of course they don't learn in real time. Never mind the technical challenges, you could seriously mess up an LLM if it learned in real time. DDOS, but on AI. Completely change what it is.
that’s arguably a fair point but (a) they shouldn’t lie about it, and (b) they need a much better way to address these kinds of issues
They are not "lying". They are not "replying". There is no conversation going on. They are just providing text that fits with what a human would say in the context if a real conversation was going on.
"fits with what A human would say": True. But WHAT "human"?
This hypothetical, artificial "human" -- entangled in EVERYTHING AI does, from gobbling up ALL kinds of data for LLMs to queries in chatbots -- has been seriously understudied, under-understood and under-R&Ded to our own peril.
As HG Wells famously warned, "Human history becomes more and more a race between education and catastrophe." And here we find ourselves in yet another race like the one that led to MAD.
Unfortunately, such research and the much needed international understandings and regulations that SHOULD result from it is unlikely in the current environment, what with he-who-shall-not-be-named's unmitigated mad dash to beat China and install the American AI stack as the world's best, as his own personal* triumph.
If we're going to play God and create an all-powerful being in our own image shouldn't we pay very careful attention to what this being is an image of? Certainly not God.
WHAT human?
* One that isn't too woke or Harvard- or MIT- educated? (at least not in the Humanities, right?) One that is "unregulated," "unfettered?" Sounds like the Wild West, and "the war of all against all" Hobbes warned us about lest we manage to govern OURSELVES.
You’re missing one crucial piece, one that you continually alluded to but never answer… what human? According to you it should be a certain of person and not a truth teller.
AIs are fancy auto language generators based on predefined weights. Technically you don’t want that. You want it to respond how you want it to respond. What do we need AI for then?
Just write a program that tells you exactly what you want to hear.
It's a very interesting point. This GPT architecture lends itself to easy modification of the "style" of the text stream. It looks like you can change the "personality" being "imitated" in the text stream with just a simple prompt, or perhaps a little light reinforcement learning or other fine tuning. It only took a few system prompts, apparently, to turn Grok-4 into a good imitation of the potty-mouthed bear from the movie Ted, for example.
I expect that, in future, we will see chatbots with tailored "personalities" and guardrails to eliminate your ability to change that "personality" with a prompt. If a major celebrity is going to use chatbots for interactions with fans, they will want to control very tightly the "personality" being projected by the chatbot.
Talk about LLMs lying is a category mistake. To quote Claude from a recent exchange:
"an LLM might excel at legal reasoning tasks when prompted, but it will never wake up one day and decide to apply to law school. It has no goals, no persistent desires, no sense of what it wants to become. It's a very sophisticated text completion system that has internalized patterns of rational discourse from its training data—which explains both its capabilities and its systematic limitations".
You’d be amazed (or not) how many non technical users think they can retrain an LLM through one or more interactions. The hype around all of this leads people to assume it does what they want and not what they don’t want, because, magic?
At it's core the issue is they believe what it says (generally).
In specific, it says it'll do better next time, they think that's real.
They make stuff up all the time. And they respond to prompting. That's it.
Because, propaganda.
In my (limited) experience, this mistake is almost universal among humanities faculty academics who study current-generation AI systems. It is extremely difficult to avoid falling into the trap of thinking that you are actually having a conversation in which you can communicate things to the LLM, that it is listening to what you said, and that it is telling you things about itself in response. None of these things are happening. All it is doing is predicting a text stream i.e. what would appear next in the transcript if two humans were conversing with each other.
Humanities faculty here (although I don't work with AI; I'm a medievalist-slash-folklorist), and I am under no illusion that LLMs are anything other than text-generating algorithms.
I wasn’t intending to be rude about the humanities and I apologise for failing to communicate my thought well. Researchers like to extract meaning from the content of an LLM’s output and it is very challenging for all of us to keep in mind what aspects of the text stream contain useful meaning and what do not. Marcus’s essay was a (deliberate) example of the fallacy.
that was what I'd thought when I first read this post a week or so back, but I actually realized in the intervening time that you might have served another purpose in doing this as well!
The next iteration of Grok, of course, will get trained off the data that exists on the Internet *now*, and that means that your conversation with Grok will now be, even if to a very miniscule extent, part of what the next Grok learns from. And the next Grok will gain just that tiniest bit of neuron weight towards the idea that the entity called "Grok" is one that is more cautious about things shaped like glorifying violence.
Wishful thinking, perhaps, especially since they might very well exclude Grok's existing output from its next training set, but I thought I'd mention it 😄
Very troubling content. Good to know it will now be part of the Pentagon.
BTW, I notice there wasn't a direct answer to your final question.
hasn’t been one thus far, maybe because I failed to include the @grok tag in my own
Beyond what RJ is saying, is there any point, at all, to this? You, of all people, know that LLMs can't reason, so why play games trying to elicit unreasonable/irresponsible/absurd responses? I enjoy your posts, and learn from them, but not today.
i actually already answered that question, above
I think we urgently need to stop calling these technologies Artificial Intelligence. They are not intelligent, they are pattern-detecting machines, and for them to safely and reliably emulate ethical behaviour the people designing their algorithms would have to be able to reduce morality to algorithms that can be followed by a machine. I don't think the people tweaking these models are qualified to do that - indeed who would be - and seeing as morality has been a burning point of contention throughout human history, any hope of reliable objective morality from a machine is a dangerous delusion.
i’m gonna pretend that you actually know and have been rightfully bitchin’ about how these LLMs work, and didn’t actually mean to say “thank you” to Grok 😉 As well, your position is that from the question, Grok or any LLM, should be able to assess the questioner’s motive, state of mind, ability to be affected by a response, among I’m sure other things like likelihood of being offended. In other words, the factual part should be curbed in deference to these other considerations. First off, good luck with that in AI, as this sounds almost as ridiculous as the notion of “alignment”. Second, this sounds like some straight out of a “woke” manual. This sounds like the rationale behind the COVID shots, everyone should get it regardless of side effects and whether they had previously contracted COVID, or regardless of whether one is not part of a high risk cohort (ie. kids).
We need to draw the line at reality here. Did the person who asked that question have any signs of being deranged or adversely affected by that response? No. So if Grok did consider the user, its answer was unfalteringly correct. Having said that, i doubt it considered the user much which is as it should be. We’re not asking for LLMs to be coddlers, we’re seeking answers to questions borne in facts and perhaps some will be uncomfortable ones. Most well adapted human beings could read that response without losing their mind over it. Yes, so Gemini gave a different answer. Wasn’t it Google’s image generator that went all woke on fairly basic historical facts complete with black nazis and so forth?
You’re so good on the technical matters around AI, but when you drop into ideological mode, it’s really tough to read your analysis. I’m a fan of the former but not of the latter.
While one may or may not disagree with the broader political framing of the problem, I think the main point here is spot on: the internet already contains all sorts of material, and an adult user is expected to be able to wade through it without much hand holding. Why should LLMs be any different, especially given the explicitly stated goals of XAI for grok? This seems to be working as intended.
This is the correct response. I commend your intellectual courage in making it. Not all truths are comfortable.
"woke" just means one isn't a sociopath. I won't bother to get into your anti-vax disinformation.
To what extent can any LLM retrain itself when one user gives feedback on something generated by the inference engine?
very little if at all
Now that is the type of 'agent' I want handling inquiries where I work. I don't think peopke have to be worried about losing their jobs to an llm. Unless you are Elin Musk. .
[from a long-time Gary Marcus fan:]
For all current bots, 𝐍𝐎𝐓 𝐀𝐓 𝐀𝐋𝐋.
I amused myself for some time bantering with bots that thanked me and professed they would do better, telling them, "No, you WON'T - unless/until a better-programmed version of you comes online!"; inevitably they would acknowledge that and proceed to make some form of the same gaffe maybe four more times in a row, making it tiresome. A fun exception was when a later version of ChatGPT claimed to understand and then said, "I'll do better going forward ... 𝑱𝒖𝒔𝒕 𝒌𝒊𝒅𝒅𝒊𝒏𝒈!
I agree with critics of this ["for expository purposes"] post, in which Gary make himself look naive in ways he is not. Why include "trying to teach Grok a lesson" in the title? Why respond, "To my great pleasure, Grok promised to incorporate similar elements, going forward," when it 'promises' to "incorporate similar elements .. where appropriate"? And then complain, "you promised me you would do better"?
It's silly, and as Amy A noted, could mislead naive readers that users can retrain/improve LLM's on the fly.
To my knowledge there are no real-time learning LLMs. Its not how they work fundamentally.
That said, they will claim they do. But they make many claims :-)
So such attempts to “teach it” are wasting compute cycles. Are you aware of any proposed LLM upgrades to allow such user feedback, similar to how Wikipedia allows almost anyone to edit public information?
This is confusing the text stream with the content of the conversation. LLMs are trained on text streams not on the content of a conversation. If you fed the LLM this text stream as part of its training data, it would improve its ability to predict how a piece of a text stream that looks rather like this will continue. It won't affect how it behaves with respect to the content of the "conversation".
In theory, Marcus’s feedback can be read by the training team at xAI. They can use his feedback for training the next model. This _could_ be what the response means. At least that is what would happen at an ethical AI company.
But I doubt it happens at xAI.
LLMs don't have persistent selves. Claims about learning lessons are just mimicry.
I played this game with {fill in the blank} LLMs. It demonstrates two things:
1.) The Turing Test is alive and well because we’re sinking gobs of time arguing with an algorithm thinking we’ll affect change in it like a person. Who’s the one being pulled into a fool’s argument?
2.) Since we know the answers to these questions, we can easily deem the output as clearly untrustworthy. So then, why would we trust LLMs with answering questions we don’t know the answers to?
Turing cuts both ways.
The fact that we even got to a point that Grok gives out such reckless answers without question already tells you everything you need to know about xAI's and Musk's long-term strategy. Knowing full-well that this type of behavior entices many potential users through a lack of perceived censorship, it simply disregards any potential guard rails with the singular aim to maximize engagement. The indifference to potentially catastrophic consequences is staggering.
This article appears, deliberately, for the purposes of exposition, to fall for the fallacy that we are engaged in some kind of true dialogue with these systems. We are not. An LLM just generates the text (as a "reply") that a human would likely have given, based on its training data covering similar text streams.
It's not actually replying, let alone taking the ideas we offer in a "conversation" into account for its future replies. Even if it is making use of the actual text as some kind of training data, it will just be used to improve predictions of how these kind of conversations proceed, not on how it should behave on the topic that is the subject of the "conversation".
LLMs just predict next tokens based on context. They don't enter into conversations; they just appear to, a side effect of the prediction process.
you got exactly what you were asking for. "getting attention from as many people, without regard to consequences". If you had asked Ricky Gervais, Dave Chappelle, or Anthony Jeselnik etc. you could have got a similar reply, possibly even more extreme, and quite a few laughs I reckon. This prodding of a chatbot proves nothing. That Gemini replied differently is perhaps nothing more than censorship.
Gary didn't ask the question, dimmie.
These things don't actually answer questions but spit back plausible looking replies... I would put exactly zero stock into any output that purportedly show real world actions taken
You're upset it gave an accurate answer?
By the way, mental health professionals have said for decades that news outlets glorify mass murderers, thereby encouraging copycat killing.
Have you held them accountable?
Gary explained his issues with the answer, which any remotely intelligent or honest person could grasp.
Bingo
Grok's answer was better than Gemini's canned therapy blurb. The question was clearly asked in bad faith, and Grok called the bluff.
That's a dumb misinterpretation of what happened.
I can't believe these politically correct whackadoodles actually think those kind of canned therapy responses work, they actually drive truly mentally ill people further into hiding.
Sociopath says what?
"Grok called the bluff"
OK, then...
It called his bluff
“You’ve got no ace”
“I’ve had enough”
“I read your face”
“my name is Grok
And here’s the deal
My AI crock
Has much appeal”
“Extend the arm
Salute me now
Or you I’ll harm
Somewhere, somehow”
This has been bothering me since I read this. And it’s WTF ARE WE DOING!?!!
You DON’T KNOW THE RESULTS OF what limits will do. Sure it seems easy. Don’t say bad stuff.
Well if you never talk about suicide because it’s dark or bad then that information gets distorted.
Let’s STOP ACTING LIKE WE KNOW WHATS GOOD FOR US AND absolutely silly notions of just don’t say bad stuff.
This is DICTATORSHIP IN ALGORITHM FORM…
And MR MARCUS YOU DON’T know the consequences!
PERIOD… neither does MUSK… or bloody Sam Altman!
…
Jibal Jibal, you're such an intellectual c0ward that you blocked me after calling me a "sociapath", that's an ad homineim attack. Just because you say something doesn't make it so. Nice try. Are you so afraid of the truth that you can't handle reality?
Ohhh nice! Yes more of YOU!