Haha, nice to have the original source! I read about this example elsewhere but as far as I can recall now they cited you.
It's a good example and as I've said the LLM's 'reasoning' that they now show is actually very funny because it is not logical, it says, 'The context hints that the answer is Ronaldo'.
Haha, nice to have the original source! I read about this example elsewhere but as far as I can recall now they cited you.
It's a good example and as I've said the LLM's 'reasoning' that they now show is actually very funny because it is not logical, it says, 'The context hints that the answer is Ronaldo'.
That is exactly how it works. This process is more about stochastic aggregation than reasoning. You can think of it as a probability balance between Messi and Ronaldo that shifts as more tokens (subwords, activations) accumulate around one or the other. References to Lisbon, Portugal, or any related concepts subtly influence the overall stochastic weight in the output. Once you grasp this, it becomes quite intuitive.
For example, if Cristiano played for a hypothetical team called “Banderas”—a name that also belongs to a well-known actor frequently mentioned in the training data—the high frequency and association would create a strong “activation bomb,” heavily tilting the scales.
It is unfortunate that companies like OpenAI and much of the AI industry—including researchers and academia—allow this kind of misrepresentation to persist. Instead of pushing back against the paternalistic narrative of “intelligent”, “thinking” AI, they reinforce it, as if people couldn’t handle the truth. In reality, AI is just sophisticated pattern recognition, and that is impressive enough on its own without the need for exaggeration.
As for me, such details on where these AI models fail are more important for understanding the current state of affairs than all the benchmarks and examples of maths problems solutions they were trained to parrot. Because this really shows they are not reasoning and lack true understanding.
Haha, nice to have the original source! I read about this example elsewhere but as far as I can recall now they cited you.
It's a good example and as I've said the LLM's 'reasoning' that they now show is actually very funny because it is not logical, it says, 'The context hints that the answer is Ronaldo'.
That is exactly how it works. This process is more about stochastic aggregation than reasoning. You can think of it as a probability balance between Messi and Ronaldo that shifts as more tokens (subwords, activations) accumulate around one or the other. References to Lisbon, Portugal, or any related concepts subtly influence the overall stochastic weight in the output. Once you grasp this, it becomes quite intuitive.
For example, if Cristiano played for a hypothetical team called “Banderas”—a name that also belongs to a well-known actor frequently mentioned in the training data—the high frequency and association would create a strong “activation bomb,” heavily tilting the scales.
It is unfortunate that companies like OpenAI and much of the AI industry—including researchers and academia—allow this kind of misrepresentation to persist. Instead of pushing back against the paternalistic narrative of “intelligent”, “thinking” AI, they reinforce it, as if people couldn’t handle the truth. In reality, AI is just sophisticated pattern recognition, and that is impressive enough on its own without the need for exaggeration.
Thanks for elaborating!
As for me, such details on where these AI models fail are more important for understanding the current state of affairs than all the benchmarks and examples of maths problems solutions they were trained to parrot. Because this really shows they are not reasoning and lack true understanding.