If all we had was ChatGPT, we could say, hmm “maybe hallucinations are just a bug”, and fantasize that they weren’t hard to fix.
If all we had was Gemini, we could say, hmm “maybe hallucinations are just a bug”.
If all we had was Mistral, we could say, hmm “maybe hallucinations are just a bug”.
If all we had was LLAMA, we could say, hmm “maybe hallucinations are just a bug”.
If all we had was Grok, we could say, hmm “maybe hallucinations are just a bug”.
Instead we need to wake up and realize that hallucinations are absolutely core to how LLMs work, and that we need new approaches based on different ideas.
Gary Marcus has been warning about the field about the problem of hallucinations since 2001.
Another person commented on this substack a year ago, LLMs are doing the same thing when they get it right that they are doing when they get it wrong. They said it better, but I’ve never forgotten.
Karpathy also has a quote about how "hallucination is all LLMs do" and pointing out that to get reliability from a system built on LLMs, we need to add additional layers of analysis / methods - https://twitter.com/karpathy/status/1733299213503787018