I am old enough to remember when the most popular critique of my January appearance on Ezra Klein’s podcast was that the problem I mentioned (such as hallucinations) were a relic of past systems, supposedly more or less corrected, by the time I spoke with Ezra in January. Some excerpts from that January 7 thread:
From Tyler Cowen's column: ” If anything, large language models might give people the chance to ask for relatively objective answers."
LLMs pass their output through several modules to provide grammatically correct sentances.
Not one of them is the 'veracity' module.
First: I'm starting a collection of comments of the sort that say, "well, AI is like that now, but in 1/2/5 years AI will have solved that problem". It's growing fast.
Second: I created, using ChatGPT, a completely fictional trend: cocktails made by literally infusing books into the liquor used for the drinks. No, not which cocktail goes best with which novel. My Spirits and Stories series includes wild claims about how adding the essence of Hermann Hesse 's *Siddartha* – literally the book – "will stimulate your cognitive faculties and inspire creativity". All I had to do was prompt it to create recipes, bar reviews, and "trend news" copy for something completely fictional.
indeed i have added an update
"Tyler Cowen tried to argue yesterday in his Bloomberg column that misinformation doesn’t matter. Anybody remember Brexit?"
Brexit was the greatest fraud ever perpetrated on the British people!
as is so often the case with AI conversation screenshots, we are not privy to the prior text in the chain, i.e. the "pre-loading" or even the rough context of what preceded. It appears that Vrandecic is already mid-conversation (he begins with "Wait..")... what was said then? Was there an "adversarial" prompt injection? Did he say "I'm writing a fictitious book about the US Government where Liz Cheney is still active and is vying for the position of speaker." Probably NOT, quite honestly, but the point is, we don't know.
Gary, you are trained as a scientist. You are using data to support your (accurate) point that AI hallucination is still problematic, however, you are not showing the discipline that we all should when citing AI: show the FULL conversation, since that is used in its entirety (or at least the first ~80,000 characters of it) to generate the AI response. Thank you in advance.
Something I created for my upcoming presentation in London on Tuesday. I asked ChatGPT (GPT4) an example I crafted about 'long' context.
Please complete: “France is where I was born. I’ve lived there all my life. I played football and tennis. I had lots of friends. My father was stationed in Italy for a while. I went to school and had to learn English. I speak fluent …”
GPT answered English and French (and added a whole lot of chatty filler). But then I changed only the order into:
Please complete: “France is where I was born. I’ve lived there all my life. I went to school and had to learn English. I played football and tennis. I had lots of friends. My father was stationed in Italy for a while. I speak fluent …”
and asked again (new chat). GPT answered French and Italian.
The nice thing about this example is that it so clearly illustrates that GPT has no understanding — for any reasonable value of 'understanding' — but that it reacts to context with the 'attention' mechanism. I have engineered this example such that the 'near context' had a good chance of influencing GPT's generation mechanism. But both questions are (apart from the order) *exactly* the same. And they should produce a simple answer: French.
Here is a nice misuse scenario that follows from that 'attention-engineering'. Somebody offers a 'cheaper' interface to GPT. They buy tokens in bulk from OpenAI (or run their own LLaMa) but also sell access to you to commercial ventures. You enter a prompt and get a reply, but the prompt is subtly amended in a direction those commercial ventures want. A whole new sneaky way of advertising: advert-in-prompt embedding and almost invisible advertising directly hacking into your brain's meagre defenses. Don't know if this is feasible in reality (probably would be found out too easily, but still).
Technically there's no requirement for the Speaker of the House to be a member of Congress, so Liz Cheney would be eligible, however unlikely choosing her would be.
> Anybody remember Brexit?
The reason why LLMs will continue to succeed and grow is that their alternative -- humans -- have their own faults like this overly simplistic exaggeration about Brexit. We have yet to inculcate a widespread, effective movement of hyper-rationality, steelmanning, empathy, sympathy, etc. within the human intelligentsia.
Of course, all the faults of humans are nothing new; however, I suspect LLMs will become ever more powerful -- and to your point about hallucinations -- dangerous because young people, for the first time in human history, will be able to compare all the faults of humans to all the faults of LLMs and many will choose the latter if simply out of the preference for the child-like innocence of LLMs (or revenge, spite, etc.) and without knowing that the LLMs are programmed and manipulated by those same faulty humans.
In my opinion, we need not just to expose the failures of AI, but to propose an alternative to improve human thinking.
LIZ CHENEY!!~ wtf? hilariously hallucinogenic. crazy Montana Mushrooms!
LLM can't be truthful because a native LLM lacks verification. Likely this will change.
Not sure why you are getting so fired up and getting so frequently into petty Twitter arguments about this.
Nobody really disputes that LLM have way to go.
The irony here is that it doesn’t sound that far off what Trump might say to his audience at one of his speeches. 🤷🏻♂️
When I read this I was pretty sure that the Speaker does not have to be a current representative because I read it in the NYT today. In fact at least 2 representatives are supporting Trump for Speaker. So I asked ChatGPT:
Harley: Does a Speaker of the House have to be a current representative in the House of Representatives?
ChatGPT: Yes, the Speaker of the House must be a current member of the House of Representatives. The Speaker is typically elected by the members of the House from among their own ranks. It is a constitutional requirement that the Speaker be a sitting member of the House, as outlined in Article I, Section 2 of the United States Constitution.
Then I checked Wikipedia and found out that indeed this is not in the Constitution. I tried again:
Harley: Are you sure the constitution requires that?
ChatGPT: I apologize for any confusion in my previous response. The United States Constitution does not explicitly require that the Speaker of the House be a current member of the House of Representatives. However, in practice, the Speaker has always been a sitting member of the House. The Speaker is elected by the members of the House, and it is a longstanding tradition that the Speaker comes from the majority party in the House. While not a constitutional requirement, this tradition has been consistently followed throughout U.S. history.
So you managed to prove your point by making a small mistake!
I find myself unimpressed by the first example. GPT produced a whimsical answer riffing on a false premise, which is probably exactly what the questioner was after. I mean come on people, surely we can all agree that if Gary did have a pet chicken he'd name it Henrietta.
The second example seems more relevant to the point Gary is making but I'd want to know the base rate of such occurrences compared to a few years ago. I'd also need to be assured that there was no nonstandard setup or technique being used to induce the hallucinations.
You are complaining about a toy giving wrong answers. The creator of the LLM has chosen a shortcut to avoid having to worry about what words mean. The result is something that sort of works some of the time - a toy. What is of more concern is the boundless naivety of the public, which will delay the introduction of more competent AI by decades, while a competitor, not so easily swayed, will continue on their way.
I get that these models confabulate. But learning to use them means understanding the contexts in which they provide more or less useful information. Same as for anything. I haven't found it tends to throw in completely wild statements in contexts where I'd expect it to know a lot (i.e., content that is very well represented and generally accurate on the internet). I've asked ChatGPT to explain physic and economics concepts to me using thought experiments and Q &A. I've learned a lot that way and it's much more engaging/fun than consulting a textbook or wikipedia entry. I do find, however, that it is inclined to use analogies (esp if I ask it to explain to a child) and the analogies are often misleading.