19 Comments
Nov 22, 2022Liked by Gary Marcus

Hi Gary, another thoughtful post, glad merc mused out aloud about this :)

Contexts, attention, NLU, blah blah blah aside, this happens - two Alexas will endlessly read this single to-do item to "each other" till the cows come home: 'Alexa, what's on my to-do list today?'

That exposes the fakery of syntax-without-semantics. How would Alexa "realize" it's a joke, and how to follow up (keep going, stop playing...)?

The gullible public, egged on by marketing ploys ('things to ask Siri - will you marry me? ') create a dangerous entrance to real-world issues - last Dec, Alexa told a girl to stick a penny in a socket when the girl asked Alexa for a challenge!

Unless AI has direct experience with the world, this will remain a problem, and even more data (my acronym - LLLM - Ludicrously Large... :)) is not going to fix it. Experience isn't in data, meaning isn't in symbols.

Expand full comment

Two additional factors to consider. #1 chatbots and LLMs do not have any meaningful knowledge about the person they are interacting with, they are not truly personalized to the individual and hence not 'smart'. This is a fatal error in developing a chatbot that is robust and truly valuable to the person. And unfortunately for the tech world, compassionate caring human interaction is not the strong suit in the tech sector. #2 chatbots using current approaches with ML and LLM have no ability to use any information they may have or gather about a person and perform an activity that emulates human reasoning to engage with the person in a relevant manner - from the perspective of the individual person, selling stuff is not engaging. There are approaches to these two issues but they don't fit into the traditional and expected tech solutions.

Expand full comment
author

attempts, yes. but can it be made to work reliably enough for Amazon-scale production?

Not any time soon.

Expand full comment
Nov 22, 2022Liked by Gary Marcus

As a co-founder of a conversational AI platform I think that is all spot on. We did some work on how we could use GPT-3 and the conclusion was that while it has its place in helping designers design chatbots you cannot put it in front of a user. By the time you do the work to constrain it enough and guarantee that it behaves appropriately while helping users complete specific tasks you might as well use more “traditional” dialogue management techniques.

Having said that, I think Alexa could be more conversational even without LLMs and we and others have built more conversational skills on Alexa (albeit within the confines of a specific task).

However, if I was product owner of Alexa and was looking at the challenges of getting people to use it even for simple everyday tasks I wouldn’t necessarily have “more sophisticated conversations” that high on the roadmap.

Expand full comment

Excellent and clearly written article.

You write: "Turning LLMs into a product that controls your home and talks to you in a way that would be reliable enough to use at scale in millions of homes is still a long, long way away."

My take: It will never happen.

Expand full comment

Nobody wants to be Clippy. The learning gap necessary to make an LLM work would leave many users frustrated. I suppose an opt-in like Tesla’s FSD beta might help but there’s no money in it for Amazon and no end to the development timeline.

Expand full comment
Nov 24, 2022Liked by Gary Marcus

Dear Gary,

your arguments are spot on. As a part-time researcher, part-time developer, I frequently work with both LLMs and rule-based dialogue systems. And, I experience the same issues from an "inside" perspective.

1 Knowledge

Recently, I tried out to spin up a dialogue- system based on a LLM, which at first sight seems to be incredible smart and versatile in conversations. By priming the bot with some basic information as a hidden prompt, it reacted very well to the first questions and delivered great and astonishingly detailed answers from the pre-trained "knowledge". If I want to have a focused exchange, I will have to provide larger and larger prompts as contexts, which somewhat works against the promised ease-of-use of LLMs.

2 Consistency

Another issue is consistency. As LLMs are very creative they sometimes deliver contradictory answers.

First they recommend solution A, later solution B as the best.

3 Dialog Flow

A special kind of consistency is needed to have a consistent conversation: speaker roles, current and former topics of the conversation, intentionality.

The mechanism to cope with this problems affords to log knowlegde, topics and dialog state along with the conversational exchange and add it to the prompt for the next conversational move. This bloats the prompt and degrades performance. (Another method to get better overall consistency would be to interfere with the sampling and force wanted answers by "RULES!")

The prompt extension mechanism is a combination of short-term and long-term memory, and togehter with conversational rules we pretty much move away form end-to-end towards rule-supported hybrid systems ;-)

4 Opinions

The last issue is that LLMS are pretty opinionated about anything. And even if I can mitigate the problems for my customers by the above mentioned mechanisms, they still don't want a bot that tells their bot users that "Putin is a great guy." ;-)

So I reverted back to rule-based bots like RASA and use LLMs just to generate training material for the internal model of RASA :-)

Keep up the good work, I studied computational linguistics in the ninetees in the context of cognitive science and transformational grammar, which makes me skeptical towards end-to-end models, which have virtually no explanatory value for me as a linguist.

Werner Bogula, Artificial Intelligence Center, Hamburg - @Bogula

Expand full comment
Nov 22, 2022Liked by Gary Marcus

Actually, there have been attempts to use LLMs with codegen models to take action, call APIs etc. - https://twitter.com/sergeykarayev/status/1569377881440276481

Expand full comment

I believe that language models are semantic relations among symbols. I know for certain that human language hardware, learning algorithms, and live conversation have always been based on neuromechanical vibrations (sound, cadence, emotion) operating thousands of times faster. It's hard to imagine a bigger discrepency. Is that explanation in play?

Expand full comment

"spit our" -> "spit out"

Expand full comment

But did you actually read about the Alexa prize competition? It's just not true that Amazon isn't doing anything in regards to conversational AI.

Expand full comment

On point 5 -- these guys used DALL-E to guide a robot into setting a table correctly, "by first inferring a text description of those objects, then generating an image representing a natural, human-like arrangement of those objects, and finally physically arranging the objects according to that image." https://www.robot-learning.uk/dall-e-bot

Isn't that "using [an LLM] sentence to control stuff"?

Expand full comment

Why would they want assistants to have deep conversations? In the end, current voice assistants are just fancy user interfaces for Amazon, Google and some smart toys in your home; I don't see why people would like to have deep conversations with them or how that would improve them. Additionally, (some?) people are already worried about the privacy implications of voice assistants, having them ask you about your day would not help.

Expand full comment