51 Comments

What is desirable about Khosla's visions #1 and #2?

AI doctors will be wrong for a long time. And speaking as Stage IV cancer patient, how are AIs going to develop empathy anytime soon, especially if they are themselves disembodied? In my experience, even embodied human doctors are mostly bad at this. Similarly, tutors are going to be teaching incorrect material -- and why should any child need a tutor 24/7?

As for #2, it sounds horrifying. Labor will be free -- for those who pay for labor. What will happen to those who *get paid for* labor? "Training," or the same old nonsense? Teaching everyone to code, even if they hate it? (Oh, wait, even GPT-4 does that already.) If this is techno-optimism, it's clearly only so if you're a member of the right economic class.

Expand full comment

It's a libertarian's utopia. Everything is reduced to owning land and choosing what to do with it.

Expand full comment

It's really difficult to take any of these VC/influencer talks seriously. Predictions that are beyond the time horizon where anyone will remember them or care that he made them?! He just has no idea how or when any of that will happen. Sure, all those things will come true but is this closer to the ancient Greeks saying one day man will set foot on the Moon, or Kennedy saying the same thing? The Greeks had no idea what they'd even have to learn to solve that problem; Kennedy knew it was just engineering by that point. Its the *perfect* TED Talk. Everyone claps and praises him, he notches another 'brave' TED Talk, and AI such as it is tells you it's a better mother than you are.

Expand full comment

"Predictions that are beyond the time horizon where anyone will remember them or care that he made them"

Yep, this. Dan Gardner's book "Future Babble" is all about this (Nassim Taleb talks about it a lot, too). We just *love* hearing people talk about what the future will hold, and we're unphazed by the pathetic records of our past predictions.

Expand full comment

And all this techno-optimism based on the transformer innovation in 2017 and how that led to a 175B parameter model in 2019, and after three (sometimes horrid) years of fine-tuning ChatGPT in 2022? Because all the breakthroughs we need to get from that to something actually resembling 'intelligence' and 'understanding' are *completely* unsolved/unknown. We have no clue how to do that, which basically puts any belief in all that squarely in the domain of alchemists being convinced we would get from lead to gold. Not in five hundred years, that one, regardless of the solid convictions.

Expand full comment

"Understanding" means having an adequate model. LLM understands nothing about fluids, but a fluid modeling program "understands" quite a bit, if at a low level.

A tool that can give you a proof that involves logic manipulation and numerical computations "understands" logic and numbers. Not the higher-purpose, but at least what it is dealing with.

What is intriguing about current methods is that they give up on trying to reverse-engineer intelligence. The hope is that collecting and cataloging a lot of data about how we do things will allow the system to learn by imitation. Appropriate domain-specific modeling will help keep the system grounded.

Expand full comment

We have no idea how to connect human-domain-specific modelling (e.g. Wolfram-, Cyc-like, etc.) and the pixel/token-sequence-domain specific modelling (current NNs/GenAI). E.g. a LLM has to 'understand' in order to 'decide' *when* to use *what* human-domain system, and for that the understanding is a prerequisite. But that understanding is then something those 'domain-specific modelling were supposed to bring. That's a Baron Münchhausen type of situation. It is circular.

"The hope is that collecting and cataloging a lot of data about how we do things will allow the system to learn by imitation. Appropriate domain-specific modeling will help keep the system grounded." Sure, that is the *hope*. More that that, it is the *assumption*. Which at this point has the same status as being convinced 500 years ago that all metals were one and the same so they could be converted into each other. We have convictions galore on this front. My point was that that is *all* we have.

We have no beginning of an idea how that could be done. One thing we do know, the initial hope was that those years of specific fine-tuning (you know, the cheap labor from English speaking African countries a.o.) would already be that 'learning by imitation'. That already has been shown to be a dead end (e.g. it doesn't really scale, most of the energy now is therefore directed at engineering around it).

As soon as you dig a bit below the surface, you see techniques that may have very useful and 'satisficing' uses, but *nothing* that supports that hope, nor. the direct one, nor the 'combination one'. On the contrary.

Maybe I'm wrong and I have missed something. But in that case, point me to something real, not 'hope'.

Expand full comment

We already made good progress in going from pixels to labels. So, when a robot is in the kitchen, it will be able to look around and realize that this is a kitchen. That here's a cupboard. That's where the dishwasher is. That's the sink.

Then, it has to be given lots of rules, spelled out via text. Such as, the trash is under the sink. Must first open the door, bow down, peek inside, and among all the junk, find the trash can.

Then, it has to be told how to translate text that says "bow down and look" into motor commands.

Sounds a little more than just "hope"?

Expand full comment

References?

Expand full comment

That paper (a bit too much fawning language for my taste, btw) is an overview of something different: using LLMs to instruct robots of which the behaviour itself has been 'tokenised'. That is different from the problem of 'grounding LLM output in 'understanding'.

A good example of a real (and interesting) research paper on *that* subject is https://arxiv.org/pdf/2212.06817.pdf (from Google). That one has the Kitchen1 and Kitchen2 testing, for instance.

Even the technologies of the 1980s and 1990s found some niche usabilities. Real ones. Always (in my memory) by seriously limiting degrees of freedom, i.e. scaling the complexity of the problems space down. You see that in the LLM-robotic space now, where the complexity has been scaled down to a relatively small set of 'behavioural tokens' by which the robot is steered.

LLMs will too, I suspect, but the issue to get LLMs and symbolic models married, is completely unsolved, there are some tech tips and tricks (like GPT creating python code to execute when it 'detects' arithmetic, instead of trying to approximate arithmetic directly by continuation).

I guess https://ea.rna.nl/2022/10/24/on-the-psychology-of-architecture-and-the-architecture-of-psychology/ remains an important aspect. We see what we want to see. I read these papers and see the holes that fit my conviction. You read the same papers and see the confirmation of your conviction.

I'm still interested if you have a paper (preferably not just some submitted student overview thing without actual research) that shows some new thinking in how to make sure LLMs actually understand in the same way we for instance understand the 'equivalence' between 5-6-7 and 105-106-107.

Expand full comment
Apr 21·edited Apr 21

Thank you Gerben, love the alchemy reference, especially having had the privilege of enjoying live gold smelting demos in Ballarat, Victoria, Australia where the staff poured molten gold into the pots to be cooled immediately in water to form solid gold bars (recreating the gold rush environment of the 1850s).

But before we can even talk about “intelligence” or “understanding”, can we first address predictability and reliability?

Gold production is deterministic (those touristy gold demos run several times a day). But RAG LLM output generation is not. I have invested countless hours into trying to stabilise LLM output in the hope of integrating GenAI into my SaaS pipeline. I have been too long in the lab with Llama3/Claude 3 Opus/Falcon 180b/Mistral/GPT 4 tests, my family can attest. I fear that I would not be able to achieve deterministic output from these overhyped models in my lifetime!

Expand full comment

Deterministic output is against what these models *fundamentally* do. Their strength is on the 'creative' side (where many people oppose the use of 'creative' here, but that gets us in deep philosophy). The problem is that the creativity we want and the hallucinations we do not want are under water one and the same thing.

There is a difference between 'reliable' (enough) and 'deterministic'.

Expand full comment

Thank you Gerben for your insightful reply.

I agree with your points. And my concern is that such "creative" tools are being misapplied especially in critical domains like healthcare, with millions spent on model training and deployment in the corporate hope that profits can be made off the back of job redundancies.

And I worry too about nondeterministic "creative" LLM hallucinations being yet another needless burden a stretched medical environment would have to grapple with.

Plus, I would be utterly furious if my child was misdiagnosed by an LLM and an incorrect but urgent surgical procedure ordered on my child.

Expand full comment

I suspect the medical world has more guardrails than most businesses. I worry about IT landscapes filled with lots of very poor AI-generated code. The long term effect of that is going to be really bad. See https://www.linkedin.com/posts/gerbenwierda_debunking-devin-first-ai-software-engineer-activity-7185248033578655744-ojAR

Expand full comment

Thanks Gerben.

I wish that you were right about the medical industry having more guardrails.

Unfortunately the exact opposite may indeed happen.

This is because of the negative impact of inflation on the private hospital sector, as is the case here in Oceania. Many hospitals have had to cease operations with warnings of more to come.

With cost cutting measures and higher workloads, LLM safety (as in most industries) may be the first to go right to the bottom of the priority pile. Not does that adversely affect patient outcomes but it may also be weaponised against private health insurance customers in the form of "evidence" for lowered or declined payouts by their insurers.

Expand full comment

They are absolutely going to agentify these untrustworthy systems no matter the human cost so they can fool themselves in thinking they are on the way to creating their dream ASI god. And adjacent to that, imagine biologists said they were creating a successor species; they would be jailed immediately.

Expand full comment
Apr 17·edited Apr 17

That's the next moronic step. Then they'll put those untrustworthy agents into robots. Today's AI world is 90% people who want to be rich and 10% people who know what they're doing.

Expand full comment

Putting an agent into a robot is, in fact, a great idea. Any time a robot does something dumb, that will be a learning experience (that hopefully won't get anybody killed).

The robot will have to learn from experience the countless rules of thumb that we use to do stuff, and hopefully find patterns that explain them.

Expand full comment

You're describing Russian roulette. "Hopefully" there's not a round in the chamber.

Expand full comment

I am not saying we should put a 100 horsepower metal beast in your living room. That would be bad. The robots should have very little force and torque. There are also designs with pneumatic muscles.

Such robots will, in the worst case, sulk in the corner, and won't have enough strength to lift a chair.

Then they will have to be trained to do something. LLM can give them ideas to try. Hallucination will make them somewhat ineffective, but that can be sorted out with more data based on their first-hand experience with failure.

Expand full comment

“In from three to eight years we will have a machine with the general intelligence of an average human being.”

---- Marvin Minsky, Life magazine, 1970:

Expand full comment

And that was a year after he and Papert “proved” that connectionism was a dead end. Some days I think the only things bigger than the budgets of the corporations trying to score billions off of AI are the egos of the techies trying to do the scoring.

Expand full comment

Humans have a bad record at predicting the future of technology. Either too optimistic or too negative. Thanks, Gary. I don’t always agree with you, but this is on point. We need to ask tough questions and have a BS meter for these talks and other assertions by AI technologists.

Expand full comment

Helen Toner's talk was my favourite. We keep on seeing 'more calls' for AI auditing but not seeing anything substantial actually happen yet.

Expand full comment

Why 2049? Why not 2050? Because 2050 sounds like a wild ass guess but 2949 sounds like a calculation that was reached somehow. But it's a wild ass guess. Same old AI predictions that have been given for decades.

Expand full comment

Last year TED: a lot of wishful thinking with at least something to substantiate it.

This year’s TED: a lot of wishful thinking.

Expand full comment

The interesting question is what are they smoking :)

Expand full comment

Thank you, Gary, for that summary. I am the PR writer who worked with your book The Birth of the Mind. It is great to read this work, so sane on a topic scary to the nontech world.

Expand full comment

I'm reminded of an episode of South Park, which I never watch. It's a business plan hatched by Underpants Gnomes, small humanoids that steal underpants. They have a three phase business plan: 1) collect underpants, 2) ? 3) profit. Instead we have: 1) scale up, 2) ? 3) prosperity for all, including a pony for every child.

Expand full comment

Agree with your comparison to nuclear energy.

AI winters have been all about inflated expectations (I joined AI in the early 80s, during the deepest AI winter), so we better try to be real today.

Expand full comment

I'd have to agree with the New Species aspect, they are, or will be unlike us in the way they "think"

I also suspect that the bitter lesson will continue to be bitter

Expand full comment
Apr 18·edited Apr 18

The electricity production from nuclear plants was stigmatized by stirring up the fear of life hazard by radiation and radioactive contamination, based on Chernobyl and Fukushima examples. Nothing of that sort concerning AI proliferation for the moment. Nothing sufficiently dangerous and spectacular occurred to this day to warn all the people, to show them that this technology at its present state is not really ready for general no restricted use, that they are not protected against its potential negative consequences. My concern is not that there will be too much resistance from the public but that there will be too little resistance, too little criticism in the entire society. Apart experts issuing warnings in specialized conferences and publications, apart creators and editors worrying about their IP, there is little awareness in the global audience about this technology. Most of the main public media conveys a very positive, sometimes nearly enthusiastic image of our common glorious future with AI. No expert users, average people will adopt easily this not yet reliable, not yet safe, no yet regulated, no actually controllable technology simply because it is cheap, trendy, handy and apparently efficient. I wish there were more public discussion and opposition to this technology, a resistance allowing time to set up some regulations and to instill good practices and safety rules to all users.

Expand full comment

Marcus writes, "We won’t get to a billion personal robots if they are as dodgy as driverless cars, frequently working, yet stymied often enough by outliers that we can’t fully count on them."

According to NPR stories, two companies plan to have driverless semi tractor trailers on selected roads in Texas by the end of the year. So, how to think about that?

Yes, we can't fully count on driverless vehicles. But as compared to what? A quick trip on any interstate highway reveals that MANY humans drivers are completely content to tailgate us at 75mph. And tailgating isn't really an adequate description, it's often more like NASCAR drafting. Vast numbers of human drivers truly don't care about anybody's safety, including their own.

So the question isn't, are driverless vehicles perfect? The question instead is, can driverless vehicles, on average, equal or exceed the quality of human drivers?

It seems we can apply this common sense logic to many things about AI. It's not enough to simply point out AI's flaws, we should be comparing those flaws to human flaws.

As one example, many people claim that AI text content is pretty low quality. Well, as compared to what? Have those making such claims experienced social media, the largest content trash pile in human history, all generated by humans?

Expand full comment