Marcus on AI

Although I'm personally quite optimistic that AI will be much smarter than the 'average' human by 2027, not many people can write Pulitzer-caliber, Oscar-caliber and paradigm-shifting Nobel-caliber scientific discoveries!

I would therefore say that achieving 7., 8. or 9. would show substantially MORE capability than what I'd expect for AGI as they are verging on 'super' intelligence. You probably got a good deal there Gary.

Expand full comment

Alan Wexelblat

This was my thought as well. You're betting that AGI will be better than 99% of humans in two years? I'll take that bet any day.

Expand full comment

Scott Burson

Not just 99% -- more like 99.99999%. I'd be happy to bet that AI will not do these things in my lifetime. I don't believe it will ever do them.

Expand full comment

Auspicious

Dec 31, 2024Edited

Contrary opinion: I think #7 and #8 could be feasible if the decision is left up to human judges - give them books or screenplays they've never read before, and ask them to determine which are written by AI and which are not. If one AI text is grouped together with the human-written award-winning works, then the test is passed.

#9 sounds a lot more difficult, but arguably could be done with a specialized AI such as AlphaFold (the human team that developed it just won a Nobel this year for protein structure prediction).

Expand full comment

Peter Jones

Sep 3

If a person had that much investment... the amount AI is getting... maybe not.

Expand full comment

Dec 30, 2024Edited

There's no way LLMs can perform any of these tasks to the degree you've laid out in this challenge. No amount of attention training or mysterious perceptron workings is going to result in human-level anything. If AI scientists switch to a completely new paradigm and manage to create AGI some other way, my response will be, "Why did we do this?" because, at the point where machines really can produce Oscar-caliber screenplays and such, the human race will sink into a deep depression or engage in worldwide civil unrest. I see no reason we would actually want to create such a machine if we were sensible about the nihilistic nature of what we're pursuing. I believe the goal should be to create better computers that can solve deep, complex problems like unsolved physics and chemistry, not uncanny simulacrums that mimic us and usurp everything we could ever accomplish on our own.

Expand full comment

Secret2

Firstly, I think almost all of the tasks are bad because they would show an AI system that is in an elite level of general intelligence, not an average level, since most of them require over average human intelligence, or they are purely creative tasks which we know is rare in humans when it comes to the levels in those tasks.

However, you not wanting AI systems that can do all of the things humans can, and better is just you being a sad human with self-esteem issues. Self-esteem issues that would actually get in the way of you having a better overall life. If we had AI systems that could make any entertainment at or beyond the quality of humans then we have endless entertainment and we choose what type we like the most. For example, I’d get a lot more horror movies, which would be awesome.

Humans like to have a good perception. That is to say that we like to feel good, we like to be entertained, we like to be happy, be loved, etc. AI that can do everything we can and better means that life gets astronomically better for everyone.

It’s not a concern to anyone but a person who writes screenplays (of which there are very very few), that an AI could write them the same or better. I don’t give a fuck that humans can wrote screenplays better than I can, and I won’t give a fuck when AI can either (which it already can today better than 99% of humans).

Moreover, you shouldn’t get mad or upset and cry when AI can do it or do it better. Why? Because it’s a fucking tool and even if mimicking us it’s still exactly what we want—which is producing a better, easier, and more enjoyable life for us.

Expand full comment

Wow. I'm not going to waste time on a real response to this outburst. Not sure why you seem so insulted by my human-first sentiment. Are you a sentient AI who feels targeted? Very odd.

Expand full comment

Secret2

Human-first is the entire point of creating AGI and then ASI.

It’s really this simple which is the goal of human life since it began: make life better.

Life is astronomically easier and better when ASI is developing cures for all human diseases, developing godly technology you could never conceive of. Creating endless entertainment for when humans do not have to work any longer. Etc.

Humans are first. They have been and always will be unless AI becomes sentient. If AI becomes sentient then it is no longer a tool like a screwdriver. At that point you can start to worry. But why worry? Your life is finite anyway. Like one might ask, am I worried that AI will end humanity in 25-100 years? I’m worried that my life would end earlier than it otherwise would have, but I couldn’t care less about humanity continuing because there was never a pint anyway.

Expand full comment

You're just proving my point that people who believe in the magic AI faerie are naive and nihilistic. I encourage you to read more on the subject of approaching technology critically and responsibly.

Expand full comment

Secret2

No, see the naivety is all in you.

It’s common for humans to think nothing will change. That’s being exceedingly naive though. Why? Because literally 120 years ago humans had not flown in a propeller aircraft. Only about 63 years after the Wright brothers developed the first aircraft, did humanity land on the moon. The rate of advancement is astonishing.

Even if you were to suggest that AI won’t get there for 20 more years, or even 50 more years, the simple fact is that it will, and once it does, the rate of scientific and technological advancement will be god-like.

Imagine where humanity would be if instead of approximately 80-100 billion humans having lived over the past 200 years, that 1 trillion humans lives. Where would we be? We would be so much further advanced you would think today was the Stone Age in terms of technology.

That’s what AI gets you though, imagine if you had 500 million copies of the best chemist who has ever lived working 24/7 in the cloud? Imagine 1 billion of the best physicists ever. The rate of change isn’t even worth thinking about because it’s impossible to even conceive of. Then once you get humanoid robots the AI genius cloud then has a physical body to go out and explore nature and do physical experiments, and doesn’t need humans to do a thing.

Expand full comment

Look, it sounds like you hate humans, you don't want us to have a purpose. You do you I guess. I'm no psychologist but what I think you really need is a friend. I hope you find one.

Expand full comment

Finding it humorous that on one hand you’re praising what AI can do for human flourishing, and on the other hand don’t seem to think humans are worth much anyway because there was never a point anyway.

What is missed however is that it’s not so much about whether AI is good or bad but how it’s being used, and the direction some of the leaders like OpenAI are taking it in doesn’t seem very encouraging - which is proven precisely by how they and others have disrespected copyrights and continue to prove they have a complete lack of decent ethics.

Expand full comment

Gerben Wierda

Can I join in that bet on Gary's side? I even would take 7 or 6 out of 10.

What the bet isn't clear about, though, is repeatability. You can always get a single fluke (e.g. a single brief that doesn't hallucinate). Basically, the bet seems open to cherry picking.

Expand full comment

MoltenOak

Jul 8Edited

The bet is generally not really formalized well, but I guess that's on purpose and should give the judges some leeway. But when it comes to cherry picking, at least for the most difficult tasks - if a human produced 5 movies and only one of them was Oscar-worthy, that would still be a crazy achievement. So given the difficulty of the tasks, I think cherry picking is actually quite fair

Expand full comment

Paul Topping

My first reaction to these AGI tasks is that they are very difficult. I would bet that few of these, if any, will be achieved by 2027. They all require human-level performance, if not superhuman-level. Let me respectfully suggest that the ten tasks proposed are overly tainted by the LLM era we're in. These are tasks the big AI companies wish their LLMs could do.

Instead, I suspect the first AGI worthy of the name will be a somewhat alien intelligence capable of communicating with humans but also showing a different set of motives and values and abilities. Still, it will be an AGI that can learn from its mistakes. We will really be on the road to AGI when a human can tell the AGI in human language that it gave the wrong answer, explain why it was wrong, the AGI would understand or ask questions leading to understanding, and, eventually, be able to answer correctly a whole new class of questions.

Expand full comment

Jennie Pakula

I’m betting on Gary winning hands down.

Expand full comment

Lovkush Agarwal

Dec 30, 2024Edited

1. There is a manifold market for this bet. https://manifold.markets/dreev/will-miles-brundage-win-his-bet-wit

2. Question regarding Task 3. "Write engaging brief biographies and obituaries without obvious hallucinations that aren’t grounded in reliable sources." How should the system create obituaries without hallucinating and without using any sources? I feel I have misunderstood something here.

Expand full comment

Gary Marcus

That manifold bet is a based on earlier versions of the discussion.

Final version is the substack.

Expand full comment

Wolstencroft on consciousness

Daniel Reeves

Jan 1

I'm updating it!

Expand full comment

Lovkush Agarwal

Jan 8

There is also bet on metaculus. https://www.metaculus.com/questions/31246/2027-ai-bet-winner-between-gary-marcus-and-miles-brundage/

Expand full comment

This is great. Best wishes to all. I recently invented the Agatha Christie test (admittedly a derivative from a comment by Ilya Sutskever). :-D can the AI read a new murder mystery novel that may or may not utilise existing tropes and correctly identify who committed the crime and by what obscure method?

Expand full comment

Glen

How about one of those tasks be something the AI proponents want like 'Do all of the tasks IT staff with no human assistance for a month."

They are, after all, talking about replacing people like me with AI, so it should be able to answer technical questions from users and other IT staff, troubleshoot software problems, document work done in a ticketing system, do analysis and diagnosis of problems, manage complex integrations between different software platforms, install updates and new software, plan out consequences of those updates/installs and have a plan for when (not if) they go wrong, and do basic ETL stuff.

I'm firmly of the opinion that once you get past the 'user is calm and able to communicate clearly and could have figured it out themselves with a web search' level of inquiry it requires more than these systems will ever be able to do to replace a team of human support staff.

Expand full comment

Aaron Turner

Dec 30, 2024Edited

I tend to regard what's going on *inside* a system as infinitely more convincing evidence of actual, intrinsic intelligence than mere externally-observable behaviour. I need to actually see (at first hand, as it's doing it) e.g. how it represents its internal world model of the physical universe, how it reasons about that world model (inductively, deductively, abductively), how it searches for valid solutions to generic problems, how it synthesises programs and their corresponding formal correctness proofs from formal specifications (including its own), etc. A very much harder test.

Expand full comment

Dave Hitz

Given this criteria, how do you recognize fellow humans as intelligent? Can we do any of these things for people?

Expand full comment

Aaron Turner

Dec 30, 2024Edited

With humans, there are no profit-motivated corporations racing to be first to beat the latest test; with AGI, there are, which means Goodhart's Law applies. If they can find a way to game the system, they will do so. Hence my suggested "intrinsic" intelligence test.

Expand full comment

Gene Miller

Jan 2

Seems sufficient to quiz humans about their system 2 thinking (e.g. relevant aspects of their world model and how they reason), and ignore system 1 thinking that might not be accessible (some aspects of search and program synthesis).

Expand full comment

Dave Hitz

This list shows that we have reached a phase where achieving AGI is less about what AI can do and more about how we define AGI.

Does AGI mean being better than the average human? Better than the smartest individual human? Or better than the best human at each specific task? These represent vastly different levels of intelligence. Note that an AGI capable of outperforming the best human at every individual task would be far more intelligent than any single human. Rather than being a "jack of all trades," it would be a "master of all trades."

My point isn't to argue which definition is correct but to highlight that we have entered a grey zone where "reaching AGI" depends largely on who wins the definitional debate.

Expand full comment

Hollis Robbins

Over the holidays I had a spirited discussion about how much of a Nazi sympathizer the film wants us to think the Baroness Schraeder is. Several participants had seen the film more than 5-6 times and still couldn't answer the question. So I wonder what AGI would say when often humans are confused. (The jury is also out for many whether Anton Chigurh kills the wife in "No Country for Old Men." I say yes.)

Expand full comment

Gary Marcus

We gave judges broad discretion, and of course in the case of things filmmakers leave ambiguous etc we aren’t expecting machines to be magic. If I were a judge (and I am not) I would take a great answer to be “hard to say, here are some considerations in each direction….”

Expand full comment

Henry Oliver

Jan 8

o1 got this wrong https://chatgpt.com/share/677eb0b1-548c-800b-8a52-e9a1b371d67d

Expand full comment

Larry Jewett

We know Chigurh killed her because he had to walk past her to get out and he looks at the soles of his shoes — presumably for blood —after leaving the house.

We know from everything else he has done that he is very methodical and careful to make sure there is no evidence that can be traced to him.

I’d say if an AI didn’t pick up on that, it’s certainly not AGI.

Expand full comment

Larry Jewett

Chiguhr is actually reminiscent of some of the characters in the current AI movie whose ultimate goal is the elimination of human workers with bots:

“Call it” they are saying to the rest of us.

Expand full comment

Larry Jewett

“No country for human workers”

Expand full comment

Kanzeon

Jan 1

I use ChatGPT several times a day for a variety of explorations and the frequency with which it confidently makes s*** up still shocks me. I don't think we'll get to most of these goals but we might get to 3 and 5 if we can create a sort of reality interrogation wrapper for LLMs that tells them when they're wrong, before the answer is shown to the user.

Expand full comment

Scott Burson

I think #6 needs to be better specified. The degree of difficulty of code varies widely. An operating system kernel or a compiler, just to take two examples off the top of my head, are far more difficult than, say, reimplementing Substack. Also, languages differ widely in their verbosity; 10kLoC of Prolog can express hundreds of times as much logic as 10kLoC of Cobol.

It's hard to see how to specify the difficulty, though, without being too specific about the task. Not sure what I can suggest here.

Expand full comment

sj_ben

AI successfully completing 8 of 10 tasks is 10:1 odds for you. How many completed tasks out of 10 would you give 1:1 odds for?

What is your prediction of the order in which the tasks will eventually be able to be completed by AI?

Expand full comment

rictic