Jim on Marcus on AI

8 Comments

At this point, he’s just using the older and worse models. His line of argument wouldn’t work at all if he used more modern ones (especially COT models).

Expand full comment

That's nonsense.

2 weeks ago I went to Claude, latest model. There were 3 sample test questions on the "suggested prompts", as usual. The middle one was the trivial "Whlch 1s bigger Test if Al knows which number is bigger." (AI text grabbed from a photo, excuse errors!)

> What is bigger, 9.9 or 9.11?

9.9 is smaller than 9.11.

9.11 has a larger digit in the tenths place (1) than 9.9. (which has 9 as the tenths digit and implicitly a 0 in the hundredths digit or 9.90).

I use chatgpt paid, too. It is also useless, in the same way I have to lead a school child to the right answer. It is less work, for something factual and requiring precision, to do it myself.

Expand full comment

I didn't test Claude Sonnet 3.5, but o3 mini and Gemini 2.0 advanced got it right, while 1.5 failed. The point remains, that these can be extraordinarily useful tools, if used with awareness of their limitations. AGI, they are not.

Expand full comment

Like (1)

Jim

Feb 6

Which OpenAI model did you use. Honestly, it sounds like a skills issue.

Expand full comment

Reply (1)

Jonah

Feb 6Edited

Ironically, your comment unintentionally argues against itself. Why would something be a skill issue? Because the person, not the model, needs to have the skill to lead it to the correct answer...because the model does not actually give the correct answer!

Expand full comment

Tell me you don’t understand current LLM technology without telling me you don’t understand current LLM technology

Expand full comment

Jonah

Feb 6

Well, you could just read the comments where people posted answers from other models, and either they or I identified the mistakes in them.

Expand full comment

Reply (1)

Jim

Feb 6

Some of them had errors the ones best suited for the task had none. Practical LLMs of this type have only been around for a few years. Look at the early history of any technology. Expecting instant perfection is absurd and places far more faith in the technology than is in any way reasonable.

Expand full comment