20 Comments
Dec 9, 2023Liked by Gary Marcus

“But the Turing test cuts both ways. You can't tell if a machine has gotten smarter or if you've just lowered your own standards of intelligence to such a degree that the machine seems smart. If you can have a conversation with a simulated person presented by an AI program, can you tell how far you've let your sense of personhood degrade in order to make the illusion work for you?"

― Jaron Lanier

When most people are idiots, the Turing Test is a n easy match for Artificial Idiots?

Expand full comment

Turing test really tests the gullibility of the human interrogators.

Expand full comment
Dec 9, 2023Liked by Gary Marcus

The Turing test is based on the idea that humans are not easy to fool. It turns out humans — even the most intelligent ones — are very easy to fool.

The Turing test is part of the belief system that humans are actually intelligent in an *absolute* sense (not just in a relative sense when compared to other species on earth). But humans — as is becoming clearer because of modern psychological research and what we see regarding conspiracy theories and all that — are actually very easy to fool. So, the test was a good idea, the assumption behind it however is not.

Humans believe that their convictions come from their observations and reasonings and they have high trust in both (hence the attractiveness of the Turing test). But the reverse is more often true. Our convictions steer our observations and our reasonings. That we believe in powers we do not have is probably a necessary side effect of our intelligence, which in that case is somewhat self-limiting.

Expand full comment

"Let that sink in. ELIZA (built in 1965-1966) beat GPT 3.5. That’s embarrassing! 1966 software that could easily run on my watch running competitively with multi-million GPU clusters trained on a large fraction of the internet."

Yep, that's EXACTLY what I was thinking when I read that. Ha! And your watch could run ELIZA and never break a sweat. The average smart watch is more power than the most powerful super computer in the 1980s, never mind 1966.

Expand full comment
Dec 9, 2023·edited Dec 9, 2023Liked by Gary Marcus

Spot on, Gary.

To wit, in 1979 I write a chatbot that was very revealing to my teenage mind: it was a take on ELIZA called DR. CHALLENGER (since I didn't have access to the source code of Weizenbaum's program; btw the source code of DR. CHALLENGER is available online now). It was a non-directive psychotherapist like ELIZA. It totally fooled my dad, who was amazed, and projected intelligence and awareness into it. My sister-in-law on the other hand, more shrewd and skeptical, immediately elicited responses from the program that were, to her mind, obviously canned, fake, and mechanical.

This experience led me on a multi-year quest to understand and possibly engineer a real "mind" (artificial intelligence), or a "beauty computer" to make art, etc., and to find out what the heck intelligence and consciousness was. Long story short, I failed on the first task (learning a great deal about human minds, not to mention computers and software in the process), as *no one knows* what intelligence is, or where creativity comes from, though many claim they do – and in the course of decades, found out consciousness was real.

As I commented earlier, the mechanical nature of ChatGPT-4 was clearly and suddenly revealed, when I tasked it with helping me edit a re-wording of Atma-Darshan for modern English readers: a philosophical text by an Indian writer named Atmananda Krishna Menon from the early 20th century who wrote about consciousness (ironically, or aptly enough). What it revealed in a flash of clarity was that this system has *zero* real understanding of meaning, but is brilliantly designed to be an impressive spinner of textual language patterns, crafted on the fly according to one's prompting. But it will never arise above it's true nature as that; putting bits together from parts this way does not equal intelligence-consciousness…

Expand full comment

It's rather depressing to realize that the best way to pass the Turing test is for the machine to master some of humans' least admirable traits. Your comprehension test sounds much better, though judging from recent PISA scores and what we see on university campuses, a substantial fraction of humans would fail that too. Maybe machines and humans are in fact getting closer, but with the convergence happening from both sides...

Expand full comment
Dec 9, 2023Liked by Gary Marcus

Unfortunately arguments around performance levels at any point in time are always subject to being superseded. I agree that passing the Turing Test is/will be a “so what” moment though.

Expand full comment

The fact that ELIZA scored so well says more about the human mind than the artificial mind.

I wrote a longer piece on the paper you referenced and I feel with better prompting that GPT-4 could’ve probably scored close to human performance.

Funny enough, according to the study, people are able to consistently tell they are talking to people about 64% of the time.

For me the results do tell us something interesting, namely that computers are getting close at passing as humans in conversation. Not only is that an impressive feat, it’s also worrisome, as AI imposters will become a real thing within the not so distant future.

Expand full comment

Tasks much simpler than a full Turing seem to be enough to fail these models. A demo I just uploaded:

https://github.com/0xnurl/gpts-cant-count

Expand full comment

As always, you can "fool" the public but can you fool a philosopher? Does it matter in the sense that if people think a machine is intelligent , conscious and understands, they will behave accordingly.

Expand full comment

ELIZA beat GPT 3.5... yeah right.

That alone should tell you there's something very wrong with how the test was conducted.

Expand full comment

Claims about ELIZA vs GPT-3.5 are just as silly as assertions that LLM are somehow intelligent.

"GPT-3.5, the base model behind the free version of ChatGPT, has been conditioned by OpenAI specifically not to present itself as a human, which may partially account for its poor performance. " https://arstechnica.com/information-technology/2023/12/real-humans-appeared-human-63-of-the-time-in-recent-turing-test-ai-study/

LLMs are much better "parrots" than ELIZA. LLM can create original language, rather than just using a simple rulebook. But of course, recent language generation progress, while a big advance, is just one component of an AI system.

Expand full comment

Yes, let's use the turing test, which was developed before AI was even a field of research, and during a period of time that when research *did* begin, they literally thought they could make a thinking machine in a single summer.

The mythology of the turing test as some objective, rigorous method is laughable.

Expand full comment

Did Sam pass the Turing Test?

Expand full comment

I've always thought that a rule needed to be added to the Turing Test to require the human in the loop to be an AI expert who is willing and able to probe the limits of the human-or-AI at the other end of the line. Maybe this is what Turing had in mind but there were no AI experts back then and he probably didn't realize how good an autocomplete would be created 65 years later.

Expand full comment

I think these comments about the Turing Test are about a rather simplified -- and very different -- test from the one proposed at the beginning of Turing 1950. Turing's _first_ suggestion really wasn't as simple/stupid an idea as what people now call "the Turing test."

I have a lot of papers on Turing, but the one that clearly lays out the differences between the two tests, and shows that they yield different experimental results is "Turing's Two Tests for Intelligence" S G Sterrett 2000 in the journal Minds and Machines.

Expand full comment