A couple of days ago I gave a whole bunch of reasons to doubt Hinton’s confident but problematic claims that GenAI systems don’t store things and do understand them.

More evidence against his claims just keeps piling up.

One piece of evidence comes from a new study that I plan to discuss in a future essay, which examines how sensitive LLMs are to minor perturbations, like rearranging the order of answers in multiple choice, the insertion of special characters, or changes to the answer format. As it turns out, minor changes make a noticeable difference, enough to rearrange “leaderboard” rankings. If a system really had a deep understanding, trivial perturbations wouldn’t have such effects.

Suffice to say for now, it’s not a win for LLMs and understanding.

In the meantime, in the picture is worth a 1,000 words department, here's a very clever example

, which speaks strongly yet concisely against everything Hinton was trying to say:

The example highlights three things:

1. GenAI systems do effectively store things. (In this case, canonical video game controller images.)

2. They are attracted 𝘵𝘰 those effectively-stored representations, particularly (I would presume) frequent ones. They (often) don’t come up with a novel independent design that would meet the requirements. (The “italian plumbers” examples I have discussed, in which GenAI systems tend to gravitate towards Nintendo’s trademarked Mario character, illustrate the same thing.)

3. They fail to “understand” simple concepts, such as “design to used with only one hand.” This should remind you of my older essays Horse Rides Astronaut and Race, statistics, and the persistent cognitive limitations of DALL-E. Noncanonical situations continue to remain difficult for generative AI systems, precisely because frequent statistics are a mediocre substitute for deep understanding.

I will leave the last words this time to the well-known ML expert François Chollet:

