35 Comments
Feb 18, 2023·edited Feb 18, 2023Liked by Gary Marcus

Wow. This is music to my ears because it agrees with what I've been saying for many years.

The brittleness of deep neural nets is not unlike that of the rule-based expert systems of the last century. If either of these systems is presented with a new situation (or even a slight variation of a previously learned situation) for which there is no existing rule or representation, the system will fail catastrophically. Adversarial patterns (edge cases) remain a big problem for DL. They are the flies in the DL ointment. Deep neural nets should be seen as expert systems on steroids.

The only way to get around the curse of dimensionality is to generalize. Unfortunately, DL only optimizes, the opposite of generalization. That's too bad.

Thank you for another interesting, informative and insightful article.

Expand full comment
Feb 18, 2023Liked by Gary Marcus

Again, what a beautiful example. I don't think such a trick to get outside of a ('deep') ML 'trained comfort zone' would work against more classical chess engines (like Deep Blue — which by the way also has some (less serious, probably) issues). But this is really beautiful.

Expand full comment
Feb 18, 2023Liked by Gary Marcus

It is interesting, but not surprising, that AIs have different weaknesses than humans. And, of course, our intimate knowledge of how the AIs work gives us a leg up on finding these weaknesses. We're nowhere near as far along with human brains. We know the limitations of human senses pretty well but not much beyond that. It seems likely that any algorithm for solving a game with perfect information but where a total solution is computationally intractable, will have vulnerabilities that can be exploited. Probably also has big ramifications for AI warfare.

Expand full comment

On the one hand, this is quite shocking, given the impact of the GO AI victory. On the other hand, maybe not. Feedforward networks, whether they have 2 or 100 layers, are function approximators (e.g., see Hornik et al., 1989, Neural Networks, 359–366).

In turn, function approximation is based on the Weierstrass theorem from the 19th century. This dictates that function approximation works only on limited and bounded subsets of the input domain. Informally, it is clear why. You need information about the function to be approximated, which is always finite, hence limited. And, that information has to cover the approximated interval in a sufficiently 'dense' manner.

As a consequence, there will always be information outside the approximated interval, or in less densely covered parts of it, for which the behavior of the network is uncertain (in fact arbitrary, see below).

Of course, you can try to fix this by retraining the network etc. But this is a bit like a dog chasing its own tail. The Weierstrass theorem guarantees that it will always be possible to fool the network. E.g. see https://www.researchgate.net/publication/273525175_Computation_and_dissipative_dynamical_systems_in_neural_networks_for_classification

Expand full comment
Feb 21, 2023Liked by Gary Marcus

This is not like finding out goliath can be killed by a slung rock which is actually reasonable, this is like finding out Balder which cannot be hurt by any weapon in the 9 realms can be killed by mistletoe.

Expand full comment
Feb 19, 2023Liked by Gary Marcus

Isn’t part of the current confusion more generally about AI getting better and better at showing BEHAVIOUR similar to humans in many areas, while this says nothing about underlying awareness or intelligence?

In a way, that is the engineering way of looking at the world: if a system behaves the same as a real world phenomenon on a test set it is considered good enough as an explanation.

Expand full comment

Amen to your amen Gary.

Expand full comment

I played Go for a long time. About 4 Dan amateur at my peak so not really strong but definitely know how to play. Plenty of competitions. Watched lots of professional games with professional commentary. To me, this example is a bit dumb. Clearly the computer has won. Maybe I never understood the rules. Also, maybe none of the people I played against in competitions, or at the local Go club understood the rules either. Maybe the referees at our competitions never understood the rules. If you have a few stones inside the opponents area you need to try and make an eye, then your opponent might even ignore your first move because you have no chance of making a live group (two eyes needed). I’m pretty sure if I called the referee over at the end of a game like this the ref would agree. The whole go club would be perplexed if I’d "lost".

Expand full comment

...“ once again we’ve been far too hasty to ascribe superhuman levels of intelligence to machines” ... Not even superhuman. Human or animal :)

Expand full comment

Since the human neocortex is just 6 layers deep and deep learning models are hundreds of layers deep, the models must be massively inefficient right now. More specifically the models are massively overdetermined, with a rat's nest of built-in exception handling. There is lots of evidence now in the press that that is rapidly changing as more and more methods are explored to characterize and prune the models -- as if we're now developing hyper-topographical maps of the models, or meta-meta-meta-....-models, if you will, that will let us hone in on the most efficient paths to the goals.

Expand full comment

It has been long known that RL trained against an opponent will learn to do well on the distribution of board states that are visited by that opponent. Shifting to a different opponent can reveal new game states where the RL system plays badly. When RL is trained against a copy of itself, this distribution of board states shifts over time and can become overly focused on the small part of the state space where the RL system is doing particularly well. The result can be similar to what happens with moose: they grow huge antlers to defeat other moose, but the weight of the antlers makes them vulnerable to other attacks.

Of course AlphaGo revealed that human players also had blind spots, because they had only been playing against other human players. This is why AlphaGo could defeat them. The lesson is that in problems with massive state spaces, it is very difficult to find a solution that works well across the entire space.

There are steps we can take to improve the robustness of these systems. An obvious one is to train against an ensemble of opponents with a wide range of different behaviors. We can also train an ensemble of systems to each be strong while collectively exhibiting diversity of approaches.

Expand full comment

The terms more or less overlap. Say that you've always used a certain neural pathway to, for example, pick up a glass of water. You have a stroke that kills off many to most of the motor neurons you need to do that. Your brain can likely create new pathways to accomplish that task. It does it by using alternative sets of neurons that didn't used to be involved in picking up a glass of water. That's an example of brain ~plasticity~: it ~learned~ a new way to do a task.

Think of plasticity as flexibility of function. Learning happens when a behavior is modified. Learning requires flexibility/plasticity. Not sure I've helped here....

Expand full comment

Gary writes that “deep learning ... is not always human like”.

I wish we would all stop making any comparison at all between today’s AI and humans. We must not train lay people to overestimate the humanity of robots.

An LLM is as human-like as a rubber sex doll: it’s convincing for those who want to believe they’re in the company of another human. And to be fair, that’s almost all of us, for we are evolved to perceive humanity, to anthropomorphise.

No neural network can self-reflect. No LLM can explain what it thought was gorilla-like when failing to recognize photos of black people. Or what it thought it saw when it failed to recognise an airplane on the tarmac.

The ability to explain oneself is one faculty we should INSIST on before ascribing human qualities to any machine.

No AI for now can truly learn from its important mistakes. An image analyzer that misclassifies people or crashes a car into an emergency vehicle does not appreciate the real depth of that error. The only way for experience to make its way into the artificial neural network is for designers to intervene (like gods) and change the rules.

Expand full comment

Excellent new/revised paper by Murray Shanahan on the dangers of anthropomorphizing LLMs. One of the clearest descriptions of LLMs I’ve seen.

arxiv.org/pdf/2212.03551.pdf

Expand full comment

Would this approach work against something trained with muZero's self play model of training?

Expand full comment

Great article. Nice to hear Google suffering setbacks here and there. First with ChatGPT against Microsoft and now their darling Deepmind Alpha Go against an intelligent American computer scientist human.

Expand full comment