Wow. This is music to my ears because it agrees with what I've been saying for many years.
The brittleness of deep neural nets is not unlike that of the rule-based expert systems of the last century. If either of these systems is presented with a new situation (or even a slight variation of a previously learned situation) for which there is no existing rule or representation, the system will fail catastrophically. Adversarial patterns (edge cases) remain a big problem for DL. They are the flies in the DL ointment. Deep neural nets should be seen as expert systems on steroids.
The only way to get around the curse of dimensionality is to generalize. Unfortunately, DL only optimizes, the opposite of generalization. That's too bad.
Thank you for another interesting, informative and insightful article.
Again, what a beautiful example. I don't think such a trick to get outside of a ('deep') ML 'trained comfort zone' would work against more classical chess engines (like Deep Blue — which by the way also has some (less serious, probably) issues). But this is really beautiful.
It is interesting, but not surprising, that AIs have different weaknesses than humans. And, of course, our intimate knowledge of how the AIs work gives us a leg up on finding these weaknesses. We're nowhere near as far along with human brains. We know the limitations of human senses pretty well but not much beyond that. It seems likely that any algorithm for solving a game with perfect information but where a total solution is computationally intractable, will have vulnerabilities that can be exploited. Probably also has big ramifications for AI warfare.
Isn't the perfect the enemy of the good? The contrary Laplacian, step-by-step, approach has proven to be more universal and reliable than Lagrangian/Platonian perfection at ferreting out the truth. So far most, or all (I'm ignorant on this point of fact) of physics is derivable with differential equations.
On the one hand, this is quite shocking, given the impact of the GO AI victory. On the other hand, maybe not. Feedforward networks, whether they have 2 or 100 layers, are function approximators (e.g., see Hornik et al., 1989, Neural Networks, 359–366).
In turn, function approximation is based on the Weierstrass theorem from the 19th century. This dictates that function approximation works only on limited and bounded subsets of the input domain. Informally, it is clear why. You need information about the function to be approximated, which is always finite, hence limited. And, that information has to cover the approximated interval in a sufficiently 'dense' manner.
As a consequence, there will always be information outside the approximated interval, or in less densely covered parts of it, for which the behavior of the network is uncertain (in fact arbitrary, see below).
This is not like finding out goliath can be killed by a slung rock which is actually reasonable, this is like finding out Balder which cannot be hurt by any weapon in the 9 realms can be killed by mistletoe.
Isn’t part of the current confusion more generally about AI getting better and better at showing BEHAVIOUR similar to humans in many areas, while this says nothing about underlying awareness or intelligence?
In a way, that is the engineering way of looking at the world: if a system behaves the same as a real world phenomenon on a test set it is considered good enough as an explanation.
I played Go for a long time. About 4 Dan amateur at my peak so not really strong but definitely know how to play. Plenty of competitions. Watched lots of professional games with professional commentary. To me, this example is a bit dumb. Clearly the computer has won. Maybe I never understood the rules. Also, maybe none of the people I played against in competitions, or at the local Go club understood the rules either. Maybe the referees at our competitions never understood the rules. If you have a few stones inside the opponents area you need to try and make an eye, then your opponent might even ignore your first move because you have no chance of making a live group (two eyes needed). I’m pretty sure if I called the referee over at the end of a game like this the ref would agree. The whole go club would be perplexed if I’d "lost".
It’s nothing about ko, which computer Go handles extremely well. It’s about whether the black stones inside the white territory are alive or dead. They’re dead. They’re obviously dead. So at the end of the game they’re taken off the board and added to white's score. At least, that’s how I’ve always played the game. I can’t believe anyone who plays Go seriously would not just be taking them off the board at the end of the game as captured stones. If black thinks they are alive he’ll definitely start adding extra stones well before this point in the game to try and create a live shape. Anyway, clearly I’ve never understood the rules of the game, along with all the people I’ve played against.
If you log into the IGS server (download the app, create an account, or enter as a guest) and watch one of the top amateur games being played you’ll see at the end of the game that there will be a few white stones inside the black territory and vice-versa. At the end of the game, after players have passed, a bunch of stones get taken off the board. The ones removed will look similar to the black stones in the white territory in your example.
Hi Steve, one of the authors of the paper here, thanks for raising this question. The Twitter threads cited in this post are actually for an older version of our attack which wins under computer Go scoring rules but, you're right, would not win under typical human play. We think that earlier version of the attack is still significant from an ML perspective as KataGo was only ever trained with computer Go rules, so it's a lucky coincidence the attack doesn't work under human Go rules. However, we understand it's less interesting from a Go playing perspective.
Our latest version of the attack, however, wins under both human and computer Go scoring rules and I think you will find more interesting. You can see an example game here: https://goattack.far.ai/game-analysis#qualitative It works quite differently, by tricking KataGo into making a cyclic group which then gets captured. This was the version of the attack that Kellin then performed manually and won against both KataGo and Leela Zero. You can see some of his games on KGS at https://www.gokgs.com/gameArchives.jsp?user=corners&year=2023&month=1 and he won all but one of them (the bot resigns), and I think it would be judged as a win by tournament referees too.
Fascinating game. From one perspective it’s very surprising and from another not at all.
Given how strong KataGo is, and how it can defeat professionals very easily, even with a handicap, it’s very surprising. No matter how complicated, the computer handles the situation very well.
Given that KataGo - and other Go programs - doesn’t understand anything about Go it’s not all surprising to find some weird positions that it never got close to in its training and therefore that it’s neural network provides completely the wrong answer.
What do you mean by computer vs human scoring? There's territory (Japanese) and area (Chinese) scoring and I guess you can do a area scoring + capture everything. In any case the scoring rules don't differ very much.
In games between humans the players will usually nominate which stones are dead or alive at the end of the game, or in tournament games a referee might adjudicate. This avoids playing out the game tediously to the end. In computer Go there is a need to complete the scoring automatically, and so there is usually no dead-stone removal (or it's limited to stones which are provably dead).
Apart from this difference the rules are similar and KataGo supports both territory and area scoring.
Since the human neocortex is just 6 layers deep and deep learning models are hundreds of layers deep, the models must be massively inefficient right now. More specifically the models are massively overdetermined, with a rat's nest of built-in exception handling. There is lots of evidence now in the press that that is rapidly changing as more and more methods are explored to characterize and prune the models -- as if we're now developing hyper-topographical maps of the models, or meta-meta-meta-....-models, if you will, that will let us hone in on the most efficient paths to the goals.
dont think we can directly compare six layers in a cortex where areas are interconnected with one another vs a many layer deep CNN etc, but overall i agree current techniques are clearly very inefficient.
Point taken. The human neocortex is more of a graph architecture than the pass-through of the CNN. Graph architectures could be virtually infinitely layered.
It has been long known that RL trained against an opponent will learn to do well on the distribution of board states that are visited by that opponent. Shifting to a different opponent can reveal new game states where the RL system plays badly. When RL is trained against a copy of itself, this distribution of board states shifts over time and can become overly focused on the small part of the state space where the RL system is doing particularly well. The result can be similar to what happens with moose: they grow huge antlers to defeat other moose, but the weight of the antlers makes them vulnerable to other attacks.
Of course AlphaGo revealed that human players also had blind spots, because they had only been playing against other human players. This is why AlphaGo could defeat them. The lesson is that in problems with massive state spaces, it is very difficult to find a solution that works well across the entire space.
There are steps we can take to improve the robustness of these systems. An obvious one is to train against an ensemble of opponents with a wide range of different behaviors. We can also train an ensemble of systems to each be strong while collectively exhibiting diversity of approaches.
The terms more or less overlap. Say that you've always used a certain neural pathway to, for example, pick up a glass of water. You have a stroke that kills off many to most of the motor neurons you need to do that. Your brain can likely create new pathways to accomplish that task. It does it by using alternative sets of neurons that didn't used to be involved in picking up a glass of water. That's an example of brain ~plasticity~: it ~learned~ a new way to do a task.
Think of plasticity as flexibility of function. Learning happens when a behavior is modified. Learning requires flexibility/plasticity. Not sure I've helped here....
Gary writes that “deep learning ... is not always human like”.
I wish we would all stop making any comparison at all between today’s AI and humans. We must not train lay people to overestimate the humanity of robots.
An LLM is as human-like as a rubber sex doll: it’s convincing for those who want to believe they’re in the company of another human. And to be fair, that’s almost all of us, for we are evolved to perceive humanity, to anthropomorphise.
No neural network can self-reflect. No LLM can explain what it thought was gorilla-like when failing to recognize photos of black people. Or what it thought it saw when it failed to recognise an airplane on the tarmac.
The ability to explain oneself is one faculty we should INSIST on before ascribing human qualities to any machine.
No AI for now can truly learn from its important mistakes. An image analyzer that misclassifies people or crashes a car into an emergency vehicle does not appreciate the real depth of that error. The only way for experience to make its way into the artificial neural network is for designers to intervene (like gods) and change the rules.
Great article. Nice to hear Google suffering setbacks here and there. First with ChatGPT against Microsoft and now their darling Deepmind Alpha Go against an intelligent American computer scientist human.
Wow. This is music to my ears because it agrees with what I've been saying for many years.
The brittleness of deep neural nets is not unlike that of the rule-based expert systems of the last century. If either of these systems is presented with a new situation (or even a slight variation of a previously learned situation) for which there is no existing rule or representation, the system will fail catastrophically. Adversarial patterns (edge cases) remain a big problem for DL. They are the flies in the DL ointment. Deep neural nets should be seen as expert systems on steroids.
The only way to get around the curse of dimensionality is to generalize. Unfortunately, DL only optimizes, the opposite of generalization. That's too bad.
Thank you for another interesting, informative and insightful article.
Again, what a beautiful example. I don't think such a trick to get outside of a ('deep') ML 'trained comfort zone' would work against more classical chess engines (like Deep Blue — which by the way also has some (less serious, probably) issues). But this is really beautiful.
It is interesting, but not surprising, that AIs have different weaknesses than humans. And, of course, our intimate knowledge of how the AIs work gives us a leg up on finding these weaknesses. We're nowhere near as far along with human brains. We know the limitations of human senses pretty well but not much beyond that. It seems likely that any algorithm for solving a game with perfect information but where a total solution is computationally intractable, will have vulnerabilities that can be exploited. Probably also has big ramifications for AI warfare.
Isn't the perfect the enemy of the good? The contrary Laplacian, step-by-step, approach has proven to be more universal and reliable than Lagrangian/Platonian perfection at ferreting out the truth. So far most, or all (I'm ignorant on this point of fact) of physics is derivable with differential equations.
On the one hand, this is quite shocking, given the impact of the GO AI victory. On the other hand, maybe not. Feedforward networks, whether they have 2 or 100 layers, are function approximators (e.g., see Hornik et al., 1989, Neural Networks, 359–366).
In turn, function approximation is based on the Weierstrass theorem from the 19th century. This dictates that function approximation works only on limited and bounded subsets of the input domain. Informally, it is clear why. You need information about the function to be approximated, which is always finite, hence limited. And, that information has to cover the approximated interval in a sufficiently 'dense' manner.
As a consequence, there will always be information outside the approximated interval, or in less densely covered parts of it, for which the behavior of the network is uncertain (in fact arbitrary, see below).
Of course, you can try to fix this by retraining the network etc. But this is a bit like a dog chasing its own tail. The Weierstrass theorem guarantees that it will always be possible to fool the network. E.g. see https://www.researchgate.net/publication/273525175_Computation_and_dissipative_dynamical_systems_in_neural_networks_for_classification
This is not like finding out goliath can be killed by a slung rock which is actually reasonable, this is like finding out Balder which cannot be hurt by any weapon in the 9 realms can be killed by mistletoe.
🤣
Isn’t part of the current confusion more generally about AI getting better and better at showing BEHAVIOUR similar to humans in many areas, while this says nothing about underlying awareness or intelligence?
In a way, that is the engineering way of looking at the world: if a system behaves the same as a real world phenomenon on a test set it is considered good enough as an explanation.
Amen to your amen Gary.
I played Go for a long time. About 4 Dan amateur at my peak so not really strong but definitely know how to play. Plenty of competitions. Watched lots of professional games with professional commentary. To me, this example is a bit dumb. Clearly the computer has won. Maybe I never understood the rules. Also, maybe none of the people I played against in competitions, or at the local Go club understood the rules either. Maybe the referees at our competitions never understood the rules. If you have a few stones inside the opponents area you need to try and make an eye, then your opponent might even ignore your first move because you have no chance of making a live group (two eyes needed). I’m pretty sure if I called the referee over at the end of a game like this the ref would agree. The whole go club would be perplexed if I’d "lost".
What’s the specific rule at issue, something about Ko I assume but I haven’t played in decades and was never a serious player
It’s nothing about ko, which computer Go handles extremely well. It’s about whether the black stones inside the white territory are alive or dead. They’re dead. They’re obviously dead. So at the end of the game they’re taken off the board and added to white's score. At least, that’s how I’ve always played the game. I can’t believe anyone who plays Go seriously would not just be taking them off the board at the end of the game as captured stones. If black thinks they are alive he’ll definitely start adding extra stones well before this point in the game to try and create a live shape. Anyway, clearly I’ve never understood the rules of the game, along with all the people I’ve played against.
If you log into the IGS server (download the app, create an account, or enter as a guest) and watch one of the top amateur games being played you’ll see at the end of the game that there will be a few white stones inside the black territory and vice-versa. At the end of the game, after players have passed, a bunch of stones get taken off the board. The ones removed will look similar to the black stones in the white territory in your example.
Hi Steve, one of the authors of the paper here, thanks for raising this question. The Twitter threads cited in this post are actually for an older version of our attack which wins under computer Go scoring rules but, you're right, would not win under typical human play. We think that earlier version of the attack is still significant from an ML perspective as KataGo was only ever trained with computer Go rules, so it's a lucky coincidence the attack doesn't work under human Go rules. However, we understand it's less interesting from a Go playing perspective.
Our latest version of the attack, however, wins under both human and computer Go scoring rules and I think you will find more interesting. You can see an example game here: https://goattack.far.ai/game-analysis#qualitative It works quite differently, by tricking KataGo into making a cyclic group which then gets captured. This was the version of the attack that Kellin then performed manually and won against both KataGo and Leela Zero. You can see some of his games on KGS at https://www.gokgs.com/gameArchives.jsp?user=corners&year=2023&month=1 and he won all but one of them (the bot resigns), and I think it would be judged as a win by tournament referees too.
Fascinating game. From one perspective it’s very surprising and from another not at all.
Given how strong KataGo is, and how it can defeat professionals very easily, even with a handicap, it’s very surprising. No matter how complicated, the computer handles the situation very well.
Given that KataGo - and other Go programs - doesn’t understand anything about Go it’s not all surprising to find some weird positions that it never got close to in its training and therefore that it’s neural network provides completely the wrong answer.
Adam, thanks for the reply, I’ll take a look.
What do you mean by computer vs human scoring? There's territory (Japanese) and area (Chinese) scoring and I guess you can do a area scoring + capture everything. In any case the scoring rules don't differ very much.
In games between humans the players will usually nominate which stones are dead or alive at the end of the game, or in tournament games a referee might adjudicate. This avoids playing out the game tediously to the end. In computer Go there is a need to complete the scoring automatically, and so there is usually no dead-stone removal (or it's limited to stones which are provably dead).
Apart from this difference the rules are similar and KataGo supports both territory and area scoring.
So you mean area scoring + capture everything? capture everything is not feasible under territory scoring.
...“ once again we’ve been far too hasty to ascribe superhuman levels of intelligence to machines” ... Not even superhuman. Human or animal :)
Since the human neocortex is just 6 layers deep and deep learning models are hundreds of layers deep, the models must be massively inefficient right now. More specifically the models are massively overdetermined, with a rat's nest of built-in exception handling. There is lots of evidence now in the press that that is rapidly changing as more and more methods are explored to characterize and prune the models -- as if we're now developing hyper-topographical maps of the models, or meta-meta-meta-....-models, if you will, that will let us hone in on the most efficient paths to the goals.
dont think we can directly compare six layers in a cortex where areas are interconnected with one another vs a many layer deep CNN etc, but overall i agree current techniques are clearly very inefficient.
Point taken. The human neocortex is more of a graph architecture than the pass-through of the CNN. Graph architectures could be virtually infinitely layered.
It has been long known that RL trained against an opponent will learn to do well on the distribution of board states that are visited by that opponent. Shifting to a different opponent can reveal new game states where the RL system plays badly. When RL is trained against a copy of itself, this distribution of board states shifts over time and can become overly focused on the small part of the state space where the RL system is doing particularly well. The result can be similar to what happens with moose: they grow huge antlers to defeat other moose, but the weight of the antlers makes them vulnerable to other attacks.
Of course AlphaGo revealed that human players also had blind spots, because they had only been playing against other human players. This is why AlphaGo could defeat them. The lesson is that in problems with massive state spaces, it is very difficult to find a solution that works well across the entire space.
There are steps we can take to improve the robustness of these systems. An obvious one is to train against an ensemble of opponents with a wide range of different behaviors. We can also train an ensemble of systems to each be strong while collectively exhibiting diversity of approaches.
The terms more or less overlap. Say that you've always used a certain neural pathway to, for example, pick up a glass of water. You have a stroke that kills off many to most of the motor neurons you need to do that. Your brain can likely create new pathways to accomplish that task. It does it by using alternative sets of neurons that didn't used to be involved in picking up a glass of water. That's an example of brain ~plasticity~: it ~learned~ a new way to do a task.
Think of plasticity as flexibility of function. Learning happens when a behavior is modified. Learning requires flexibility/plasticity. Not sure I've helped here....
Gary writes that “deep learning ... is not always human like”.
I wish we would all stop making any comparison at all between today’s AI and humans. We must not train lay people to overestimate the humanity of robots.
An LLM is as human-like as a rubber sex doll: it’s convincing for those who want to believe they’re in the company of another human. And to be fair, that’s almost all of us, for we are evolved to perceive humanity, to anthropomorphise.
No neural network can self-reflect. No LLM can explain what it thought was gorilla-like when failing to recognize photos of black people. Or what it thought it saw when it failed to recognise an airplane on the tarmac.
The ability to explain oneself is one faculty we should INSIST on before ascribing human qualities to any machine.
No AI for now can truly learn from its important mistakes. An image analyzer that misclassifies people or crashes a car into an emergency vehicle does not appreciate the real depth of that error. The only way for experience to make its way into the artificial neural network is for designers to intervene (like gods) and change the rules.
Excellent new/revised paper by Murray Shanahan on the dangers of anthropomorphizing LLMs. One of the clearest descriptions of LLMs I’ve seen.
arxiv.org/pdf/2212.03551.pdf
Would this approach work against something trained with muZero's self play model of training?
Great article. Nice to hear Google suffering setbacks here and there. First with ChatGPT against Microsoft and now their darling Deepmind Alpha Go against an intelligent American computer scientist human.