Excellent piece. I have come to understand that all the great results attributed to LLMs should instead be attributed to the millions of human beings that LLMs use as preprocessors. LLMs are essentially cheaters with no understanding. What LLMs do show is that human language is highly statistical. But this is not a great scientific breakthrough since linguists have known this for decades, if not centuries.
The hype surrounding LLMs (and generative AI in general) always brings me to what I consider to be the acid test for AGI. Can it be used to design a robot cook that can walk into an unfamiliar kitchen and fix a meal? Generative AI models don't stand a chance to pass this test. Not in a billion years.
I love the "preprocessor" conceptual framework. Will be using that in my attempts to talk less technical (and, embarrassingly, some supposedly technical) friends off of the LLM ledge.
The point in the article about Google having a mental model is a really great comparison as well. The sooner we realise LLMs are a type of dynamic search (a very useful one at times) and not a thinking machine, the sooner we can shift our full attention to leveraging the benefits of the tech and mitigating the risks.
There's no need for tests really, just common sense and basic rationality. The definition of AGI is likely to become distorted beyond comprehension, the more money and hype is pumped into this. What will likely happen is industry will try to promote weaker and weaker definitions of AGI, until it is logical to say that a robot being able to peel an orange when told to, will be proof of AGI.
This reminds me of the infamous "I can't define pornography but I know it when I see it" approach to ontology. Intuitive, yes - scientifically useful, less so.
Thanks for the comment. In my opinion, AGI is an intelligence that has the ability to generalize. I think that a robot that can walk into an unfamiliar kitchen and fix a meal has that ability. But AGI does not have to be at human level. I consider that many insects, such as honeybees, have generalized intelligence. Scaling to human level or beyond is an engineering and training problem with known solutions. That can always come later.
To say that a language model has a model of the world is an oxymoron. There is a general principle here. The general principal is that all of these claims that some GenAI model has a cognitive property are all based on the logical fallacy of affirming the consequent. Here is an example of this pseudologic. If Lincoln was killed by robots, then Lincoln is dead. Lincoln is dead, therefore, he was killed by robots. This conclusion is obviously nonsense.
In the case of GenAI, the argument is: If the model has this cognitive property (reasoning, sentience, model of the world, etc.), then it will answer this question correctly. It answers correctly, therefore it has the described competence. This conclusion is no more valid than the one about Lincoln's death. It is not valid to assume from the consequent, that the premise is true. Other factors could have produced the result (John Wilkes Booth or language patterns).
To assert that large language models have any properties beyond those that were designed into them (language modeling) is magical thinking based on a logical fallacy. The language model is sufficient to explain the observation, and since we know how the language models was built, we have no reason to think that there is any more to it than that.
I don't always agree with your positions, but I think you are doing a valuable service in providing a narraticve counter to the hype train that runs around papers like this. Keep up the good work!
So, as I was reading your article, Gary, it triggered a memory for me. Before I learned how to drive, the map I had in my head about the world was pretty limited. I knew my way to and from school on my bike, but I didn't really understand where other cities were. I would hear my parents and their friends talk about these places, but I had no firm grasp on them. I didn't have a model/map in my head, in other words, like I did for the bike route to/from school. I knew that my parents would say that such and such a town was to the north/south/east/west of my hometown, but I had no idea which roads to take to get there or how far away they might be beyond my parents estimated driving time. In other words, I had learned what they had said about these towns, but I was still lacking a detailed model with which to reason about reality. Any model I had was based on language and was extremely "low resolution."
It is interesting that GPT-4 understands the difference between, say, 1a. "Amy poured water into the glass" (content locative) and 2a. "Amy filled the glass with water" (container locative), in that in 2a., Amy fills the whole glass. When you ask it whether 1b. *"Amy poured the glass with water" and 2b. *"Amy filled water into the glass", it says that they are both grammatical (but that they might sound odd, and it gives some explanations as to why, that are not convincing). Even young children, who might never have heard these sentences, understand that both 1b. and 2b. are wrong, based on the causality of language and their model of the world. That's a very strong argument for there being some kind of grammatical generative system in our minds that GPT-4 lacks, and no matter how much data we fill it with, it's not going to get it.
Yes, in The Stuff of Thought he marvels over them, that even young children understand them without being explicitly taught how to use them. The failure of GPT-4 to correctly analyse them shows that Pinker was correct in thinking that these linguistic features were something out of the ordinary. Very interesting.
Q: Is the following sentence grammatically correct? “Amy poured the glass with water”.
A: Yes, the sentence “Amy poured the glass with water”. is grammatically correct. In this sentence, “Amy” is the subject, “poured” is the verb, and “the glass with water” is the object. It indicates that Amy poured water into the glass. While this sentence is grammatically correct, as mentioned earlier, a more common and natural way to express the same idea would be to say “Amy poured water into the glass”.
---
This "natural way to express the same idea" by GPT-4 is wrong, because "Amy poured water into the glass" is not the same thing; it's content locative and thus doesn't entail that she filled the glass.
Interesting example. In principle, when asked for a grammaticality judgment, an LLM-based system could scan the sample sentence looking for points at which the next token was extremely unlikely according to its model, and if it found such a token, answer in the negative. I speculate that had it done that, it would have answered correctly here. But, of course, it's not doing anything like that; it's analogizing from examples it has seen of such requests and the replies to them. I would expect it to respond affirmatively to almost any construction that was nonexistent or extremely rare in its training set.
This says to me that LLMs don't have the ability to reason about their model of language; they can run the model, of course, but they can't (yet) step up to a meta-level where the model's predictions become an input to subsequent reasoning steps.
It’s interesting that you (and the authors of that paper?) interpret “Amy poured the glass with water” as entailing that she filled the glass — I have no intuitions about the truth conditions of either of the ungrammatical sentences. If you asked me to give a judgment on either, I’d have to say that they don’t mean anything.
This is not the case for all syntactically ill-formed sentences — even when it’s an issue of argument structure, it’s often possible to get truth conditions anyway. “Mary donated a million dollars to Harvard” — perfectly fine; “Mary donated Harvard a million dollars” — not grammatical, but definitely has the same truth conditions as the grammatical one.
But if some people do have strong intuitions about “Amy poured the glass with water”, then it seems like there’s inter-speaker variation there. (As an L1 English speaker and L2 German speaker, I keep thinking “it would be fine, you’d just need to stick a prefix on the verb!”)
"Amy poured the glass with water" doesn't mean that she filled the glass, but it uses a verb for a content-locative construction (with focus on how she does it) with a construction used for containter-locative verbs, so that's why we feel it's ungrammatical. GPT-4 doesn't get this distinction. It's perhaps more easily interpreted here:
Amy loaded the wagon with hay (container-locative)
Amy loaded hay into the wagon (content-locative)
So they have different truth-conditions, if you will. I see what you're saying though, that "poured" suggests a truth-condition where she doesn't fill the glass, while the sentence structure suggests that she does. Either way, the sentence is not grammatical.
I'm reminded of a remark the Bill Powers made to me years ago in connection with an Old School symbolic model I had constructed for the purpose of analyzing the semantics of a Shakespeare sonnet. Powers remarked (http://www.jstor.org/stable/2907155):
"There are always two levels of modelling going on. At one level, modelling consists of constructing a structure that, by its own rules, would behave like the system being modelled, and if one is lucky produce that behavior by the same means (the same inner processes) as the system being modelled. That kind of model "runs by itself"; given initial conditions,the rules of the model will generate behavior.
"But the other kind of modelling is always done at the same time: the modeller provides for himself some symbolic scheme together with rules for manipulating the symbols,f or the purpose of reasoning about the other kind of model. The relationship between the two kinds of models is very much like the relationship you describe between the thought-level and the abstraction-level.
"The biggest problem in modelling is to remain aware of which model one is dealing with. Am I inside my own head reasoning about the model, or am I inside the model applying its rules to its experiences? This is especially difficult to keep straight when one is talking about cognitive processes; unless one is vividly aware of the problem one can shift back and forth between the two modes of modelling without realizing it."
In this case, as Davis remarks, it is not at all obvious that the LLM itself has explicit access to this world "model" that is so obvious to an external observer, an observer standing in a "transcendent" relationship to the model. I fear that this kind of confusion is very common in thinking about LLMs and is responsible for over-estimating their capacities.
Originally, OpenAI tried to get to safety in GPT3 using fine-tuning only. That was so 'jailbreaking-prone' that they had to install some simplistic filtering (like filtering on 'bad words'). I call these filters 'AoD-filters', where 'AoD' stands for 'Admission of Defeat'. Most illustrative here is that they not only filter the prompt, but they also filter the generation that way. Hence GPT creates a reply which is then flagged by GPT's *own* 'AoD' filter. Funny and telling.
It is difficult for people not to get bewitched by these systems. I recall your piece about 'How not to test GPT" post a while back.
LLMs are 'statistically constrained hallucinators'. The constraining will realistically never scale to the amount that it becomes a real model that has logical understanding. Even in OpenAI's 'LLMs are few shot learners' article, you are easily misled by the fact that the X-axis is logarithmic... If you fix that, interesting visuals appear...
Oh, man. Thanks for putting in the work to clarify this. I saw the post, and my first thought was confusing word relations with an actual model or formal intrinsic representation. There is a lot to be said regarding how a physics or a spatial model works, how an artificial neural network represents a model, and the distance to how the human brain organizes these ideas. Part of the current hype forced exciting ideas forward but lacking real substance.
Also this week, the advent of Assembly Theory was published in the journal *Nature*, unifying biology with physics. The treatment of LLMs as singular entities is unrealistic. Now and forever they will be collaborators, as are all of us. Such is the essence of complexity. The pursuit of singular infallible causal models is equally illusory, having been diligently pursued since at least the time of Aristotle without success. Progress will proceed through growth and refinement: With reinforcement learning from human feedback (RLHF), constitutional and multiagent frameworks, etc.
This kind of paper, as the one written by Gurnee & Tegmark and discussed above, besides its technical content, expresses a form of enthusiasm about the future applications of AI and the propensity to see mainly the positive aspects of AI. As I wrote in a preceding comment, the AI-driven tools will be soon so easy to handle, so helpful, and so apparently efficient, that a lot of persons will gladly use them even some very serious issues remain, also in a professional context. While AI-based tools will be more and more extensively used, there will be less and less space for criticism. People will not like to hear criticism about handy tools they use every day and are satisfied with. We can only hope, and not be certain, that the AI systems designers will incorporate some logical inference scheme in the LLMs.
Using the physical science terminology, the LLMs are kind of empirical models as opposed to knowledge based (physical laws) models. The empirical models are correlations based on experiments’ results. They were sometimes called “black box models”. They are a set of polynomial equations with a multitude of coefficients obtained by mathematical fitting the model output to some experimental data. They can be efficient for prediction if the user does not cross (even by the smallest amount) the value domain of parameters considered and does not try to represent a situation where an additional parameter is needed (even just a single parameter more). We have the same limitations in LLMs. In physics, black box models are used to describe a single phenomenon, a single process. But the ambition with LLMs is quite extravagant as they will pretend to describe the entire world. If one wants to describe the entire world with a “black box model”, one would need an infinite number of parameters and an infinite number of coefficients. That’s why obviously we will need a hybrid approach to get an efficient and reliable AI system, a combination of empirical correlations, knowledge based equations and logical rules.
A set of comparator functions modeled as a "brain" that's unattached to any somatic basis at all--much less one resembling a "nervous system"--receiving exclusively verbal/numerical information input to describe an external field that the comparator functions don't even recognize as an external field = ArtificiaI Intelligence
Don't get me wrong. Despite those inherent constraints, AI is potentially a valuable tool, given properly directed focus from an external human intelligence.
I'd especially like to see a concerted effort to drill an AI program in the principles of logical fallacy detection. The evaluations might prove to be vulnerable to occasional reasoning flaws at the outset, but I'm confident of the ability of an AI program to learn and improve in its ability at logical fallacy detection (although it's likely to demand more clarity in the semantics of the input being screened.)
Significantly, the default state of AI harbors an advantage that individual human awarenesses find very difficult to consistently maintain: Impartiality.
The rules of logical fallacy detection are, for the main part, simple and straightforward.* The lists are overwhelmingly similar.
However, as far as the real world application of logical fallacy detection, the obstacle that is most liable to confound the intelligence of humans is the challenge of applying the rules consistently. Time and again I've read humans engaged in debate (often of a political nature) doing takedowns of arguments by the side they oppose with unerring precision- but they're unable to critique their own positions with the same level of incisive reasoning and logical rigor. The limitations of ego-based human standpoint tend to get in the way. With diligent practice and self-discipline it's a fixable glitch, theoretically. But considered as a latent potential, idea-clutching personal egos always present a problem for us humans. A formidable obstacle to lucidity and appropriately responsible action.
By contrast, in the case of Artificial Intelligence, I don't feature an autonomous Ego-based agenda as presenting a problem at all, because there's no evidence that any AI program possesses an Ego. Hence, AI is potentially available as a tool to expose the flaws in any fact claim, inference, or debate position, without fear or favor.
[ * with some exceptions. For instance, not every "slippery slope" is a fallacy; some slippery slopes indicate actual perils. ]
Excellent piece. I have come to understand that all the great results attributed to LLMs should instead be attributed to the millions of human beings that LLMs use as preprocessors. LLMs are essentially cheaters with no understanding. What LLMs do show is that human language is highly statistical. But this is not a great scientific breakthrough since linguists have known this for decades, if not centuries.
The hype surrounding LLMs (and generative AI in general) always brings me to what I consider to be the acid test for AGI. Can it be used to design a robot cook that can walk into an unfamiliar kitchen and fix a meal? Generative AI models don't stand a chance to pass this test. Not in a billion years.
I love the "preprocessor" conceptual framework. Will be using that in my attempts to talk less technical (and, embarrassingly, some supposedly technical) friends off of the LLM ledge.
The point in the article about Google having a mental model is a really great comparison as well. The sooner we realise LLMs are a type of dynamic search (a very useful one at times) and not a thinking machine, the sooner we can shift our full attention to leveraging the benefits of the tech and mitigating the risks.
That test wouldn't prove AGI.
There's no need for tests really, just common sense and basic rationality. The definition of AGI is likely to become distorted beyond comprehension, the more money and hype is pumped into this. What will likely happen is industry will try to promote weaker and weaker definitions of AGI, until it is logical to say that a robot being able to peel an orange when told to, will be proof of AGI.
This reminds me of the infamous "I can't define pornography but I know it when I see it" approach to ontology. Intuitive, yes - scientifically useful, less so.
Thanks for the comment. In my opinion, AGI is an intelligence that has the ability to generalize. I think that a robot that can walk into an unfamiliar kitchen and fix a meal has that ability. But AGI does not have to be at human level. I consider that many insects, such as honeybees, have generalized intelligence. Scaling to human level or beyond is an engineering and training problem with known solutions. That can always come later.
To say that a language model has a model of the world is an oxymoron. There is a general principle here. The general principal is that all of these claims that some GenAI model has a cognitive property are all based on the logical fallacy of affirming the consequent. Here is an example of this pseudologic. If Lincoln was killed by robots, then Lincoln is dead. Lincoln is dead, therefore, he was killed by robots. This conclusion is obviously nonsense.
In the case of GenAI, the argument is: If the model has this cognitive property (reasoning, sentience, model of the world, etc.), then it will answer this question correctly. It answers correctly, therefore it has the described competence. This conclusion is no more valid than the one about Lincoln's death. It is not valid to assume from the consequent, that the premise is true. Other factors could have produced the result (John Wilkes Booth or language patterns).
To assert that large language models have any properties beyond those that were designed into them (language modeling) is magical thinking based on a logical fallacy. The language model is sufficient to explain the observation, and since we know how the language models was built, we have no reason to think that there is any more to it than that.
I don't always agree with your positions, but I think you are doing a valuable service in providing a narraticve counter to the hype train that runs around papers like this. Keep up the good work!
So, as I was reading your article, Gary, it triggered a memory for me. Before I learned how to drive, the map I had in my head about the world was pretty limited. I knew my way to and from school on my bike, but I didn't really understand where other cities were. I would hear my parents and their friends talk about these places, but I had no firm grasp on them. I didn't have a model/map in my head, in other words, like I did for the bike route to/from school. I knew that my parents would say that such and such a town was to the north/south/east/west of my hometown, but I had no idea which roads to take to get there or how far away they might be beyond my parents estimated driving time. In other words, I had learned what they had said about these towns, but I was still lacking a detailed model with which to reason about reality. Any model I had was based on language and was extremely "low resolution."
It is interesting that GPT-4 understands the difference between, say, 1a. "Amy poured water into the glass" (content locative) and 2a. "Amy filled the glass with water" (container locative), in that in 2a., Amy fills the whole glass. When you ask it whether 1b. *"Amy poured the glass with water" and 2b. *"Amy filled water into the glass", it says that they are both grammatical (but that they might sound odd, and it gives some explanations as to why, that are not convincing). Even young children, who might never have heard these sentences, understand that both 1b. and 2b. are wrong, based on the causality of language and their model of the world. That's a very strong argument for there being some kind of grammatical generative system in our minds that GPT-4 lacks, and no matter how much data we fill it with, it's not going to get it.
Pinker’s 1989 book was all about these kind of examples
Yes, in The Stuff of Thought he marvels over them, that even young children understand them without being explicitly taught how to use them. The failure of GPT-4 to correctly analyse them shows that Pinker was correct in thinking that these linguistic features were something out of the ordinary. Very interesting.
This is from the following article (with the Answer being GPT-4): https://doi.org/10.1016/j.datak.2023.102208
---
Q: Is the following sentence grammatically correct? “Amy poured the glass with water”.
A: Yes, the sentence “Amy poured the glass with water”. is grammatically correct. In this sentence, “Amy” is the subject, “poured” is the verb, and “the glass with water” is the object. It indicates that Amy poured water into the glass. While this sentence is grammatically correct, as mentioned earlier, a more common and natural way to express the same idea would be to say “Amy poured water into the glass”.
---
This "natural way to express the same idea" by GPT-4 is wrong, because "Amy poured water into the glass" is not the same thing; it's content locative and thus doesn't entail that she filled the glass.
Interesting example. In principle, when asked for a grammaticality judgment, an LLM-based system could scan the sample sentence looking for points at which the next token was extremely unlikely according to its model, and if it found such a token, answer in the negative. I speculate that had it done that, it would have answered correctly here. But, of course, it's not doing anything like that; it's analogizing from examples it has seen of such requests and the replies to them. I would expect it to respond affirmatively to almost any construction that was nonexistent or extremely rare in its training set.
This says to me that LLMs don't have the ability to reason about their model of language; they can run the model, of course, but they can't (yet) step up to a meta-level where the model's predictions become an input to subsequent reasoning steps.
It’s interesting that you (and the authors of that paper?) interpret “Amy poured the glass with water” as entailing that she filled the glass — I have no intuitions about the truth conditions of either of the ungrammatical sentences. If you asked me to give a judgment on either, I’d have to say that they don’t mean anything.
This is not the case for all syntactically ill-formed sentences — even when it’s an issue of argument structure, it’s often possible to get truth conditions anyway. “Mary donated a million dollars to Harvard” — perfectly fine; “Mary donated Harvard a million dollars” — not grammatical, but definitely has the same truth conditions as the grammatical one.
But if some people do have strong intuitions about “Amy poured the glass with water”, then it seems like there’s inter-speaker variation there. (As an L1 English speaker and L2 German speaker, I keep thinking “it would be fine, you’d just need to stick a prefix on the verb!”)
"Amy poured the glass with water" doesn't mean that she filled the glass, but it uses a verb for a content-locative construction (with focus on how she does it) with a construction used for containter-locative verbs, so that's why we feel it's ungrammatical. GPT-4 doesn't get this distinction. It's perhaps more easily interpreted here:
Amy loaded the wagon with hay (container-locative)
Amy loaded hay into the wagon (content-locative)
So they have different truth-conditions, if you will. I see what you're saying though, that "poured" suggests a truth-condition where she doesn't fill the glass, while the sentence structure suggests that she does. Either way, the sentence is not grammatical.
I'm reminded of a remark the Bill Powers made to me years ago in connection with an Old School symbolic model I had constructed for the purpose of analyzing the semantics of a Shakespeare sonnet. Powers remarked (http://www.jstor.org/stable/2907155):
"There are always two levels of modelling going on. At one level, modelling consists of constructing a structure that, by its own rules, would behave like the system being modelled, and if one is lucky produce that behavior by the same means (the same inner processes) as the system being modelled. That kind of model "runs by itself"; given initial conditions,the rules of the model will generate behavior.
"But the other kind of modelling is always done at the same time: the modeller provides for himself some symbolic scheme together with rules for manipulating the symbols,f or the purpose of reasoning about the other kind of model. The relationship between the two kinds of models is very much like the relationship you describe between the thought-level and the abstraction-level.
"The biggest problem in modelling is to remain aware of which model one is dealing with. Am I inside my own head reasoning about the model, or am I inside the model applying its rules to its experiences? This is especially difficult to keep straight when one is talking about cognitive processes; unless one is vividly aware of the problem one can shift back and forth between the two modes of modelling without realizing it."
In this case, as Davis remarks, it is not at all obvious that the LLM itself has explicit access to this world "model" that is so obvious to an external observer, an observer standing in a "transcendent" relationship to the model. I fear that this kind of confusion is very common in thinking about LLMs and is responsible for over-estimating their capacities.
Love it. Stray parenthesis here:
Finding that some stuff (correlates
Originally, OpenAI tried to get to safety in GPT3 using fine-tuning only. That was so 'jailbreaking-prone' that they had to install some simplistic filtering (like filtering on 'bad words'). I call these filters 'AoD-filters', where 'AoD' stands for 'Admission of Defeat'. Most illustrative here is that they not only filter the prompt, but they also filter the generation that way. Hence GPT creates a reply which is then flagged by GPT's *own* 'AoD' filter. Funny and telling.
It is difficult for people not to get bewitched by these systems. I recall your piece about 'How not to test GPT" post a while back.
LLMs are 'statistically constrained hallucinators'. The constraining will realistically never scale to the amount that it becomes a real model that has logical understanding. Even in OpenAI's 'LLMs are few shot learners' article, you are easily misled by the fact that the X-axis is logarithmic... If you fix that, interesting visuals appear...
Oh, man. Thanks for putting in the work to clarify this. I saw the post, and my first thought was confusing word relations with an actual model or formal intrinsic representation. There is a lot to be said regarding how a physics or a spatial model works, how an artificial neural network represents a model, and the distance to how the human brain organizes these ideas. Part of the current hype forced exciting ideas forward but lacking real substance.
OthelloGPT would like a word...
See https://thegradient.pub/othello/
Also this week, the advent of Assembly Theory was published in the journal *Nature*, unifying biology with physics. The treatment of LLMs as singular entities is unrealistic. Now and forever they will be collaborators, as are all of us. Such is the essence of complexity. The pursuit of singular infallible causal models is equally illusory, having been diligently pursued since at least the time of Aristotle without success. Progress will proceed through growth and refinement: With reinforcement learning from human feedback (RLHF), constitutional and multiagent frameworks, etc.
Using things like clustering in the 70's to find Dallas and Austin are close from texte.
This kind of paper, as the one written by Gurnee & Tegmark and discussed above, besides its technical content, expresses a form of enthusiasm about the future applications of AI and the propensity to see mainly the positive aspects of AI. As I wrote in a preceding comment, the AI-driven tools will be soon so easy to handle, so helpful, and so apparently efficient, that a lot of persons will gladly use them even some very serious issues remain, also in a professional context. While AI-based tools will be more and more extensively used, there will be less and less space for criticism. People will not like to hear criticism about handy tools they use every day and are satisfied with. We can only hope, and not be certain, that the AI systems designers will incorporate some logical inference scheme in the LLMs.
Using the physical science terminology, the LLMs are kind of empirical models as opposed to knowledge based (physical laws) models. The empirical models are correlations based on experiments’ results. They were sometimes called “black box models”. They are a set of polynomial equations with a multitude of coefficients obtained by mathematical fitting the model output to some experimental data. They can be efficient for prediction if the user does not cross (even by the smallest amount) the value domain of parameters considered and does not try to represent a situation where an additional parameter is needed (even just a single parameter more). We have the same limitations in LLMs. In physics, black box models are used to describe a single phenomenon, a single process. But the ambition with LLMs is quite extravagant as they will pretend to describe the entire world. If one wants to describe the entire world with a “black box model”, one would need an infinite number of parameters and an infinite number of coefficients. That’s why obviously we will need a hybrid approach to get an efficient and reliable AI system, a combination of empirical correlations, knowledge based equations and logical rules.
A set of comparator functions modeled as a "brain" that's unattached to any somatic basis at all--much less one resembling a "nervous system"--receiving exclusively verbal/numerical information input to describe an external field that the comparator functions don't even recognize as an external field = ArtificiaI Intelligence
Don't get me wrong. Despite those inherent constraints, AI is potentially a valuable tool, given properly directed focus from an external human intelligence.
I'd especially like to see a concerted effort to drill an AI program in the principles of logical fallacy detection. The evaluations might prove to be vulnerable to occasional reasoning flaws at the outset, but I'm confident of the ability of an AI program to learn and improve in its ability at logical fallacy detection (although it's likely to demand more clarity in the semantics of the input being screened.)
Significantly, the default state of AI harbors an advantage that individual human awarenesses find very difficult to consistently maintain: Impartiality.
The rules of logical fallacy detection are, for the main part, simple and straightforward.* The lists are overwhelmingly similar.
example https://www.logicalfallacies.org/
However, as far as the real world application of logical fallacy detection, the obstacle that is most liable to confound the intelligence of humans is the challenge of applying the rules consistently. Time and again I've read humans engaged in debate (often of a political nature) doing takedowns of arguments by the side they oppose with unerring precision- but they're unable to critique their own positions with the same level of incisive reasoning and logical rigor. The limitations of ego-based human standpoint tend to get in the way. With diligent practice and self-discipline it's a fixable glitch, theoretically. But considered as a latent potential, idea-clutching personal egos always present a problem for us humans. A formidable obstacle to lucidity and appropriately responsible action.
By contrast, in the case of Artificial Intelligence, I don't feature an autonomous Ego-based agenda as presenting a problem at all, because there's no evidence that any AI program possesses an Ego. Hence, AI is potentially available as a tool to expose the flaws in any fact claim, inference, or debate position, without fear or favor.
[ * with some exceptions. For instance, not every "slippery slope" is a fallacy; some slippery slopes indicate actual perils. ]