IN support of the difficulty of getting black doctors treating white kids via Dall-E:
I have found run some tests on gender and color representation around doctors and nurses and found it very difficult to get images reflecting the prompt and the context - the generated images have always made the nurses female, and young and pretty, and diverse (ie. black or female) doctors also young and pretty, with glasses. Not the more realistic scenarios we were looking for.
I'm sure that with persistence and reprompting you can get the results needed, but if generative AI is supposed to be making work faster.... that's only the case when you are cool with stereotyped results.
And now all the graphic design gig workers I have previously used are now giving me the same AI generated images and articles, that I could do myself. THAT is getting very problematic.
The succinct comments of A. Renske A. C. Wierda on DALL-E a while back was "DALL-E stinks for anything except and astronaut on a horse". (And on ChatGPT claiming that GAI was a step to AGI: "Yeah. That's what Reddit thinks...").
I gave my talkative EACBPM in London last Tuesday (video in preproduction) on "What everyone should understand about ChatGPT and friends" and these examples from DALL-E would have perfectly fitted there.
The question is not, I guess, 'ChatGPT fever will break' but more 'how long will it take?'. And the fever will not completely break as with NFT, FTX, etc.), because while these GAI systems have trouble with being reliable, they have far less problems with being creative (just crank up the randomness). So, there will be actual use cases, both legit (creativity support) and problematic (spam/infowar/noise).
If true semantics is needed, it will be delivered. Eventually. There is a focus in research on 'semantic segmentation' in TTI and TTV and the reverses. But why do I feel compelled to speak of 'true' semantics? There seems to be a strong tendency to reduce the meaning of terms like AI, GAI, semantics, and ontology to refer to what tech can now deliver rather than what it cannot yet, both by news media and researchers. I think that lowering of the bar can retard progress by setting sights too low.
I have tried so many times to create drawings or pictures for my presentions and workshops with DALLE, and didn't get a single useful picture. The problem is very simple: DALLE (and LLM's) doesn't have a clue bout the meaning of words. For example, for a presentation of coaching of sytems in conlfict, I asked a drawing of two managers in a fistfight and the referee in the middle trying to intervene and getting hit himself. I made a rough drawing with stickfigures myself in five minutes, but even after many iterations DALLE couldn't. I did the same for a referee being caught in a crossfire between two managers, with no result.
The data set is a subset of reality. The machines notice patterns in the data. The patterns represent value judgments. And who gets to decide which set of values should be amplified by the machines?
While I agree with many of your general observations about LLMs, the doctor example is easily disproven. Here's my first-shot Black African doctor treating white kids (I even used the exact same wording used in the Twitter example):
Great read. All the failures you mentioned arise from the inability of deep neural nets to generalize. After all is said and done, DNNs are essentially rule-based expert systems on steroids. The corner case problem that plagues self-driving cars and the failure of DALL-E to handle non-canonical cases are similar to the brittleness of expert systems. It's déjà vu all over again. Adding more data (more rules) will never eliminate the problem because corner cases are infinite in number.
DL experts will refuse to admit it and I'm sorry to point it out but the DL era is a continuation of GOFAI. AGI will never come from the GOFAI mindset. If the powers that be want to solve AGI, attention and resources must shift to a new paradigm. We must concentrate our efforts on solving the generalization problem. Cracking generalization won't require huge expensive computers. Even insects can generalize. They have to because their tiny brains cannot contain millions of representations. Scaling can come later.
AGI researchers should listen to the words of the late existentialist philosopher, Hubert Dreyfus. He was fond of saying that “the world is its own model” and that “the best model of the world is the world itself.” It was his way of explaining that creating stored representations for everything was a mistake.
It is not true that deep neural nets are essentially rule-based systems. They actually know how to generalize.
Insects don't need to hold millions of representations because they don't need to do as much work as an art drawing program. They need to navigate, respond to visual and other stimuli, etc. A compact (but still very large and complex) neural network will do.
For art drawing to be done "holistically", a machine will need to know about millions of objects, and their 3D and mechanical properties. If we know how to draw in 2D, which by now we know very well, this larger goal is the next step. Can't run before you can walk.
You're mistaken. No DL system system in existence could do a fraction of what a honeybee can do. Honeybees thrive and navigate in extremely complex 3D environments. They can handle zillions of different types and shapes of flowers, trees, leaves, plants, animals and other insects. They can even communicate the location of food sources to the hive.
Honeybees can do all of these things with less than 1 million neurons. They can do it because their brains can generalize.
While honebees encounter animals, leaves, other insects, the data that actually gets in their brain is much less. They care primarily for if there is pollen, building a rudimentary map, avoiding obstacles, etc.
We already have artificial neural nets that can handle analogous problems, if perhaps not as sophisticated.
This is, as before, orthogonal to the fact that AI art gen programs need a large latent space. It is not enough to "generalize" to be able to draw art. You actually need to understand the variety of all possible shapes in existence.
That neural nets can both generalize and draw art is actually a testament to their versatility. Of course they remain brittle in many ways, but that because neural nets, alone, are not enough to model efficiently many kinds of problems, so they need to be used in combination with other methods.
The idea that deep neural nets can generalize is new to me. It takes zillions of samples of chairs for a net to properly recognize chairs. Even then, it will fail to recognize a chair if it has never seen one like it before. A human being can generalize from a few samples. Thanks for the exchange.
> Blind fealty to the statistics of arbitrary data is not AGI. It’s a hack. And not one that we should want.
I won't comment on the third sentence, but I think the first two are muddled thinking.
It is perfectly possible in principle to build an intelligence that works like a human brain in conventional silicon computer hardware (to 'model' a human brain, if you like). I don't expect it ever to happen, but there is nothing preventing it in principle; human brains are just digital computers anyway. We should not, however, expect the first synthetic intelligences to work anything like human brains, because silicon computers are not wired similarly to human brains and simulating a human brain in one would be wildly inefficient. So synthetic intelligences will work very differently.
I strongly suspect that many, if not all, of LLMs weaknesses can and will be overcome simply by bigger training programs (scaling up, as it were). I think if you compare the current state of AI image generation with where we were 18 months ago, you would be forced to conclude that we have made massive progress in pretty much every issue that early models had.
It is possible I'm wrong about this, but even if I am, I suspect that some alternative kind of model, quite different from an LLM, but equally unlike a human brain, will nonetheless reach human-level intelligence. What matters for almost all purposes in evaluating an intelligence is now how it thinks (whether it has any sort of model of the world), but what it can do. An intelligence is exactly as good as its outputs.
I hate when people treat the issues with the doctors as racist when it's that their data is incomplete. I covered this about a year ago on trying to understand how to eliminate bias in AI/ML
My wife complained to me about a problem similar to the black doctors example: it's hard to get DALL-E to make pictures of queer couples. When you say you want a picture of two men in love getting married, it will usually put a wife next to each of them that they are in love with.
> “system that can derive semantics of wholes from their parts as a function of their syntax."
That system is called human operator. Only human as an entity made by Nature has qualities of wholes. Any man-made system needs human maintenance. Otherwise we would invent an eternal engine.
Another one, I asked midjourney to draw an axolotl that drinks coffee at the top Eiffel tower. It drew a creature that looks like a mouse inside a cafe, by the window, with the Eiffel tower in the background and its top was close to the mouse's head. Weirdly enough, the cup was not drawn properly, and in one of the pictures it looked like the mouse was spitting a water stream (copied maybe from a fountain). Imgur link to the picture: https://i.imgur.com/UkN48jm.jpg .
AI doesn't know what it means to be at the top of Eiffel tower. And at the time I submitted the prompt it didn't know how to draw an axolotl (roughly mid April 2023)
The problem here (sticking to the existing paradigm for the moment) is that training data curation has been largely ignored. The method is to throw more data at the problem and hope that this magically fixes the resulting distribution represented by the system. Then even if this were fixed, the barriers you have discussed would still remain.
Ah yes. Another fine example of Statistical Human Imitation Technology or S.H.I. -- you get the picture.
IN support of the difficulty of getting black doctors treating white kids via Dall-E:
I have found run some tests on gender and color representation around doctors and nurses and found it very difficult to get images reflecting the prompt and the context - the generated images have always made the nurses female, and young and pretty, and diverse (ie. black or female) doctors also young and pretty, with glasses. Not the more realistic scenarios we were looking for.
I'm sure that with persistence and reprompting you can get the results needed, but if generative AI is supposed to be making work faster.... that's only the case when you are cool with stereotyped results.
And now all the graphic design gig workers I have previously used are now giving me the same AI generated images and articles, that I could do myself. THAT is getting very problematic.
The succinct comments of A. Renske A. C. Wierda on DALL-E a while back was "DALL-E stinks for anything except and astronaut on a horse". (And on ChatGPT claiming that GAI was a step to AGI: "Yeah. That's what Reddit thinks...").
I gave my talkative EACBPM in London last Tuesday (video in preproduction) on "What everyone should understand about ChatGPT and friends" and these examples from DALL-E would have perfectly fitted there.
The question is not, I guess, 'ChatGPT fever will break' but more 'how long will it take?'. And the fever will not completely break as with NFT, FTX, etc.), because while these GAI systems have trouble with being reliable, they have far less problems with being creative (just crank up the randomness). So, there will be actual use cases, both legit (creativity support) and problematic (spam/infowar/noise).
If true semantics is needed, it will be delivered. Eventually. There is a focus in research on 'semantic segmentation' in TTI and TTV and the reverses. But why do I feel compelled to speak of 'true' semantics? There seems to be a strong tendency to reduce the meaning of terms like AI, GAI, semantics, and ontology to refer to what tech can now deliver rather than what it cannot yet, both by news media and researchers. I think that lowering of the bar can retard progress by setting sights too low.
I have tried so many times to create drawings or pictures for my presentions and workshops with DALLE, and didn't get a single useful picture. The problem is very simple: DALLE (and LLM's) doesn't have a clue bout the meaning of words. For example, for a presentation of coaching of sytems in conlfict, I asked a drawing of two managers in a fistfight and the referee in the middle trying to intervene and getting hit himself. I made a rough drawing with stickfigures myself in five minutes, but even after many iterations DALLE couldn't. I did the same for a referee being caught in a crossfire between two managers, with no result.
The data set is a subset of reality. The machines notice patterns in the data. The patterns represent value judgments. And who gets to decide which set of values should be amplified by the machines?
It's rather "the data set is a representation of some piece of reality", and inaccurate too.
While I agree with many of your general observations about LLMs, the doctor example is easily disproven. Here's my first-shot Black African doctor treating white kids (I even used the exact same wording used in the Twitter example):
https://i.imgur.com/5nz0yci.png
Great read. All the failures you mentioned arise from the inability of deep neural nets to generalize. After all is said and done, DNNs are essentially rule-based expert systems on steroids. The corner case problem that plagues self-driving cars and the failure of DALL-E to handle non-canonical cases are similar to the brittleness of expert systems. It's déjà vu all over again. Adding more data (more rules) will never eliminate the problem because corner cases are infinite in number.
DL experts will refuse to admit it and I'm sorry to point it out but the DL era is a continuation of GOFAI. AGI will never come from the GOFAI mindset. If the powers that be want to solve AGI, attention and resources must shift to a new paradigm. We must concentrate our efforts on solving the generalization problem. Cracking generalization won't require huge expensive computers. Even insects can generalize. They have to because their tiny brains cannot contain millions of representations. Scaling can come later.
AGI researchers should listen to the words of the late existentialist philosopher, Hubert Dreyfus. He was fond of saying that “the world is its own model” and that “the best model of the world is the world itself.” It was his way of explaining that creating stored representations for everything was a mistake.
It is not true that deep neural nets are essentially rule-based systems. They actually know how to generalize.
Insects don't need to hold millions of representations because they don't need to do as much work as an art drawing program. They need to navigate, respond to visual and other stimuli, etc. A compact (but still very large and complex) neural network will do.
For art drawing to be done "holistically", a machine will need to know about millions of objects, and their 3D and mechanical properties. If we know how to draw in 2D, which by now we know very well, this larger goal is the next step. Can't run before you can walk.
You're mistaken. No DL system system in existence could do a fraction of what a honeybee can do. Honeybees thrive and navigate in extremely complex 3D environments. They can handle zillions of different types and shapes of flowers, trees, leaves, plants, animals and other insects. They can even communicate the location of food sources to the hive.
Honeybees can do all of these things with less than 1 million neurons. They can do it because their brains can generalize.
While honebees encounter animals, leaves, other insects, the data that actually gets in their brain is much less. They care primarily for if there is pollen, building a rudimentary map, avoiding obstacles, etc.
We already have artificial neural nets that can handle analogous problems, if perhaps not as sophisticated.
This is, as before, orthogonal to the fact that AI art gen programs need a large latent space. It is not enough to "generalize" to be able to draw art. You actually need to understand the variety of all possible shapes in existence.
That neural nets can both generalize and draw art is actually a testament to their versatility. Of course they remain brittle in many ways, but that because neural nets, alone, are not enough to model efficiently many kinds of problems, so they need to be used in combination with other methods.
The idea that deep neural nets can generalize is new to me. It takes zillions of samples of chairs for a net to properly recognize chairs. Even then, it will fail to recognize a chair if it has never seen one like it before. A human being can generalize from a few samples. Thanks for the exchange.
A human has an intimate decades-long relationships with the real world. That's why we can generalize, because we have the knowledge to start with.
Humans can't generalize if dumped in an alien world where physics does not work as here.
> Blind fealty to the statistics of arbitrary data is not AGI. It’s a hack. And not one that we should want.
I won't comment on the third sentence, but I think the first two are muddled thinking.
It is perfectly possible in principle to build an intelligence that works like a human brain in conventional silicon computer hardware (to 'model' a human brain, if you like). I don't expect it ever to happen, but there is nothing preventing it in principle; human brains are just digital computers anyway. We should not, however, expect the first synthetic intelligences to work anything like human brains, because silicon computers are not wired similarly to human brains and simulating a human brain in one would be wildly inefficient. So synthetic intelligences will work very differently.
I strongly suspect that many, if not all, of LLMs weaknesses can and will be overcome simply by bigger training programs (scaling up, as it were). I think if you compare the current state of AI image generation with where we were 18 months ago, you would be forced to conclude that we have made massive progress in pretty much every issue that early models had.
It is possible I'm wrong about this, but even if I am, I suspect that some alternative kind of model, quite different from an LLM, but equally unlike a human brain, will nonetheless reach human-level intelligence. What matters for almost all purposes in evaluating an intelligence is now how it thinks (whether it has any sort of model of the world), but what it can do. An intelligence is exactly as good as its outputs.
I hate when people treat the issues with the doctors as racist when it's that their data is incomplete. I covered this about a year ago on trying to understand how to eliminate bias in AI/ML
https://www.polymathicbeing.com/p/eliminating-bias-in-aiml
My wife complained to me about a problem similar to the black doctors example: it's hard to get DALL-E to make pictures of queer couples. When you say you want a picture of two men in love getting married, it will usually put a wife next to each of them that they are in love with.
> “system that can derive semantics of wholes from their parts as a function of their syntax."
That system is called human operator. Only human as an entity made by Nature has qualities of wholes. Any man-made system needs human maintenance. Otherwise we would invent an eternal engine.
Pink clock in the last row is puzzling--to me, at least. Looks like either the hour hand or the minute hand have slipped.
Btw - contrary findings do not disprove that the original results occurred.
Another one, I asked midjourney to draw an axolotl that drinks coffee at the top Eiffel tower. It drew a creature that looks like a mouse inside a cafe, by the window, with the Eiffel tower in the background and its top was close to the mouse's head. Weirdly enough, the cup was not drawn properly, and in one of the pictures it looked like the mouse was spitting a water stream (copied maybe from a fountain). Imgur link to the picture: https://i.imgur.com/UkN48jm.jpg .
AI doesn't know what it means to be at the top of Eiffel tower. And at the time I submitted the prompt it didn't know how to draw an axolotl (roughly mid April 2023)
The problem here (sticking to the existing paradigm for the moment) is that training data curation has been largely ignored. The method is to throw more data at the problem and hope that this magically fixes the resulting distribution represented by the system. Then even if this were fixed, the barriers you have discussed would still remain.