27 Comments
Oct 14, 2023Liked by Gary Marcus

Ah yes. Another fine example of Statistical Human Imitation Technology or S.H.I. -- you get the picture.

Expand full comment
Oct 14, 2023Liked by Gary Marcus

IN support of the difficulty of getting black doctors treating white kids via Dall-E:

I have found run some tests on gender and color representation around doctors and nurses and found it very difficult to get images reflecting the prompt and the context - the generated images have always made the nurses female, and young and pretty, and diverse (ie. black or female) doctors also young and pretty, with glasses. Not the more realistic scenarios we were looking for.

I'm sure that with persistence and reprompting you can get the results needed, but if generative AI is supposed to be making work faster.... that's only the case when you are cool with stereotyped results.

And now all the graphic design gig workers I have previously used are now giving me the same AI generated images and articles, that I could do myself. THAT is getting very problematic.

Expand full comment
Oct 14, 2023Liked by Gary Marcus

The succinct comments of A. Renske A. C. Wierda on DALL-E a while back was "DALL-E stinks for anything except and astronaut on a horse". (And on ChatGPT claiming that GAI was a step to AGI: "Yeah. That's what Reddit thinks...").

I gave my talkative EACBPM in London last Tuesday (video in preproduction) on "What everyone should understand about ChatGPT and friends" and these examples from DALL-E would have perfectly fitted there.

The question is not, I guess, 'ChatGPT fever will break' but more 'how long will it take?'. And the fever will not completely break as with NFT, FTX, etc.), because while these GAI systems have trouble with being reliable, they have far less problems with being creative (just crank up the randomness). So, there will be actual use cases, both legit (creativity support) and problematic (spam/infowar/noise).

Expand full comment
Oct 16, 2023Liked by Gary Marcus

Can we talk about the watch face with 1 at the top, 3 at the bottom, 6 (or maybe a heavily distorted 5) at the right, and ". ." on the left? The numbers on the bezel of the lower-right watch are likewise gibberish. It seems like the system is treating the numbers as some sort of decorative elements, rather than as something that is integral to the task of telling time.

Expand full comment

If true semantics is needed, it will be delivered. Eventually. There is a focus in research on 'semantic segmentation' in TTI and TTV and the reverses. But why do I feel compelled to speak of 'true' semantics? There seems to be a strong tendency to reduce the meaning of terms like AI, GAI, semantics, and ontology to refer to what tech can now deliver rather than what it cannot yet, both by news media and researchers. I think that lowering of the bar can retard progress by setting sights too low.

Expand full comment

I have tried so many times to create drawings or pictures for my presentions and workshops with DALLE, and didn't get a single useful picture. The problem is very simple: DALLE (and LLM's) doesn't have a clue bout the meaning of words. For example, for a presentation of coaching of sytems in conlfict, I asked a drawing of two managers in a fistfight and the referee in the middle trying to intervene and getting hit himself. I made a rough drawing with stickfigures myself in five minutes, but even after many iterations DALLE couldn't. I did the same for a referee being caught in a crossfire between two managers, with no result.

Expand full comment

The data set is a subset of reality. The machines notice patterns in the data. The patterns represent value judgments. And who gets to decide which set of values should be amplified by the machines?

Expand full comment

While I agree with many of your general observations about LLMs, the doctor example is easily disproven. Here's my first-shot Black African doctor treating white kids (I even used the exact same wording used in the Twitter example):

https://i.imgur.com/5nz0yci.png

Expand full comment

Great read. All the failures you mentioned arise from the inability of deep neural nets to generalize. After all is said and done, DNNs are essentially rule-based expert systems on steroids. The corner case problem that plagues self-driving cars and the failure of DALL-E to handle non-canonical cases are similar to the brittleness of expert systems. It's déjà vu all over again. Adding more data (more rules) will never eliminate the problem because corner cases are infinite in number.

DL experts will refuse to admit it and I'm sorry to point it out but the DL era is a continuation of GOFAI. AGI will never come from the GOFAI mindset. If the powers that be want to solve AGI, attention and resources must shift to a new paradigm. We must concentrate our efforts on solving the generalization problem. Cracking generalization won't require huge expensive computers. Even insects can generalize. They have to because their tiny brains cannot contain millions of representations. Scaling can come later.

AGI researchers should listen to the words of the late existentialist philosopher, Hubert Dreyfus. He was fond of saying that “the world is its own model” and that “the best model of the world is the world itself.” It was his way of explaining that creating stored representations for everything was a mistake.

Expand full comment

> Blind fealty to the statistics of arbitrary data is not AGI. It’s a hack. And not one that we should want.

I won't comment on the third sentence, but I think the first two are muddled thinking.

It is perfectly possible in principle to build an intelligence that works like a human brain in conventional silicon computer hardware (to 'model' a human brain, if you like). I don't expect it ever to happen, but there is nothing preventing it in principle; human brains are just digital computers anyway. We should not, however, expect the first synthetic intelligences to work anything like human brains, because silicon computers are not wired similarly to human brains and simulating a human brain in one would be wildly inefficient. So synthetic intelligences will work very differently.

I strongly suspect that many, if not all, of LLMs weaknesses can and will be overcome simply by bigger training programs (scaling up, as it were). I think if you compare the current state of AI image generation with where we were 18 months ago, you would be forced to conclude that we have made massive progress in pretty much every issue that early models had.

It is possible I'm wrong about this, but even if I am, I suspect that some alternative kind of model, quite different from an LLM, but equally unlike a human brain, will nonetheless reach human-level intelligence. What matters for almost all purposes in evaluating an intelligence is now how it thinks (whether it has any sort of model of the world), but what it can do. An intelligence is exactly as good as its outputs.

Expand full comment

I hate when people treat the issues with the doctors as racist when it's that their data is incomplete. I covered this about a year ago on trying to understand how to eliminate bias in AI/ML

https://www.polymathicbeing.com/p/eliminating-bias-in-aiml

Expand full comment

My wife complained to me about a problem similar to the black doctors example: it's hard to get DALL-E to make pictures of queer couples. When you say you want a picture of two men in love getting married, it will usually put a wife next to each of them that they are in love with.

Expand full comment

> “system that can derive semantics of wholes from their parts as a function of their syntax."

That system is called human operator. Only human as an entity made by Nature has qualities of wholes. Any man-made system needs human maintenance. Otherwise we would invent an eternal engine.

Expand full comment

Pink clock in the last row is puzzling--to me, at least. Looks like either the hour hand or the minute hand have slipped.

Expand full comment

Btw - contrary findings do not disprove that the original results occurred.

Expand full comment
Oct 14, 2023·edited Oct 14, 2023

Another one, I asked midjourney to draw an axolotl that drinks coffee at the top Eiffel tower. It drew a creature that looks like a mouse inside a cafe, by the window, with the Eiffel tower in the background and its top was close to the mouse's head. Weirdly enough, the cup was not drawn properly, and in one of the pictures it looked like the mouse was spitting a water stream (copied maybe from a fountain). Imgur link to the picture: https://i.imgur.com/UkN48jm.jpg .

AI doesn't know what it means to be at the top of Eiffel tower. And at the time I submitted the prompt it didn't know how to draw an axolotl (roughly mid April 2023)

Expand full comment