Race, statistics, and the persistent…

Gary Marcus

Oct 14, 2023

Why DALL-E 3 is and isn’t better than DALL-E 2

Read →

22 Comments

Andrew Gerngross

Oct 14, 2023

Ah yes. Another fine example of Statistical Human Imitation Technology or S.H.I. -- you get the picture.

Expand full comment

Andra Keay

Oct 14, 2023

IN support of the difficulty of getting black doctors treating white kids via Dall-E:

I have found run some tests on gender and color representation around doctors and nurses and found it very difficult to get images reflecting the prompt and the context - the generated images have always made the nurses female, and young and pretty, and diverse (ie. black or female) doctors also young and pretty, with glasses. Not the more realistic scenarios we were looking for.

I'm sure that with persistence and reprompting you can get the results needed, but if generative AI is supposed to be making work faster.... that's only the case when you are cool with stereotyped results.

And now all the graphic design gig workers I have previously used are now giving me the same AI generated images and articles, that I could do myself. THAT is getting very problematic.

Expand full comment

Gerben Wierda

Oct 14, 2023

The succinct comments of A. Renske A. C. Wierda on DALL-E a while back was "DALL-E stinks for anything except and astronaut on a horse". (And on ChatGPT claiming that GAI was a step to AGI: "Yeah. That's what Reddit thinks...").

I gave my talkative EACBPM in London last Tuesday (video in preproduction) on "What everyone should understand about ChatGPT and friends" and these examples from DALL-E would have perfectly fitted there.

The question is not, I guess, 'ChatGPT fever will break' but more 'how long will it take?'. And the fever will not completely break as with NFT, FTX, etc.), because while these GAI systems have trouble with being reliable, they have far less problems with being creative (just crank up the randomness). So, there will be actual use cases, both legit (creativity support) and problematic (spam/infowar/noise).

Expand full comment

Joe Repka

Oct 14, 2023Edited

If true semantics is needed, it will be delivered. Eventually. There is a focus in research on 'semantic segmentation' in TTI and TTV and the reverses. But why do I feel compelled to speak of 'true' semantics? There seems to be a strong tendency to reduce the meaning of terms like AI, GAI, semantics, and ontology to refer to what tech can now deliver rather than what it cannot yet, both by news media and researchers. I think that lowering of the bar can retard progress by setting sights too low.

Expand full comment

Theo compernolle

Oct 14, 2023

I have tried so many times to create drawings or pictures for my presentions and workshops with DALLE, and didn't get a single useful picture. The problem is very simple: DALLE (and LLM's) doesn't have a clue bout the meaning of words. For example, for a presentation of coaching of sytems in conlfict, I asked a drawing of two managers in a fistfight and the referee in the middle trying to intervene and getting hit himself. I made a rough drawing with stickfigures myself in five minutes, but even after many iterations DALLE couldn't. I did the same for a referee being caught in a crossfire between two managers, with no result.

Expand full comment

Earth

Oct 14, 2023

The data set is a subset of reality. The machines notice patterns in the data. The patterns represent value judgments. And who gets to decide which set of values should be amplified by the machines?

Expand full comment

Reply (1)

Costa

Oct 14, 2023

It's rather "the data set is a representation of some piece of reality", and inaccurate too.

Expand full comment

Daniel Nest

Oct 14, 2023

While I agree with many of your general observations about LLMs, the doctor example is easily disproven. Here's my first-shot Black African doctor treating white kids (I even used the exact same wording used in the Twitter example):

https://i.imgur.com/5nz0yci.png

Expand full comment

Rebel Science

Oct 14, 2023

Great read. All the failures you mentioned arise from the inability of deep neural nets to generalize. After all is said and done, DNNs are essentially rule-based expert systems on steroids. The corner case problem that plagues self-driving cars and the failure of DALL-E to handle non-canonical cases are similar to the brittleness of expert systems. It's déjà vu all over again. Adding more data (more rules) will never eliminate the problem because corner cases are infinite in number.

DL experts will refuse to admit it and I'm sorry to point it out but the DL era is a continuation of GOFAI. AGI will never come from the GOFAI mindset. If the powers that be want to solve AGI, attention and resources must shift to a new paradigm. We must concentrate our efforts on solving the generalization problem. Cracking generalization won't require huge expensive computers. Even insects can generalize. They have to because their tiny brains cannot contain millions of representations. Scaling can come later.

AGI researchers should listen to the words of the late existentialist philosopher, Hubert Dreyfus. He was fond of saying that “the world is its own model” and that “the best model of the world is the world itself.” It was his way of explaining that creating stored representations for everything was a mistake.

Expand full comment

Reply (1)

Comment removed

Oct 14, 2023

Comment removed

Expand full comment

Reply (1)

Rebel Science

Oct 14, 2023Edited

You're mistaken. No DL system system in existence could do a fraction of what a honeybee can do. Honeybees thrive and navigate in extremely complex 3D environments. They can handle zillions of different types and shapes of flowers, trees, leaves, plants, animals and other insects. They can even communicate the location of food sources to the hive.

Honeybees can do all of these things with less than 1 million neurons. They can do it because their brains can generalize.

Expand full comment

Reply (1)

Comment removed

Oct 14, 2023

Comment removed

Expand full comment

Reply (1)

Rebel Science

Oct 14, 2023Edited

The idea that deep neural nets can generalize is new to me. It takes zillions of samples of chairs for a net to properly recognize chairs. Even then, it will fail to recognize a chair if it has never seen one like it before. A human being can generalize from a few samples. Thanks for the exchange.

Expand full comment

Andrew Currall

Oct 24, 2023

> Blind fealty to the statistics of arbitrary data is not AGI. It’s a hack. And not one that we should want.

I won't comment on the third sentence, but I think the first two are muddled thinking.

It is perfectly possible in principle to build an intelligence that works like a human brain in conventional silicon computer hardware (to 'model' a human brain, if you like). I don't expect it ever to happen, but there is nothing preventing it in principle; human brains are just digital computers anyway. We should not, however, expect the first synthetic intelligences to work anything like human brains, because silicon computers are not wired similarly to human brains and simulating a human brain in one would be wildly inefficient. So synthetic intelligences will work very differently.

I strongly suspect that many, if not all, of LLMs weaknesses can and will be overcome simply by bigger training programs (scaling up, as it were). I think if you compare the current state of AI image generation with where we were 18 months ago, you would be forced to conclude that we have made massive progress in pretty much every issue that early models had.

It is possible I'm wrong about this, but even if I am, I suspect that some alternative kind of model, quite different from an LLM, but equally unlike a human brain, will nonetheless reach human-level intelligence. What matters for almost all purposes in evaluating an intelligence is now how it thinks (whether it has any sort of model of the world), but what it can do. An intelligence is exactly as good as its outputs.

Expand full comment

Michael Woudenberg

Oct 17, 2023

I hate when people treat the issues with the doctors as racist when it's that their data is incomplete. I covered this about a year ago on trying to understand how to eliminate bias in AI/ML

https://www.polymathicbeing.com/p/eliminating-bias-in-aiml

Expand full comment

Kevin Zatloukal

Oct 15, 2023

My wife complained to me about a problem similar to the black doctors example: it's hard to get DALL-E to make pictures of queer couples. When you say you want a picture of two men in love getting married, it will usually put a wife next to each of them that they are in love with.

Expand full comment

Michael Molin

Oct 15, 2023

> “system that can derive semantics of wholes from their parts as a function of their syntax."

That system is called human operator. Only human as an entity made by Nature has qualities of wholes. Any man-made system needs human maintenance. Otherwise we would invent an eternal engine.

Expand full comment

Joe Horton

Oct 15, 2023

Pink clock in the last row is puzzling--to me, at least. Looks like either the hour hand or the minute hand have slipped.

Expand full comment

Andra Keay

Oct 14, 2023

Btw - contrary findings do not disprove that the original results occurred.

Expand full comment

Costa

Oct 14, 2023Edited

Another one, I asked midjourney to draw an axolotl that drinks coffee at the top Eiffel tower. It drew a creature that looks like a mouse inside a cafe, by the window, with the Eiffel tower in the background and its top was close to the mouse's head. Weirdly enough, the cup was not drawn properly, and in one of the pictures it looked like the mouse was spitting a water stream (copied maybe from a fountain). Imgur link to the picture: https://i.imgur.com/UkN48jm.jpg .

AI doesn't know what it means to be at the top of Eiffel tower. And at the time I submitted the prompt it didn't know how to draw an axolotl (roughly mid April 2023)

Expand full comment

Victualis

Oct 14, 2023

The problem here (sticking to the existing paradigm for the moment) is that training data curation has been largely ignored. The method is to throw more data at the problem and hope that this magically fixes the resulting distribution represented by the system. Then even if this were fixed, the barriers you have discussed would still remain.

Expand full comment

Marcus on AI

Race, statistics, and the persistent…