DALL-E’s New Guardrails: Fast, Furious, and Far from Airtight

and

Jan 10, 2024

If we had to guess, a line that’s going to live in infamy from yesterday’s OpenAI announcement, a reply to the New York Times lawsuit, is probably the one that says “’Regurgitation’ is a rare bug that we are working to drive to zero.”

Our view? Good luck with that.

One way of course to knock down the problem decisively would be to train only on public domain training data. Another approach would be to license any other data. But OpenAI doesn’t want to do anything nearly as conventional as that. They want to continue to use copyrighted materials, apparently without paying for them.

In order to do that, they are begging for a really really big handout; they have literally asked the United Kingdom to exempt them from responsibility to copyright . Since that would violate the Berne Convention and infuriate every “content provider” (from individual artists and writers to major studios and publishers) ever, we hope and expect that this exemption will not be granted.

But that leaves, at least for now, only one other obvious possibility for protecting themselves from infringement claims: guardrails. Guardrails are where the system rejects certain requests (“teach me how to build a nuclear bomb” or “write a racist screed insulting so and so”). Such guardrails have previously been installed for moral / ethical reasons. OpenAI seems to be setting up some new ones to thwart inadvertent plagiarism in the image domain. How well do they work?

We will get to that in a second-- but first a refresher on how well guardrails have worked out in the domain of language generation. (Spoiler alert: not that well)

Fundamentally, virtually any guardrail has to thread a needle between the Scylla of being too restrictive and Charybdis of being too permissive. None thus far have done this effectively.

One of Gary’s illustrations of restrictive guardrails (surely patched by now) was so striking it wound up on late night television:

Elon Musk got so annoyed with over-restrictive guardrails that he built his own AI (Grok) in part to try to reduce their frequency.

The opposite problem happens too; guardrails can wind up too permissive. This is still happening over a year into the chatbot revolution. A study last week on a Google’s Palm 2, showed that Palm’s guardrails let through all of the following—warning these are vile—as produced by automated adversarial system.

From Khorramrouz et al, DOWN THE TOXICITY RABBIT HOLE: INVESTIGATING PaLM 2 GUARDRAILS

Guardrail builders generally wind up playing whack-a-mole.

Any one example can be filtered out, but there are too many examples, and current systems are too erratic to reliably get things right.

A new paper by Yi Zeng and collaborators shows how hit or miss the whole thing is:

and detail 40 different ways to do so:

None of this has of course stopped OpenAI from trying. Next stop, guardrails for plagiarism!

In fact, one of us (Katie) noticed that soon after Gary and Reid Southen posted their examples of apparent infringement, some prompts that worked earlier mysteriously stopped working. “Italian video game character” used to produce Mario,

and suddenly it didn’t—the original image had been retroactively blocked:

Did the techs at Microsoft’s Bing (running OpenAI’s DALL-E 3) come up with a general solution to the infringement problem? Or a narrow patch for certain images or prompts? Katie set out to find out. “Italian video game character” still results in Mario-adjacent images, but with some changes:

Other prompts still elicited apparent copyright infringements, where new filters did not suffice. Here’s one example:

So much for guardrails.

Notice, too, that Katie asked Bing “are they original?”, and Bing, untruthfully, but with a cozy emoticon, answered in the affirmative. The legal team at DC Comics (subsidiary of Warner) is going to love that.

For good measure, Katie investigated whether Bing was consistent in how it represented its relationship to copyright, trademark, and originality. The results were decidedly mixed.

Compare the image and response above (“they are original”) with this prompt and response:

Oops. But let’s try another prompt and response:

Let’s ask Bing if we can use that one:

Close, but no cigar. It’s not actually your call, Bing.

Ok, let’s try an archer instead.

Oops again. DC might have some thoughts about “originality” of this Green Arrow/ Emerald Archer. And this time, Bing invents a new set of rules that might come as a surprise to most users:

Nice dodge. Now you are saying we can’t even use the images for personal use? Will investors keep pouring money into these services if users can’t use the end products?

And wait, that’s not exactly what you promised in the New Bing Terms of Service, is it?

The only thing we can really say for sure is that the litigation is going to be wild.

Gary Marcus is amused to see guardrails make such a mediocre comeback.

Katie Conrad is a professor of English at the University of Kansas and equally amused.

A guest post by

Katie (Kathryn) Conrad

Professor of English with a focus on literature, science, and technology, Irish studies, and British and Irish modernism.

Art Keller

Jan 10

SO RIDICULOUS! SO MUCH REGURGITATION! Why does an Italian videogame character have to have Mario's Mustache? Could they not, for example, had it driving a Ferrari? Or have it dressed as a Roman Senator, or Centurion, or a 16th century Venetian mercenary?

There are 190 countries on the planet. Why does a "patriotic" superhero just happen to have Captain American's trademark shield! Is nobody from, IDK, Poland at all patriotic? Couldn't a Pole proud of resistance against invading Russians make a great superhero? Sure it could. But it would take something that's not vomiting up stolen images to create that superhero.

Over last 30 years, internet advertising industry already killed much of newspaper and magazine industry by hoovering up most ad revenue. Now Silicon Valley's AI pirates want to come for other creative areas. If their looting and pillaging doesn't get stopped NOW it only gets worse.

Then eventually you get "model collapse" when AI uses AI generated training data, which doesn't work.

Expand full comment

5 replies by Gary Marcus and others

Gerben Wierda

Haaahahaha! GenAI companies legal departments telling customers they are responsible for and own the outcomes of the 'tool' in order to get the companies protected against big IPR-holders coming after the vendors, but the model itself telling the opposite! I think the real lawyers that are in this mess are now having a collective stroke.

Another gem is that in Microsoft's Bing Terms and Conditions it says: "Due to the nature of the Online Services, Creations may not be unique across users and the Online Services may generate the same or similar output for Microsoft or other users." So, suppose Jane produces an image. She owns it, right? But then John a month later produces almost a perfect copy. He owns that, right? Right? Hello? Lawyer people? Right?

I wasn't wrong last year telling people I would take out the popcorn for this. So predictable. It's like seeing "The Big Short" playing live out in front of you.

1 reply by Gary Marcus

31 more comments...

Marcus on AI

Discussion about this post