Reid Southen is a successful concept artist who has worked for many of the biggest studios (Marvel, 20th Century Fox, Warner Bros, Paramount etc) on a lot of huge films (Matrix Resurrections, The Hunger Games, Transformers, and Alien, among others).
Yesterday he noticed that the latest, greatest version of an AI-generated art system known as Midjourney was rather too good at drawing lovely art like this
That appeared suspiciously familiar to this iconic still from the Joker:
In modern parlance, I would call Southen’s experiment an instance of red-teaming. It shows that a user of Midjourney could easily inadvertently infringe on copyright.
It also suggests that Midjourney has been trained on high-resolution copyright images, to which they may or may not have a license.
If you were a tech company you might be tempted call this an incident of “data leakage” or perhaps “duplication without attribution.” Sure looks like automated, digital plagiarism to me.
§
This was no one off. Some further facts:
This took Southen almost no effort
The output was not marked as being directly premised on copyrighted work
Southen quickly showed that the whole pattern could be easily replicated, multiple times with multiple films. Here’s one set of examples (you can find more on his X account)
§
What really bugs me thought is what happened next. Before the day was done, Midjourney retaliated - by revoking Southen’s privileges and wiping his history.
If that’s not consciousness of guilt, I don’t know what is.
(Paraphrasing Scooby Doo, they’d have gotten away with it too, if it wasn't for those meddling red-teamers.)
§
Speaking of what did they know and when did they know it, an excerpt from MidJourney’s own Terms of Service:
Good to know that they have heard of the concept of infringement. That’s a start.
§
Ed Newton-Rex is a composer and AI researcher who bravely quit a potentially very lucrative gig at Stable Diffusion, another generative AI company, because of his discomfort with what was going on.
He posted this yesterday. I gave him the last words, his own thoughts on what transpired yesterday:
Gary Marcus loves AI, in principle, but hates a lot of the ways in which it is being used.
If one (or a company) wants to use an officially authentified work (article, book, piece of art, etc.), one has to ask permission to the copyright holder. Why the companies which are developing generative AI systems could not be obliged by the law to ask permission for using protected content that is inserted in their training data set ? Even if this content is transformed or reformulated when processed by the system. When a film script is written based on a book, the story may be deeply altered, but still permission of using the book must be granted. Maybe that is the regulation which is needed here. This kind of regulation would be consistent with the one concerning the transparency on training data and which is presently under discussion. In fact, it is basically an issue of transparency, if the AI developers are transparent and honest on their data, they will have to respect copyrights.
That’s the problem right now - there is no alternative. But your discussion of artist rights and corporate conduct in a copyright context is meaningless. AI works aren’t subject to copyright, nor are they made in an infringing manner. You have to understand the basic scope of copyright: it can’t be a matter of lines or formulas or patterns, etc. AI deconstructs and reconstructs exactly that way though - more on this on my sub stack where I try to break down these very complicated topics. And I’ve been doing them both for a long long time.