24 Comments
Dec 23, 2023·edited Dec 23, 2023Liked by Gary Marcus

If one (or a company) wants to use an officially authentified work (article, book, piece of art, etc.), one has to ask permission to the copyright holder. Why the companies which are developing generative AI systems could not be obliged by the law to ask permission for using protected content that is inserted in their training data set ? Even if this content is transformed or reformulated when processed by the system. When a film script is written based on a book, the story may be deeply altered, but still permission of using the book must be granted. Maybe that is the regulation which is needed here. This kind of regulation would be consistent with the one concerning the transparency on training data and which is presently under discussion. In fact, it is basically an issue of transparency, if the AI developers are transparent and honest on their data, they will have to respect copyrights.

Expand full comment

That’s the problem right now - there is no alternative. But your discussion of artist rights and corporate conduct in a copyright context is meaningless. AI works aren’t subject to copyright, nor are they made in an infringing manner. You have to understand the basic scope of copyright: it can’t be a matter of lines or formulas or patterns, etc. AI deconstructs and reconstructs exactly that way though - more on this on my sub stack where I try to break down these very complicated topics. And I’ve been doing them both for a long long time.

Expand full comment
author

AI works themselves can’t be copyrighted but they are built in some cases on works that are.

If User U makes a work using Output O from Software S that violates copyright of human artist A or Company C, User U can still be held liable, no?

Expand full comment

see software S doesn’t work that way that’s your problem it doesn’t copy. and that’s why every court case has been lost so far and the copyright office is just throwing its hands up.

Expand full comment

I honestly appreciate you trying to understand this. But if you really wanna understand it and really want to deal with the issue again, I’d recommend you read my sub stack and let’s work on a joint substract together generating content for both of us and pushing forward on the issue.

Expand full comment

I don’t think you understand the issue if you have that response. There’s nothing to argue about - only to understand. It’s complicated but it is so crucial. And the courts, and the copyright office, have already found this. Why are you doubting it? You understand how a database is loaded don’t you?

Expand full comment

hi, I’m gonna say this one more time. Deconstructing an image and loading a database is not copying. There is no legal protection attendant to it. The way AI copies is not the kind of copying copyright is about. legally, it’s just not copying In the sense of copyright and I’m not the one to argue with. It’s the law.

Expand full comment
author

I will be astounded if the courts side with you on this, but I don’t care to argue with you. I think others have already responded with clarity.

Expand full comment

I’m not quite sure what you’re saying or why you feel a need to merge anything? Just read what I said. I’ve simplified it as much as possible to show you everything has changed. if you have a direct question, just ask.

Expand full comment

"Merge"? Who said it, and what did they say exactly? Please try and be clear.

Expand full comment

I don’t understand but again, you might want to look at my substack.

Expand full comment

Gary,

Great piece and always enjoy your perspective. You and Rob Terceck are really good at explaining both sides.

Julie, enjoy your perspective as well. Thank you.

If you think about it twenty years ago, the young mobile content industry was stuck until such tracking systems were introduced; thereafter every major media company adopted mobile media and the entire field grew rapidly. 

YouTube went through a similar evolution about 15 years ago. When it first debuted, YouTube was confronted with hundreds of lawsuits from angry rights holders.

Then YouTube introduced Content ID, and thereby turned antagonists into supporters and allies. In the process, YouTube gained a unique data asset that is unrivaled in the world of online video. Content ID is one reason why YouTube reigns supreme today in the world of online video. 

My thinking is that the first GenAI company that develops a reliable from of usage tracking and reporting will achieve something like YouTube’s success, by garnering the support of the major content companies, and thereby converting opponents into champions while at the same time distinguishing itself from a wide field of lookalike competitors.   We have a company Plai Anywhere that we think can make can make everyone work togethr and happy to discuss with you. Happy Holidays and will be attending CES if you are going.

With Appreciation

Pete

Expand full comment

and as far as alternatives goes - Copyright has been around since the 1600s - there is no way the law is going to be able to cope with AI in a hurry - and esp. while it is ripping through society… trying to shut AI down won’t work either so ????

Expand full comment
author

best answer = new laws that requiring consent and compensation

Expand full comment

remember, the law doesn’t work like science. Or programming. You have a ton of stakeholders trying to suss things out and will take a long long time to do that. And the little Person may well end up getting shafted. Anyway, that’s why technology has to step up if possible but it’s probably not gonna happen.

Expand full comment

no, you can’t do that that easily. That’s a big part of my point. And another big part as you can try to infringe on copyright law in order to protect against infringement of copyright copyright is a constitutional grant, and you’ve got to be really careful with enacting other laws around it.

Expand full comment

I truly wish you wouldn’t speculate on the nature of copyright, AI and corporate conduct. AI has nothing to do with copyright and the sooner everybody understands that the better because then we will encourage real protection, for artist works and AI, because right now it’s going to destroy Hollywood and let’s not duck that issue– but people are. And don’t waste time with copyright and AI. It’s just not a thing.

Expand full comment
author

have no idea what you are trying to say here or what alternative you are offering

Expand full comment

If I may attempt to clarify things a bit, I am wondering if Julie C. and Gary M. are coming at this from different angles, and that accounts for at least some of the talking-past-each-other here. From reading her newsletter, her agenda sounds like it has to do raising the alarm around protecting movie studio’s copyrights when AI is involved to any degree in the creation, and the threat to the ownership of said productions (at least from what she says – I don't know if there's a deeper layer of agenda under that), given where her work is involved as an attorney and executive in that industry, and the large amounts of economic capital at stake in the whole field. She also does not know currently what the solution is, as this is mid-stream; a sub-theme seems to be that we need to be careful messing with copyright laws, as they are fundamental (she also sees artists and other small players being thrown under the bus – as usual).

Whereas, Gary is more focused (in this article) on protecting the artist's materials that went into the database in the first place, and the AI companies being open, honest and clear about it all.

So in a sense they are coming at it from different ends of the AI beast: what it's fed into it (and therefore spewed out) for Gary; and for Julie, what it spews out and the legal status of said output, and the potential implications for all those that are make a living off of copyrighted end products that would be affected by the beast's disgorgements.

Am I on target guys?

Expand full comment

No not at all. AI neither commits copyright infringement in ingesting materials nor does it commit copyright infringement in outputting materials again it would be good if you read the blog and I’m sorry if this isn’t good news but it is what it is. And of course it’s ironic for any proponent of AI in that form to say copyright should be saved. They were destroying it as they set up their systems but didn’t bother figuring it out. Because it’s too complicated. lol

Expand full comment

You still aren't being clear. I did read your blog. The issue of material being copied (into the database, and used in some fashion by the AI and it's users) is exactly what's at issue, so just saying cases have been lost and this is different with AI is not the same as saying that it isn't really doing that (it is, in fact using copyrighted material in the database, even if it's used in "patterns"), and indeed cases may be won on that basis. You have to make that distinction. It's very simple, not complicated.

Expand full comment

I understand I’m not being clear to you. But this is well-known copyright law. You cannot get a copyright in a line, a style, a pattern, a concept, when AI deconstructs and loads of database, and I would recommend you look at the Laion5b (sic) story, which is a common database and contains something like 5 billion addresses. But it doesn’t contain 5 billion images and it doesn’t copy any of the images. It deconstructs all of the images to come up with something in its mind that looks like Barbee. And draw it. But that entire process isn’t legally copying. because none of that is protected. Copyright doesn’t protect thinking basically. Now if that’s not clear, let me know what why it isn’t. But I can’t say it any clearer – AI databases are not created from copyright infringing images they are created from Internet, addresses, and computational patterns and probabilities , none of that is protectable

Expand full comment

If I am reading you correctly, you are blurring together the processing of images with the loading of them. Because it's irrelevant that the databases such as Laion5b that use image from the internet "doesn’t copy any of the images": even if it's just a URL, the image is used as-is at the very beginning of a process. Deconstructing and reconstructing images by a program is a distinct step after they are found and accessed via a link.

Regards "Copyright doesn’t protect thinking basically." of course that's true, but AI and computers don't think, obviously. The fact that this isn't obvious to lawyers and judges because of the trickery of engineers and interested parties (or the lawyers and judges want to blur the distinction for some reason) is why we need voices like Gary Marcus to clarify things and say the truth.

Expand full comment

^^… concept, but that is how AI deconstructs and loads its databases…

Expand full comment