34 Comments
Jan 10Liked by Gary Marcus, Katie (Kathryn) Conrad

SO RIDICULOUS! SO MUCH REGURGITATION! Why does an Italian videogame character have to have Mario's Mustache? Could they not, for example, had it driving a Ferrari? Or have it dressed as a Roman Senator, or Centurion, or a 16th century Venetian mercenary?

There are 190 countries on the planet. Why does a "patriotic" superhero just happen to have Captain American's trademark shield! Is nobody from, IDK, Poland at all patriotic? Couldn't a Pole proud of resistance against invading Russians make a great superhero? Sure it could. But it would take something that's not vomiting up stolen images to create that superhero.

Over last 30 years, internet advertising industry already killed much of newspaper and magazine industry by hoovering up most ad revenue. Now Silicon Valley's AI pirates want to come for other creative areas. If their looting and pillaging doesn't get stopped NOW it only gets worse.

Then eventually you get "model collapse" when AI uses AI generated training data, which doesn't work.

Expand full comment
Jan 10Liked by Gary Marcus, Katie (Kathryn) Conrad

Haaahahaha! GenAI companies legal departments telling customers they are responsible for and own the outcomes of the 'tool' in order to get the companies protected against big IPR-holders coming after the vendors, but the model itself telling the opposite! I think the real lawyers that are in this mess are now having a collective stroke.

Another gem is that in Microsoft's Bing Terms and Conditions it says: "Due to the nature of the Online Services, Creations may not be unique across users and the Online Services may generate the same or similar output for Microsoft or other users." So, suppose Jane produces an image. She owns it, right? But then John a month later produces almost a perfect copy. He owns that, right? Right? Hello? Lawyer people? Right?

I wasn't wrong last year telling people I would take out the popcorn for this. So predictable. It's like seeing "The Big Short" playing live out in front of you.

Expand full comment

“’Regurgitation’ is a rare bug that we are working to drive to zero.” - that is such a ridiculous statement that it beggars belief. OpenAI must be taking us for idiots. During training time the model aims to minimize the difference between its output and the training data, that's how training works. So, the model is basically trying to replicate the training data. ’Regurgitation’ is not a bug, it's a fundamental feature of training GenAI models.

Expand full comment
Jan 10Liked by Gary Marcus, Katie (Kathryn) Conrad

And yeah, it's the old “to put your finger in the dike”, with many programmers being the Dutch boys with fingers, plugging the holes, and there will always be ways around them. The dike is flawed and needs to be rebuilt.

Expand full comment
Jan 10Liked by Gary Marcus

If find this issue pretty amusing but also as a frequent user of ChatGPT, I don’t look forward the nerfed, guard rail enhanced version that will inevitably result from all this

Expand full comment
Jan 10Liked by Katie (Kathryn) Conrad

They are basically asking governments to excuse wholesale copyright infringement in training their LLMs. There is no fair dealing defence based in UK or EU copyright law that covers what OpenAI have done in training their LLM , as any research was either not being done for a non commercial purpose or if it was, the use of the infringing items in a commercial LLM is plainly outside that research purpose. I very much doubt that fair use in the US does either, given that that regurgitation is direct evidence of insufficiently transformative use of a copyright work and the direct commercial competition with legitimately created and licenced works - see the US Supreme Court judgment in Andy Warhol Foundation In v Goldsmith.

Generative AI can be prompted to create images and text that reproduce substantial parts of copyright works - examples appear in this thread, as well as in the pleadings in the New York Times case and the exhibits in the Getty Images case against Stability AI.

Regurgitation - piracy more like it!

And like piracy of sound recordings and images - copyright owners already have a well-developed suite of copyright law and litigation practice across the major copyright jurisdictions to deal with it. This includes potential liability for authorisation of infringement by offering the means of to create the infringing items and liability and as a joint tortfeasor with the end user doing the prompting and creating the work.

Incidentally, apart from copyright, names and characters such as Captain America and Mickey Mouse often have trade mark protection as well. Good luck with trying to use those new generative AI creations as trade marks.

Expand full comment
Jan 10Liked by Gary Marcus, Katie (Kathryn) Conrad

"...only for demonstration purposes." Wow. Convenience for the moment reasoning.

The overall situation is one where what was basically\ a lab experiment, as it were, gets turned over to the public, through a semi-controlled channel, as part of a broader experiment using the public (us) as guinea pigs (first mistake: hubris) via a disingenuous logic of Altman that it needs to be "open" in order to develop it as a tool and model, etc. And then the second mistake, taking it even farther and making it into a (in part, then more and more) commercially-funded product, then the titans and self-interested logic of profitability gets its hands in it.... More or less. And in sneaks The ends justify The means thinking.

Once again, a false comparison is being made between humans reading, memorizing, using, creating, etc. and AIs. Interestingly, not only with humans like Ng, such as his comment "just as humans are allowed to read documents on the open internet.." quote from X.com in the previous (Marcus) article, but with an AI e.g., "you would need my permission to use it on your website since I created it."

Who programmed this damned thing!? 😂

Expand full comment

Total Garbage This Techno Worship. “Empower people to express themselves creatively” (Open AI, DALL E2). While institutions grapple to find meaning of art in an increasingly wild (risky) and meaningless world, Jacques Ellul’s early warning appears way more useful and prescient. “…caught in a web of facts, systems, rules and outcomes they have been given, but not given the opportunity to decide for themselves” (Art In A Technological Society). Open AI tells us “you can create original, realistic images and art from a text description.” (example shows an astronaut riding a white horse on the moon). But the makers of this incredible software have also restrained it’s ability - forbidding DALL E2 from generating violent, problematic, hateful or pornographic images. By removing the most “explicit content” from it’s training data, a sterilized DALL E2 does not have exposure to the above threats. Invisible are the fascist and purist ethics of Open AI. The founders arrogant promise of “creating AI that benefits humanity” aptly named after the God of Surrealism, Salvador Dali. Now art can be censored even before it can take birth. But the images generated by DALL E2 are not spectacular, all be it still borrowing, emulating real art and artists of the past.

Expand full comment
Jan 10Liked by Gary Marcus, Katie (Kathryn) Conrad

Until the time you can open up the blackbox, and support user-level explanations, there is no hope for the guardrail approach to succeed.

Expand full comment
Jan 10Liked by Gary Marcus

I suspect that the apparent lack of any significant reasoning ability by LLMs will be a significant factor explaining why effective guardrails are so hard (impossible?) to implement, i.e. fundamentally the LLMs themselves are too dumb either to properly understand what plagiarism is or to robustly reason when a particular response would be an instance of it. The other significant factor of course would be not training on data (i.e. intellectual property) for which you don't have permission!

Expand full comment
Jan 10Liked by Gary Marcus, Katie (Kathryn) Conrad

This is going to be one hell of a wild ride, for sure. Think Olympic luge on steroids, with no guardrails.

Expand full comment

Remember when symbolic AI was abandonned when a) developping sets of if .. then… ontologies that would properly encode reality realized the wall of infinite recursive complexity of the world was unreachble and b) deep learning offered the simplicity of using statistical inference from large sets of data, ignoring the fundamental requirement that statitcal inference requires that the data be ergodic?Superficially DL has worked well, notwistanding its non reliability and the rule that it not to be used blindly in critical decision making where human judgment must be included in the loop (for instance, as per recent remarks of Judge Roberts of the US Supreme court). Gary you correctly point out that these on-going efforts to sanitize the use of LLM’s are pointless. My point here is this is because they are facing the fundamentally identical quagmire of recursive-complexiy-non-compliance-with-statistical-inference. These efforts in promp engineering are interesting experiments but offer nothing more that the folly of just relying on statitical inference. And brilliants minds wasting their creatibity on futility.

Expand full comment

Remember when symbolic AI was abandonned when a) developping sets of if .. then… ontologies that would properly encode reality realized the wall of infinite recursive complexity of the world was unreachble and b) deep learning offered the simplicity of using statistical inference from large sets of data, ignoring the fundamental requirement that statitcal inference requires that the data be ergodic? Superficially DL has worked well, notwistanding its non reliability and the rule that it not to be used blindly in critical decision making where human judgment must be included in the loop (for instance, as per recent remarks of Judge Roberts of the US Supreme court). Gary you correctly point out that these on-going efforts to sanitize the use of LLM’s are pointless. My point here is this is because they are facing the fundamentally identical quagmire of recursive-complexiy-non-compliance-with-statistical-inference. These efforts in promp engineering are interesting experiments but offer nothing more that the folly of just relying on statitical inference. And brilliants minds wasting their creatibity on futility.

Expand full comment
author

A quick update: The UK government has decided not to allow a broad copyright exemption for AI training. https://committees.parliament.uk/publications/42766/documents/212749/default/

Expand full comment

People have been making and publishing memes of copyrighted characters for years 🙄

Sorry, that horse is outta the barn, precedent wise

And if people aren’t selling anything with the copyrighted characters, then bluntly, why should it matter?

Expand full comment

My wish is for AI to accomplish Herculean feats of technological wizardry in a span of months, of the sort that would otherwise require hundreds of high focused, high-performing humans working full-time for years on end. Projects like making better medicines, and devising cleaner and more efficient energy storage products and industrial processes.

Propaganda and plaigiaristic summaries are such an unworthy use of the technology. I can't imagine AI providing any function in the verbal realm that a human properly equipped with scholastic and critical thinking skills isn't able to do better. In point of fact, all of the claims and speculations about AI use in matters like politics, history, current events, and culture is driving me back a renewed emphasis on the hard-copy material world of brick and mortar libraries, and their interlibrary loan functions. I'm not sure how long digital libraries like archive.org will remain uncorrupted, once the bullshit really gets flowing.

How sick I am of AI Boilerplate. It's a cybermold infestation. Dry rot to the structural functioning of Internet search engines. Ad prioritization in page results and the accompanying click-through sites were already nearly intolerable. Now it's proliferating through the exploitation of AI, and in the process it's generating some of the most banal "review" content in the history of humankind. Increasingly accompanied by images of AI Spokesmodels, often generated to look almost but not quite like A-list celebrities, and other famers...tedious. Although I have to admit some bewilderment that AI technology hasn't yet been used for vid/TV ads, to subtly morph the images of the models in commercials from scene to scene- or even within the frames of one scene, Philip K. Dick "scramble suit" style...oh, that would destroy the illusion that naturalistic television commercials depict a spontaneous and unedited reality? Good. It's an illusion that deserves to be unmasked. An authentic Creative would figure out an artful and entertaining way to do it, while conveying the intended marketing message even more effectively than ever.

(Always been pretty much impervious to advertising, myself- yes, I know that Everyone Who Matters insists otherwise, and I know why that is: Turf Claiming. I'm even less of a target audience than ever; my Consumption is pretty much all about secondhand and upcycled goods these days.)

Expand full comment