39 Comments
Jan 27·edited Jan 27Liked by Gary Marcus

A real-life anecdote to back up what Gary has outlined: a few months ago, I was interviewed by a global organization that needed help with their technical documentation (communicating the importance of a specific set of green technologies to a wider, non technical audience). This organization has over 50,000 members all over the world, and the work they do directly impacts the built environment.

The first question they asked me was, "How do you feel about using ChatGPT in your work (as a writer/editor)?"

Note the open-ended nature of the question. They weren't making a value judgment. The question didn't lead—they didn't ask "Do you use ChatGPT in your work"

I said, simply and immediately, "I refuse." And then qualified it with many of the points Gary, I myself, and many others have been making about the reliability of generative AI.

There was a beat of silence as the three people on the call looked at each other. Then they broke into applause. (And hired me a few weeks later)

I have never received actual applause in an interview.

Expand full comment

Damn. What a moment.

Expand full comment

I knew there was a reason I liked you ;-)

Expand full comment
Jan 27Liked by Gary Marcus

This says it all: “OpenAI painted a false dichotomy. The choice is not between them building AI or not, it is between them building AI for free.” With all the hype and fascination OpenAI was never properly evaluated on basic business fundamentals.

Gary, the more you write and with knowledge of the last 1000 years of business and technology. OpenAI is incredibly comparable to the dot.com era crash and burn. Technology and times may change but human greed and stupidity doesn’t.

Expand full comment

Gary, this is a very nice synopsis. I've forwarded to colleagues. Keep up the good work.

Expand full comment

BTW, I'm also a comrade in arms of Kathryn Conrad, at KU's nascent Center for Cyber-Social Dynamics.

Expand full comment

Great analysis! I might add a 10th; energy costs. All those data centres and even if much is offloaded to devices, that's still a huge cost. One that will only get higher.

Expand full comment

Great overview for anyone new to the controversy. Like your phrasing about training materials. Why shouldn’t OpenAI have to pay for their raw materials like any other business? What if Spotify were to say that paying royalties to artists makes their business model untenable? It would be an absurd argument. Why does the same logic get thrown around with AI tech?

Expand full comment
Jan 27Liked by Gary Marcus

Even Spotify, which itself has decimated the payments model for and livelihoods of many musicians, has not dared to take such a brazen stance. Goes to show you the unprecedented level of hubris here.

Expand full comment
Jan 27Liked by Gary Marcus

Good summary. My five cents are that some of these are worse than others. Legal troubles may go away with enough money and claims of strategic importance, but plateauing performance and the problem of long-term economic sustainability will probably be the big ones.

The grifters and cultists in the space claim that progress can only be exponential, and in two years generative AI will create all the blockbuster movies, making actors unnecessary, and in ten years it will "solve all of science". Reality is unlikely to oblige, because it is indeed full of diminishing returns.

As for the economics, the old observation from social media that if you aren't paying for it, you are not the customer but the product can be flipped around to observe that whoever is paying for it is the customer. So, in the case of OpenAI, I guess the customer is Microsoft, and it also follows that the core product is hype.

The intended small-scale customers of the service seem to fall into a limited number of categories: (1) those playing around with it for amusement and the novelty; (2) coders who use it instead of a keyword search on Stackoverflow; (3) those who want to save on salaries and fees for writers and artists; (4) spammers and fraudsters who generate low quality web content. The first group will not pay anything substantial, and at any rate the novelty will wear off soon. The second to fourth groups will be willing to pay something, but it is a big, open question if they are willing to pay what it actually costs to keep the models running long-term. It may just be easier to go back to Stackoverflow, or a writer may turn out to be cheaper than a "prompt engineer" whose output still has to be labouriously revised and checked anyway. It is surely already a bad sign that there appear to be some companies who claim to use AI but are really Mechanical Turks where human artists are producing the output or human engineers control the "self-driving cars". I also doubt that the quality produced by the current generation of generative AI will ever be good enough for anybody except the fourth group, who don't have to care about their company's reputation when things go wrong, but that is the question of diminishing returns again.

It also leads to a follow-up question: if the hallucinations and glitches we get right now cost hundreds of millions of dollars to produce, what will a model cost that consistently produces outputs of acceptable quality? What would an AI cost to run that performs at the cognitive level of the average human, who, after all, costs only tens of thousands of dollars per year?

Expand full comment

Accurate assessment of problems facing OpenAI.

Hallucinations are fundamental to how LLMs operate so very hard problem to solve.

LLMs work best with additional tools to augment it's hallucinatory capabilities.

Expand full comment

Hi Tomasz,

What tools? AI validating other AI to make a lower chance of "bad/wrong" responses? Users perhaps need to flag any responses that they suspect to be hallucinatory so that professional teams at companies can review and work on finding trends and optimizing an overseer AI that will filter/iterate with the underlying primary LLM output to improve the integrity/bounding of the final output... during the inference process? Is this perhaps already happening?

Expand full comment
Jan 27·edited Jan 27

I wrote a comment on that. There's a lot of research going on and papers published.

Expand full comment

If you use AI for code generation then it's pretty simple to verify if program is executing correctly.

Another point is that interaction with AI is iterative so yo work on required solution via chat.

I used this technique successfully to write code.

When you get error then you can simply provide output to AI to look for correction.

AI is time saver in this context.

Expand full comment
Jan 27Liked by Gary Marcus

I'm surprised the phrase "new AI winter" didn't occur in this post. There's lots of shivering already.

Expand full comment
Jan 28·edited Jan 28

Last time we had an "AI winter", some small companies were trying to market expert systems. This time, world's leading tech companies (with trillions in market cap each) have been making a profit from deep-learning methods for at at least 10 years.

We have in the real world solutions for voice recognition and generation, image recognition and generation, and same for language. Self-driving cars drove 7 million miles with no driver.

A correction as for the dot-com bust is likely, there's no AGI on the horizon, and even current methods need quite some more engineering and research work. But no, no AI winter. This time things are for real, and the users are in the billions.

Expand full comment

Sure, the AI winter we may be heading into now is different than the previous ones. That said, it is not accurate to suggest expert systems were only marketed by "some small companies". All the major computer companies spent substantial sums on them.

These AI winters are mostly a matter of investment drying up due to unfulfilled profits and dreams. They don't actually kill the technology behind them. Expert systems and their technologies are still with us and I would guess a few companies continue to profit from them. I'm sure that will be the case with LLMs as well. My guess is the profit that comes from this technology will disappoint investors and they will invest much less in AI for a while. Self-driving cars aren't really living up to expectations and I doubt that will change anytime soon. The copyright fiasco is only just getting started. People are still starry-eyed about the idea of LLMs doing programming jobs. I've used it and it is helpful but it isn't going to replace many programmers any time soon. And we don't even need to get into the mess that deep fakes will make.

Expand full comment
Jan 28·edited Jan 28

We are surely in a period of inflated expectations. I will not be surprised if many small players go belly up, and maybe the same for Cruise's self-driving car.

Yes, Waymo has been going slower than expected, though in retrospect that makes sense. A robot car on the street has to have very high expectations, and there is no other analogous robotic deployment. We are indeed a bit before the time for such things.

Code assistants are highly useful, and I use one daily, but yeah, it won't replace a programmer. So, there is agreement on the basic facts.

I am more optimistic about the future though. OpenAI can offer people opt-outs and license data from bigger players. Microsoft has enough cash.

Assistants are useful, and both Microsoft and Google put them wherever they can. Likely enough for companies to shell out some licenses for that, if they improve.

What I think as most exciting is the fact that now we have the compute resources, human resources, and market to afford to work on much harder problems than it was possible a decade ago.

White collar labor is also very expensive, and any tech that will help a bit with productivity will be bought. So, rather than a winter or a "singularity" (as the opposite extreme), I am expecting slow deployment over the next 5 years and continued improvements.

Expand full comment

May at best be an "OpenAI winter!" Why AI winter? While they did contribute to making GenAI popular and they are a significant player for sure, the AI field is beyond GenAI, and certainly far beyond OpenAI, the company.

Expand full comment

The problems with the current wave of AI technology go far beyond just OpenAI. I mention a few in my comments above. Refer to OP Gary Marcus's recent work for more details.

Expand full comment
Jan 28·edited Jan 28Liked by Gary Marcus

Open AI will certainly survive being supported by MS’s money but Open AI self-appointed status as a leader of ethical and fair AI development must be questioned. So I wonder if another, really nonprofit, major AI company would not be needed, a new company which would operate legally and candidly from the start. For instance, licensing all copyrighted material used for feeding the models, inviting creators to contribute to the data basis, assessing the hallucination risk for users, etc. Generally, establishing rules of good conduct for all the AI industry. The goal of this company would not be to make instantly a technological breakthrough but rather to make an ethical one, to draw a safe path to sustainable AI for all. This company could be created by AI’s professionals of good will and funded by donators of good will, with a vision and an ambition, who are aware of the risks and of the stakes. The AI-driven tools proposed by this company would be at the beginning less powerful than the others on the market but a guarantee of fair contribution and fair access could be a valuable compensation for users. Ethically guided AI companies proposing safe and reliable tools, in full compliance with the law, are urgently needed as administrations all over the world are not eager to set a real safety net for AI applications.

Expand full comment
Jan 28Liked by Gary Marcus

Great summary, definitely need to share. It often feels like, as Gen Z would say, OpenAI is gaslighting us. They promise amazing technology that is easy for every user, while encouraging us to blame ourselves for bad prompts if we get unreliable answers, even though they admit that their unreliable systems will gives us different answers every time. They claim new discoveries, but genAI simply regurgitates what is already known or created through others’ hard work. It’s all a very expensive and environmental costly parlor trick. If the emperor has any clothes, it’s just underwear.

Expand full comment
Jan 30·edited Jan 30Liked by Gary Marcus

I would argue that chatgpt4 is about the second most useful thing for coding to me, after integrated development editors. I still get far more productivity boost from things like the debugger. Perhaps OpenAI should have a slighter better evaluation than companies like IntelliJ?

Expand full comment
Jan 27Liked by Gary Marcus

Good Article - I trust your realistic view of the present state of affairs.

Expand full comment

The appalling hallucination rate of legal error puts me in mind of a tweet from M. Andreessen trying to spin a recent study that even LLMs trained not to be deceptive continue to be deceptive. He touted that as "gloriously uncontrollable." WTF does THAT mean? Anybody in a C suite anywhere looking to inject uncontrollable AI into their products/services? How about frequently incorrect legal advice? I'm starting to think there are fewer legit use cases for current AI than hype suggests, unless you use case is churning out quantities of text where errors aren't important (i.e. propaganda or fiction).

Expand full comment

When they spin **that** hard, you know they're running on fumes. Even in fiction you need to get things right :P

(Sorry li'l chatbot didn't mean to kill your Pulitzer prize dreams)

Expand full comment
Jan 27·edited Jan 27

This is a good writeup, and Gary's diligence is appreciated.

OpenAI has a lot going for it, and these challenges, just as Altman's temporary ouster, will be seen as small glitches.

Some people assume that the OpenAI tech will stall. It is assumed all that they will do is put data in the pot and pray that it cooks. That is very far from the truth.

Here are several techniques that hold great promise. None of them is fool-proof alone, but can do much together.

- Problem classification: There need not be one chatbot. Under the hood, the problem will be classified and appropriate agent will be called. These agents can be developed independently.

- RAG: Retrieve, generate, verify based on fresh and reliable data. This will make it possible to have very recent knowledge and a large amount of errata developed from current use, that will help the bot improve on the fly.

- Use of tools. Bard made a lot of progress there recently. It runs software and produces results. We will see more in-depth reasoning where the bot makes a plan then executes it. Of course, a lot of training is needed for a bot to use tools well.

- Verification. This is a huge problem and there is not one single approach. Yet, use of knowledge graphs, symbolic logic (we've seen examples of that), formal verifiers, and web search, can greatly improve the results.

We are getting closer to a versatile system that works like a human. It can have external memory which it can update, it can check itself, run simulators to do experiments, inspect results, and adapt based on failure.

This will not be done in one year, but we will see very nice progress for many years ahead.

Expand full comment
founding

Gary - Nice summary of the blisters forming in OpenAI’s shoes. As I’m sure you know, I’ve been beating my own drum about very similar issues in my own Substack. I even credited you and one of your recent posts in my submission yesterday (“The Next AI Winter Has Already Begun.”).

But one thing I think you could have put more emphasis on is that OpenAI’s problems actually pose a far greater threat to Microsoft. After Satya Nadella pumped over $13 billion into OpenAI, MS is at far greater risk of a. losing market cap (like, maybe, a trillion?), and just as costly, b. VERY Substantial reputational risk.

You agree?

Again, nice job!

Bill Lambos

Expand full comment
author

Good point. Microsoft certainly won’t go out of business but you are right that the market cap might sink, and reputation could take a hit, if things don’t turn out well.

Also, they may look a little squirrelly after the FTC digs in .

Expand full comment

Alternatively, Microsoft, which lost out on smartphones and the internet, may, for a change, be at the forefront of the AI revolution. There is a big demand for more automation of office work, and Microsoft, with its deep penetration of enterprise computing, is uniquely positioned to exploit. Low risk (given its market share), potentially very high reward.

Expand full comment

These days, I see GPT, Machine Learning and AI being used interchangeably, almost being synonymous with each other, but these are three very distinct terms, with AI being the overall name for anything that exceeds human stupidity, I mean machine-enabled thinking. 20 years ago I wrote a paper on DSS (decision support systems), which support so many AI applications, but are a research area in their own regard.

I think it should be noted that many of these challenges are very speculative. Educated guesses can be valuable, but it's important to distinguish between speculation and certainty.

I'd agree with you that copyright and plagiarism will be an achilles heel of OpenAI. Sadly, I believe that the lawmaker will legislate this sooner than regulating Facebook. Somehow individuals' privacy isn't as high in the books of our elected representatives than the money-making enterprises that lobby these representatives.

Don't quite agree with your third argument "OpenAI lacks both profits and a moat": While the general principles for building generative AIs might be fairly well-known, refining them, making them better at the "truthiness problem" (your forth argument) and more suitable for specific applications is where a lot of the IP will sit. Google's been working on improvements for years and only came out with Bard and Gemini when pressured by OpenAI. Sure, there are sensitive areas, like the mentioned banking and military, but even they will have AI support in the not so near future, no doubt. The military might go with initial focus areas such as logistics and supply management to reduce waste, ensure timely resupply and improve the overall efficiency with minimal risk to human life. Surveillance, reconnaissance and cyber warfare (another one of my papers) will also be AI supported.

That "most of the high quality data sources have already been tapped" is highly speculative in the fifth argument. We all know that GPT suffered from low quality data initially and that it's gotten better but whether most high quality data sources have already been used is too much out of thin air IMHO (unless you have prove).

Seventh argument should be a given. Ever since the Dotcom crash we should be clear not to put all our eggs in one basket. At times the article gave me the impression that AI (or GPT) is doomed because of the hype that OpenAI started. I don't think it is. GPT (and from there more of AI) is here to stay. OpenAI isn't all of AI or GPT. It'd be a fallacy to think that it's up to OpenAI to do the heavy lifting.

(To clarify: I'm not affiliated with OpenAI, nor am I defending ChatGPT. To that end, I support your question about candour.)

PS: I don't quite understand the link to apple.com over “OpenAI Quietly Deletes Ban on Using ChatGPT for ‘Military and Warfare'", but it might just be a mistake.

Expand full comment