OpenAI’s Got 9.9 Problems, and Twitch Ain’t One
The AI darling faces a long list of serious challenges in 2024
Last year was, without question, OpenAI’s year.
ChatGPT, a product that they released at the end of November 2022 took the world by storm; by early in 2023 over 100 million users had adopted it. Pretty every major media outlet covered it, most many times (“5 ways ChatGPT can help your love life”, your pet, and so on), and most major companies began to experiment with it. Google struggled to keep up, and Microsoft, which has a large stake in OpenAI, jumped by nearly 50%. AI was on everybody’s lips, and OpenAI became central in virtually every discussion about AI. OpenAI’s CEO Sam Altman was ubiquitous, testifying at the US Senate (I sat beside him, as Senators addressed by first name), and made a world tour, meeting world leaders as if he were royalty.
But suddenly and with little warning the world got a hint that something was not altogether right. On Friday November 17th, the company’s Board unexpectedly fired Altman, with the board claiming that he was “not consistently candid”. Altman was dramatically rehired a few days after he was fired, but perhaps diminished. Despite enormous support from his employees, Altman and his long-time collaborator Greg Brockman were taken off OpenAI’s board, and The Information and others reported that Altman agreed to an internal investigation into “alleged conduct that prompted the board to oust him”.
Several reports, including at The New Yorker and The Wall Street Journal suggested that Altman had lied about the views of one board member, in an effort to get another fired, consistent with the claim that he was not consistently candid. (After all this came out, I began having questions of own about Altman’s candor, recalling in particular his answer to Senator Kennedy’s question back in May about his income. Altman said he had “no equity in OpenAI”, but in hindsight the answer was less than fully forthcoming. Although Altman has no direct equity in the company, he does in fact have indirect equity in the company, through his holdings in his former company Y Combinator, quite possibly worth tens of millions of dollars.)
Candor might not be of the essence in many companies, but Altman isn’t selling used cars, either. Because Altman’s company is presumably central to the future of AI and hence potentially to the future of humanity, he is likely, justifiably, to remain under a microscope, henceforth.
Regardless of how the internal investigation comes out, 2024 is a looking a lot more challenging, both for Altman and for the company.
Indeed, as scientist and technologist who was watched the company for a long time, I see a long list of challenges for the company this year.
First, lawsuits against the company are likely to come fast and furious. The New York Times sued in December, in a lawsuit that many think has considerable legs, in part because it was possible to get the system to reproduce verbatim or near verbatim full paragraphs of copyrighted text. And then, on January 6, the film industry concept artist Reid Southen and I showed that OpenAI’s image generation system, DALL-E 3, was capable of producing recognizable trademarked characters, such as Nintendo’s Mario and Sonic the Hedgehog, without being directly solicited to do so, which may lead to an additional flurry of law suits from film and video game studios; actors might have grounds as well. (We showed that a competitor, Midjourney was also capable of the same kind of potential copyright infringement.) It would not be surprising if OpenAI spends a huge part of its energy this year defending lawsuits.
Second, battles over copyright materials may wind up cutting massively into profits, and may be unavoidable. The company itself recently claimed that it would be “impossible” for its AI to work adequately without drawing heavily on copyrighted materials. Rather cheekily, the company asked the British government for an enormous handout: they effectively asked the House of Lords to waive copyright laws such that they might prosper, to the considerable detriment of artists, writers and other creators who would thus receive no compensation for their work. This was a truly remarkable request for an intellectual property welfare system by a not particularly indigent corporation. The request is likely in violation both of common law and of international conventions around intellectual property such as the Berne convention, and publishers, writers, and artists have every right to fight it tooth and nail. And, ethically speaking, it would amount to a massive giveaway to a handful of big companies and an enormous number of people’s expense. On January 10th, at US Senate Judiciary Committee, both Democratic and Republican members of the Senate spoke forcefully against doing any such thing. And a few days later the UK House of Commons pretty much said no, “recommend[ing] that the Government … not pursue plans for a broad text and data mining exemption to copyright”.
And in reality, OpenAI painted a false dichotomy. The choice is not between them building AI or not, it is between them building AI for free or building AI by paying for their raw materials, doing what lots of companies like Netflix and Spotify do routinely: licensing copyrighted materials that they commercialize. OpenAI knows this full well. Even as OpenAI implied to the House of Lords that “free” was the only option, behind the scenes they were busy negotiating licenses. Given how many different kinds of copyrighted works they are drawing on, and from how many vendors (many media outlets, many book publishers, many film studios, etc), and given how commercial their usage is of those sources, the licensing costs may quickly mount.
Third, at least so far OpenAI lacks both profits and a moat. The systems that they build are hugely expensive to operate, because they require massive amounts of compute for training (estimated in the tens or hundreds of millions of dollar for GPT-4, and rumored to be on the order of a billion dollars for GPT-5), but at the same time the general principles for building them have become fairly well-known in the industry. Google seems finally to be catching up, with their Gemini model, and Meta is releasing open-source competitors, perhaps not quite as good, but catching up quickly. Amazon and Apple may compete, and so on; dozens of startups are trying to as well. Large language models such as ChatGPT may quickly become commodities; we can expect price wars, and profits may continue to be elusive, or modest at best.
Fourth, ChatGPT and related systems have a kind of truthiness problem; some of what they say is true, and some is not; it is very difficult for the end user to anticipate what will or not be true. They have been known to make up biographical details, and even whole court cases; they have defamed people, and even occasionally botched basic math questions. Whatever they say sounds authoritative, but it is not always true; as they say in the military, “frequently wrong, never in doubt”.
In the industry these kinds of errors have come to be known as “hallucinations”, and they are common enough that the word hallucinations become dictionary.com’s word of the year for 2023. There is no immediate fix in the offing, and as someone who has studied related technologies for the last three decades, I am not certain that we will see one soon. A huge number of major companies, looked into ChatGPT this year, but many have been held back to greater or less degree by these errors. In domains with high stakes (e.g., banking or military applications) occasional errors could be quite costly.
In other domain, law, a Stanford study earlier this month reported that
legal hallucinations are pervasive and disturbing: hallucination rates range from 69% to 88% in response to specific legal queries for state-of-the-art language models. Moreover, these models often lack self-awareness about their errors and tend to reinforce incorrect legal assumptions and beliefs.
New models that allow both images and text seem to be facing similar problems.
Fifth, although the underlying technology initially improved rapidly, leading to a lot of excitement, it may soon, perhaps this year or next, reach a plateau. Progress depends heavily on having vast amounts of data, and it is likely that most of the high quality data sources have already been tapped. Further data may be of diminishing utility if it is of low quality or overlaps with existing data. Moreover more data tends to go with bigger models that more expensive to operate; the increases in cost may not be accompanied by big enough gains in performance, at some point leading the big companies to step off the gas. Bill Gates, initially excited about large language models, recently anticipated a plateau. If a plateau is reached in the near-term, a lot of the enthusiasm may dissipate, and investors may start to temper their investments.
Sixth, although ChatGPT has already widely been adopted by computer programmers and perhaps others, and been explored by a great many Chief Information Officers and the like, it is still not clear that there will be enough adoption to justify the company’s stratospheric valuations. Many companies are still in wait and see mode, trying the software but not putting into full-time production. In an opinion piece at WIRED this week, MIT Economist Daron Acemogulu argued that the the field could be in for a “great AI disappointment”, suggesting that “Rose-tinted predictions for artificial intelligence’s grand achievements will be swept aside by underwhelming performance and dangerous results.”
Seventh, word on the street at Davos was that after the Fall drama some companies have lost some confidence in OpenAI. Whereas many had all or most of their eggs in OpenAI’s basket before the drama, some are now said to be looking for backup plans and alternative vendors.
Eighth, the FTC has taken a strong interest in OpenAI, with at least two separate actions. The first, reported in July, is about privacy and accuracy. It has become apparent that large language models, OpenAI’s core technology, sometimes leak private data, and that this problem has not been resolved; hallucinations too remain problematic. (Regulation in the EU and elsewhere may force OpenAI to confront problems with their software around unreliability, privacy, its off-label use to generate misinformation, and in the service of cybercrime, and so on). Meanwhile on January 25, the FTC announced an “Inquiry into Generative AI Investments and Partnerships”, demanding information from Alphabet, Amazon.com, Anthropic Microsoft, and OpenAI. OpenAI’s unusual relationship with Microsoft is sure to receive considerable scrutiny.
Ninth, internal tensions are likely to remain. The company was originally established as a nonprofit “engag[ing] in research activities that advance digital intelligence in the way that is most likely to benefit humanity as a whole, unconstrained by a need to generate financial return”; a for-profit that does not always seem to align with those goals was created underneath that in 2019. Some have taken November’s battle over management to represent a conflict between the two, as a conflict between those wishing to maintain the nonprofit’s mission of human benefit relative to others seeking to maximize profit. Since the nonprofit was there first, and is at the top of the food chain, serious questions remain. And candor of the company as a whole will remain an issue. Just in the last few weeks I spotted two separate headlines in two different outlets, each describing a different abandoned promise, “OpenAI Quietly Deletes Ban on Using ChatGPT for ‘Military and Warfare” (The Intercept, January 12), and “OpenAI Quietly Scrapped a Promise to Disclose Key Documents to the Public (Wired, January 24)., again raising questions about the direction and candor of the company relative to its original mission.
Finally, the nonprofit versus for profit issue has bubbled over to the outside world. On January 9 the public advocacy group Public Citizen upped the stakes and wrote to the California attorney general asking them to dissolve OpenAI’s nonprofit parent, on the grounds that “OpenAl, Inc. may have failed to carry out its non-profit purposes“, arguing that the the nonprofit is now “acting under the effective control of its for-profit subsidiary affiliate”, and urging the state to distribute the proceeds (perhaps worth billions) to charity. This call could easily put a further strain on the company, and perhaps lead to other uncomfortable revelations if California chooses to investigate. Altman may find himself further in the hot seat.
§
That’s a lot for any one company to deal with.
It’s also worth noting that OpenAI is not alone is all of this; many other generative AI companies are likely to encounter intense legal and public pressure this year, around potential cases of copyright infringement.
Here’s my own guess: OpenAI, backed by Microsoft’s trillions, and now deeply tied to Microsoft, will survive, but the bloom will fade from the rose. Payouts for lawsuits and licensing will cut heavily into potential future profits. External investors pouring money into the company (most recently at a valuation rumored to be $86 billion dollars) may find themselves disappointed, unclear on how profits to justify those valuations could ever be made. Valuations for OpenAI and some of its peers may drop, sharply.
A lot of what has driven excitement about OpenAI has been a sense of unlimited potential. That could soon change.
Gary Marcus, Professor Emeritus at NYU, was Founder and CEO of the machine learning company Geometric Intelligence, acquired by Uber, hosted the 8-part podcast Humans versus Machines, and is co-author of Rebooting AI, one of Forbes’ must-read books in AI.
A real-life anecdote to back up what Gary has outlined: a few months ago, I was interviewed by a global organization that needed help with their technical documentation (communicating the importance of a specific set of green technologies to a wider, non technical audience). This organization has over 50,000 members all over the world, and the work they do directly impacts the built environment.
The first question they asked me was, "How do you feel about using ChatGPT in your work (as a writer/editor)?"
Note the open-ended nature of the question. They weren't making a value judgment. The question didn't lead—they didn't ask "Do you use ChatGPT in your work"
I said, simply and immediately, "I refuse." And then qualified it with many of the points Gary, I myself, and many others have been making about the reliability of generative AI.
There was a beat of silence as the three people on the call looked at each other. Then they broke into applause. (And hired me a few weeks later)
I have never received actual applause in an interview.
This says it all: “OpenAI painted a false dichotomy. The choice is not between them building AI or not, it is between them building AI for free.” With all the hype and fascination OpenAI was never properly evaluated on basic business fundamentals.
Gary, the more you write and with knowledge of the last 1000 years of business and technology. OpenAI is incredibly comparable to the dot.com era crash and burn. Technology and times may change but human greed and stupidity doesn’t.