Scoop: What former employees of OpenAI are worried about
An inside look. And, spoiler alert, it’s not Q*
Over the summer, I have spoken to three people who left OpenAI since November when Sam Altman was fired and rehired. Almost everything has been off the record, but the picture has been consistent: promises have been made and not kept; they lost faith in Altman personally, and have lost faith in the company’s commitment to AI safety.
Yesterday, I had an opportunity to dive deeper. One of the three former employees spoke with me at length, and he has bravely—despite economic and legal pressure—decided to go on the record: William Saunders, who worked at OpenAI for three years, on the Alignment team, which eventually became the Superalignment team. He resigned from OpenAI on February 15, 2024.
He’s not going to tell you, or me, when GPT-5 will be released (if he even knows), but he has a lot of important things to say about what we might need to make a safe AI world — and where big companies are falling short.
§
In our conversation, Saunders made three key points:
Current AI is not all that scary, but future AI might become very scary. We don’t know when that will happen, but we are not prepared for it. His worry was not about any specific technology that OpenAI was about to drop (and he couldn’t comment on those details in any case), but about how they, or any other company, might roll out future products, if and when they posed major risks.
Internal governance is key; it shouldn’t be just one person at the top of one company calling the shots for all humanity. In principle major decisions, possibly even species-defining decisions could be made by a single person, whether at OpenAI or elsewhere. And in fact that’s the most likely scenario, given the dynamics of power. And nothing legally prevents it. Boards can be weak, and may not even be consulted; rank-and-file employees rarely have much say. As I have noted elsewhere, when Zuckerberg decided to open-source Meta’s Llama, he didn’t need anyone’s approval; rumor has it his General Counsel advised otherwise. If GPT-6 (if we ever get there) was truly dangerous, Altman could release it even if there were immense internal objections, from lawyers, engineers, and maybe even the board (if they even were advised about the release at all).
There should be a role for external governance, as well: companies should not be able to make decisions of potentially enormous magnitude on their own; external advisory bodies should have a strong say in weighing risks versus benefits. I stressed the same in my Senate testimony, but nothing thus far in existing US law addresses this need.
§
In an upcoming blog, sharp, insightful and heartfelt, Saunders goes into considerably more detail.
There, he proposes 9 “high risk decision principles”; the first is “Seek as broad and legitimate authority for your decisions as is possible under the circumstances”, the second “Don’t take actions which impose significant risks to others without overwhelming evidence of net benefit”, number seven, which he described as a metaprinciple, is “Don’t give power to people or structures that can’t be held accountable“
As he noted in conversation, in principle, CEOs of AI companies like OpenAI or Anthropic could make decisions of extraordinary impact without buy in from anybody, inside or outside their own companies, utterly ignoring all of those principles.
Saunders found that terrifying. So do I.
§
I can’t help but think of something Sam Altman famously told The New Yorker, in 2016, in the early days of OpenAI:
Altman’s views of his overall mission may not have changed, but the rest of us still don’t have a voice. No such governance board ever materialized; if it does, you can bet it won’t have any teeth, unless there is a government mandate that it does. California’s pending AI bill SB-1047 was supposed to be a tiny step in that direction, and even that has gotten watered down.
In the course of my conversation with Saunders, it became clear that one of the most important reasons for passing SB-1047 in California was its whistleblower protections. At one point, he said something to the effect that, if it passes, at least future whistleblowers will be able to speak freely to the Attorney General. OpenAI has relaxed some of its restrictions on nondisparagement, but so much of what future whistleblowers might need to speak to could still be precluded by confidentiality agreements. SB-1047 would allow whistleblowers to say more, to at least one party, the California Attorney General, who could act on it.
§
Part of our conversation was sparked by a bit of news as the Verge reported, OpenAI has just announced that it is opposed to California’s SB-1047 – despite Altman’s public support for AI regulation at the Senate. Justifications for their opposition were meagre, e.g., “the broad and significant implications of AI for U.S. competitiveness and national security require that regulation of frontier models be shaped and implemented at the federal level. A federally-driven set of AI policies, rather than a patchwork of state laws, will foster innovation and position the U.S. to lead the development of global standards.”
To Saunders, this about-face was pretty familiar and pretty emblematic: it’s one thing for OpenAI to talk about safety approvingly in public, another for it to accept any external control whatsoever on its actions, or even to honor its commitments internally. He left the company because he lost faith that the company would honor its commitments to safety.
Saunders didn’t think SB-1047 was perfect but says “The proposed SB 1047 legislation in California, while it could be improved, was the best attempt I’ve seen to provide a check on this power.”
He was none too impressed by OpenAI’s last minute opposition, writing in his draft blog that OpenAI’s comment today on SB-10147
[amounts to] fear mongering about the consequences of the bill without naming any specific ways the bill is harmful or could be improved, kicking the can down the road to the federal government even though no similar legislation is underway federally. If OpenAI was acting in good faith, they could have proposed amendments months ago, including sun-setting the California law once sufficiently similar federal law existed.
But of course they didn’t.
§
With companies accountable to nothing, not even what they have said to Congress and the White House, self-regulation is never going to work.
Unfortunately, if OpenAI (and others in Silicon Valley) succeed in torpedoing SB-1047, self-regulation is in many ways what we will be left with (especially when efforts at Federal regulation, aside from the Biden administration AI Executive Order, are moving so slowly).
When and if some future form of AI brings new threats, we will be screwed, our collective fates determined by a tiny number of individuals running a few companies, with almost no accountability. That just can’t be good.
§
Saunders and I disagree on timelines. He thinks it is at least somewhat plausible we will see AGI in a few years; I do not.
But he and I fully agree with his overall picture: if we don’t figure out the governance problem, internal and external, before the next big AI advance—whenever it may come—we could be in serious trouble.
As Lord Acton famously noted in 1887, “Power tends to corrupt, and absolute power corrupts absolutely.”
Update: Literally as i was about to post this, I saw that Saunders signed this open letter.
Gary Marcus is author of the upcoming Taming Silicon Valley [MIT Press, September 17], about the moral decline of big tech, and what we as citizens can do about it.
In related news: https://www.bloodinthemachine.com/p/how-a-bill-meant-to-save-journalism
Here, big tech (in this case Google) was able to turn proposed checks and balances (on their somewhat predatory business model) in a Californian bill completely around. With some 'AI' thrown in for good measure.
Money is power. Power corrupts. Hence money corrupts.
While I applaud the efforts to corral AI before the beasts escape, I am resigned to the fact that it isn't going to happen. As is virtually always the case with regulating industry, bad stuff needs to happen before any preventative regulations can be passed. There are several reasons for this:
- No one is really sure what the bad things look like or how the scenarios will play out. This makes it hard to write effective regulations and there's nothing worse than ineffective regulations.
- Regulators are deathly afraid of restricting a possible economic powerhouse. After all, no one gives out awards for bad stuff avoided.
- When there are, say, 10 potential bad things predicted, it is hard to take the predictors seriously. They are hard to distinguish from people who simply want to thwart the technology. Gary Marcus constantly gets accused of this. The accusations aren't justified but it's still a problem.
- There's the feeling that even if US companies play by some new set of rules, other countries or rogue agents will not and the bad stuff will happen anyway.