Discussion about this post

User's avatar
David's avatar

If your actions have no consequences, your representations have no meaning.

Expand full comment
Stephen Wilson's avatar

I hope this sort of debacle also prompts some careful reflection on the whole idea of “guardrails” — because it’s such a terrible metaphor.

The prominence given to the word belies technologists thinking about safety too late and too poorly.

“Guardrails” dominates every AI safety discussion. Often the word is the solitary mention of protective measures.

But think for a minute about what is a guardrail?

Real world guardrails save drivers from catastrophic equipment failure or personal failure (like a heart attack).

They are the safety measure of last resort!

But in AI it’s all they talk about — as if it’s the only way to mitigate against bad AIs!

Holistic safety-in-depth tries to account for bad weather, poor roads, design errors, and murderous drivers. But with AI they seem to expect failing models to … what … just bounce around between imaginary barriers until they come to a stop?

And don’t get me started on the physics of real guardrails. They’re designed by engineers with a solid grasp on the material properties it takes to stop an out-of-control lorry. Yet AIs don’t obey the laws of physics. We have scant idea how Deep Neural Networks work let alone how they fail.

Expand full comment
28 more comments...

No posts