Discussion about this post

User's avatar
Andy Squire's avatar

This confirms the core issue:

we’re evaluating agents as if they were single decisions, when they behave as multi-step execution systems.

Safe steps composing into unsafe outcomes isn’t an edge case - it’s the default.

Which means most current “AI governance” approaches are mis-specified:

they audit after the fact instead of enforcing admissibility at runtime.

The real requirement is not better guardrails -

it’s control at the execution boundary across the entire chain.

TheOtherKC's avatar

As someone who loves watching systems break, AI has been a never-ending source of fascination and comedy. The STAC example in particular was so funny I had to call over my co-workers to look at it with me.

...and then I remember that every clever and hilarious exploit represents a real risk of harming actual people.

91 more comments...

No posts

Ready for more?