1 Comment
⭠ Return to thread

Simply invent new harder tests that "old" Ais will have a low score on like 40% for GPT 4 and then the new model gets 80% or something.

Expand full comment