Discussion about this post

User's avatar
James Murnau (aka Tim James)'s avatar

I maintain that many of the more "brilliant" responses LLMs don't hold up to close scrutiny. We're often so blown away by the initial shock of "Wow, a bot did this?" that we forget to pay close attention to what's actually been written. Often, the prompt is slightly fudged, or the bot is engaging in what I would describe as mad-libbing: taking sentences and phrases that originally referred to something else and simply changing the nouns. This seems to be the default for LLMs when answering whimsical questions: substituting more prosaic subjects ("parakeets") with more unusual ones ("flying pigs") in a way that looks like understanding if you forget just how much data these things are trained on.

(The root of all pareidolia when is comes to AIs is our tendency to forget that these things have quite literally swallowed the whole Internet. Your whimsical question has been asked on Reddit or Quora at least once, and probably 15 times).

Expand full comment
Tom Dietterich's avatar

Nice article! I would only add that these kinds of prompts break GPT because they are "out of distribution". There is presumably no training data about surgical churros, so GPT "tries" to find some connection between the two, and the connection it finds is about size. As you imply, size might even have a causal role in surgical instruments, but GPT can't reason about that, of course.

Expand full comment
73 more comments...

No posts