Discussion about this post

User's avatar
Gerben Wierda's avatar

The paper by Ernest Davies is indeed worthwhile. I found especially the final paragraph of section 4.8 illustrative.

Using the LLM as a sort if 'loaded dice' for the mutation element of a genetic algorithm is a nice ('tinkering engineers') trick, but also one that raises a question about the effect of the LLM constraints — they are stochastically constrained confabulators after all — on the mutations on how effective you can find genetic optima.

It seems we're in the 'engineering the hell out of a fundamentally limited approach' stage for transformer LLMs. And the overselling by Google PR is becoming a pattern (see Gemini).

Expand full comment
Scott Burson's avatar

Just started to read the paper, and found this (line 54):

"First, we sample best performing programs and feed them back into prompts for the LLM to improve on; we refer to this as best-shot prompting. Second, we start with a program in the form of a skeleton (containing boilerplate code and potentially prior structure about the problem), and only evolve the part governing the critical program logic. For example, by setting a greedy program skeleton, we evolve a priority function used to make decisions at every step. Third, we maintain a large pool of diverse programs by using an island-based evolutionary method that encourages exploration and avoids local optima."

This is already coming across to me as a likely case of stone soup — I'm referring to an old fable in which soup was allegedly made from a stone, but it becomes clear in the telling that there were lots of other ingredients that actually made it soup. Given the structure they've described above, they could very well have gotten the same result, I would expect, using a random program tree generator — this is just John Koza's genetic programming technique from the '90s. Does anyone seriously believe that there was information anywhere in the LLM's training corpus that bore on this problem?

Expand full comment
17 more comments...

No posts