Discussion about this post

User's avatar
Dennis P Waters's avatar

Sadly, Alexander does not seem to grasp that text inputs alone (no matter how much text) will never overcome the need for embedded knowledge of the world that comes from perceptual and motor systems that have evolved over billions of years. Smart guy, though, in other contexts.

Expand full comment
Alex Semenov's avatar

Meta AI: "Memorization Without Overfitting: Analyzing the

Training Dynamics of Large Language Models" by Kushal Tirumala et al. states:

"Surprisingly, ... as we increase the number of parameters, memorization

before overfitting generally increases, indicating that overfitting by itself cannot completely explain the properties of memorization dynamics as model scale increases"

LLMs involuntary drift to architectures where an every pattern designated a node in NN (as opposed to tuning a single node's activations to represent a number of patterns). Plastic, growing architectures are the [partial] answer. LLMs (of whatever size) are just approximating them by increasing the size of a model.

"A node a pattern" architectures posses some amazing properties (native explainable, continuously locally trained, stable against all the drifts and overfitting, +++), still remain to be stochastic parrots. They are just a foundation for a "smooth transition" for [neuro] symbolic layers above. Work in progress.

In short - Gary's correct, no LLM will ever be intelligent, being built on the wrong foundation without idea of how to build a first floor. Shiny though :-)

Expand full comment
13 more comments...

No posts