Tom Dietterich on Marcus on AI

12 Comments

An alternative view of what is happening is that we have been passing through three different phases of LLM-based development.

In Phase 1, "scaling is all you need" was the dominant view. As data, network size, and compute scaled, new capabilities (especially in-context learning) emerged. But each increment in performance required exponentially more data and compute.

In Phase 2, "scaling + external resources is all you need" became dominant. It started with RAG and toolformer, but has rapidly moved to include invoking python interpreters and external problem solvers (plan verifiers, wikipedia fact checking, etc.).

In Phase 3, "scaling + external resources + inference compute is all you need". I would characterize this as the realization that the LLM only provides part of what is needed for a complete cognitive system. OpenAI doesn't call it this, but we could view o1 as adopting the impasse mechanism of SOAR-style architectures. If the LLM has high uncertainty after a single forward pass through the model, it decides to conduct some form of forward search combined with answer checking/verification to find the right answer. In SOAR, this generates a new chunk in memory, and perhaps in OpenAI, they will salt this away as a new training example for periodic retraining. The cognitive architecture community has a mature understanding of the components of the human cognitive architecture and how they work together to achieve human general intelligence. In my view, they give us the best operational definition of AGI. If they are correct, then building a cognitive architecture by combining LLMs with the other mechanisms of existing cognitive architectures is likely to produce "AGI" systems with capabilities close to human cognitive capabilities.

Expand full comment

sounds like neurosymbolic AI in the end, no?

Expand full comment

Well, someday someone may figure out how to do it all in a connectionist architecture. But either way, we are seeing more and more structure in these systems. I also think the pragmatic engineers in startups will be thinking: "I could try to do reasoning inside the net, but damn this SAT solver runs fast on my GPU." I'm on the lookout for interesting combinations of heavily optimized symbolic AI reasoning engines and strong contextual knowledge retrieved from the LLM. That would give us the soundness of the inference engine plus the rich context and world knowledge of the LLM. It's not how people work, but it is a great way to build an AI system.

Expand full comment

Like (1)

Scott Burson

Nov 10

Mostly agreed, with the caveats that (a) we don't yet understand how to combine LLMs with those other mechanisms in a way that will work, and (b) even when we make some progress on that question, I think it's still going to be incremental; I would not yet use words like "likely ... close to human cognitive capabilities".

Expand full comment

Yes, I'm speculating here (and will probably regret it quite soon). If past experience is a guide, we will discover yet more pieces that are needed.

Expand full comment

Like (3)

Beans

Nov 11

Tom, question from a neophyte, would all these additional systems ancillary to the LLM be helpful in terms of interpretability?

Expand full comment

Reply (1)

Tom Dietterich

Nov 11

Maybe. The more structure that is exposed by the system, the more interpretable it can be. For example, RAG makes it possible to cite source documents. However, as the size of a search space scales up (e.g., in AlphaGo or in a SAT solver), the size of the "explanation" grows very large, and new techniques are needed to summarize it. That raises the long-standing challenge of discovering human-interpretable abstractions.

Expand full comment

Reply (1)

Larry Jewett

Nov 11

When LLMs cite “source documents” are they actually citing the specific documents from which particular data came?

Or are they citing after the fact “best guesses” about where the data likely might have come from? —Eg, based on a web search of key words

If they are citing the actual source documents , how does that work?

Expand full comment

I've seen a GPT based system cite correctly in some sense, but still hallucinate the details when forced (for example) to do arithmetic.

Expand full comment

Tom Dietterich

Nov 11Edited

In Retrieval Augmented Generation, a collection of documents (e.g., Wikipedia) is pre-processed and indexed into a vector data base. During generation, your question is matched to the vector database, and relevant passages from the documents are copied into the LLM's context buffer. Bing (and presumably Google) also do a web search and include some results in the input buffer as well. My simple model is that it is these retrieved documents that are cited. But I imagine the commercial models have multiple strategies for determining which documents to cite. Studies have shown that the generated answers can have a mix of retrieved material and information learned during the pre-training phase. You must check everything an LLM produces!

Expand full comment

Reply (2)

Eric Jeker

Nov 12

This is particularly evident using Perplexity. It's also particularly frustrating as if the source is garbage the RAG will basically output garbage, or an interpretation that is based on garbage. Humans can quickly tell when a source is bad but it seems difficult for their pipeline to do that. I am also still wondering if they remain slave of the SEO and PageRank algorithms to extract those documents?

Expand full comment

Larry Jewett

Nov 11

Thanks.

And I plan to check😊

Expand full comment