23 Comments
Dec 7, 2023Liked by Gary Marcus

The problem is they are consuming all the internet data, leaving no space for creating independent test sets to really evaluate these models.

Expand full comment
Dec 7, 2023Liked by Gary Marcus

Let's solve the simple productivity issue at hand. I apply to be your professional editor. For more details, kindly check your webmail.

Expand full comment
Dec 6, 2023Liked by Gary Marcus

Another issue is that there doesn't seem to be a wide IP moat, right? The barrier is compute and know how, but the players rich enough to afford it may all be offering variations of the same (limited) capability. I wonder how this plays for publishers who have proprietary data that they may be able to use to build more unique offerings (assuming they can keep their data from leaking into big players' tools).

Expand full comment

Here you are assuming that companies will be content to use the Transformers algorithms and simply add more data. I believe we will see a very large software stack, including databases, simulators, planners, all hidden behind the chat interface. Companies with best engineering talent will stay ahead, as always.

Expand full comment
Dec 6, 2023·edited Dec 6, 2023

I'm hypothesizing that focused tools with curated data may be better uses for LLMs. Less data, not more. Tools for people with enough expertise to catch the hallucinations, not "one tool to rule them all." Really the opposite of the hype so far - and I'm not convinced that even this will be particularly useful, given the costs.

Expand full comment

I agree that focused and specialized uses of LLM will do better. This is where the field is going, with mixture of experts LLM. For example, if you want a bot good at math, it should be trained on math problems, not on internet junk. Then, depending on what the user wants, the right agent is invoked.

Expand full comment

I agree with both of you.

One issue I find is how do interested entrepreneurs gather enough training data to make a good focused transformer (not even thinking about compute cost of training models). Even if a few or more companies in the same business pooled IP together would that be sufficient training data? I am specifically thinking about electrical engineering design tools... seems like there is at least one company claiming they have something coming soon in that domain. I wonder where they got enough training data.

Expand full comment

The argument that LLM is close to a plateau is based on the following reasonable points: (a) companies area already breaking the bank in terms of energy use and hardware cost, (b) improvements in quality grow with the log of the data and data is not as much left, and (c) LLM do not build true representations and have no idea about what they do.

Against which one can argue with the following: (a) There is much room for improvement on factuality. Google has lots of data. It can do a search, give LLM context, run LLM, cross-check against actual facts (b) Integration with third-party tool gives the chatbot depth and world knowledge. Companies have a lot of resources in creating a lot of training materials for this.

It is quite plausible that we are on a path of getting more and more capable assistants that do high quality work, "understand" more and more of the tasks being asked, and can check their own work. Will this be called a plateau?

Expand full comment

Gary Marcus already said his piece on integration with third-party tools (https://garymarcus.substack.com/p/getting-gpt-to-work-with-external).

Expand full comment

Yes, I read that article very carefully. There is no show-stopper against chatbots learning how to use tools well. It is same as with people, however. Easier and more frequently seen problems will be mastered before harder problems.

We will likely see more of chatbots trying to follow some strategies they've seen, fail, and resume another way. They will also get better at checking how they are doing and learning from failure messages (as those can be classified and be part of training).

Expand full comment

Sounds like Gemini may give GPT-4 a run for its money, but have we reached a plateau in terms of advancement? The idea that Google didn't exactly blow it away could indicate some interesting dynamics in the AI landscape. Exciting times ahead.

Expand full comment

Would you go back 20 years to a time AMD put out a processor that didn't eclipse Intel in a cycle and say processor technology had hit a plateau?

New releases, new benchmarks, and new years will come and we will keep blasting forward.

Expand full comment

This kind of thinking is exactly what's fooling our generation when it comes to new technologies... We lived through a time period when Moore's Law was in full effect, and we think it will be applicable to everything.

Expand full comment

Do you have any evidence to the contrary?

Expand full comment

Of course, I do not have any evidence. I am simply saying that your analogy is inappropriate, that's all.

Expand full comment

Without any evidence to back your claim, I think you are wrong. Nothing is “fooling” our generation. You shouldn’t make statements like that with nothing to back them up

Expand full comment

Has there been any information on the sizes of Gemini? Like how many parameters each version has?

Expand full comment

After having scraping the web and electronic libraries without worrying about copyright, modern-day pirates behind huge foundation models must begin to lack relevant data to progress. It would therefore be somewhat normal to see a plateauing while waiting to find other avenues to lead to a general AI integrating type 2 systems in the sense of Kahneman and/or to better exploit existing data.

Expand full comment

Custom GPTs are still great - we'll have to see what we get to build on Gemini next week. But I agree, it's starting to heat up in here!

I've been using Bard more frequently the past week or so and GPT in general has been getting flaky. Not sure if they're having issues with hardware holding up under the loads or what, but that will be another benefit to Google having their own ship sailing.

Expand full comment

Dr. Marcus, I imagine you are burning your candle much more than three ends. I am sure Bard could confirm this, and Gemini hopefully provides better details on how many candles you are burning. Having visited a few churches with many candles burning is the way I see it. I have a simple goal of sharing knowledge below and hope you can acknowledge this solution as a possible priority for the good old USA, and wonder how AI could help me since humans prefer not to.

Recently, Sam Altman joined Joy Buolamwini @CWClub in San Francisco, talking about her book "Unmasking AI." He left quickly, avoiding the audience after the talk, but I got a photo and shared docs with Joy & yesterday with Dr. Fei-Fei Li's book tour @CWClub, "The Worlds I See," about my passion for a civics education law proposed in Congress back in 1979.

One simple provision in this complex bill was to keep the institution and possibility of a military draft as an insurance policy, which 60 Senators would need to support and a House majority to activate, assuming the President signed it into law after Congress voted to restart it. I say this first to be clear about the limitations of a much cheaper use "ONLY" in this bill of moving youth draft registration to 17th birthday, for all youth challenge to a one-year, on-and-off, 17 to 18 birthday local/nationwide talks on civic values/education and marketing of voluntary service-learning challenges to some sweat-equity experiences with local non-profits, AmeriCorps-civic service, or military service.

I have thank you letters from the Carter, Reagan, Clinton, Bush Jr., W.H. staff & administration. Since 9/11, I have had over 1000 photos with experts but nearly 0% feedback. This is one great mystery for this veteran with a degree in behavioral science.

I hope to shame leaders into acknowledging this solution, but everybody seems apathetic and indifferent to Congress debating this again in the recent past, or near future. Here are some links to more detailed current info about this topic area.

I hope Gary Marcus can acknowledge his understanding of this impossible dream of a simple, year-after-year, youth wake-up call in every zip code to local/nationwide civic values talks on their dreams balanced with the realities we live now in the 2020s between their 17th and 18th birthdays!

Now, I believe that AI might help me better reach out and hold accountable the human beings that should be responsible locally and nationwide to an awareness to the general public and Congress of how the details should work, and answers to all the paranoia that would kill this debate in Congress. Peter Jesella, @JesellaPeter per X now, and broken accounts but history details @pjesella, @NCMNPS.

https://www.politico.com/news/2021/12/06/ndaa-women-draft-dropped-523829

https://www.hawley.senate.gov/hawley-leads-colleagues-effort-remove-ndaa-provision-forcing-women-register-draft

https://www.facebook.com/notes/peter-jesella/peters-written-oral-remarks-to-national-commission-on-military-national-and-publ/2196275167117721/

https://www.supremecourt.gov/Search.aspx?FileName=/docket/docketfiles/html/public%5C20-928.html

https://digital.library.unt.edu/ark:/67531/metadc1724233/?q=Inspired%20to%20Serve

https://www.nylc.org/page/WhatisService-Learning

Expand full comment

Very telling indeed!

Expand full comment

Ah, an absorbing barrier beckons.

Expand full comment