Great piece. I've always maintained that there was no intelligence in generative AI and that it does not get us closer to cracking AGI. This is not to say that the technology is useless. It is certainly very interesting and useful for some purposes where reliability and truthfulness are not an issue.
My take is that it is woefully irrelevant to the number one problem facing AGI research today: generalization. I would even venture that generative AI is a hindrance to cracking AGI because it sucks badly needed funding out of generalization research.
In my opinion. AI is neither good nor evil. Human beings, on the other hand, have proven they can be extremely evil. If AGI falls in the wrong hands, it's goodbye humanity..
It is not remotely useful in the art world. If I pay an artist to, e.g., create a logo, I want them not only to give me vector graphics files that I can use, but I also want them to create something that follows a consistent theme (e.g. the logo should "fit in" with the design of the rest of my website, brand, etc). Probably they will multiple versions of the logo (something that fits in a favicon, something large that fits on a banner, etc). The same goes for images, paintings, etc. People don't pay for the final product, they pay for the work that goes into generating it, and to be able to control that work. If you just have something that spits out the final product, and you have to cross your fingers that it does exactly what you want, then you have something that is useless for most cases that people need an artist.
AI allows for rapid prototyping of logos. It may be true that any given random logo that the program generates won't necessarily be what you need, but if you can generate 400 variants in an hour, you will almost certainly get at least one that is at least usable.
If you are a multinational business, you are going to be willing to spend the money to get a professionally designed logo.
If you are a local business, a "good enough" logo generated using MidJourney in less than an hour may well be better than what you'd get otherwise from hiring someone, as random graphic designers have no guarantee of being competent, and you, as a random plebian who doesn't know much about art, are not necessarily even equipped to hire a good one. And it will certainly save you a lot of money.
More "throw away" imagery - such as images on a seasonal menu - have even less value to you as a business owner, and as such, AI art is a really good use case for such vs hiring someone.
The main benefits of AI art are not to big multinational businesses with billions of dollars to spend, but smaller local businesses which don't have tons of money to spend on art.
And even big companies have bad logos sometimes. Twitter's new logo is bad, for instance.
Doing this on steroids (i.e. with GenAI instead of your imagination) is going to saturate your brain with mediocre ideas and the atmosphere with CO2. Both things not good.
The most useful form of generative AI is actually in the art world, not writing.
I'm not sure why people think that text-based AI generative tools are going to be super awesome; the drawbacks are very obvious and virtually everyone is literate anyway, which greatly reduces the value of the output there.
Conversely, art is something that most people can't do well, and which takes a very, very, very long time to generate (hours for a single piece). Generative AI can produce images in less than a minute.
This is where the real value is going to be, in my eyes - graphic design, art accessibility, and in combination with tools like photoshop, hyper-advanced tools for image correction and editing.
The hallucination problem is irrelevant to art, because art is about making stuff that looks good, not creating "truth"; we have seen immense gains in the quality of images, and if you need to correct AI images, sure, that's a thing, but it still is way faster to generate and correct than to create from scratch.
As such, for many purposes, generative AI art is really useful. And art is a big industry.
It is likely we will see AI 3D modelling tools, which will also be very useful for producing lots of stuff for video game environments and the like when you are creating open worlds.
I think you're dismissing the writing use case too easily, particularly fiction, which actually benefits from hallucinatory mechanisms. Being literate isn't the same as creating written works. The reading age of Western adults averages at an 11-year old. Hallucinations are also how the thoughts in my head are being generated similarly in yours via words - the brain is a hallucination-generating machine.
Fictional writing doesn't actually benefit from hallucinations because even fictional writing requires consistency. The problem with hallucinations is that it creates output that is inconsistent with the input. It's not just that it is making up nonsense, it is that it makes up inconsistent nonsense. You'll see things where characters end up switching voices or roles in an AI written story because the AI knew it was a story about two characters confronting each other but couldn't keep consistent which was which, or which had what voice or point of view or what have you.
This is one of the reasons why AI-written stories aren't very good, along with the quality of the writing being rather poor.
You make some good points and thanks for clarifying the tighter definition of "hallucination" used for ML inputs vs outputs. But it's likely that the gap will close. Plus a lot of the output being sold on Amazon would struggle to be classified as decent writing, but it sells, with profit based on volume, pricing and subscriptions, not content quality.
Good points. Yes, AI may often produce junk, but junk sells, and it soaks up dollars that might have gone to quality writing.
There's a McDonald's fast food joint in the little shopping plaza near our house. The drive up lane is always backed up. 90% of the shows on the streaming channels are not really fit for human consumption. But they're there because lots of people like them.
So many commentators seem to be basing their analysis of generative AI on what's happening NOW, at the dawn of this industry. I wasn't such a good writer myself, when I was 2 years old.
What I imagine happening is that AI starts at the lower junk content end of the market replacing human writers. This seems to be happening already. And then, as AI improves, it gradually moves up the quality ladder replacing humans at ever higher levels.
Example: I read a philosophy professor who claimed that AI can already produce credible philosophy articles at the undergraduate level. He produced an example to illustrate, and it seemed convincing. If that's true, then at some point in time AI will probably be able to do graduate level writing too. And then maybe the professional level.
There will always be human writing, because some humans like to write. But will they be able to make a living writing in a market overwhelmingly flooded with super cheap written content of at least reasonable quality?
Will future audiences really care that much whether it was a human or machine that generated the content? Do you care that machines instead of humans made your car? Or do you just care what the car can do for you, and how much it costs?
This formula might provide an answer. Perceived value is determined by perceived scarcity.
As example, the Internet made it possible for almost anyone to be a writer with a global audience. Scarcity was destroyed, and the perceived value of written content sank. It seems AI will just take this already existing "scarcity of scarcity" situation to the next level.
Any philosopher worth the name would recognize that it is an inductive fallacy to assume that because an LLM can now regurgitate an undergrad term paper it would inevitably yield be something original and cogent that a skilled graduate student could write.
We could consider the future of Substack. What's going to happen when AI can generate an entire network like this in just a few hours? If that can happen one time, it can happen thousands of times. Thousands of networks each with thousands of fake authors with their fake personalities and their millions of generated articles.
From a business perspective the question would seem to be, how much will the broad public care about the difference between human written articles on Substack, and mass produced AI content that has flooded the Internet because it is so cheap and efficient to produce?
EVIDENCE: Here is America roughly half the country has voted for Trump twice, and may do so yet again. They could be watching C-Span, but instead they're watching Fox. How discriminating do we expect these folks to be when it comes to consuming written Internet content?
How much do you and I care that our cars are now made largely by robots? That's what I see coming to the world of content. We net writers will care when we are replaced, just as the factory workers cared when they were replaced. But the broad public is not going to care.
Good points, but that's NOW, not where we're likely headed. The Internet had all kinds of limitations and problems in 1995, like super slow dial up modems etc. Twenty years from now we'll probably look back on this era of AI development in a similar way. The fact that generative AI is pretty buggy NOW doesn't really mean that much in the larger picture.
The problem here isn't one of computational power or infrastructure but of approach.
These systems are not capable of producing intelligent output. The way they're generated is by feeding in vast amounts of data and generating a complex weighted mathematical equation that predicts following text based on previous text/prompts (or in the case of an image model, predicts what image would be linked to a textual description (or even an image, in the case of MidJourney's image prompts)).
The problem with this is that it is prone to hallucination because these models don't actually "know" anything. What were Microsoft's Q2 profits in 2023 is something that could be answered with a Google search; what were Microsoft's Q2 profits in 2024 or 1969 cannot be, because the former is in the future and the latter is before the company existed. What were Stark Industries' Q2 profits in 2023 is likewise unanswerable because Stark Industries is a fictional company.
The cause of hallucination is that there's tons of text out there saying what the Q2 profits for year XXXX were for (insert random company here). As such, as far as the AI is concerned, generating something that looks like all of those articles is entirely reasonable and there's no reason why it would be wrong - but the actual number is likely to be made up nonsense. Even if you link your AI model to search, you're still going to end up with these problems when you ask for information that doesn't exist, and even if you ask for information that does exist, it can easily misinterpret it based on something else being more common (for example, if there was big news around Microsoft's Q2 profits in 2022, or big news about their projected profits for Q2 2023, it might find that instead via search).
And one that isn't using web search is going to be even worse off in terms of accessing real information.
The idea that this is some simple to solve problem is not correct; it's not because of a lack of computational power, it's because the way these models are created and function is the very thing that causes the hallucinations in the first place.
Exactly. The problem, in a nutshell, is twofold: 1) "Truth" is a static. 2) There are literally infinite "Truths". The only 'solve' to the problem of hallucinations is to insert statics into the models and clearly, irrespective access to compute, inserting infinite statics (or infinite anything else for that matter) is simply an impossible task. As Dr. Marcus said in one of his initial posts on generative AI, all we have is a "stochastic parrot". This is a novelty item, not a trillion dollar industry.
I disagree that generative AI is a novelty item in general; art AIs are already being used in prototyping, background creation, etc. and we're seeing entire products created using AI generated art. Lower print run products cannot afford to pay artists tens of thousands of dollars to hundreds of thousands of dollars to produce a fully illustrated high quality full-color RPG manual, but with MidJourney, you can make 99% of the art that way, pay someone $500 to draw you cover art, and you've got what you need for $530 instead of $50,000. This is a big improvement. Creating art for TTRPGs is another big thing that it's really useful for. Stock images can be replaced with a MidJourney subscription.
Moreover, because art doesn't have to be "true", it just has to look good, as well as because something that looks cool can still be used (the Clever Hans effect), all the AI art has to do is produce generally the right sort of thing, and make stuff that looks really good, to be a "success".
We're also seeing it being used in photoediting software; photoshop combined with generative AI art is a very powerful combination and is very useful for editing photos, removing foreground objects, adding stuff in, etc.
We're also seeing AI upscaling being put to good use, as well as outpainting.
I think the language models are apt to be a bit less useful outside of some of the more edge-case uses as far as productivity goes, as chatting with them is a novelty and they can't write good enough material (and can't consistently enough avoid hallucinations) to be nearly as useful as they need to be for a lot of purposes.
That said, they may be useful for translation tools, as well as generating material for video games - I've already seen people use these models to generate NPC dialogue and a demonstration of using it (along with a voice synthesizer) to create sports casting that is responsive to what is going on in a video game race.
Readers of fictional novels don't expect (or even want) such informational accuracy. Disruptive technologies like LLMs only need to produce books of good enough quality for the vast majority of such readers. If they're happy with what they're reading in the books that they purchased then so will the publishers and booksellers making more money from them by adopting such innovations. What the authors think (and how they're financially impacted) won't come into it.
The book market is very heavily slated towards a small number of popular authors; most books sell fewer than 5,000 copies. Most books do not make money, and certainly not significant amounts of it.
You need to write something like JK Rowling or George RR Martin. Otherwise, you aren't actually meaningful competition in that market.
The quality of writing in the AI written works is extremely low, which means you need to compete in a market which is more focused on disposable, lower end content. The most vulnerable market would probably be romance novels, as being able to write a dirty story to yourself with an AI is appealing to a lot of people and the fact that it might not be of the best quality may not be as important, as a lot of romance novels aren't the highest quality to begin with, and being able to personalize it might make up for the other quality issues, as sure, it may not be the best writing, but it is specifically appealing to you in particular.
That's an actually valuable market that might be targetable. But the present quality of these AIs is not even to this level (no one wants a romantic partner to suddenly turn cold on them, or radically change personality or backstory) and a lot of these AIs aren't even trained to produce such things.
I keep reading this over and over and of course it's a natural impulse to just extrapolate in a linear fashion.
The problem with this is addressed tenfold in this substack; the reason Gary and others (like me) reject the scaling hypothesis is not that we can't extrapolate, it's because there are fundamental issues with it and repeating your point does not increase my propensity to abandon to the facts.
Your example with model T is particularly flawed because it precisely did *NOT* follow a scaling that is necessary for AGI: The model T was one order of magnitude less efficient, sure, but it was driving about the same speed (~factor 2) and cost the same money (inflation aside). You may quote comfort and convenience as major improvement points in the last hundred years, and I'd agree with that, but let's be honest: did the economic reality change because of how automobiles evolved since model T? Not at all, just like it's not a good bet to hedge for a replacement for chairs and beds that have been around for even longer. The truth is in the marginal contribution and it's just not there.
This is a piece from an artist about how AI art just isn't very good. Now, the article focuses mainly on sacred art, but isn't just limited to that. I've found the AI art okay for blog posts and such, but even then, it's often difficult to get anything I'm thinking of to be created.https://open.substack.com/pub/hilarywhite/p/ai-images-whatever-it-is-its-not
You can create very nice images using AI, but it's limited in some odd ways as to what you can actually create. The higher end engines (like MidJourney) can make images that pass passive scrutiny, and if you shop them to clean up artifacts, it can be very hard to tell the difference between an AI image and a human generated one if you avoid the standard AI "style".
The quality of AI art has gone from "pretty bad" to "quite good" over the last year. I started using MidJourney on V3; it was originally making images like this:
You write, "This is a piece from an artist about how AI art just isn't very good."
Cultural leaders across the board will do everything they can to maintain their positions within the status quo, but in the end most of them are doomed to fail. Like it or not, change is coming, faster and faster.
Amazon killed our local mall. Like that. Lots more of that kind of thing coming. More and more disruption, faster and faster.
Did the authors of the content consent to being included in the "large data sets"? Have they been contacted, assuming they are alive, of course, and given the opportunity to opt out instead of being automatically opted in? I just read an article that midjourney supports prompts as "draw me in the style of this or that artist" who's alive, and the author hasn't been asked the permission to be included in the dataset. So, yes, it is a form of stealing. But to me, the art generated by AI is soulless, and at some subconscious level, the brain can perceive it.
Gary's analysis is very useful and obviously he and his team are highly knowledgeable. However, from the perspective of Communication Science pictorial communication (images and such) is very powerful medium of communication and does not necessarily need to be 'factual' in the way Gary describes. One could alss make the point that creative writing draws often from discourses and themes that have before, but are recreated in new ways.
Good writing is very much dependent on out-of-distribution sampling.
You can't get that by turning up the temperature, because that way you induce too many changes in places (or more like: with respect to contexts) that should stay constant.
"Correct" (i.e. non-uncanny-valley) OOD sampling has a fractal associative structure that cannot be mimicked with the current architecture without solving a deep (and possibly infinite) looping problem. GPTs are about width, not depth.
(Also, just to be first to say it, I don't see mamba solving it. An order of magnitude faster scaling area law is still an area law.)
1970, Minsky: “In from three to eight years, we will have […] a machine that will be able to read Shakespeare, grease a car, play office politics, tell a joke, have a fight. At that point the machine will begin to educate itself with fantastic speed. In a few months it will be at genius level, and a few months after that, its powers will be incalculable.” (interviewed for the famous Life article: Meet Shaky, the first electronic person)
Th important thing about this quote is that it was believable then. Minsky was one of *the* experts (Turing Award for AI winner)
(Incidentally, I asked GPT4 to wager if GAI would be a step to AGI and after much humming and thing it produced a "yes". But then I showed that to my daughter and her comment was: "Yeah. That's what Reddit thinks...")
My estimate is that GPT-fever is going to break, and we're going to be left with a some productivity-enhancing uses. And do not forget Nobel-prize worthy efforts like AlphaFold that also come from transformers afaik. Niches will profit. And LLM Noise in society will be a problem. It's like getting a lot of cheap energy from fossil fuels and as a side effect polluting massively.
I remember when Cyc came up with the revelation of "recursion". Then the presenter mentions Cyc was written in Lisp where recursion is everything - so it's wasn't a revelation but rather a base case programmed into Cyc.
New term for me "bogosity". It might be slightly different than what I was trying to point out - which applies to the LLM - that basically Cyc (and now AI) pops out a revelation of "new knowledge" - when it's new just polished and then output as new (or new sounding). The "Recursion" concept they thought to be new knowledge - but it was how Cyc was written. It was fed "recursion milk" when it was founded. So it couldn't invent recursion.
Hi Scott, didn't mean to go off on a tangent, lol. I worked on Cyc, for just under a year, glad it wasn't longer. I have since been under the strong belief that 'common sense reasoning' is an absurd oxymoron, and that common sense is 'common' because it's sensed nearly identically, by us beings - reasoning, which is optional, comes after.
So yes, that claim about recursion fits well with what was proposed, pursued etc.
I must add that I've since found out Minsky probably never made that actual 3-8 year claim. The claim was probably a fabrication of the Life (fixed my mistakenly mentioning Time) 'journalist'. Minsky was thoroughly convinced GOFAI was on the right track, of course.
Words and code are just the beginning... some of the most beneficial usecases right now for generative AI are in the visual domain. Generative Fill in Photoshop is a game a changer, and Adobe is working on similar tools to transform video. Image generators like Midjourney and Stable Diffusion can generate amazing images for little effort. And in the world of 3D graphics generative AI startups are making it easy to lay people to generate 3D objects & worlds, which could be big in the next few years as AR/VR starts to gain momentum thanks to Apple's Vision Pro. Then there's voice cloning, digital clones, text-to-video... we are only scratching the surface of generative AI, and it doesn't have to achieve AGI (it likely won't) to completely transform many industries.
I love how you brought the money and valuation into the discussion. Late in the research phase and early in the development phase, the valuation comes in. If the valuations were truly as inflated as inferred, wow.
Your book Rebooting AI offers a well considered solution. Build a knowledge graph (ontology) that covers human knowledge in a taxonomy of concepts. A semantic web. The scale needed to is on the order of ten million concepts with a branch depth of 5-10 edges. (2-3x Wikipedia) The ontology connecting concept nodes is constructed by NLP using Common Crawl to extract 50-100 billion RDF triples and classify subject/object predicates to connect nodes by relationships. This semantic AI model (SAM) is the solution you posit.
LLMs might perform much better with long tail knowledge trees grouping tokens by topics rather than starting with random weights. SAM could be used to detect factual errors and hallucinations. Even red team the LLM or construct steering prompts to align the LLM with legal or other constraints. Investment in a Web 3.0 SAM (reading and curating) can save LLM (write only).
I think the current capacity to generate code, art, etc. already shows its dramatic value even with the hallucination issue considered. One still needs to be able to code for example, but it greatly speeds me up even just cutting and pasting code snipits.
The idea that this is going to be a "Dud" is at strong variance with observed capability. BUT it is possible that early players will not have trillion dollar valuations, or even hundreds of billions. So for investors investing at stratospheric valuations it could be a "dud" in that sense.
But this is going to re-invent nearly all knowledge work. And we don't yet know how.... just like in 1998 we really had not good understanding of what the internet was going to be, or in 2009 what as smart phone was going to be. We are looking at the tip of a very unique iceberg of innovation. That much should be clear to you. Indeed because of the dramatic range of intersection this technology has with ..... everything.... this berg is going to be larger than the smart phone, and likely comparable to the scale of the internet in the scope of things in transforms.
A very good piece! It seems that we need a new breed of AI - different from the generative one, different from statistical one. What about the one based on differences and differentiation, comparisons and filtering, as a new computational paradigm? Think about the game "20 Questions" or Venn diagrams - they narrow down on the most fitting candidate rather quickly.
You have influence, you are familiar with researchers, investors and policy-makers, why don't you step in and control the whole process? AGI potential is there but to implement it properly it will take efforts of more people. We don't need hype, we need a working thing. You will make it work. But it will be different AI, not generative one.
My guess is that today's generative AI is about where the web was in 1995. It's new, it's exciting, you can do some cool things with it, but it's still pretty primitive, in comparison with what is likely coming. We're probably spending too much time worrying about the current crop of bugs.
I'm having my first AI image generating experience over the last two weeks, and as so many already know, it's pretty compelling. I dove right in to building a new Substack with Stable Diffusion, and seem to have been fully sucked in to the experience. Point being, as this technology continues to improve, becomes easier to use, less buggy, more reliable, and more powerful, it seems likely more and more people will be drawn ever more deeply in to the fantasy realm these tools empower us to build.
This psychological progression interests me more than the money involved. Generative AI is yet another mechanism for further directing our attention away from the real world and towards the symbolic digital realm. I suspect that, in the end, this will prove a more important factor than who gets rich off this industry.
One thing I've seen more clearly from a few weeks with Stable Diffusion is that there's just no chance of turning back with AI, as the benefits are just too compelling. Not going forward with AI would be like turning off the Internet, that's just not going to happen. I knew that already intellectually, but a full immersion in generative AI helped me actually "get it".
I still think AI is, on balance, a mistake. But I see now that declaring AI a mistake is also a mistake, because it's clear that for the better or the worse, like it or not, whatever the pros and cons and consequences, AI is coming, and there's nothing anyone can do about it. So my plan going forward is....
Until AI eats my DNA or whatever, I'm swimming downstream from now on, going with the flow, surrendering to the inevitable, and am going to have some fun with it.
Brains are several hundred million years old, and we still hallucinate at the drop of a hat. I mean, fever dreams? Really? Raising body temperature a few degrees deranges the whole process?
"Psychotic experience is to the diagnosis of mental illness as fever is to the diagnosis of infection"
Great piece. I've always maintained that there was no intelligence in generative AI and that it does not get us closer to cracking AGI. This is not to say that the technology is useless. It is certainly very interesting and useful for some purposes where reliability and truthfulness are not an issue.
My take is that it is woefully irrelevant to the number one problem facing AGI research today: generalization. I would even venture that generative AI is a hindrance to cracking AGI because it sucks badly needed funding out of generalization research.
Exactly
If what you say is true, maybe that's a good thing? Hinderances to AGI sounds kinda appealing here.
In my opinion. AI is neither good nor evil. Human beings, on the other hand, have proven they can be extremely evil. If AGI falls in the wrong hands, it's goodbye humanity..
It's the wrong hands that will want it the most....
How true.
"where reliability and truthfulness are not an issue."
Thing is, reliability and truthfulness are increasingly becoming issues in our day-to-day lives, for myriad reasons.
It is not remotely useful in the art world. If I pay an artist to, e.g., create a logo, I want them not only to give me vector graphics files that I can use, but I also want them to create something that follows a consistent theme (e.g. the logo should "fit in" with the design of the rest of my website, brand, etc). Probably they will multiple versions of the logo (something that fits in a favicon, something large that fits on a banner, etc). The same goes for images, paintings, etc. People don't pay for the final product, they pay for the work that goes into generating it, and to be able to control that work. If you just have something that spits out the final product, and you have to cross your fingers that it does exactly what you want, then you have something that is useless for most cases that people need an artist.
AI allows for rapid prototyping of logos. It may be true that any given random logo that the program generates won't necessarily be what you need, but if you can generate 400 variants in an hour, you will almost certainly get at least one that is at least usable.
If you are a multinational business, you are going to be willing to spend the money to get a professionally designed logo.
If you are a local business, a "good enough" logo generated using MidJourney in less than an hour may well be better than what you'd get otherwise from hiring someone, as random graphic designers have no guarantee of being competent, and you, as a random plebian who doesn't know much about art, are not necessarily even equipped to hire a good one. And it will certainly save you a lot of money.
More "throw away" imagery - such as images on a seasonal menu - have even less value to you as a business owner, and as such, AI art is a really good use case for such vs hiring someone.
The main benefits of AI art are not to big multinational businesses with billions of dollars to spend, but smaller local businesses which don't have tons of money to spend on art.
And even big companies have bad logos sometimes. Twitter's new logo is bad, for instance.
> , but if you can generate 400 variants in an hour, you will almost certainly get at least
> one that is at least usable.
And you have to go through all of them to see.
"AI allows for rapid prototyping of logos."
Like a paper and a pen do.
Doing this on steroids (i.e. with GenAI instead of your imagination) is going to saturate your brain with mediocre ideas and the atmosphere with CO2. Both things not good.
Not remotely useful in the art world? Maybe the art world is expanding in some new directions and some folks don't want to go there?
I know you hate humanity but most people do not.
Maybe not. For good reason. I didn't want to go in the direction of soup cans either.
This is all fixable in the near future.
The most useful form of generative AI is actually in the art world, not writing.
I'm not sure why people think that text-based AI generative tools are going to be super awesome; the drawbacks are very obvious and virtually everyone is literate anyway, which greatly reduces the value of the output there.
Conversely, art is something that most people can't do well, and which takes a very, very, very long time to generate (hours for a single piece). Generative AI can produce images in less than a minute.
This is where the real value is going to be, in my eyes - graphic design, art accessibility, and in combination with tools like photoshop, hyper-advanced tools for image correction and editing.
The hallucination problem is irrelevant to art, because art is about making stuff that looks good, not creating "truth"; we have seen immense gains in the quality of images, and if you need to correct AI images, sure, that's a thing, but it still is way faster to generate and correct than to create from scratch.
As such, for many purposes, generative AI art is really useful. And art is a big industry.
It is likely we will see AI 3D modelling tools, which will also be very useful for producing lots of stuff for video game environments and the like when you are creating open worlds.
I think you're dismissing the writing use case too easily, particularly fiction, which actually benefits from hallucinatory mechanisms. Being literate isn't the same as creating written works. The reading age of Western adults averages at an 11-year old. Hallucinations are also how the thoughts in my head are being generated similarly in yours via words - the brain is a hallucination-generating machine.
Fictional writing doesn't actually benefit from hallucinations because even fictional writing requires consistency. The problem with hallucinations is that it creates output that is inconsistent with the input. It's not just that it is making up nonsense, it is that it makes up inconsistent nonsense. You'll see things where characters end up switching voices or roles in an AI written story because the AI knew it was a story about two characters confronting each other but couldn't keep consistent which was which, or which had what voice or point of view or what have you.
This is one of the reasons why AI-written stories aren't very good, along with the quality of the writing being rather poor.
You make some good points and thanks for clarifying the tighter definition of "hallucination" used for ML inputs vs outputs. But it's likely that the gap will close. Plus a lot of the output being sold on Amazon would struggle to be classified as decent writing, but it sells, with profit based on volume, pricing and subscriptions, not content quality.
Good points. Yes, AI may often produce junk, but junk sells, and it soaks up dollars that might have gone to quality writing.
There's a McDonald's fast food joint in the little shopping plaza near our house. The drive up lane is always backed up. 90% of the shows on the streaming channels are not really fit for human consumption. But they're there because lots of people like them.
So many commentators seem to be basing their analysis of generative AI on what's happening NOW, at the dawn of this industry. I wasn't such a good writer myself, when I was 2 years old.
What I imagine happening is that AI starts at the lower junk content end of the market replacing human writers. This seems to be happening already. And then, as AI improves, it gradually moves up the quality ladder replacing humans at ever higher levels.
Example: I read a philosophy professor who claimed that AI can already produce credible philosophy articles at the undergraduate level. He produced an example to illustrate, and it seemed convincing. If that's true, then at some point in time AI will probably be able to do graduate level writing too. And then maybe the professional level.
There will always be human writing, because some humans like to write. But will they be able to make a living writing in a market overwhelmingly flooded with super cheap written content of at least reasonable quality?
Will future audiences really care that much whether it was a human or machine that generated the content? Do you care that machines instead of humans made your car? Or do you just care what the car can do for you, and how much it costs?
This formula might provide an answer. Perceived value is determined by perceived scarcity.
As example, the Internet made it possible for almost anyone to be a writer with a global audience. Scarcity was destroyed, and the perceived value of written content sank. It seems AI will just take this already existing "scarcity of scarcity" situation to the next level.
Any philosopher worth the name would recognize that it is an inductive fallacy to assume that because an LLM can now regurgitate an undergrad term paper it would inevitably yield be something original and cogent that a skilled graduate student could write.
We could consider the future of Substack. What's going to happen when AI can generate an entire network like this in just a few hours? If that can happen one time, it can happen thousands of times. Thousands of networks each with thousands of fake authors with their fake personalities and their millions of generated articles.
From a business perspective the question would seem to be, how much will the broad public care about the difference between human written articles on Substack, and mass produced AI content that has flooded the Internet because it is so cheap and efficient to produce?
EVIDENCE: Here is America roughly half the country has voted for Trump twice, and may do so yet again. They could be watching C-Span, but instead they're watching Fox. How discriminating do we expect these folks to be when it comes to consuming written Internet content?
How much do you and I care that our cars are now made largely by robots? That's what I see coming to the world of content. We net writers will care when we are replaced, just as the factory workers cared when they were replaced. But the broad public is not going to care.
Good points, but that's NOW, not where we're likely headed. The Internet had all kinds of limitations and problems in 1995, like super slow dial up modems etc. Twenty years from now we'll probably look back on this era of AI development in a similar way. The fact that generative AI is pretty buggy NOW doesn't really mean that much in the larger picture.
The problem here isn't one of computational power or infrastructure but of approach.
These systems are not capable of producing intelligent output. The way they're generated is by feeding in vast amounts of data and generating a complex weighted mathematical equation that predicts following text based on previous text/prompts (or in the case of an image model, predicts what image would be linked to a textual description (or even an image, in the case of MidJourney's image prompts)).
The problem with this is that it is prone to hallucination because these models don't actually "know" anything. What were Microsoft's Q2 profits in 2023 is something that could be answered with a Google search; what were Microsoft's Q2 profits in 2024 or 1969 cannot be, because the former is in the future and the latter is before the company existed. What were Stark Industries' Q2 profits in 2023 is likewise unanswerable because Stark Industries is a fictional company.
The cause of hallucination is that there's tons of text out there saying what the Q2 profits for year XXXX were for (insert random company here). As such, as far as the AI is concerned, generating something that looks like all of those articles is entirely reasonable and there's no reason why it would be wrong - but the actual number is likely to be made up nonsense. Even if you link your AI model to search, you're still going to end up with these problems when you ask for information that doesn't exist, and even if you ask for information that does exist, it can easily misinterpret it based on something else being more common (for example, if there was big news around Microsoft's Q2 profits in 2022, or big news about their projected profits for Q2 2023, it might find that instead via search).
And one that isn't using web search is going to be even worse off in terms of accessing real information.
The idea that this is some simple to solve problem is not correct; it's not because of a lack of computational power, it's because the way these models are created and function is the very thing that causes the hallucinations in the first place.
Exactly. The problem, in a nutshell, is twofold: 1) "Truth" is a static. 2) There are literally infinite "Truths". The only 'solve' to the problem of hallucinations is to insert statics into the models and clearly, irrespective access to compute, inserting infinite statics (or infinite anything else for that matter) is simply an impossible task. As Dr. Marcus said in one of his initial posts on generative AI, all we have is a "stochastic parrot". This is a novelty item, not a trillion dollar industry.
I disagree that generative AI is a novelty item in general; art AIs are already being used in prototyping, background creation, etc. and we're seeing entire products created using AI generated art. Lower print run products cannot afford to pay artists tens of thousands of dollars to hundreds of thousands of dollars to produce a fully illustrated high quality full-color RPG manual, but with MidJourney, you can make 99% of the art that way, pay someone $500 to draw you cover art, and you've got what you need for $530 instead of $50,000. This is a big improvement. Creating art for TTRPGs is another big thing that it's really useful for. Stock images can be replaced with a MidJourney subscription.
Moreover, because art doesn't have to be "true", it just has to look good, as well as because something that looks cool can still be used (the Clever Hans effect), all the AI art has to do is produce generally the right sort of thing, and make stuff that looks really good, to be a "success".
We're also seeing it being used in photoediting software; photoshop combined with generative AI art is a very powerful combination and is very useful for editing photos, removing foreground objects, adding stuff in, etc.
We're also seeing AI upscaling being put to good use, as well as outpainting.
I think the language models are apt to be a bit less useful outside of some of the more edge-case uses as far as productivity goes, as chatting with them is a novelty and they can't write good enough material (and can't consistently enough avoid hallucinations) to be nearly as useful as they need to be for a lot of purposes.
That said, they may be useful for translation tools, as well as generating material for video games - I've already seen people use these models to generate NPC dialogue and a demonstration of using it (along with a voice synthesizer) to create sports casting that is responsive to what is going on in a video game race.
Readers of fictional novels don't expect (or even want) such informational accuracy. Disruptive technologies like LLMs only need to produce books of good enough quality for the vast majority of such readers. If they're happy with what they're reading in the books that they purchased then so will the publishers and booksellers making more money from them by adopting such innovations. What the authors think (and how they're financially impacted) won't come into it.
The book market is very heavily slated towards a small number of popular authors; most books sell fewer than 5,000 copies. Most books do not make money, and certainly not significant amounts of it.
You need to write something like JK Rowling or George RR Martin. Otherwise, you aren't actually meaningful competition in that market.
The quality of writing in the AI written works is extremely low, which means you need to compete in a market which is more focused on disposable, lower end content. The most vulnerable market would probably be romance novels, as being able to write a dirty story to yourself with an AI is appealing to a lot of people and the fact that it might not be of the best quality may not be as important, as a lot of romance novels aren't the highest quality to begin with, and being able to personalize it might make up for the other quality issues, as sure, it may not be the best writing, but it is specifically appealing to you in particular.
That's an actually valuable market that might be targetable. But the present quality of these AIs is not even to this level (no one wants a romantic partner to suddenly turn cold on them, or radically change personality or backstory) and a lot of these AIs aren't even trained to produce such things.
You write. "These systems are not capable of producing intelligent output." This should probably be edited to read...
These systems are not capable of producing intelligent output NOW.
This is revolutionary technology at the very beginning of it's life span. Like Model T Fords in 1910. Of course it's not perfect, why would it be?
I keep reading this over and over and of course it's a natural impulse to just extrapolate in a linear fashion.
The problem with this is addressed tenfold in this substack; the reason Gary and others (like me) reject the scaling hypothesis is not that we can't extrapolate, it's because there are fundamental issues with it and repeating your point does not increase my propensity to abandon to the facts.
Your example with model T is particularly flawed because it precisely did *NOT* follow a scaling that is necessary for AGI: The model T was one order of magnitude less efficient, sure, but it was driving about the same speed (~factor 2) and cost the same money (inflation aside). You may quote comfort and convenience as major improvement points in the last hundred years, and I'd agree with that, but let's be honest: did the economic reality change because of how automobiles evolved since model T? Not at all, just like it's not a good bet to hedge for a replacement for chairs and beds that have been around for even longer. The truth is in the marginal contribution and it's just not there.
That's the same flawed argument that crypto boosters used to use. AI might evolve like the internet, but it might also evolve like the Juicero.
This is a piece from an artist about how AI art just isn't very good. Now, the article focuses mainly on sacred art, but isn't just limited to that. I've found the AI art okay for blog posts and such, but even then, it's often difficult to get anything I'm thinking of to be created.https://open.substack.com/pub/hilarywhite/p/ai-images-whatever-it-is-its-not
You can create very nice images using AI, but it's limited in some odd ways as to what you can actually create. The higher end engines (like MidJourney) can make images that pass passive scrutiny, and if you shop them to clean up artifacts, it can be very hard to tell the difference between an AI image and a human generated one if you avoid the standard AI "style".
The quality of AI art has gone from "pretty bad" to "quite good" over the last year. I started using MidJourney on V3; it was originally making images like this:
https://www.deviantart.com/titaniumdragon/art/Moth-reaper-balanced-on-a-scythe-AI-Generated-928471885
https://www.deviantart.com/titaniumdragon/art/Wizard-Hat-Buildings-AI-Generated-MidJourney-928473515
https://www.deviantart.com/titaniumdragon/art/First-Coyote-MidJourney-928474850
It is now producing images like this:
https://www.deviantart.com/titaniumdragon/art/Mavis-the-Killdeer-970551771
https://www.deviantart.com/titaniumdragon/art/A-Taste-of-the-Feywild-965093736
https://www.deviantart.com/titaniumdragon/art/Maria-Fox-Summoner-961828344
You write, "This is a piece from an artist about how AI art just isn't very good."
Cultural leaders across the board will do everything they can to maintain their positions within the status quo, but in the end most of them are doomed to fail. Like it or not, change is coming, faster and faster.
Amazon killed our local mall. Like that. Lots more of that kind of thing coming. More and more disruption, faster and faster.
I think what midjourney does is stealing.
You're wrong. It's algorithmic content creation using a mathematical formula derived from observation of large data sets.
It doesn't steal anything.
Copying is just the identity function; the question is *which* formula
Did the authors of the content consent to being included in the "large data sets"? Have they been contacted, assuming they are alive, of course, and given the opportunity to opt out instead of being automatically opted in? I just read an article that midjourney supports prompts as "draw me in the style of this or that artist" who's alive, and the author hasn't been asked the permission to be included in the dataset. So, yes, it is a form of stealing. But to me, the art generated by AI is soulless, and at some subconscious level, the brain can perceive it.
The future you're envisioning isn't very good. In fact, it's really awful.
Gary's analysis is very useful and obviously he and his team are highly knowledgeable. However, from the perspective of Communication Science pictorial communication (images and such) is very powerful medium of communication and does not necessarily need to be 'factual' in the way Gary describes. One could alss make the point that creative writing draws often from discourses and themes that have before, but are recreated in new ways.
Good writing is very much dependent on out-of-distribution sampling.
You can't get that by turning up the temperature, because that way you induce too many changes in places (or more like: with respect to contexts) that should stay constant.
"Correct" (i.e. non-uncanny-valley) OOD sampling has a fractal associative structure that cannot be mimicked with the current architecture without solving a deep (and possibly infinite) looping problem. GPTs are about width, not depth.
(Also, just to be first to say it, I don't see mamba solving it. An order of magnitude faster scaling area law is still an area law.)
Spot on.
1970, Minsky: “In from three to eight years, we will have […] a machine that will be able to read Shakespeare, grease a car, play office politics, tell a joke, have a fight. At that point the machine will begin to educate itself with fantastic speed. In a few months it will be at genius level, and a few months after that, its powers will be incalculable.” (interviewed for the famous Life article: Meet Shaky, the first electronic person)
Th important thing about this quote is that it was believable then. Minsky was one of *the* experts (Turing Award for AI winner)
(Incidentally, I asked GPT4 to wager if GAI would be a step to AGI and after much humming and thing it produced a "yes". But then I showed that to my daughter and her comment was: "Yeah. That's what Reddit thinks...")
My estimate is that GPT-fever is going to break, and we're going to be left with a some productivity-enhancing uses. And do not forget Nobel-prize worthy efforts like AlphaFold that also come from transformers afaik. Niches will profit. And LLM Noise in society will be a problem. It's like getting a lot of cheap energy from fossil fuels and as a side effect polluting massively.
A similar claim (like the one Minksy made) was made about Cyc, that didn't pan out either.
I remember when Cyc came up with the revelation of "recursion". Then the presenter mentions Cyc was written in Lisp where recursion is everything - so it's wasn't a revelation but rather a base case programmed into Cyc.
Scott, yes, Cyc is infused with bogosity.
PS: http://www.catb.org/jargon/html/M/microLenat.html
New term for me "bogosity". It might be slightly different than what I was trying to point out - which applies to the LLM - that basically Cyc (and now AI) pops out a revelation of "new knowledge" - when it's new just polished and then output as new (or new sounding). The "Recursion" concept they thought to be new knowledge - but it was how Cyc was written. It was fed "recursion milk" when it was founded. So it couldn't invent recursion.
Hi Scott, didn't mean to go off on a tangent, lol. I worked on Cyc, for just under a year, glad it wasn't longer. I have since been under the strong belief that 'common sense reasoning' is an absurd oxymoron, and that common sense is 'common' because it's sensed nearly identically, by us beings - reasoning, which is optional, comes after.
So yes, that claim about recursion fits well with what was proposed, pursued etc.
I must add that I've since found out Minsky probably never made that actual 3-8 year claim. The claim was probably a fabrication of the Life (fixed my mistakenly mentioning Time) 'journalist'. Minsky was thoroughly convinced GOFAI was on the right track, of course.
Words and code are just the beginning... some of the most beneficial usecases right now for generative AI are in the visual domain. Generative Fill in Photoshop is a game a changer, and Adobe is working on similar tools to transform video. Image generators like Midjourney and Stable Diffusion can generate amazing images for little effort. And in the world of 3D graphics generative AI startups are making it easy to lay people to generate 3D objects & worlds, which could be big in the next few years as AR/VR starts to gain momentum thanks to Apple's Vision Pro. Then there's voice cloning, digital clones, text-to-video... we are only scratching the surface of generative AI, and it doesn't have to achieve AGI (it likely won't) to completely transform many industries.
I love how you brought the money and valuation into the discussion. Late in the research phase and early in the development phase, the valuation comes in. If the valuations were truly as inflated as inferred, wow.
This research paper suggests an hallucinations rate for GPT-4 on imaging related questions at 2,3% vs 57% for 3.5.
Wouldn’t that support a better hallucinations management in coming years (not decades)?
https://pubmed.ncbi.nlm.nih.gov/37306460/
i) The evaluation and choice of benchmark is highly dubious.
ii) It's likely (despite OpenAI not telling) that this is a RAG-type solution.
You don't fix a leaking barrel in a firefight by putting *fewer* holes in it, you need a new barrel that is bulletproof.
The piece is fantastic, but I also wanted to note that the image is the inspiration for a scene in "Castle in the Sky" a Studio Ghibli film.
I think that in the debate Marcus vs Bengio over the hybrid intelligence approaches, Gary is winning)
Chatgpt can replace any civil servant, government agent, or politician. It’s stupid, repetitive and wrong.
Your book Rebooting AI offers a well considered solution. Build a knowledge graph (ontology) that covers human knowledge in a taxonomy of concepts. A semantic web. The scale needed to is on the order of ten million concepts with a branch depth of 5-10 edges. (2-3x Wikipedia) The ontology connecting concept nodes is constructed by NLP using Common Crawl to extract 50-100 billion RDF triples and classify subject/object predicates to connect nodes by relationships. This semantic AI model (SAM) is the solution you posit.
LLMs might perform much better with long tail knowledge trees grouping tokens by topics rather than starting with random weights. SAM could be used to detect factual errors and hallucinations. Even red team the LLM or construct steering prompts to align the LLM with legal or other constraints. Investment in a Web 3.0 SAM (reading and curating) can save LLM (write only).
Gary here is my take:
I think the current capacity to generate code, art, etc. already shows its dramatic value even with the hallucination issue considered. One still needs to be able to code for example, but it greatly speeds me up even just cutting and pasting code snipits.
The idea that this is going to be a "Dud" is at strong variance with observed capability. BUT it is possible that early players will not have trillion dollar valuations, or even hundreds of billions. So for investors investing at stratospheric valuations it could be a "dud" in that sense.
But this is going to re-invent nearly all knowledge work. And we don't yet know how.... just like in 1998 we really had not good understanding of what the internet was going to be, or in 2009 what as smart phone was going to be. We are looking at the tip of a very unique iceberg of innovation. That much should be clear to you. Indeed because of the dramatic range of intersection this technology has with ..... everything.... this berg is going to be larger than the smart phone, and likely comparable to the scale of the internet in the scope of things in transforms.
A very good piece! It seems that we need a new breed of AI - different from the generative one, different from statistical one. What about the one based on differences and differentiation, comparisons and filtering, as a new computational paradigm? Think about the game "20 Questions" or Venn diagrams - they narrow down on the most fitting candidate rather quickly.
There are two ingredients to the solution - my approach discussed here https://alexandernaumenko.substack.com/ and sensorimotor primitives discussed here https://dileeplearning.substack.com/p/ingredients-of-understanding
You have influence, you are familiar with researchers, investors and policy-makers, why don't you step in and control the whole process? AGI potential is there but to implement it properly it will take efforts of more people. We don't need hype, we need a working thing. You will make it work. But it will be different AI, not generative one.
What you are looking for seems to be... decision trees.
Pretty much ... more. I add comparable properties and ranges. I describe how that approach handles natural languages naturally.
Read Bateson 1972 - he is in love with differences.
Generalization out of the box. Even with flavors.
Applicable to all modalities.
Decision trees were not explored to the maximum.
My guess is that today's generative AI is about where the web was in 1995. It's new, it's exciting, you can do some cool things with it, but it's still pretty primitive, in comparison with what is likely coming. We're probably spending too much time worrying about the current crop of bugs.
I'm having my first AI image generating experience over the last two weeks, and as so many already know, it's pretty compelling. I dove right in to building a new Substack with Stable Diffusion, and seem to have been fully sucked in to the experience. Point being, as this technology continues to improve, becomes easier to use, less buggy, more reliable, and more powerful, it seems likely more and more people will be drawn ever more deeply in to the fantasy realm these tools empower us to build.
This psychological progression interests me more than the money involved. Generative AI is yet another mechanism for further directing our attention away from the real world and towards the symbolic digital realm. I suspect that, in the end, this will prove a more important factor than who gets rich off this industry.
One thing I've seen more clearly from a few weeks with Stable Diffusion is that there's just no chance of turning back with AI, as the benefits are just too compelling. Not going forward with AI would be like turning off the Internet, that's just not going to happen. I knew that already intellectually, but a full immersion in generative AI helped me actually "get it".
I still think AI is, on balance, a mistake. But I see now that declaring AI a mistake is also a mistake, because it's clear that for the better or the worse, like it or not, whatever the pros and cons and consequences, AI is coming, and there's nothing anyone can do about it. So my plan going forward is....
Until AI eats my DNA or whatever, I'm swimming downstream from now on, going with the flow, surrendering to the inevitable, and am going to have some fun with it.
I see generative AI as being the hi tech version of https://en.wikipedia.org/wiki/Clever_Hans?wprov=sfti1#
I wrote about that here in an early essay, and I agree
Brains are several hundred million years old, and we still hallucinate at the drop of a hat. I mean, fever dreams? Really? Raising body temperature a few degrees deranges the whole process?
"Psychotic experience is to the diagnosis of mental illness as fever is to the diagnosis of infection"
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8923948/
Generating an internal model of the world is just difficult.