Unfortunately LLMs are trying to confound what people mean by creativity. These authors trained the model on the entire bodies of work created by those poets in the first place. And as you can see here, even then the outputs are just mundane mimicry.
Poets are seldom so derivative, let alone regurgitating "accessible" language in someone else's style!
I do not use the term creativity much anymore because it has been bastardized by Silicon valley. We must look at the verbiage now also. Artisanship is above and beyond creativity. Silicon Valley wishes and has pretty much took creativity to a data point. We must be very careful going forward
thank you for responding. I am taking stand. Humanity is the truth not a artificial thing which doesn't have a heartbeat. Humans function with both. There are heartfelt decisions and brain made decisions so I am over this A.I comparison with humans. I pose if this thing is so superior why must it be tied to comparing it with human output? I am personally over this bad game.
Glad this resonated! Please check out my publication, Ratchets Everywhere, where I write about how innovation propagates as a one-way ratchet through evolution and collaboration, creating abundance through complexity: https://gairiksachdeva.substack.com/
Inventor is giving them far too much credit. Neural networks have existed for several decades. These people are just plugging very old research into a massive AI research database (they skirted IP law by claiming to be a non-profit research group) and computers that have far more capacity than existed before.
These people are often sociopath grifters who derive pleasure only from 'winning', which they often conflate with making other people lose. I know a real AI researcher, and he's as appalled by what Silly Valley private equity is doing with these algorithms as you are.
You know Prometheus myth? Wanting to steal creative fire for mankind? And his punishment. These people might get the courage to use their “fire” creatively? Not to penalize mankind in favor of machines!
Science fiction classics from 50,60s saw this coming, vernor vinge’s True Names, Phil Dick’s Minority Report show AI structures working as decision makers. It is our history and future. Man is only animal supposedly that can foresee his death. Yet we ignore history.
Seems unlikely this tech will replace human creativity. But understanding how close the technology can get to human level on many different topics is an imporant thing to study.
This is interestingly similar to much of the use of LLMs. Unless you have expertise in a specific, it will be difficult to find the issues and errors in the LLM output.
Using "naive" subjects for distinguishing poetry is similar to asking the general public to question an LLM about some aspect of law, and then getting them to rate the accuracy and completeness of the reply. They are very unlikely to be able to detect the good from the bad.
It is so important to critically evaluate the methodology to detect the good research from the bad and meaningless.
The rhyme pattern 3.5 stubbornly clung to seems to be absent. Only its imitation of Byron concerns itself with rhyme at all. It seems to have, as a pastiche, improved. Beyond this, I cannot say: when it comes to poetry, I am frankly a philistine.
So I had a silly idea, and presented them to a different ChatGPT 4o instance, as the work of a college student who was turning these in to a hard-ass professor. The results were... probably too nice, but with some expected critiques. "There’s a tendency to focus more on imitating the surface style rather than capturing the deeper thematic essence. Encouraging more complexity, especially in the T.S. Eliot poem, and more drama or irony in the Byron poem, could enhance the pieces."
Davis's paper was a fun read. I did not read the original research, but the impression I got was that the while study setup is extremely poorly thought out.
In a technical analogy it would be asking the general public to judge computer code, or legal argument, or medical research. Does that tell us anything about what is judged? Or more about the judges?
ChatGPT can fool amateurs. Now, whoopie. Researchers writing about that can fool those who don't generally consume research. Whoopie again. Spot a pattern, anyone?
ChatGPT's output may be preferred over difficult poetry. Well. fast food is generally preferred over healthier or more complicated alternatives too. Does that make fast food any better? At pressing base evolutionary buttons, sure. Spot a pattern, anyone?
I really like your take on this! As someone who is not really a foodie or a poetry devotee, I do find resorting to preferences to assess ChatGPT output does seem a bit like a fast food vs pate taste test.
I can't help but also think that using some of the "objective" methods of the Davis's paper could also be misleading and even a bit circular in deciding if the AI poetry is "good" poetry. However, the original paper was unfortunately making the seemingly much more difficult (objectively testable) claim that “AI-generated poetry is indistinguishable from human-written poetry.”
Writing something refreshingly original and insightful is hard enough even for people, not to talk about chatbots who are mindless imitators.
One should not lose sight of the bigger picture though. There's lots of low-hanging fruit to be picked by automating easy and boring work, and some companies will get handsomely rich.
I didn't believe the headline period. Poetry is a human thing. Not a data point. We must now try not to make everything a data point in this A.I era. The hype must be shutdown. Human is human and A.I is A.i till further notice.
In another venue, where the subjects of the experiments had been called "participants", I wrote:
"The word "participants" is doing a lot of work here. Said "participants" were folks who don't read much poetry much. Such as myself. (And most of the poetry mentioned is in a language (period English) that the "participants" don't speak and/or haven't studied.)
But I do (try to) play and study jazz (a bit), and can assure you lots of "participants" wouldn't like jazz, and even some who like it don't know what to listen for. Heck, there are tales about (and live recordings demonstrating) players getting pissed off at audiences who clap on 1 and 3, so drop a beat so that the audience clapping is shifted to 2 and 4, where it should be, thank you. So next we can do a study of how AI produced jazz is preferred by "participants".
"The AI-generated poems were generated using ChatGPT-3.5. "... So they use a shitty outdated model and are surprised when it produces subpar poetry? Would be very different result using something like Claude 3.5 Sonnet (or even 3 Opus) or chatgpt-4o-latest
Still, GenAI is an awesome tool, I use it every day. I am still amazed that this can even work, it's like magic.
But I don't think we really have to worry about creativity being replaced by LLMs. If we look at the mass market and mass media, 99% is already complete garbage. Just open Youtube, TikTok, Netflix or... Substack if you need some convincing. GenAI is just going to skew the ratio to 99.9%. No soul, no narrative, just a pile of garbage.
And the reality of today's society is that it's what people actually want. They want a distraction, to fill their already empty soul. They watch Netflix while working out, or doing their laundry, or even browsing their phone. The show is empty, the music is empty, the story is empty, and it renders all their tasks and activities empty as well.
In the end, the side-effect of GenAI might be, at least for some, the realization that this is not the way, and that we should disconnect the WiFi and fall back to reality and to substance.
LLMs already include the seeds of their own demise, having been trained on social media and mountains of other garbage (increasingly including their own output)
Why some folks believe that scaling the garbage pile up (on stuff like TikTok) will improve things is hard to fathom.
What came to mind is the blind taste tests of wine. That wonderful year when "2 buck chuck" won as the best wine. However, I will also note that when looked into more deeply, "2 buck chuck" is made from surplus grapes, and some years top-grade vineyards over-produce. The chemists of wine that produce "2 buck chuck" are as good or better than any other vintners. Given an excellent basis for their wine, they can make an excellent wine from it. This has little relation to ChatGPT though.
I haven't read the study, but I suspect that if someone thought the ChatGPT output was as good as Shakespeare, they haven't read Shakespeare! "Did my heart love till now? Forswear it, sight! For I ne'er saw true beauty till this night."
The mere fact that GenAI’s default poetry is AABB should tell you that it’s simply taking the mean of all its training data, which is vast tracts of poetry from all kinds of online (often self-proclaimed) poets, not just the good or great ones — most of which is in AABB rhyme. And since most of all poetry is average and cliched, it’s also programmed and bound to take the mean of that, and produce dross. It’s quite literally destined for average. It can’t escape it.
(Throw in some hallucinations for head-scratching fun to add to the banality.)
> I also think it is a safe bet that the idea that, one hundred years later, scientists would write that drivel generated by an automaton is “indistinguishable” from Shakespeare and Whitman would not have occurred to I.A. Richards in his darkest dreams, and would have occurred to Orwell only in his darkest dreams.
Maudlin, machine-produce "prolefeed" is one of the components of the dystopian state of Oceania in 1984, so suffice to say it did occur to him in his darkest dreams.
Unfortunately LLMs are trying to confound what people mean by creativity. These authors trained the model on the entire bodies of work created by those poets in the first place. And as you can see here, even then the outputs are just mundane mimicry.
Poets are seldom so derivative, let alone regurgitating "accessible" language in someone else's style!
I do not use the term creativity much anymore because it has been bastardized by Silicon valley. We must look at the verbiage now also. Artisanship is above and beyond creativity. Silicon Valley wishes and has pretty much took creativity to a data point. We must be very careful going forward
Love the focus on artisanship! Easier to see the value in it too.
thank you for responding. I am taking stand. Humanity is the truth not a artificial thing which doesn't have a heartbeat. Humans function with both. There are heartfelt decisions and brain made decisions so I am over this A.I comparison with humans. I pose if this thing is so superior why must it be tied to comparing it with human output? I am personally over this bad game.
Glad this resonated! Please check out my publication, Ratchets Everywhere, where I write about how innovation propagates as a one-way ratchet through evolution and collaboration, creating abundance through complexity: https://gairiksachdeva.substack.com/
Why are the inventors so obsessed with replacing himan creativity? It is our souls.
Inventor is giving them far too much credit. Neural networks have existed for several decades. These people are just plugging very old research into a massive AI research database (they skirted IP law by claiming to be a non-profit research group) and computers that have far more capacity than existed before.
These people are often sociopath grifters who derive pleasure only from 'winning', which they often conflate with making other people lose. I know a real AI researcher, and he's as appalled by what Silly Valley private equity is doing with these algorithms as you are.
humanity is number thats why also. However, they have folks thinking Tech A.I is number one. Humanity better wake up
meant to humanity is number one. Apologies
You know Prometheus myth? Wanting to steal creative fire for mankind? And his punishment. These people might get the courage to use their “fire” creatively? Not to penalize mankind in favor of machines!
Science fiction classics from 50,60s saw this coming, vernor vinge’s True Names, Phil Dick’s Minority Report show AI structures working as decision makers. It is our history and future. Man is only animal supposedly that can foresee his death. Yet we ignore history.
Many of the people making AI love this literature, and yet they WANT to make everything AI-driven. Something something torture nexus.
Seems unlikely this tech will replace human creativity. But understanding how close the technology can get to human level on many different topics is an imporant thing to study.
Exactly! When will humans aka humanity realize this is a silent war taking place right in front of them.
This is interestingly similar to much of the use of LLMs. Unless you have expertise in a specific, it will be difficult to find the issues and errors in the LLM output.
Using "naive" subjects for distinguishing poetry is similar to asking the general public to question an LLM about some aspect of law, and then getting them to rate the accuracy and completeness of the reply. They are very unlikely to be able to detect the good from the bad.
It is so important to critically evaluate the methodology to detect the good research from the bad and meaningless.
The deeper questions that never surface to the top for any of our unbridled acceptance of AI centre on WHY we WANT AI poetry, music, writing, et al
I decided to let 4o take a shot at poetry, imitating the styles of Walt Whitman, T.S. Eliot, and Lord Byron, and giving it "autumn" as a theme.
https://chatgpt.com/share/673ccd81-d98c-8003-ae82-e4d9d11cd575
The rhyme pattern 3.5 stubbornly clung to seems to be absent. Only its imitation of Byron concerns itself with rhyme at all. It seems to have, as a pastiche, improved. Beyond this, I cannot say: when it comes to poetry, I am frankly a philistine.
So I had a silly idea, and presented them to a different ChatGPT 4o instance, as the work of a college student who was turning these in to a hard-ass professor. The results were... probably too nice, but with some expected critiques. "There’s a tendency to focus more on imitating the surface style rather than capturing the deeper thematic essence. Encouraging more complexity, especially in the T.S. Eliot poem, and more drama or irony in the Byron poem, could enhance the pieces."
https://chatgpt.com/share/673cd13c-c54c-8003-a21c-e416c5cc83c0
Davis's paper was a fun read. I did not read the original research, but the impression I got was that the while study setup is extremely poorly thought out.
In a technical analogy it would be asking the general public to judge computer code, or legal argument, or medical research. Does that tell us anything about what is judged? Or more about the judges?
ChatGPT can fool amateurs. Now, whoopie. Researchers writing about that can fool those who don't generally consume research. Whoopie again. Spot a pattern, anyone?
ChatGPT's output may be preferred over difficult poetry. Well. fast food is generally preferred over healthier or more complicated alternatives too. Does that make fast food any better? At pressing base evolutionary buttons, sure. Spot a pattern, anyone?
I really like your take on this! As someone who is not really a foodie or a poetry devotee, I do find resorting to preferences to assess ChatGPT output does seem a bit like a fast food vs pate taste test.
I can't help but also think that using some of the "objective" methods of the Davis's paper could also be misleading and even a bit circular in deciding if the AI poetry is "good" poetry. However, the original paper was unfortunately making the seemingly much more difficult (objectively testable) claim that “AI-generated poetry is indistinguishable from human-written poetry.”
Writing something refreshingly original and insightful is hard enough even for people, not to talk about chatbots who are mindless imitators.
One should not lose sight of the bigger picture though. There's lots of low-hanging fruit to be picked by automating easy and boring work, and some companies will get handsomely rich.
I didn't believe the headline period. Poetry is a human thing. Not a data point. We must now try not to make everything a data point in this A.I era. The hype must be shutdown. Human is human and A.I is A.i till further notice.
In another venue, where the subjects of the experiments had been called "participants", I wrote:
"The word "participants" is doing a lot of work here. Said "participants" were folks who don't read much poetry much. Such as myself. (And most of the poetry mentioned is in a language (period English) that the "participants" don't speak and/or haven't studied.)
But I do (try to) play and study jazz (a bit), and can assure you lots of "participants" wouldn't like jazz, and even some who like it don't know what to listen for. Heck, there are tales about (and live recordings demonstrating) players getting pissed off at audiences who clap on 1 and 3, so drop a beat so that the audience clapping is shifted to 2 and 4, where it should be, thank you. So next we can do a study of how AI produced jazz is preferred by "participants".
I call BS. YMMV, of course."
"The AI-generated poems were generated using ChatGPT-3.5. "... So they use a shitty outdated model and are surprised when it produces subpar poetry? Would be very different result using something like Claude 3.5 Sonnet (or even 3 Opus) or chatgpt-4o-latest
Still, GenAI is an awesome tool, I use it every day. I am still amazed that this can even work, it's like magic.
But I don't think we really have to worry about creativity being replaced by LLMs. If we look at the mass market and mass media, 99% is already complete garbage. Just open Youtube, TikTok, Netflix or... Substack if you need some convincing. GenAI is just going to skew the ratio to 99.9%. No soul, no narrative, just a pile of garbage.
And the reality of today's society is that it's what people actually want. They want a distraction, to fill their already empty soul. They watch Netflix while working out, or doing their laundry, or even browsing their phone. The show is empty, the music is empty, the story is empty, and it renders all their tasks and activities empty as well.
In the end, the side-effect of GenAI might be, at least for some, the realization that this is not the way, and that we should disconnect the WiFi and fall back to reality and to substance.
LLMs already include the seeds of their own demise, having been trained on social media and mountains of other garbage (increasingly including their own output)
Why some folks believe that scaling the garbage pile up (on stuff like TikTok) will improve things is hard to fathom.
But who am I to question them.
Scale away!
What came to mind is the blind taste tests of wine. That wonderful year when "2 buck chuck" won as the best wine. However, I will also note that when looked into more deeply, "2 buck chuck" is made from surplus grapes, and some years top-grade vineyards over-produce. The chemists of wine that produce "2 buck chuck" are as good or better than any other vintners. Given an excellent basis for their wine, they can make an excellent wine from it. This has little relation to ChatGPT though.
I haven't read the study, but I suspect that if someone thought the ChatGPT output was as good as Shakespeare, they haven't read Shakespeare! "Did my heart love till now? Forswear it, sight! For I ne'er saw true beauty till this night."
The mere fact that GenAI’s default poetry is AABB should tell you that it’s simply taking the mean of all its training data, which is vast tracts of poetry from all kinds of online (often self-proclaimed) poets, not just the good or great ones — most of which is in AABB rhyme. And since most of all poetry is average and cliched, it’s also programmed and bound to take the mean of that, and produce dross. It’s quite literally destined for average. It can’t escape it.
(Throw in some hallucinations for head-scratching fun to add to the banality.)
For fun, ask ChatGPT to not rhyme.
Again and again, it can’t not.
Those who can, do. Those who can't create Artificial Imitation.
> I also think it is a safe bet that the idea that, one hundred years later, scientists would write that drivel generated by an automaton is “indistinguishable” from Shakespeare and Whitman would not have occurred to I.A. Richards in his darkest dreams, and would have occurred to Orwell only in his darkest dreams.
Maudlin, machine-produce "prolefeed" is one of the components of the dystopian state of Oceania in 1984, so suffice to say it did occur to him in his darkest dreams.