112 Comments

GenAI promise: help us find more needles.

GenAI reality: makes bigger haystacks, spray paints them silver.

Expand full comment

Absolutely this is frightening. And what I worry about also is when these scientifically inaccurate AI generated scientific journal articles are then hoovered up into the LLM training data. It will be a race to the bottom for reliable scientific knowledge.

Expand full comment

"In my opinion, every article with ChatGPT remnants should be considered suspect and perhaps retracted, because hallucinations may have filtered in, and both authors and reviewers) were asleep at the switch."

And those authors should be stripped of their research positions, the reviewers fired, and the journals shuttered. Anything that has "Certainly, here is a list of..." or "I'm a language model" is corrupt and demonstrates that such articles were not reviewed in any meaningful sense. Everybody involved should be turfed with extreme prejudice.

Expand full comment

"the total number of articles may radically spike, many of them dubious and a waste of reviewers’ time. Lots of bad stuff is going to sneak in."

Many of them dubious? I would suggest that all of the LLM created ones will need retraction.

With this sort of spike model collapse will be inevitable.

BTW, I have just been looking at the questions of Accuracy, Consistency and Completeness of LLM responses using very slight changes to the input prompt. The results are amazing and worrying.

Expand full comment
Mar 15Liked by Gary Marcus

Reminds me of the supercomputers sent by Trisolarions to mess with our science and scientific minds, in the book Three-body Problem.

Expand full comment

And if you prompt GPT-4 the wrong way, it will just make shit up - like quantitative data - without providing a disclaimer. And then if you use these bullshit numbers in an article, and then later on, you ask for similar numbers, it will quote the numbers that it made up previously. And so the "knowledge base" builds...

Expand full comment

In the long ago era of 2015, when I first really started worrying about AI in general (no pun intended, sorry), I was consumed by the idea that if AI was able to simulate reality "well" enough, science would grind to a halt because nothing would be reproducible or verifiable without expenses exponential to the cost of generating shit in the first place. Science is based on providing proof, but anyone who must doubt every single thing about their reality gets nothing done and lives a tiring, miserable existence.

Personally, the only "upside" I have personally seen of this entire hype cycle is the ability to generate specific images on command for blogs like this one (which reduces their value to 0 immediately, no offense intended), and summarizing text that nobody was going to read anyway, especially if that text was AI-generated in the first place. These trinkets are not worth the price of our entire civilization. However dysfunctional it is currently, it will cease to function at all once this dreck clogs up every gear keeping society running.

Expand full comment
Mar 15Liked by Gary Marcus

"Trust but Verify" should apply to LLMs, not just nuclear proliferation, with a slight mod: Trust [gingerly] but Verify [profusely].

Expand full comment

I have checked the first twenty results that come up with the linked "certainly, here is" search. Eighteen of them are in low-quality, for-profit open access journals, generally run by 'publishers' I have never even heard of but that were clearly created in the last few years to jump on the open access wagon, or PDFs posted on the social media network ResearchGate. The two others are published by IEEE and Springer, but they are contributions to conference proceedings, which (for better or for worse) attract less scrutiny than a journal paper, because they are just accompanying a talk given by the authors, which would have been the main event. More confusingly, some of the publications did not seem to contain the phrase at all, so there also may have been a few false positives?

To be clear, I am concerned by the use of ChatGPT et al in science, especially when I see "Certainly, here is a literature survey with citations", given that this isn't just somebody having the bot create a draft of the introduction to overcome writer's block but instead circumventing one of the key steps of writing a paper, understanding the literature in your own field. Also, of course, the bots are known to make up references, so that's a terrible idea anyway.

Still, as I wrote in the previous thread, this doesn't show to me that it is a quantitative problem with serious journals. Hallucinated references, for example, would be discovered by every publisher I have recently published with, because their production editors cross-check all references while adding DOIs to the list. At that moment at the latest, the gig would be up, even if the reviewers missed a made-up reference.

Thus, currently, it still seems to me as if this is a match made in heaven between lazy and incompetent authors who use PlagiarismBot 4.0 and 'we will publish anything you want on our website and format it to look like a research paper, in exchange for a fee' style predatory operations run out of a garage somewhere, but not a significant problem in any scientific journal whose articles I, as a scientist, would actually read myself. The main problem will be that most people do not have the background knowledge to differentiate between a well-edited, high quality journal and the bottom-feeding paper mills, so they will mistakenly conclude that all of science must be broken now.

But in reality, this is like buying fake Rolexes off shady dudes in a back alley three times in a row, have them come apart in your hands each time, and then concluding that Rolex itself is a scam. Sorry, but not that company's fault if you can't figure out that this shifty-looking fellow who runs off the moment he has the money isn't selling you the real thing for fifty bucks. Likewise, I do not see what the scientific community can possibly do even in theory if some random tweeter who comes pre-convinced that science deserves to be shut down as a whole points selectively towards the poor standards of something with a name to the effect of "International Scholarly Journal of Futuristic Advancement in Innovation Science" that solicits contributions via emails that start with "Greetings of the Day, Professor!!!!". Hard-working, diligent scientists have no control over how many people set up a website and format it to ape the look of a research journal. It is easy! Any reader here could open their own journal within the week, if they put their mind to it.

A flood of submissions may become a problem for genuine journals, perhaps. We will have to see how that works out. At the moment I am happy (not really) to report that submitting a manuscript to most serious journals is such a soul-destroying battle with editorial management software that it may discourage spamming to a sufficient degree. There was a short window in my career where you could just email your manuscript to the editor but, alas, that definitely isn't the case these days.

Expand full comment

Maybe because of attention like Gary’s blogs the peer review system will get a much needed overhaul which it badly needed before this happened. This just makes it much more obvious. Great work!

Expand full comment
Mar 15·edited Mar 15

Bad "journals" and paper-writing factories have existed for many many years. It has had various effects, a key one of which has been the brand enhancement of, and focus on, a small number of key journals and conferences within each sub-field and specialist community. Most global academic communities are surprisingly small; one of the reasons I didn't want to spend the rest of my post-PhD life studying the intricacies of human colour vision was that I didn't fancy having the same arguments with the same 50 people for four decades. Pull one of these GPT howlers in a key publication for your sub-field and you can kiss your reputation goodbye in no time.

Expand full comment

“Shut it down if they can’t fix this problem.” 🎯

Expand full comment

The violations are so blatant that the authors did not even bother to delete the AI specific languages. I suspect lots of other authors also used AI but they just remembered to delete these languages. The author-reviewer system has always been based on good faith, and now it's under more threat than ever. PS: I guess some reviewers are also using AI.

Expand full comment

What’s more, it will start digesting its own output, as some have noted.

“Certainly, here is ‘Certainly, here is “Certainly, here is ‘Certainly, here is’ “ ‘ “ ...

Reminds me of standing in a bathroom that has a mirror on the back wall also, and you get this wild tunnel effect.

Welcome To The Hall Of Mirrors: empty and meaningless, reflecting whatever you put in the middle.

Expand full comment

I wonder if the solution to this problem ends up being a return to ye olden days, when luminaries in a field acted as gatekeepers, and it was essentially impossible to get any recognition without their blessing. On the one hand, gatekeeping is deeply unfair, and it massively slows progress (there is a reason why science was once said to progress "one funeral at a time"). On the other hand, gatekeepers do at least have an incentive to protect their reputation as arbiters of the quality of the work they endorse.

If you think about it, the cost balance between type I and type II errors shifts as the average quality of the submitted work declines. If the median paper is AI-generated garbage, then it's worth missing a few good papers to be certain of rejecting the chaff. Science as an enterprise suffers, both because valuable work gets discarded and because some worthy researchers can't get traction and get bounced from the field. However, it doesn't suffer nearly as much as it will if nobody can find the good work because it is lost in a sea of garbage.

Expand full comment

"Move fast and break things" might not be the strategy it's been cracked up to be

Expand full comment