27 Comments
7 hrs agoLiked by Gary Marcus

Programming is painstaking work. If you paste in some code and hope for the best you will encounter a lot of grief, much of it much later.

GenAI is an aid, to be used in small incremental doses. But then it is good for you.

Expand full comment

AI-machinary makers are now gathering the input of the users in the form of their small doses ypu describe to generate better machines that can guess what is required by the first original request/task. The thing is as Marcus already mentioned: when an open source code is already there, if written in a good reconfigurable way, then it is already reusable, and the copilote is not needed; now code copilot is just a non trasparent search engine because it should have pointed out that source of the code by where it gets it from (including its license) instead of the illusion that it is '" generating" code or "co" programming. And on the other hand, if there is a true new requirement that really need a new code/algorithm/data-structure, then this will need a person understanding the new requirement/problem and crafting a new solution.

The gap that exist in the search for good proper working code, while current AI code generators are designed (my opinion from how I see the context of openAI and Microsoft) with the intention to immitate/replicate developers in order to replace them with the intermediate step of the co-pilot as a way to close their gap (the current limitation of code generators), not the search for good code gap (the problem that keeps human agency in place and reduce the technical dept).

Expand full comment

There are many ways of looking at it. CoPilot is helpful in my work. There's always a need for custom code, even if the pattern is maybe already obvious in other code.

I anticipate there will be future tools for refactoring, debugging, etc.

Expand full comment
7 hrs agoLiked by Gary Marcus

This aligns with my experience. I’m a Sr. Engineer and we use Copilot, Cursor, ChatGPT, ect. at my company.

Personally, I haven’t seen a meaningful uptick in feature velocity since we adopted GenAI coding assistants, but I am seeing more code volume from Jr. devs with bizarre bugs. My time digging through PRs has ticked up for sure.

In my dev work I’ll find myself turning off Copilot half the time, because it’s hallucinating suggestions get pretty distracting.

Expand full comment
6 hrs agoLiked by Gary Marcus

A software developer learns how to code. An LLM doesn't even know what code is. Throwing together a probabilistic sequence of vectors that appear many times in github repos will only get you so far.

Expand full comment
7 hrs agoLiked by Gary Marcus

The only way we'd see a 10x programming productivity gain is if AI could write entire apps reliably from some kind of easy to write description. Of course, that is exactly what some hype merchants have claimed. Assuming there was such a problem domain, management would quickly realize that this means it is so regular that they could write a single program that, with a few input parameters, could generate the target apps with greater reliability and maintainability, and fewer compute resources, than the AI solution.

Expand full comment
6 hrs agoLiked by Gary Marcus

Imagine if all this money was given to open source libraries / frameworks / higher level language creators..

They actually raise the level of abstraction and let programmers do more with less code.. And it's been happening since the beginning without any hype

Expand full comment
6 hrs agoLiked by Gary Marcus

There was a great paper recently (swe-bench) that found that off the shelf the best llm models solve about 2% of a curated set of github issues. Even if this can be 10x'd by fine tuning, that still is not a replacement for a software engineer, especially since someone still needs to verify the solutions.

Expand full comment
5 hrs agoLiked by Gary Marcus

Sergei Brin recently commented at the All-In Summit that none of his devs are using AI. He thought they should be and has been trying to encourage them to use it. He says he wowed them a few times when he used AI to quickly generate some demo apps. But it begs the question. Why aren't devs, the people most amenable to AI who readily use and adapt to new technology, adopting it, and instead have to be pushed into using it? Another data point in support of Gary's premise. Experts find much less benefit from LLMs than non experts who can be happy with an almost solution.

Expand full comment

This, exactly. If I've run into a problem that I can't figure out, even after scouring places like stackoverflow, there's zero chance an LLM is gonna get me the answer. It's basically doing a stupider, less reliable version of searching the internet!

Expand full comment

So much funding wasted on Sisyphus. No modularity in AI equates to no clever design. Slow and buggy code crops up rather than correct and optimal. Sigh.

Expand full comment
7 hrs agoLiked by Gary Marcus

People should be aware that using CoPilot is a risky intellectual property business. If you're writing code for your company, or for hire, you're potentially giving up your copyright to the code. Be careful.

Expand full comment
7 hrs ago·edited 7 hrs agoLiked by Gary Marcus

Even 2x would be hype. And don't forget the LLM terms-of-service forbid working on AI/ML code.

Expand full comment

Those types of claims utterly ignore technical debt up the wazoo that's gonna bite every "LLM-code" infested project out there: https://www.geekwire.com/2024/new-study-on-coding-behavior-raises-questions-about-impact-of-ai-on-software-development/

=====

But while AI may boost production, it could also be detrimental to overall code quality, according to a new research project from GitClear, a developer analytics tool built in Seattle.

The study analyzed 153 million changed lines of code, comparing changes done in 2023 versus prior years, when AI was not as relevant for code generation. Some of the findings include:

“Code churn,” or the percentage of lines thrown out less than two weeks after being authored, is on the rise and expected to double in 2024. The study notes that more churn means higher risk of mistakes being deployed into production.

The percentage of “copy/pasted code” is increasing faster than “updated,” “deleted,” or “moved” code. “In this regard, the composition of AI-generated code is similar to a short-term developer that doesn’t thoughtfully integrate their work into the broader project,” said GitClear founder Bill Harding.

The bottom line, per Harding: AI code assistants are very good at adding code, but they can cause “AI-induced tech debt.”

=====

Expand full comment

I strongly disagree... It will 10x the very bad developers, and there are a loooooot 😁

Expand full comment

Exactly. ChatGPT makes me 1,000,000x more productive at coding in languages I've never seen before! Hence, I am not the person for that job.

Expand full comment

A 10x increase in crap production would still be an increase in productivity, at least to the folks who didn’t consider bugs (who have precisely zero understanding of software production, not incidentally)

And to the folks who found a 41% increase in bugs, are they really sure it’s not 41.589%?

I suspect a lot of bugs and security flaws are simply going undetected in the effort to increase “productivity”.

But the reality is that no one really knows what the long term impact of this “experiment” is going to be.

But those who have actually done software development can make an educated guess.

Once the AIs start training on their own buggy code output (which may actually already be happening), it’s going to be a downward spiral.

Expand full comment
1 hr ago·edited 1 hr ago

As someone who experiences the 10x in real life (despite the cringe attached to it, I think it's an apt term), I think critics are missing the obvious in their criticism.

1. building software is mostly not about code

2. llms don't do all that well at code but can generate things that have the right code shape

3. there are many artifacts that are not code in production that are extremely useful to building good software

If you put this all together and focus on "what do humans need to build good software collaboratively", good uses of LLMs become apparent:

- good documentation / rfcs / knowledge bases / onboarding docs / mentoring / etc...

- logging, monitoring, error messages, visualizers, analysis tools, etc...

- prototypes prototypes prototypes. You don't even need to run them, but they are a sort of solo-adventure-whiteboard-brainstorming

I gave a workshop about the topic that hopefully gives a bit more insight into how I approach things: https://www.youtube.com/watch?v=zwItokY087U

Handout is here: https://github.com/go-go-golems/go-go-workshop

What this looks like in practice (although my opensource stuff) is that I can build software like this: https://github.com/go-go-golems/go-go-labs/blob/main/web/voyage/app.md in an hour or two in the evening, after work, without feeling like I am really writing software.

For longer-term software: https://github.com/go-go-golems/go-go-labs/blob/main/pkg/zinelayout/parser/units_doc.md

I don't really care if I have to fill in the 10 lines that do the actual complicated thing, that's fun.

But I 100% stand behind 10x improvement in (productivity is maybe not the best word) quality. Faster "coding" means faster iteration/prototyping, and iteration is one of the key ingredients to building something that actually is useful.

Expand full comment

As a C# developer, I believe ReSharper and their Rider IDE have done more to make my job easier than anything else.

Expand full comment

I'll give LLMs one thing: they've made me a lot faster at creating plots in R. It's hard to remember all the syntactic quirks required for doing all the little things I might want to do to make a plot look nice, and GPT is quite good at taking my description of what I want to see and giving me code that creates it.

But this is pretty in-the-weeds: I'm using GPT to make one-off images, so I don't care how efficient the code is and I don't care about "bugs". Inserting GPT-generated code into a continuous workflow is more dangerous. It's valuable as a tool for jogging your memory or sparing yourself the time searching through stackoverflow. But if you don't fully understand the code it generates, you're asking for trouble.

Expand full comment