62 Comments
User's avatar
Aaron Turner's avatar

Claude Opus 4.6 for POTUS --- it can't possibly do a worse job than the present incumbent!

Thomas Schmid's avatar

How true ;-) To quote Jon Stewart (The Daily Show, https://www.youtube.com/watch?v=WCkcPcMTYuQ, 19:44min in): "I can't believe it.Our bombs are now smarter than our president".

Amy Ngo's avatar

I’ve used both Opus 4.5 and 4.6 for work — couldn’t agree more!

TheOtherKC's avatar

And taxes are exactly the kind of thing a layperson would want and expect a proper AI to be capable of. Boring, monotonous, pretty structured...

jnappi's avatar

Doesn't surprise me that it fails to do taxes given what I see it do with code. https://blog.nappisite.com/p/why-are-llms-so-good-at-writing-code

John Pienta's avatar

The entire other side of this is they are WHOLLY UNPREPARED for your tax data in terms of data security and safety. God knows who has access to your tax and financial information if you put it into a model.

People are also forgetting it is not a magic in-out box. All your chats are being saved, and mined for data, trends etc.

Oaktown's avatar

I've talked with so many people who don't know this and enter all kinds of very personal data into those LLMs. Thanks for reminding us and thanks to Gary Marcus and his persistent work to educate the public before it's too late. Tick, tick, tick ...

John Pienta's avatar

There's enormous legal consequences for storing financial data willy nilly. They, as companies, should be very, very, concerned about this, and actively trying to dissuade people from using it for such.

Oaktown's avatar

Agree, but they have no ethics or concern for anything but money and power.

Jim Ryan's avatar

Scam Altmann would send it all to Iran

Joseph P. Duchesne's avatar

In my experience LLMs are really poor at making judgment calls.

To your point, it is therefore super scary that they are being used in active, battlefield military decisions making, let alone autonomous weaponry.

Bron's avatar

Microsoft 365 Copilot (ChatGPT) can’t even be trusted to solve technical problems with Microsoft Windows 11 software. You would think Microsoft would ensure that with its own version of ChatGPT, aka 365 Copilot, would be infused with those kinds of problem solutions. From months of attempting to transition several laptops from Microsoft Personal to Microsoft Business, I learned the hard way 365 Copilot has a hard time solving Microsoft software issues. Go figure.

TheAISlop's avatar

Copilot is horrible and puts a microscope on Microsoft's failures to leverage their greatest advantage -- dominance of the Enterprise space.

Oaktown's avatar

They're trying (in vain) to get their money's worth out of their ill advised LLM investments, but it seems to me they're just infuriating their customers. I hope those customers hold MS's feet to the fire.

Stephen Schiff's avatar

It says something, though I am not sure what, that a LLM can't replace the technology embodied in a 1980s era spreadsheet.

TheAISlop's avatar

Wait what? I'm not getting a $75,203 refund???

Kotzsu's avatar

why would a stochastic parrot be good at doing taxes?

Tim Koors's avatar

Once again AI overpromises and underdelivers. It is like watching a slow motion train wreck and you can do nothing to prevent it. Is another AI winter on the way? History does not repeat but it does echo. Hope that this does not take the economy down along with it.

Oaktown's avatar

We're gonna need more than hope to prevent that.

Nat Irvin II's avatar

Humans are truly predictable especially when we underestimate our own inability to make a mess of things

Patrick Logan's avatar

The federal government had free software to do many people's taxes. The current administration shut that down in the name of... well, I'm not sure why. Most people don't need anything more than rule-based software for taxes.

RCThweatt's avatar

Just one OMFG after another these days.

toolate's avatar

once you realize that the goal is not success as most of us define it, but someting clsoer to how Big Brother defined it in 1984, all of this makes perfect sesnse

Amy A's avatar

Gary, would love to hear your thoughts on OpenAI planning an IPO by the end of the year and NVIDIA using that as an excuse to drastically cut their investment commitment.

Oaktown's avatar

I've used TurboTax for several years. As of last year I noticed an obvious decline in the interface accuracy and ease of use, so last year I asked a rep if they had introduced AI LLMs and he said "Yes."

I gave him my two cents of feedback explaining the quality—especially ease of use, step-by-step questions, reliability, and info dialog boxes—had been far better before they added the LLMs. They of course didn't take my advice, and this year it was even worse!! The step-by-step method never queried if I had paid estimated taxes and its review said my returns looked great, no red flags detected, and I owed a penalty—until I searched for the specific IRS estimated tax form and entered my payments for the previous year. Only then did it calculate I was owed a refund—and my previous returns had been available to see how much estimated tax I had owed.