Discussion about this post

User's avatar
Ben Lowndes's avatar

I am concerned at the way people unquestioningly turn to these tools, and are prepared to explain away the glaring flaws.

"It's getting better..."

"It needs a clearer prompt...."

If junior staff made this many mistakes, consistently and without learning, they wouldn't last long in many high performing teams.

Expand full comment
AJB's avatar
Feb 4Edited

Oh and lest you think that while the "free" versions are the only one's flawed, no. The fancy, expensive ones from supposedly hallucination free legal services like LexisNexis are STILL hallucinating wildly, 17-34% of the time - according to a recent audit by Stanford. Attorneys foolish enough to fire their paralegals and rely instead upon a LexisNexis or Westlaw AI agent are finding themselves in peril when the judge notices that the case presented as precedent was entirely made up by the AI machine. In a new preprint study by Stanford RegLab and HAI researchers, we put the claims of two providers, LexisNexis (creator of Lexis+ AI) and Thomson Reuters (creator of Westlaw AI-Assisted Research and Ask Practical Law AI)), to the test. We show that their tools do reduce errors compared to general-purpose AI models like GPT-4. That is a substantial improvement and we document instances where these tools provide sound and detailed legal research. But even these bespoke legal AI tools still hallucinate an alarming amount of the time: the Lexis+ AI and Ask Practical Law AI systems produced incorrect information more than 17% of the time, while Westlaw’s AI-Assisted Research hallucinated more than 34% of the time. https://hai.stanford.edu/news/ai-trial-legal-models-hallucinate-1-out-6-or-more-benchmarking-queries

Expand full comment
181 more comments...

No posts