506
Advanced OpenAI models hallucinate more than older versions, internal report finds
(www.ynetnews.com)
This is a most excellent place for technology news and articles.
This is a big reason why I continue to cringe whenever I hear one of the endless news stories or podcasts about how AI is going to revolutionize our society any day now. It's clear they are being better with image generation but text 'thinking' is way too unreliable to use like human replacement knowledge workers or therapists, etc.
This is an increasingly bad take. If you work in an industry where LLMs are becoming very useful, you would realize that hallucinations are a minor inconvenience at best for the applications they are well suited for, and the tools are getting better by leaps and bounds, week by week.
edit: Like it or not, it’s true. I use LLMs at work, most of my colleagues do too, and none of us use the output raw. Hallucinations are not an issue when you are actively collaborating with the model and not using it to either “know things for you” or “do the work for you.” Neither of those things are what LLMs are really good at, but that’s what most laypeople use them for, so these criticisms are very obviously short-sighted to those of us who have real-world experience with them in a domain where they work well.
you're getting down voted because you accurately conceive of and treat LLMs the way they should be—as tools. the people down voting you do not have this perspective because the only perspective pushed to people outside of a technical career or research is "it's artificial intelligence and it will revolutionize society but lol it hallucinates if you ask it stuff". This is essentially propaganda because the real message should be "it's an imperfect tool like all tools but boy will it make getting a lot of certain types of work done way more efficient so we can redistribute our own efforts to other tasks quicker and take advantage of LLMs advanced information processing capabilities"
tldr: people disagree about AI/LLMs because one group thinks about them like Dr. Know from the movie A.I. and the other thinks about them like a TI-86+ on steroids
Yep, you’re exactly right. That’s a great way to express it.