this post was submitted on 03 May 2024
869 points (100.0% liked)

Technology

69109 readers
2940 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
 
  • Rabbit R1 AI box is actually an Android app in a limited $200 box, running on AOSP without Google Play.
  • Rabbit Inc. is unhappy about details of its tech stack being public, threatening action against unauthorized emulators.
  • AOSP is a logical choice for mobile hardware as it provides essential functionalities without the need for Google Play.
you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 8 points 11 months ago (3 children)

The best convincing answer is the correct one. The correlation of AI answers with correct answers is fairly high. Numerous test show that. The models also significantly improved (especially paid versions) since introduction just 2 years ago.
Of course it does not mean that it could be trusted as much as Wikipedia, but it is probably better source than Facebook.

[–] [email protected] 21 points 11 months ago (2 children)

"Fairly high" is still useless (and doesn't actually quantify anything, depending on context both 1% and 99% could be 'fairly high'). As long as these models just hallucinate things, I need to double-check. Which is what I would have done without one of these things anyway.

[–] [email protected] 3 points 11 months ago (1 children)

1% correct is never "fairly high" wtf

Also if you want a computer that you don't have to double check, you literally are expecting software to embody the concept of God. This is fucking stupid.

[–] [email protected] 11 points 11 months ago* (last edited 11 months ago) (1 children)

1% correct is never “fairly high” wtf

It's all about context. Asking a bunch of 4 year olds questions about trigonometry, 1% of answers being correct would be fairly high. 'Fairly high' basically only means 'as high as expected' or 'higher than expected'.

Also if you want a computer that you don’t have to double check, you literally are expecting software to embody the concept of God. This is fucking stupid.

Hence, it is useless. If I cannot expect it to be more or less always correct, I can skip using it and just look stuff up myself.

[–] [email protected] 1 points 11 months ago (1 children)

Obviously the only contexts that would apply here are ones where you expect a correct answer. Why would we be evaluating a software that claims to be helpful against 4 year old asked to do calculus? I have to question your ability to reason for insinuating this.

So confirmed. God or nothing. Why don't you go back to quills? Computers cannot read your mind and write this message automatically, hence they are useless

[–] [email protected] 7 points 11 months ago (1 children)

Obviously the only contexts that would apply here are ones where you expect a correct answer.

That's the whole point, I don't expect correct answers. Neither from a 4 year old nor from a probabilistic language model.

[–] [email protected] 1 points 11 months ago (2 children)

And you don't expect a correct answer because it isn't 100% of the time. Some lemmings are basically just clones of Sheldon Cooper

[–] [email protected] 6 points 11 months ago

I don't expect a correct answer because I've used these models quite a lot last year. At least half the answers were hallucinated. And it's still a common complaint about this product as well if you look at actual reviews (e.g., pretty sure Marques Brownlee mentions it).

[–] [email protected] 3 points 11 months ago

Hallucinations are largely dealt with if you use agents. It won't be long until it gets packaged well enough that anyone can just use it. For now, it takes a little bit of effort to get a decent setup.

[–] [email protected] 11 points 11 months ago (3 children)

An LLM has never generated a correct answer to any of my queries.

[–] [email protected] 16 points 11 months ago (1 children)

That seems unlikely, unless "any" means two.

[–] [email protected] 6 points 11 months ago (3 children)

Perhaps the problem is that I never bothered to ask anything trivial enough, but you'd think that two rhyming words starting with 'L" would be simple.

[–] [email protected] 2 points 11 months ago (1 children)

Ok, by asking you mean that you find somewhere questions that someone identified as being answered wrongly by LLM, and asking yourself.

[–] [email protected] 2 points 11 months ago
[–] [email protected] 2 points 11 months ago

"AI" is a really dumb term for what we're all using currently. General LLMs are not intelligent, it's assigning priorities to tokens (words) in a database, based on what tokens were provided before it, to compare and guess the next most logical word and phrase, really really fast. Informed guesses, sure, but there's not enough parameters to consider all the factors required to identify a rhyme.

That said, honestly I'm struggling to come up with 2 rhyming L words? Lol even rhymebrain is failing me. I'm curious what you went with.

[–] [email protected] 5 points 11 months ago

I’ve asked GPT4 to write specific Python programs, and more often than not it does a good job. And if the program is incorrect I can tell it about the error and it will often manage to fix it for me.

[–] [email protected] 5 points 11 months ago (2 children)
[–] [email protected] 1 points 11 months ago

I think Meta hates your answer