this post was submitted on 17 Jul 2024
132 points (100.0% liked)

Technology

68526 readers
4032 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
top 10 comments
sorted by: hot top controversial new old
[–] superminerJG@lemmy.world 39 points 8 months ago (1 children)

Goodhart's law:

When a measure becomes a target, it ceases to be a good measure.

[–] bionicjoey@lemmy.ca 16 points 8 months ago (2 children)

The Turing Test (as some people believe it to be): if you can have a conversation with a computer and not tell if it's a computer, then it must be intelligent.

AI companies: writes ML model that is specifically designed to convincingly play one side of a conversation, even though it has no ability to understand the things it talks about.

[–] technocrit@lemmy.dbzer0.com 9 points 8 months ago (1 children)

It's worth emphasizing that the "Turing Test" is not a good test since it's not at all scientific.

It's just another thought experiment that grifters have taken to the bank.

[–] bionicjoey@lemmy.ca 8 points 8 months ago

Also as Turing proposed it it's meant to be infinitely repeatable. The test isn't supposed to just be if a machine can convince one person with one conversation. That would be trivial. The real Turing test is the converse, it says that there should be no conversation one could have with the machine where it wouldn't convince you it's a human.

[–] kromem@lemmy.world 2 points 8 months ago

The most advanced models absolutely have modeling about what's being discussed and relationships between concepts.

Even toy models have been shown to build world models from very basic training data.

Honestly, read at least a little bit of the relevant research:

https://www.anthropic.com/news/mapping-mind-language-model

[–] exu@feditown.com 21 points 8 months ago

There's a reason why the open llm leaderboard was changed a while ago.
Basically, scores didn't improve much anymore and many tests were contained in the training data.

See this blogpost for more info.

https://huggingface.co/spaces/open-llm-leaderboard/blog

[–] Buffalox@lemmy.world 19 points 8 months ago

Much like IQ tests for humans are flawed too. Figuring out series of numbers or relations in a graphic representation, only tells how good you are at these specific tasks, and doesn't provide a reliable picture of "general" intelligence.

[–] MajorHavoc@programming.dev 16 points 8 months ago

"close to meaningless" sums up my expert opinion on the whole current AI hype machine sales pitch.

Highly tuned models for incredibly specific, not-dangerous use cases is the next pragmatic step. There's a lot to excited about, in that very narrow band.

Anyone selling more than that is part of a con, or in very rare cases, doing genuine "fuck off and ask me again in a decade" kinds of research.

[–] A_A@lemmy.world 4 points 8 months ago

Looks quite satisfying to me, otherwise, we can still create new tests ... :

The tests cover an astounding range of knowledge, such as eighth-grade math, world history, and pop culture. Many are multiple choice, others take free-form answers. Some purport to measure knowledge of advanced fields like law, medicine and science. Others are more abstract, asking AI systems to choose the next logical step in a sequence of events, or to review “moral scenarios” and decide what actions would be considered acceptable behavior in society today.

[–] water@lemmy.world 2 points 8 months ago