this post was submitted on 12 Aug 2024
249 points (100.0% liked)

Technology

68639 readers
5915 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 4 points 8 months ago* (last edited 8 months ago) (1 children)

Reward models (aka reinforcement learning) and preference optimization models can come to some conclusions that we humans find very strange when they learn from patterns in the data they’re trained on. Especially when those incentives and preferences are evaluated (or generated) by other models. Some of these models could very well could come to the conclusion that nuking every advanced-tech human civilization is the optimal way to improve the human species because we have such rampant racism, classism, nationalism, and every other schism that perpetuates us treating each other as enemies to be destroyed and exploited.

Sure, we will build ethical guard rails. And we will proclaim to have human-in-the-loop decision agents, but we’re building towards autonomy and edge/corner-cases always exist in any framework you constrain a system to.

I’m an AI Engineer working in autonomous agentic systems—these are things we (as an industry) are talking about—but to be quite frank, there are not robust solutions to this yet. There may never be. Think about raising a teenager—one that is driven strictly by logic, probabilistic optimization, and outcome incentive optimization.

It’s a tough problem. The naive-trivial solution that’s also impossible is to simply halt and ban all AI development. Turing opened Pandora’s box before any of our time.