this post was submitted on 04 Dec 2023
871 points (98.0% liked)
Technology
69421 readers
3003 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related news or articles.
- Be excellent to each other!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
- Check for duplicates before posting, duplicates may be removed
- Accounts 7 days and younger will have their posts automatically removed.
Approved Bots
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
How can the training data be sensitive, if noone ever agreed to give their sensitive data to OpenAI?
Exactly this. And how can an AI which "doesn't have the source material" in its database be able to recall such information?
Model is the right term instead of database.
We learned something about how LLMs work with this.. its like a bunch of paintings were chopped up into pixels to use to make other paintings. No one knew it was possible to break the model and have it spit out the pixels of a single painting in order.
I wonder if diffusion models have some other wierd querks we have yet to discover
The technology of compression a diffusion model would have to achieve to realistically (not too lossily) store “the training data” would be more valuable than the entirety of the machine learning field right now.
They do not “compress” images.