Technology

69600 readers

3126 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

[email protected]

1257

Brian Eno: “The biggest problem about AI is not intrinsic to AI. It’s to do with the fact that it’s owned by the same few people” (musictech.com)

submitted 1 month ago by [email protected] to c/[email protected]

151 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] [email protected] 25 points 1 month ago* (last edited 1 month ago) (13 children)

They're not illegally harvesting anything. Copyright law is all about distribution. As much as everyone loves to think that when you copy something without permission you're breaking the law the truth is that you're not. It's only when you distribute said copy that you're breaking the law (aka violating copyright).

All those old school notices (e.g. "FBI Warning") are 100% bullshit. Same for the warning the NFL spits out before games. You absolutely can record it! You just can't share it (or show it to more than a handful of people but that's a different set of laws regarding broadcasting).

I download AI (image generation) models all the time. They range in size from 2GB to 12GB. You cannot fit the petabytes of data they used to train the model into that space. No compression algorithm is that good.

The same is true for LLM, RVC (audio models) and similar models/checkpoints. I mean, think about it: If AI is illegally distributing millions of copyrighted works to end users they'd have to be including it all in those files somehow.

Instead of thinking of an AI model like a collection of copyrighted works think of it more like a rough sketch of a mashup of copyrighted works. Like if you asked a person to make a Godzilla-themed My Little Pony and what you got was that person's interpretation of what Godzilla combined with MLP would look like. Every artist would draw it differently. Every author would describe it differently. Every voice actor would voice it differently.

Those differences are the equivalent of the random seed provided to AI models. If you throw something at a random number generator enough times you could--in theory--get the works of Shakespeare. Especially if you ask it to write something just like Shakespeare. However, that doesn't meant the AI model literally copied his works. It's just doing it's best guess (it's literally guessing! That's how work!).

[–] [email protected] 8 points 1 month ago (5 children)

The issue I see is that they are using the copyrighted data, then making money off that data.

[–] [email protected] 6 points 1 month ago (4 children)

...in the same way that someone who's read a lot of books can make money by writing their own.

[–] [email protected] 1 points 1 month ago

I hate to be the one to break it to you but AIs aren't actually people. Companies claiming that they are "this close to AGI" doesn't make it true.

The human brain is an exception to copyright law. Outsourcing your thinking to a machine that doesn't actually think makes this something different and therefore should be treated differently.

load more comments (3 replies)

load more comments (10 replies)