LocalLLaMA
Welcome to LocalLLaMA! Here we discuss running and developing machine learning models at home. Lets explore cutting edge open source neural network technology together.
Get support from the community! Ask questions, share prompts, discuss benchmarks, get hyped at the latest and greatest model releases! Enjoy talking about our awesome hobby.
As ambassadors of the self-hosting machine learning community, we strive to support each other and share our enthusiasm in a positive constructive way.
Rules:
Rule 1 - No harassment or personal character attacks of community members. I.E no namecalling, no generalizing entire groups of people that make up our community, no baseless personal insults.
Rule 2 - No comparing artificial intelligence/machine learning models to cryptocurrency. I.E no comparing the usefulness of models to that of NFTs, no comparing the resource usage required to train a model is anything close to maintaining a blockchain/ mining for crypto, no implying its just a fad/bubble that will leave people with nothing of value when it burst.
Rule 3 - No comparing artificial intelligence/machine learning to simple text prediction algorithms. I.E statements such as "llms are basically just simple text predictions like what your phone keyboard autocorrect uses, and they're still using the same algorithms since <over 10 years ago>.
Rule 4 - No implying that models are devoid of purpose or potential for enriching peoples lives.
view the rest of the comments
Let me rephrase it a bit: OpenAI is one of the prime examples. They wrote one or two scientific papers early on. And then they stopped. Deliberately. They're not contributing anything to science. All they invent is strictly for-profit and happens behind closed doors. They take, they don't contribute back.
And the main asset in the digital age is information. It's necessary for AI training to pile that up in a dataset. So that's their supply and they want it cheap because they need a lot of it. That's where they generate their "rent" from. Do they contribute anything back with that? No. They "seek" it and pile it up and that becomes their trade secret. And that's why I call them "rent-seeking". (Thanks for the Wikipedia article, yours was way better than the convoluted definition I read yesterday...) And it even translates to the illegal activities mentioned in the Wikipedia article. Meta has admitted to pirating books to pile up datasets faster. OpenAI likely did the same(?) It's just that they keep everything a secret. No company tells you anymore whether your content went into a dataset, since you might be able to use the legal system against them.
We can see that also with some platforms like Github, which turned out to be a great resource for AI training for Microsoft. Harvesting data is one of the main business models these days. And having that data is what pays the rent. It's not all there is to it. There's a lot of work in compiling it, curating datasets, RLHF... And then of course the science behind AI itself. But the last one aside, that's also often done with negative effects on society. We all know about the precarious situation of the data labellers in Africa.
And then all of this, plus the experts they get from the public universities and all the GPUs in the datacenters and some electricity get turned into their (OpenAI's) intellectual property.
Maybe tell me what they contribute back? Is there anything they give? I don't think so. They mainly seem like parasites to me, freeloading on all the information they can gather in electronic form. And then? Is there anything we get in return?
And maybe we're having a small misunderstanding here. I'm not Anti-AI or anything. I just want people who take something from society, to contribute something back to society. And they really like to take, but they themselves painstakingly avoid disclosing the smallest little details.
I'd say there is two options. Either they do contribute back and we find a healthy relationship between society and big-tech AI companies. That'd make it completely fine if they also take things and it's give-and-take. Or they want to do a for-profit dubious service with no-one having a say in it or look inside or be able to use it aside from what they devised for society... But then the same rules apply to them. They then also have to contribute back in form of money to pay for their supplies and license the content that goes in to their product.
My own opinion: Allow AI and cater to scientific progress. In a healthy way, though. The companies do AI and they get resources. But they're obligated to transparency and contribute back. For example open-weight models are a good idea. I'd go further than that, because science and society also needs to address biases, what AI can be used for, and a bunch of issues that come with it. Like misinformation, spam... The companies aren't incentivised to address that. And it starts to show impact on the internet and society. And regulations are the way to make them do what's necessary or benefitial in the long run.
I'm generally against hyper-capitalism and big corporations. They often don't do us any good. It's a bit complicated with AI since those companies are over-valued and there is a big investment bubble, which isn't necessarily about society. But the copyright-industry is part of the same picture. Spotify for example isn't healthy for society at all. And the Höffner video you linked had a lot of good points about that. I'm not sure whether you're aware of the other side of the coin... For example I've talked to some musicians (copyright holders) and I've written some few pages of technical documentation and I'm aware that it takes several weeks behind the desk to produce 40 pages. And like half a year or more to write a novel. And somehow you need to eat something during those months... So with capitalism it's not always easy. The current situation is sub par. And the copyright industry is mainly a business model to leech on people who create something. We'd be better off if we cut out the middle men.
I see. Thank you. I'm afraid you don't quite understand what rent-seeking means. Let me try a hypothetical example.
Food is pretty cheap. But suppose a single company had a monopoly on supplying food. How much would people be willing to pay? People would give almost anything they have.
The reason food is cheap, is because there is no monopoly. If someone charges more than the competition, you go to the competition. You get a market price. It's complicated but one thing that goes into the price of food is the cost of labor. Many people must work to supply food.
These workers could do other things with their time. But also, other people could do their work of supplying food. No one has a monopoly. Eventually, the cost of labor depends on how much money you must offer to people to be willing to put up with the work.
If someone had a monopoly on food supply, they could charge fantastic prices. Their cost would not change. The difference between the market price and the monopoly price is the monopoly rent.
Let's take this closer to AI training.
Let's say there's some guy who's searching through libraries and archives for stuff to digitize so that it can be sold to AI companies for training. He finds an archive of old newspapers. How much would the market price for scans of these newspapers be? Let's ignore copyright for now.
Maybe the potential buyer could send someone else to scan the papers. So our guy could only ask to be paid for the labor in scanning the papers.
So our guy will not say where he found that archive. That is his trade secret. The potential buyer would have to send someone to search for that archive and scan it. That means our guy can ask to be paid for his labor in finding the archive AND scanning it. The potential buyer will only hire someone else to do that if our guy asks too high a price.
There is a way our guy can get more. If he destroys all remaining copies of these newspapers, then he has a monopoly. Now he can ask for as much as the potential buyer is willing to pay. That's a monopoly rent.
Now copyright... Those newspapers are probably under copyright. If our guy is in Europe, he will have to get permission by the rights-holder to scan the papers. Copyright is a monopoly enforced by the state. The rights-holder can now extract the monopoly rent from our guy.
If the publisher has gone out of business, the rights-holders may be hard to find but he has to make the effort. In practice, this means that there is really no point in making the effort to preserve European culture and history. The copyright people don't just harm technological progress and the European economy, they harm European culture. That's parasitic.
You're making the argument that OpenAI and others are trying to get paid. That's not rent-seeking. Ideally, our laws ensure that seeking money makes you work for the benefit of other people.
Farmers work for money, and everyone else gets a lot of good, cheap food out of it. If you demand that farmers should work for free, then you're demanding that many of us should starve.