this post was submitted on 04 Jul 2025

89 points (100.0% liked)

Technology

72414 readers

2437 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

[email protected]

Large Language Model Performance Doubles Every 7 Months (spectrum.ieee.org)

submitted 1 day ago by [email protected] to c/[email protected]

54 comments fedilink hide all child comments

By 2030, AI will greatly outperform humans in some complex intellectual tasks. Discover how LLMs are doubling their capabilities every seven months.

top 50 comments

sorted by: hot top controversial new old

[–] [email protected] 1 points 5 hours ago

they are improving at an exponential rate. It's just that the exponent is less than one.

[–] Fedegenerate 12 points 12 hours ago (1 children)

Someone doesn't know the folly of extending straight lines graphs into the future.

[–] [email protected] 5 points 6 hours ago* (last edited 6 hours ago) (1 children)

https://xkcd.com/605

Reminded me of this

And this

https://xkcd.com/1007

[–] [email protected] 1 points 6 hours ago

Oof, the alt text on that second one was unexpectedly dark lmao

[–] [email protected] 35 points 19 hours ago* (last edited 19 hours ago) (2 children)

Classic pseudo-science for the modern grifter. Vague definitions, sloppy measurements, extremely biased, wild unsupported predictions, etc.

[–] [email protected] 7 points 10 hours ago (1 children)

That graph is hilarious. Enormous error bars, totally arbitrary quantization of complexity, and it's title? "Task time for a human that an AI model completes with a 50 percent success rate". 50 percent success is useless, lmao.

On a more sober note, I'm very disappointed that IEEE is publishing this kind of trash.

[–] [email protected] 1 points 5 hours ago

in yes/no type questions, 50% success rate is the absolute worst one can do. Any worse and you're just giving an inverted correct answer more than half the time

[–] [email protected] 5 points 18 hours ago

and assuming that improvement doesn't plateau, ever,

[–] [email protected] 2 points 11 hours ago

new moore law dropped

[–] [email protected] 12 points 18 hours ago* (last edited 18 hours ago) (1 children)

when will they be able to tell me how many 'r's are in 'strawberry' in under 1s?

[–] [email protected] 6 points 12 hours ago (1 children)

Deepseek-r1:1.5b

Thinking

Thinking...
Alright, so I'm trying to figure out how many 'r's are in the word "strawberry." Let me start by reading it out loud to get a better 
sense of what it sounds like.

S-T-R-A-W-B-E-R-R-Y

Okay, let's break this down letter by letter. The first letter is S, then T, R, A, W, B, E, and two R's at the end.

So far, I've got one 'R' in the third position after S-T- and before A-W-B-E-R-R-Y.

Wait a minute, there are two R's in "strawberry." So that means there are two instances of the letter 'r' in the word. Is that all?

Let me double-check to make sure I'm not missing any other 'r's. Scanning through each letter again:

1. S - No
2. T - No
3. R - Yes, one R here.
4. A - No
5. W - No
6. B - No
7. E - No
8. R - Second R
9. R - Third R

Wait, am I counting correctly? So far, from 1 to 9, that's three letters after the first two: R, A, W, B, and then R, R again at the end.

Hold on, maybe I should visualize it like this:

S T R A W B E R R Y

So positions:
1. S
2. T
3. R (position 3)
4. A
5. W
6. B
7. E
8. R (position 8)
9. R (position 9)

And then the word ends with 'Y.'

Hmm, so only two 'R's in total? Let me confirm:

Looking back: positions 3 and 8 and 9.

Wait, that makes three 'R's. Did I miss one?

Hold on, maybe I need to write it out clearly to visualize better:

S T R A W B E R R Y

Counting each 'R':

- After S (position 2): No
- After T (position 3): Yes, first R
- After A (position 4): No
- After W (position 5): No
- After B (position 6): No
- After E (position 7): No
- After R (position 8): Second R
- After R (position 9): Third R
- After Y (position 10): No

So that's three 'R's in total. Did I count right? Maybe I should write it down step by step to ensure accuracy.

Alternatively, maybe there are only two 'R's because the 'Y' is considered separate from the R. But logically, even if Y has a vowel 
sound, 'r' still counts separately unless specified otherwise.

Wait, actually, in English letters, 'Y' isn't really pronounced like an 'R.' It sounds more like a consonant. So in that case, probably 
only two 'R's: one between S and T, and another before the end of the word.

But I'm still confused because when I count, I sometimes make mistakes. Maybe using an alphabet list or a counting tool would help.

Let me write down each letter with its position:

1. S
2. T
3. R (count 1)
4. A
5. W
6. B
7. E
8. R (count 2)
9. R (count 3)
10. Y

So that's three 'R's in total.

Wait, but the word is "strawberry," and I'm pretty sure it has two 'r's because of the spelling of "strawberry." Maybe the counting 
matches that.

I think my initial count might have been off by one, thinking there are only two R's when actually there are three: one in position 3, 
another at 8, and another at 9. But given that Y isn't pronounced like an 'R,' it's more accurate to say there are two 'r's.

Hmm, maybe I need to consider the pronunciation of each letter as well. Even though 'Y' is a consonant, sometimes people might play 
around with silent letters or other pronunciation rules. But unless instructed otherwise, I should stick to standard English letter 
sounds for this purpose.

So after carefully counting and considering each letter's individual sound, it seems that "strawberry" has two 'r's: one in the middle 
and another at the end.
...done thinking.

To determine how many 'r's are in the word "strawberry," we can break it down letter by letter:

1. **S** - No
2. **T** - No
3. **R** (count 1)
4. **A** - No
5. **W** - No
6. **B** - No
7. **E** - No
8. **R** (count 2)
9. **R** (count 3)
10. **Y** - No

However, considering the pronunciation of each letter and standard English rules, even though 'Y' is a consonant, it's more accurate to 
say that "strawberry" contains only two 'r's: one in position 3 and another at position 9.

Final Answer: There are two 'r's in the word "strawberry."

[–] [email protected] 4 points 6 hours ago

I like how it counted correctly and then gave an incorrect final answer.. Bravo 👏

[–] [email protected] 20 points 21 hours ago (1 children)

I very much like those huge generalizations in AI articles that makes you small and stupid. Those generalizations proves nothing but they sound like something big is coming. It's parody. How long we see them before people wake up ? Just wait 2 more years and AI will be better bro. You're not using AI properly, you need to learn how to use AI bro. You need to use different model for this task bro. Just pay for corporate products bro. Amount of junk of top of this pile of shit is amusing.

[–] [email protected] 5 points 18 hours ago (1 children)

Because so much money has been thrown at it, for startups, for power generation, for investors, that this is little more than marketing for retail investors to buy into.

[–] [email protected] 1 points 13 hours ago

That's no doubt that they pour money to machines instead of people. We all see that in statistics that machines get more support these days than people.

[–] [email protected] 118 points 1 day ago (4 children)

This is like measuring the increasing speeds of cars in the early years and extrapolating that they would be supersonic by now by ignoring the exponential impact that air resistance has.

[–] [email protected] 1 points 12 hours ago (1 children)

Or like looking at the early days of semiconductors and extrapolating that CPU speed will double every 18 months ..smh these people

[–] [email protected] 1 points 12 hours ago* (last edited 12 hours ago) (1 children)

Since CPU speeds are still doubling every 18 months you have a solid point!

Or maybe not since you are probably referring to the doubling of transistors that was an observation which was accurate over a lengthy period of time in the context of when the observation was made. Nobody said that would continue indefinitely either.

[–] [email protected] 2 points 11 hours ago

Yup, that's what I was alluding to, while it may not still be the case for transistors, they did manage to take 50 odd years to get there, push that trend line from the figure 50 years heh (not saying you should, 5 seems much more conservative)

Take a look at Nvidias pace wrt Moore's law (of FLOPS) https://netrouting.com/nvidia-surpassing-moores-law-gpu-innovation/

[–] [email protected] 16 points 1 day ago

My son has doubled in size every month for the last few months. At this rate he'll be fifty foot tall by the time he's seven years old.

Yeah, it's a stupid claim to make on the face of it. It also ignores practical realities. The first is those is training data, and the second is context windows. The idea that AI will successfully write a novel or code a large scale piece of software like a video game would require them to be able to hold that entire thing in their context window at once. Context windows are strongly tied to hardware usage, so scaling them to the point where they're big enough for an entire novel may not ever be feasible (at least from a cost/benefit perspective).

I think there's also the issue of how you define "success" for the purpose of a study like this. The article claims that AI may one day write a novel, but how do you define "successfully" writing a novel? Is the goal here that one day we'll have a machine that can produce algorithmically mediocre works of art? What's the value in that?

[–] [email protected] 24 points 1 day ago* (last edited 1 day ago)

Very good analogy. They're also ignoring that getting faster and faster at reaching a 50% success rate (a totally unacceptable success rate for meaningful tasks) doesn't imply ever achieving consistently acceptable success.

[–] [email protected] 10 points 1 day ago

Air resistance has cubic not exponential impact

[–] [email protected] 60 points 1 day ago (3 children)

This is such bullshit. Models have already consumed all available data and have nothing left to consume, whole needing exponentially more data for progressive advancements

[–] [email protected] 13 points 22 hours ago

This. It's the old "to the moon" mentality.

If my 2yo continues to grow at the current rate, we'll have to buy new doors soon becouse at age 10 the kid will be the tallest person on Earth.

[–] [email protected] 3 points 18 hours ago

time for them to set sail to the wild seas again!

[–] [email protected] 6 points 1 day ago

Apparently, throwing more data at it will not help much from now on... But anyway what they're saying, I can't trust the snake oil seller, he is suspicious...

[–] [email protected] 46 points 1 day ago (2 children)

How is completely fucking up literally 50% of the time outperforming exactly???

[–] [email protected] 3 points 19 hours ago

It's outperforming "messier" problems with a much lower success rate.

[–] [email protected] 28 points 1 day ago (1 children)

You see, in 7 months, they'll fuck up literally 100% of the time! Progress.

[–] [email protected] 4 points 22 hours ago

It would be even better but unfortunately you can't exceed 100% wrong.

Wait, maybe you can, let me check the AI!

[–] [email protected] 9 points 1 day ago (1 children)

So only 10 years until it isn't a ressource hog anymore...

[–] [email protected] 6 points 23 hours ago

Only if people give up on the whole concept by then. Each new generation of AI model takes more energy than the last.

[–] [email protected] 85 points 1 day ago (2 children)

*with 50 percent reliability.

Heck of an asterisk on this claim.

[–] [email protected] 16 points 1 day ago (2 children)

That sounds like a coin flip, but 50% reliability can be really useful.

If a model has 50% chance of completing a task that would cost me an hour - and I can easily check it was completed correctly - on average, I'm saving half of the time it would take to complete this.

That said, exponentials don't exist in the real world, we're just seeing the middle of a sigmoid curve, which will soon yield diminishing returns.

[–] [email protected] 5 points 22 hours ago

and I can easily check it was completed correctly

Can you always though?

[–] [email protected] 1 points 23 hours ago (1 children)

That said, exponentials don’t exist in the real world, we’re just seeing the middle of a sigmoid curve, which will soon yield diminishing returns.

Yes, but the tricky thing is we have no idea when the seemingly exponential growth will flip over into the plateuing phase. We could be there already or it could be another 30 years.

For comparison Moores law is almost certainly a sigmoid too, but weve been seeing exponential growth for 50 years now.

[–] [email protected] 3 points 22 hours ago (1 children)

Moore's law hasn't been exponential for ~15 years now.

[–] [email protected] 1 points 16 hours ago

If you are just talking transitor density I believe it still is, but even if not, my point was that it had exponential growth spanning over many decades.

load more comments (1 replies)

[–] [email protected] 8 points 1 day ago

Then why do I feel like it's programming abilites are getting worse? I've stopped paying for it now because it causes more frustration than anything else. Works for simple "how can I simplyfi this code" queries when my head hurts, but that's about it.

[–] ReallyActuallyFrankenstein 13 points 1 day ago* (last edited 1 day ago) (1 children)

Is it just me, or is this graph (first graph in the article) completely unintelligible?

The X-axis being time is self-explanatory, but the Y-axis is somehow exponential time but then also mapping random milestones of performance, meaning those milestones are hard-linked to that time-based Y-axis? What?

[–] [email protected] 7 points 22 hours ago

That's what you get when the "research" for the article is AI generated.

[–] [email protected] 2 points 20 hours ago

They need to invent an inquiring-gpt or Q-GPT. Otherwise they'll need humans to do the digging.

[–] [email protected] 3 points 23 hours ago

I saw something once that explained how you can have an ai trained on a set of soccer games and have it generate soccer games as a use for it.

The idea is that the model has compressed all the soccer games into a smaller data size form than the total of having let's say 100+ games on video or whatever.

That's the real utility I see in generative ai that I know can keep going basically as long as we want to.

[–] [email protected] 28 points 1 day ago (1 children)

I doubt it

[–] [email protected] 17 points 1 day ago (2 children)

Then why share it?

[–] [email protected] 22 points 1 day ago (1 children)

Do you not see any value in engaging with views you don't personally agree with? I don't think agreeing with it is a good barometer for whether it's post-worthy

[–] [email protected] 8 points 1 day ago (1 children)

Good point, thank you, I figured that sharing poor scientific articles essentially equals spreading misinformation (which I think is a fair point either), but I like your perspective either

[–] [email protected] 5 points 1 day ago

I guess the value is that at some point you'll probably hear the core claim - "AI is improving exponentially" - regurgitated by someone making a bad argument, and knowing the original source and context can be very helpful to countering that disinformation.

load more comments (1 replies)

[–] [email protected] 15 points 1 day ago

2 X 0 = 0

[–] [email protected] 5 points 1 day ago

Yawn...

load more comments