this post was submitted on 11 Jul 2025
210 points (100.0% liked)
Fuck AI
3450 readers
1277 users here now
"We did it, Patrick! We made a technological breakthrough!"
A place for all those who loathe AI to discuss things, post articles, and ridicule the AI hype. Proud supporter of working people. And proud booer of SXSW 2024.
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Sounds about right.
I'd like to see numbers for inexperienced devs and devs working on somebody else's code, though.
EDIT: Oh, this is interesting. The full paper breaks down where the time goes. Turns out coders do in fact spend less time actually working on the code when using AI, but the time spent prompting, waiting on the output and processing the output eats up the difference. They also sit idle for longer with AI. So their forecasts aren't that crazy, they do work less/faster with AI, but the new extra tasks make them less productive overall.
That makes a lot of sense in retrospect, but it's not what I was expecting.
Thanks for the quick summary! I would probably forget to read this later as im at work right now, so thanks!
Yeah, I had to dig a bit further for this figure. They display the same data more prominently in percentage of the time devoted to each bug, which gives them smaller error bars, but also doesn't really answer the question that matters regarding where the time went.
Worth noting that this is a subset of the data, apparently. They recorded about a third of the bug fixes on video and cut out runs with cheating and other suspicious activity. Assuming each recording contains one bug they end up with a fourth of the data broken down this way.
Which is great, but... it does make you wonder why that data is good enough for the overall over/underestimate plot if it's not good enough for the task breakdown. Some of the stuff they're filtering out is outright not following the instructions or self-reporting times that are more than 20% off from what the recording shows. So we know some of those runs are so far off they didn't get counted for this, but presumably the rest of the data that just had no video would be even worse, since the timings are self-reported and they paid them to participate by the hour.
I'd definitely like to see this with more data, this is only 16 people, even if they are counting several hundred bugs. Better methodology, too. And I'd like to see other coder profiles in there. For now they are showing a very large discrepancy between estimate and results and at least this chart gives you some qualitative understanding of how that happened. I learned something from reading it. Plus, hey, it's not like AI research is a haven of clean, strict data.
Of course most people will just parrot the headline, because that's the world we live in.