this post was submitted on 05 Mar 2024
16 points (100.0% liked)

Gaming

3228 readers
269 users here now

The Lemmy.zip Gaming Community

For news, discussions and memes!


Community Rules

This community follows the Lemmy.zip Instance rules, with the inclusion of the following rule:

You can see Lemmy.zip's rules by going to our Code of Conduct.

What to Expect in Our Code of Conduct:


If you enjoy reading legal stuff, you can check it all out at legal.lemmy.zip.


founded 2 years ago
MODERATORS
top 2 comments
sorted by: hot top controversial new old
[โ€“] [email protected] 16 points 1 year ago* (last edited 1 year ago) (1 children)

I may be partially responsible for this lazy ass implementation.

3 months ago I was playing around with stable diffusion a lot and because I sleep in the same room where my PC is, I used to lower the TDP of the GPU during the night to 150w to keep it quiet. One day while SD was running, I lowered the TDP in LACT and pressed Apply but instead of getting quieter, the fans ramped up and I was shocked seeing that the card was in fact pulling 420w instead of its rated 293w (6900xt).

I tracked down the issue to the driver incorrectly applying the power limit, basically if you set a TDP that's too low for the current power state, the driver would disable the power limit entirely until the card entered a lower power state, after which, your new TDP would be correctly applied.

Running a modern GPU without power limits is bad and potentially dangerous for everything involved: the GPU, the VRMs, even the power supply cables may melt as we've seen with nVidia cards. So I reported the issue immediately to the AMDGPU developers (my issue is linked in the article).

They quickly came up with a fix, which I tested, which wouldn't allow you to set a TDP lower than the lowest valid TDP for the highest power state. This gets the job done but it's a kludge more than a fix, ideally the driver should realize that the new TDP is too low for the current power state and switch to a lower power state, and I don't know why AMD implemented such a shitty solution in their official kernel driver.

[โ€“] [email protected] 4 points 1 year ago

Alright everyone, time to break out the pitchforks and storm this guy's house lol

But seriously, this is 100% on AMD, don't beat yourself up for their laziness.