Arch Linux

Please read this before reporting a bug:
https://wiki.archlinux.org/title/Bug_reporting_guidelines

Do NOT report bugs when a package is just outdated, or it is in the AUR. Use the 'flag out of date' link on the package page, or the Mailing List.

REPEAT: Do NOT report bugs for outdated packages!
Tasklist

FS#77526 - [linux] AMDGPU - Scaling bug

Attached to Project: Arch Linux
Opened by Virgile (crashone) - Tuesday, 14 February 2023, 12:15 GMT
Last edited by Toolybird (Toolybird) - Sunday, 26 March 2023, 21:04 GMT
Task Type Bug Report
Category Packages: Core
Status Closed
Assigned To No-one
Architecture x86_64
Severity Medium
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

Description:
Randomly and after a few minutes (I haven't found a trigger yet), my Radeon 7900XTX GPU does not scale to the load increase. As a result, the window manager (KDE Plasma) becomes laggy and games are very limited in performance.

I have to reload the performance profile manually (by switching for example from BOOTUP_DEFAULT to 3D_FULL_SCREEN in /sys/class/drm/card0/device/pp_power_profile_mode ) to restore normal performance.

This bug affects all performance profiles, and it can also appear when 3D_FULL_SCREEN is enabled. In this case, I have to switch to BOOTUP_DEFAULT in order to fix it temporarily.

Additional info:
* package version(s) : linux 6.1.11.arch1-1, also tried Linux 6.1.11-zen1-1.1-zen
This task depends upon

Closed by  Toolybird (Toolybird)
Sunday, 26 March 2023, 21:04 GMT
Reason for closing:  Upstream
Comment by Virgile (crashone) - Tuesday, 14 February 2023, 20:31 GMT
After further investigations, it seems that it's the memory clock that stuck at 96Mhz. The GPU frequency scales correctly.
Comment by Toolybird (Toolybird) - Thursday, 16 February 2023, 04:53 GMT
Have you tried different kernel versions? The general kernel troubleshooting steps are here [1]. If it's a regression then look here [2]. You could also try reporting an issue upstream to the AMD GPU kernel devs [3]. Please let us know what you find out.

[1] https://wiki.archlinux.org/title/Kernel#Troubleshooting
[2] https://wiki.archlinux.org/title/Kernel#Debugging_regressions
[3] https://gitlab.freedesktop.org/drm/amd
Comment by Virgile (crashone) - Thursday, 16 February 2023, 12:16 GMT
The problem occurs across different kernel 6.1 patch versions, on vanilla, zen and amd (from AUR). I can't say if it's a regression, as it has always been present since I plug this GPU the first time in January. I also know it is not hardware related, as on Windows this problem does not exist.
Comment by Virgile (crashone) - Saturday, 25 February 2023, 12:04 GMT
Someone else on Reddit experienced the same problem : https://www.reddit.com/r/linux_gaming/comments/10da6b8/comment/j536ng1/?utm_source=share&utm_medium=web2x&context=3

I think the GPU power management is not yet correctly managed. Adding "amdgpu.runpm=1" to the boot parameters seems to have solved the problem. Moreover, with this parameter, the card also uses all allocated power (303W) whereas it was way lower before (around 270W).
Comment by Toolybird (Toolybird) - Monday, 27 February 2023, 05:33 GMT
It's good that you have a workaround...but it still sounds like something that should be reported to the AMD GPU devs...

Loading...