FS#77879 - AMDGPU causing erratic shutdowns
Attached to Project:
Arch Linux
Opened by Jonas Jefe (jonaslorincz) - Friday, 17 March 2023, 01:36 GMT
Last edited by Toolybird (Toolybird) - Monday, 17 April 2023, 22:10 GMT
Opened by Jonas Jefe (jonaslorincz) - Friday, 17 March 2023, 01:36 GMT
Last edited by Toolybird (Toolybird) - Monday, 17 April 2023, 22:10 GMT
|
Details
Description:
This happens and is caused by the kernel. In the amdgpu driver, the kernel will shut down the system due to a Critical Thermal Fault (CTF). This happens when the junction temperature reported by the GPU exceeds 105C. However, this is problematic on certain hardware. Specifically, for ASUS G513QY, the GPU will automatically thermal throttle in accordance with the rate of cooling and current temperature. It is designed to maintain a stable 100C temperature under heavy load. However, this does not make the temperature completely consistent; it will fluctuate, and sometimes it will cause a range of temperature changes to occur between 95C-105C. This is expected and is within the range of normal operation. Additional info: kernel 6.2.6 This is a workaround for this issue: https://hst.sh/ufahikadap.patch It will remove the code causing the shutdown and instead use the GPU's automatic thermal management. Steps to reproduce: Use a ASUS G513QY laptop (or another device that has similar characteristics) and operate the GPU on heavy load (i.e. running a game). |
This task depends upon
Closed by Toolybird (Toolybird)
Monday, 17 April 2023, 22:10 GMT
Reason for closing: Upstream
Additional comments about closing: It's in the hands of upstream. Let's hope they get around to addressing it.
Monday, 17 April 2023, 22:10 GMT
Reason for closing: Upstream
Additional comments about closing: It's in the hands of upstream. Let's hope they get around to addressing it.
[1] https://gitlab.freedesktop.org/drm/amd