Arch Linux

Please read this before reporting a bug:
https://wiki.archlinux.org/title/Bug_reporting_guidelines

Do NOT report bugs when a package is just outdated, or it is in the AUR. Use the 'flag out of date' link on the package page, or the Mailing List.

REPEAT: Do NOT report bugs for outdated packages!
Tasklist

FS#73946 - [mesa] AMDGPU returns a black screen whilst playing Minecraft

Attached to Project: Arch Linux
Opened by Jaroslav (iCarbonZz_) - Thursday, 24 February 2022, 16:49 GMT
Last edited by Andreas Radke (AndyRTR) - Tuesday, 24 January 2023, 20:08 GMT
Task Type Bug Report
Category Upstream Bugs
Status Closed
Assigned To Andreas Radke (AndyRTR)
Laurent Carlier (lordheavy)
Architecture x86_64
Severity Medium
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

Description:
When I play Minecraft, and use the mod Magnesium/Sodium to utilize the GPU for chunk rendering, it makes the amdgpu driver crash.

Additional info:
- Packages used to reproduce: linux-zen 5.16.10-zen1-1-zen, linux 5.16.10.arch1-1, linux 5.16.9.arch1-1, linux-lts 5.15.24-2, linux-zen 5.15.5-zen1-1-zen, mesa 21.3.7-1, (AUR) mesa-git 22.1.0_devel.150214.eebe298a878.d41d8cd98f00b204e9800998ecf8427e-1*

- both dmesg and journalctl cut off almost instantly after the screen goes black; the only thing I got in return was:
[ 365.525014] [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out!

Hardware:
- CPU: AMD Ryzen 5 2600 @ 4GHz (overclocked**)
- Motherboard: Gigabyte B450 Gaming X (although this may be irrelevant)
- GPU 1: AMD Radeon RX 560 4GB (overclocked**)
- GPU 2: NVIDIA GeForce GTX 660 Ti 2GB

* Note: I tried to replicate this on the Fedora KDE Live USB, but I was unable to do so. It has a 5.14 kernel, and an older version of mesa (can't remember which one right now)
* Also, when I offload the rendering to the second graphics card, it doesn't happen either. I use the Plasma session on Wayland, but X11 didn't say anything in Xorg.0/1.log either, hence the absence of it in attachments.

** Note: while both the CPU and GPU are overclocked, with the extra kernel parameter amdgpu.ppfeaturemask=0xffffffff in rEFInd, returning both to stock clocks and removing the kernel parameter does not change anything at all.

Steps to reproduce:
Play Minecraft 1.16.x or higher with either Forge Mod Loader or Fabric installed, and have the mod "Magnesium" for Forge, or "Sodium" for Fabric installed.
Turn on Multi-Draw Chunk Rendering.
Play for 5-30 seconds, if it doesn't happen (one of like 100 times) play for about three minutes.
This task depends upon

Closed by  Andreas Radke (AndyRTR)
Tuesday, 24 January 2023, 20:08 GMT
Reason for closing:  Upstream
Comment by Andreas Radke (AndyRTR) - Thursday, 24 February 2022, 17:27 GMT
Please check with official mesa package and disable overclocking. Try to get a backtrace or some other log messages. Check downgrading official packages to find whether kernel drm, mesa or xf86-video-amdgpu causes this.
Comment by Jaroslav (iCarbonZz_) - Friday, 25 February 2022, 18:03 GMT
I have misjudged what the actual problem was. Turns out, it is very likely mesa.
I tried replicating the Fedora KDE experience by downgrading to the 5.14.11 kernel, and mesa 21.2.3, which returned no crashes at all, other than the fact that it was bothersome to set up (understandably, downgrading like 6 packages + LLVM was not entertaining whatsoever, but I suppose it shouldn't be.) So, onwards I proceeded.

Jumped to 5.15.11 and kept the same version of mesa. No difference, everything rock stable.
So, I tried jumping onto my default kernel, that being 5.16.11-zen, which, of course, as well, returned no crashes whatsoever.

It was at this moment that I realized that "amdgpu" does not signify only the kernel driver in dmesg. It can also be impacted by various other system components, such as mesa.

So, the mesa hunt began.

First, I tried upgrading just a few versions; to 21.2.5, which, as I anticipated, worked just fine. Although this version requires libLLVM-13.so now. (Maybe even 21.2.4, but honestly, didn't try that one.)

Second, I upgraded to 21.3.3-1 (-2 where available). This is where my graphics card met its demise. Instantly, after 10 seconds of playing, the computer locked up and blackscreened.
So I decided to downgrade ALL packages to -1 to check if there wasn't a problem between those.
Again, it crashed. So there's the culprit.

TL;DR It's mesa 21.3.X onwards. Even now, as we speak, the bug is somewhere in the mesa-git package on the AUR (22.1.x) Also I am probably considered to be dumb by now (as I thought that it was the kernel at first).

As for any logs, I am still unable to pipe anything out from dmesg/journalctl even when I run them in a separate tty and have them log onto my hard drive, which is just secondary storage. Best case scenario, it stops with the Waiting for fences timed out! error and nothing there after 2 hours of letting the computer stay on the black screen (I can confirm that the rest of the machine works as I had some music playing and it still kept on going)
And that's as far as my knowledge goes.
Comment by Andreas Radke (AndyRTR) - Friday, 25 February 2022, 18:12 GMT
Please report it to the mesa devs upstream.
Comment by Marcell Meszaros (MarsSeed) - Sunday, 13 March 2022, 09:43 GMT
Could you please check if the issue is still present with the latest linux kernel
and latest [mesa] 22.0.0 which is currently in testing?
Comment by Jaroslav (iCarbonZz_) - Saturday, 09 April 2022, 21:07 GMT
I apologize for the late reply.
I haven't tested 22.0.0 when it was in testing, though I tested the git build before 22.0 got released as stable. That crashed as well.
As Andreas suggested, I reported the bug to the mesa devs upstream. You can see the issue here: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6080
(Still occurs as of now)

Loading...