FS#34563 - [linux] 3.8.x - 3.10.x kernel drm (radeon_drv.so xorg) crashes
Attached to Project:
Arch Linux
Opened by Linas (Linas) - Monday, 01 April 2013, 21:48 GMT
Last edited by Tobias Powalowski (tpowa) - Tuesday, 17 September 2013, 10:00 GMT
Opened by Linas (Linas) - Monday, 01 April 2013, 21:48 GMT
Last edited by Tobias Powalowski (tpowa) - Tuesday, 17 September 2013, 10:00 GMT
|
Details
Description:
I upgraded a lot of packages yesterday. This included the 1.13.3-1 -> 1.14.0-2 update of xorg-server{,-common} and other related packages (xf86-input-{evdev,keyboard,mouse} xf86-video-{ati,fbdev,vesa} libdrm mesa ati-dri...) as well as an upgrade of linux package 3.7.10-1 -> 3.8.4-1. Today, xorg started crashing. It can work ok for hours and then suddenly crashes with SIGBUS, and then goes on crashing continuously when it is respawned (needing a reboot*). I'm not sure if I should be blaming xorg or the kernel. The backtraces point to radeon_drv.so (package xf86-video-ati 1:7.1.0-3) I am using a Radeon HD 3600. It is also worth noting that the kernel was booted with radeon.no_wb=1 parameter. On log files b and c, the crashes happened on radeon_drv.so called from AddScreen, while on a it was called from libexa.so (but also in xorg-server pkg) It seemed to be a little more likely to happen with chromium opened, but it could be as well statistical noise. The crashes have happened after a few minutes but it has now been happily working for 3.5 hours without showing it. |
This task depends upon
Closed by Tobias Powalowski (tpowa)
Tuesday, 17 September 2013, 10:00 GMT
Reason for closing: No response
Tuesday, 17 September 2013, 10:00 GMT
Reason for closing: No response
One new piece of information. Reverting to 1.13 (see next post) along with the dependencies gives me a stable box again so the kernel is probably okay.
I am update to date with testing disabled.
warning: xf86-input-evdev: ignoring package upgrade (2.7.3-2 => 2.8.0-1)
warning: xf86-input-void: ignoring package upgrade (1.4.0-4 => 1.4.0-5)
warning: xf86-video-apm: ignoring package upgrade (1.2.5-2 => 1.2.5-3)
warning: xf86-video-ati: ignoring package upgrade (1:7.1.0-1 => 1:7.1.0-3)
warning: xf86-video-fbdev: ignoring package upgrade (0.4.3-2 => 0.4.3-3)
warning: xf86-video-v4l: ignoring package upgrade (0.2.0-11 => 0.2.0-12)
warning: xorg-server: ignoring package upgrade (1.13.3-1 => 1.14.0-2)
warning: xorg-server-common: ignoring package upgrade (1.13.3-1 => 1.14.0-2)
I opened https://bugzilla.kernel.org/show_bug.cgi?id=56311 upstream
This is what I did (and still crashes):
upgraded xorg-server-common (1.14.0-2 -> 1.13.3-1)
upgraded xf86-input-evdev (2.8.0-1 -> 2.7.3-2)
upgraded xorg-server (1.14.0-2 -> 1.13.3-1)
upgraded xf86-input-keyboard (1.7.0-1 -> 1.6.2-2)
upgraded xf86-input-mouse (1.9.0-1 -> 1.8.1-2)
upgraded xf86-video-ati (1:7.1.0-3 -> 1:7.1.0-1)
upgraded xf86-video-fbdev (0.4.3-3 -> 0.4.3-2)
upgraded xf86-video-vesa (2.3.2-3 -> 2.3.2-2)
upgraded linux (3.7.10-1 -> 3.8.4-1)
Before upgrading to xorg-server 1.14.0-2 I had a stable system.
I attach also some crashes I see in dmesg it looks like the GPU is stalling.
8324.678418] CE: hpet increased min_delta_ns to 20113 nsec
[11043.378969] radeon 0000:04:00.0: GPU lockup CP stall for more than 10000msec
[11043.378981] radeon 0000:04:00.0: GPU lockup (waiting for 0x00000000001598ad last fence id 0x00000000001598a0)
This bug looks related too:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/986524
dmes.log (15.5 KiB)
from pacman.log:
[2013-04-09 00:09] [PACMAN] Running 'pacman -U xf86-input-evdev-2.7.3-2-x86_64.pkg.tar.xz xf86-input-keyboard-1.6.2-2-x86_64.pkg.tar.xz xf86-input-mouse-1.8.1-2-x86_64.pkg.tar.xz xf86-video-ati-1:7.1.0-1-x86_64.pkg.tar.xz xf86-video-fbdev-0.4.3-2-x86_64.pkg.tar.xz xf86-video-vesa-2.3.2-2-x86_64.pkg.tar.xz xorg-server-common-1.13.3-1-x86_64.pkg.tar.xz xorg-server-1.13.3-1-x86_64.pkg.tar.xz xorg-server-devel-1.13.3-1-x86_64.pkg.tar.xz'
[2013-04-09 00:10] [PACMAN] downgraded xf86-input-evdev (2.8.0-1 -> 2.7.3-2)
[2013-04-09 00:10] [PACMAN] downgraded xf86-input-keyboard (1.7.0-1 -> 1.6.2-2)
[2013-04-09 00:10] [PACMAN] downgraded xf86-input-mouse (1.9.0-1 -> 1.8.1-2)
[2013-04-09 00:10] [PACMAN] downgraded xf86-video-ati (1:7.1.0-3 -> 1:7.1.0-1)
[2013-04-09 00:10] [PACMAN] downgraded xf86-video-fbdev (0.4.3-3 -> 0.4.3-2)
[2013-04-09 00:10] [PACMAN] downgraded xf86-video-vesa (2.3.2-3 -> 2.3.2-2)
[2013-04-09 00:10] [PACMAN] downgraded xorg-server-common (1.14.0-2 -> 1.13.3-1)
[2013-04-09 00:10] [PACMAN] downgraded xorg-server (1.14.0-2 -> 1.13.3-1)
[2013-04-09 00:10] [PACMAN] downgraded xorg-server-devel (1.14.0-2 -> 1.13.3-1)
So it still crashes but much less often. I would like to get this to upstream, but I am not sure *who's* upstream problem is. Can you point me to the right direction?
On https://bugzilla.kernel.org/show_bug.cgi?id=56311 Alex DEucher said it is a mesa bug in features enabled only on 3.8 kernels, pointing to https://bugs.freedesktop.org/show_bug.cgi?id=61182
That one seems a bit messy, but the “resources occupy a lot of memory” makes sense, as "using a lot of memory" seemed to play part on it.
It's possible however that -as there were a couple of conflicts- I didn't revert it right, or that some later commit also creates crashes. But the modified mesa (packages mesa, mesa-libgl, ati-dri, intel-dri, nouveau-dri, svga-dri) still fails. I am attaching the mesa-git-fixes.patch I used (it's the same file that was in the package, with the commit with the revert appended).
Firefox does not use GPU Accelerated Windows in this setup by default. By forcing it (layers.acceleration.force-enabled=true) the about:support reports accelerated windows, and Firefox runs stable. Very strange, just the other way round as one might expect...
PS: Still crashing with 3.8.10-1-ARCH
Attached is a log with `startx 2> log.txt`
I will revert to 1.13 and report if this stabilizes my system.
update:
now using;
xf86-input-evdev-2.7.3-2-x86_64.pkg.tar.xz
xorg-server-1.13.3-1-x86_64.pkg.tar.xz
xorg-server-common-1.13.3-1-x86_64.pkg.tar.xz
But still can't run 3d :(
update:
So, I practically broke everything. Once I fixed everything, my 3d started working again.
I think it came down to the fact that I had `extra/xf86-input-mouse` installed, but I cannot verify, as I tried a lot of different things.
Now I'm using packages from mesa-git repo, and I don't have this problem anymore with glamour enabled.