FS#31668 - [linux] 3.5.4 ati radeon, kms enabled, black screen on boot

Attached to Project: Arch Linux
Opened by Uli (Army) - Monday, 24 September 2012, 07:32 GMT
Last edited by Andreas Radke (AndyRTR) - Monday, 26 November 2012, 14:24 GMT
Task Type Bug Report
Category Packages: Core
Status Closed
Assigned To Tobias Powalowski (tpowa)
Thomas Bächler (brain0)
Andreas Radke (AndyRTR)
Architecture x86_64
Severity High
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 8
Private No

Details

Description:

First of all, I know about https://bugs.archlinux.org/task/31264, but the behavior here is a bit different. At least I think so, becaue here it doesn't happen on _every_ boot. If I'm proven wrong, my apologies.

This doesn't happen on _every_ boot, but about every 3-4 bootups. I use syslinux as a bootloader and systemd as init system.

What I do is, I start the laptop, see syslinux's menu, but as soon as radeon is being loaded with kms (early method), the screen goes black. Then it boots (I see that it does on the disc activity control light) and tries to start X (in this moment I see the screen flicker for a short moment) and then it's done. This is possible because I configured systemd to automatically start X.

What I did, when it just happened: I logged in blindly and ran "dmesg > dmesg_failed". Then I hit Ctrl+Alt+Del to reboot the machine. Then it worked. Again I ran "dmesg > dmesg_worked".

Here are the log files:
dmesg_failed: http://codepad.org/XQpJ6pQt
dmesg_worked: http://codepad.org/Fu4pwl37

I looked through them and diff'ed them. The start of the boot process looks fine, the log files are the same. But two lines appear on the "worked"-log which aren't in the "failed"-log

[ 15.515877] systemd[1]: tmp.mount: Directory /tmp to mount over is not empty, mounting anyway. (To see the over-mounted files, please manually mount the underlying file system to a secondary location.)
[ 26.336230] [Firmware Bug]: ACPI: BIOS _OSI(Linux) query ignored

I'm not sure if and how they can be related to the problem, but I think too much information is better than too little.

The interesting part is right at the end of the "failed"-log, This is caused by the failed launch of X.

[ 51.378213] radeon 0000:00:01.0: GPU lockup (waiting for 0x0000000000000004 last fence id 0x0000000000000001)
[ 51.378236] [drm] Disabling audio support
[ 51.379397] radeon 0000:00:01.0: GPU softreset
[ 51.379406] radeon 0000:00:01.0: GRBM_STATUS=0xA0003828
[ 51.379413] radeon 0000:00:01.0: GRBM_STATUS_SE0=0x00000007
[ 51.379420] radeon 0000:00:01.0: GRBM_STATUS_SE1=0x00000007
[ 51.379428] radeon 0000:00:01.0: SRBM_STATUS=0x20020940
[ 51.536902] radeon 0000:00:01.0: Wait for MC idle timedout !
[ 51.536907] radeon 0000:00:01.0: GRBM_SOFT_RESET=0x00007F6B
[ 51.537012] radeon 0000:00:01.0: GRBM_STATUS=0x00003828
[ 51.537015] radeon 0000:00:01.0: GRBM_STATUS_SE0=0x00000007
[ 51.537019] radeon 0000:00:01.0: GRBM_STATUS_SE1=0x00000007
[ 51.537022] radeon 0000:00:01.0: SRBM_STATUS=0x20020940
[ 51.538026] radeon 0000:00:01.0: GPU reset succeed
[ 51.704859] radeon 0000:00:01.0: Wait for MC idle timedout !
[ 51.861889] radeon 0000:00:01.0: Wait for MC idle timedout !
[ 51.864715] [drm] PCIE GART of 512M enabled (table at 0x0000000000040000).
[ 51.864825] radeon 0000:00:01.0: WB enabled
[ 51.864832] radeon 0000:00:01.0: fence driver on ring 0 use gpu addr 0x0000000018000c00 and cpu addr 0xffff880138e50c00
[ 52.043389] [drm:r600_ring_test] *ERROR* radeon: ring 0 test failed (scratch(0x8500)=0xCAFEDEAD)
[ 52.043395] [drm:evergreen_resume] *ERROR* evergreen startup failed on resume

Looks like somebody on the forums has the same problem (found it by googling for "drm:evergreen_resume ERROR".)

https://bbs.archlinux.org/viewtopic.php?pid=1163369#p1163369

No solution found there.

Additional info:
* package version(s)
linux 3.5.4-1
systemd 191-1 (but had the issue with 189-1 as well)
xf86-video-ati 1:6.14.6-1

* config and/or log files etc.
dmesg_failed: http://codepad.org/XQpJ6pQt
dmesg_worked: http://codepad.org/Fu4pwl37

Steps to reproduce:
Boot up and see ... well, nothing ;)
This task depends upon

Closed by  Andreas Radke (AndyRTR)
Monday, 26 November 2012, 14:24 GMT
Reason for closing:  Upstream
Comment by Jöran Karl (BlueDarknezz) - Monday, 01 October 2012, 18:22 GMT
Hello together,

I've "exactly" the same problem, at least the same symptom (also it's not at every boot). If I remember correctly, then this behavior has started with the systemd-setup ("A pure systemd installation"). I don't think, that the problem is related to...

"[ 15.515877] systemd[1]: tmp.mount: Directory /tmp to mount over is not empty, mounting anyway. (To see the over-mounted files, please manually mount the underlying file system to a secondary location.)
[ 26.336230] [Firmware Bug]: ACPI: BIOS _OSI(Linux) query ignored" [Army]

...because I don't get the first message, that /tmp will be mount in a non-empty folder (for me it's /var/log) and the second is the generic warning for a broken stock BIOS.

Right now for me it's more interesting, that I've got 3 different CPU addresses. Or are they generated at runtime?

radeon 0000:00:01.0: fence driver on ring 0 use gpu addr 0x0000000018000c00 and cpu addr 0xffff88011745ac00
radeon 0000:00:01.0: fence driver on ring 0 use gpu addr 0x0000000018000c00 and cpu addr 0xffff880117448c00
radeon 0000:00:01.0: fence driver on ring 0 use gpu addr 0x0000000018000c00 and cpu addr 0xffff880117435c00

Update 03.10.2012: The cpu address seems to be generated. In an other working log, the address is different too.

The ring test error was only in one of the two logs visible.

I'm open for every hint I can get.

Additional info:
* package version(s)
grub-bios 2.00-1
linux 3.5.4-1 (and before)
systemd 193-1 (and before)
xf86-video-ati 1:6.14.6-1

* config and/or log files etc.
Are attached at the comment. For a better diff I've removed the timestamps, but the originals (*_timestamps) are attached too.

Comment by Uli (Army) - Sunday, 07 October 2012, 11:16 GMT
The problem ist still existent after the latest updates from [testing]

linux 3.6-1
systemd 194-1
xf86-video-ati 1:6.14.6-2

I think I'll have to look if this is known upstream and if there's already a possible solution in sight.
Comment by Stefan Kooman (hydro-b) - Tuesday, 09 October 2012, 18:05 GMT
I have the same issue running on a Lenovo thinkpad x121e. I'm having this issue since I migrated to systemd, this might be a coincidence though. I use grub-bios 2.00-1. Disabling all vga=, video= in grub.cfg, as mentioned in "installation" section of https://wiki.archlinux.org/index.php/Kernel_Mode_Setting, doesn't make a difference.

I don't have this issue on my workstation with exactly the same installation of arch (cloned install).
Comment by Federico Cinelli (Cinelli) - Thursday, 11 October 2012, 20:20 GMT
Try adding radeon to module line in mkinitcpio , rebuilding initramfs, and reboot. and remove any vga/nomodeset.. Should be ok.
Comment by Uli (Army) - Friday, 12 October 2012, 07:02 GMT
Hi Federico, this is exactly how I have it, see the bugreport, this is what I meant by "as soon as radeon is being loaded with kms (early method)" - putting the radeon module in mkinitcpio.conf is the "early method".
Comment by Bruno Bischofberger (whilealive) - Tuesday, 16 October 2012, 18:32 GMT
Same thing here since 3.5.4. (Acer Laptop with AMD E300) As far as I can remember (sorry, I'm not absulutely sure anymore...) I already had this issue before migrating to systemd. I did the migration mostly because I was wondering if the issue would disapear.
Comment by Federico Cinelli (Cinelli) - Wednesday, 17 October 2012, 01:41 GMT
I'm going to guess (?) these are hybrid graphic systems... Install xf86-video-intel & changing your mkinitcpio module line to ="intel_agp radeon" ... also, is there a login manager running as well?
Comment by Stefan Kooman (hydro-b) - Wednesday, 17 October 2012, 06:21 GMT
My system is definately not a hybrid system. It's a AMD Fusion E-450 with a Radeon HD6320M GPU, no intel inside(TM). I'm using xdm as login manager.
Comment by Uli (Army) - Wednesday, 17 October 2012, 06:51 GMT
Same here, no Intel inside.
Comment by Federico Cinelli (Cinelli) - Thursday, 18 October 2012, 00:39 GMT
paste coppies of lspci and lsmod
Comment by Jöran Karl (BlueDarknezz) - Thursday, 18 October 2012, 05:01 GMT
I've a Lenovo ThinkPad x121e with AMD E-350 and here are my lspci and lsmod output. The problem still persists with the latest updates from core/extra.
Comment by Federico Cinelli (Cinelli) - Thursday, 18 October 2012, 18:17 GMT
Joran It is possible that fglrx doesn't cooperate well with the system's ACPI hardware calls, so it auto-disables itself and there is no screen output. (quoted from the wiki)

try applying: aticonfig --acpi-services=off

if that does not work, please post output of aticonfig --initial
Comment by Jöran Karl (BlueDarknezz) - Thursday, 18 October 2012, 18:32 GMT
Hello Federico,

I don't want to offend, but have you really read one of our comments, reports or logs?
Army and I've posted, that we use xf86-video-ati. There is no need for fglrx, so it can't incorporate with the radeon module (because it isn't installed), which is loaded. Please take a look at the logs, if you further wish to help.

Kind regards,

Jöran
Comment by Bruno Bischofberger (whilealive) - Thursday, 18 October 2012, 19:00 GMT
I'm also using xf86-video-ati and not fglrx. (AMD E300 with Radeo HD 6310) By the way, here is also my dmesg file from the last blank screen.
Comment by Federico Cinelli (Cinelli) - Thursday, 18 October 2012, 19:26 GMT
Jordan, No offense taken. But I did read your logs and I was only giving an option since the Catalyst Proprietary Display Drivers are said to work for the Radeon HD 6310. And can be found here. There is also a .pdf that you can download with instructions for the installation. I appologize for any confusion my last post had caused, I should have been more clear that it was just an option. If you are set on using the radeon drivers could you post the output of pacman -Qs libgl and also pacman -Q linux and uname -r

http://support.amd.com/us/gpudownload/linux/Pages/radeon_linux.aspx?type=2.4.1&product=2.4.1.3.42&lang=English
Comment by Jöran Karl (BlueDarknezz) - Friday, 19 October 2012, 17:00 GMT
Hi Federico,

as requested, here the output of pacman -Qs libgl and uname -r:

# pacman -Qs libgl
local/libgl 9.0-1
Mesa 3-D graphics library and DRI software rasterizer
local/libglade 2.6.4-3
Allows you to load glade interface files in a program at runtime
local/libglapi 9.0-1
free implementation of the GL API -- shared library. The Mesa GL API module
is responsible for dispatching all the gl* functions

# uname -r
3.6.2-1-ARCH
Comment by Mike DeTuri (Scrivener) - Saturday, 20 October 2012, 05:43 GMT
I don't have anything to add except to chime in and say that I'm having the same problem with the open source radeon driver. I had to boot three times today due to a black screen when modeset started. I'm using an AMD A8-3870 system with Radeon HD 6550D video (built into the CPU).
Comment by Vinay S Shastry (shastry) - Saturday, 20 October 2012, 06:04 GMT
I've got an amd E-350 netbook with similar issues, and I have some success avoiding the black screen with:

1) Grub2: Adding "GRUB_TERMINAL_OUTPUT=console", removing "GRUB_GFXPAYLOAD_LINUX=keep" from /etc/default/grub and recreate grub config.
2) Kernel 3.7: linux-mainline package (AUR)
Comment by Jöran Karl (BlueDarknezz) - Saturday, 20 October 2012, 12:10 GMT
Hello shastry,

I will try your hint regarding uncommenting of "GRUB_GFXPAYLOAD_LINUX=keep" and will see, if it would help for the next few reboots/start-ups.
Comment by Uli (Army) - Saturday, 20 October 2012, 12:34 GMT
Seems a bit odd to me, since I use syslinux and have the same issue. Or is there something similar for syslinux?
Comment by Jöran Karl (BlueDarknezz) - Sunday, 21 October 2012, 12:06 GMT
@Uli: You're right, didn't help with uncommenting of "GRUB_GFXPAYLOAD_LINUX=keep". My first boot after the change went black :/.

Has someone other hints?
Comment by Linas (Linas) - Sunday, 21 October 2012, 21:46 GMT
You are facing the problem just after WB is enabled. Try booting with the kernel parameter radeon.no_wb=1. I need to use it for X to work. Unlike you, I have no problem reaching to the console without it, though.
Comment by Mike DeTuri (Scrivener) - Sunday, 28 October 2012, 13:49 GMT
Upgrading to linux-3.6.3-1 seems to have helped. I've only booted 4 or 5 times since, but I haven't seen the black screen. I haven't changed anything else about my configuration.
Comment by Uli (Army) - Sunday, 28 October 2012, 14:30 GMT
The first few boots worked fine for me too, but then it happened again. Unfortunately the no_wb=1 option didn't help.
Comment by Bruno Bischofberger (whilealive) - Monday, 29 October 2012, 17:12 GMT
Same here. Nothing changed with 3.6.3-1.
Comment by Matthew Stewart (Dren) - Thursday, 08 November 2012, 02:09 GMT
I have the same issue. I'm using an hp dm1z laptop with the E-350 processor, and lspci tells me its graphics card is "Radeon HD 6310".
Modesetting works relatively well when it doesn't give me a black screen. Without modesetting, suspending doesn't work and I can't leave X once I enter it. I had no luck with catalyst either. This all happened before and after switching to systemd.
Comment by Matthew Stewart (Dren) - Friday, 09 November 2012, 20:26 GMT
Thank you, shastry, I'm now using the 3.7 kernel and it appears to be working! I've rebooted almost a dozen times and didn't get the black screen once.

uname -r
3.7.0-1-mainline
Comment by fzap (fzap) - Sunday, 25 November 2012, 19:26 GMT
I'am using
AMD-E-450 Chipset,
radeon,
with kernel 3.6.7,
early KMS start and
full systemd setup and still have randomly black screens.
Comment by Vinay S Shastry (shastry) - Sunday, 25 November 2012, 19:38 GMT
@fzap
As mentioned above, I've had success with 3.7 kernel. Have you tried it?
Comment by Andreas Radke (AndyRTR) - Monday, 26 November 2012, 14:24 GMT
This is a kernel drm upstream issue. Closing our downstream bug.

Loading...