FS#38980 - [linux] 3.13 Automated GPU switching non-functional/freezing

Attached to Project: Arch Linux
Opened by nikku (nikku) - Friday, 21 February 2014, 16:52 GMT
Last edited by Gerardo Exequiel Pozzi (djgera) - Monday, 08 December 2014, 16:32 GMT
Task Type Bug Report
Category Upstream Bugs
Status Closed
Assigned To Tobias Powalowski (tpowa)
Thomas Bächler (brain0)
Architecture x86_64
Severity Critical
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 7
Private No

Details

The automated GPU switching introduced in 3.13 doesn't work at all with my AMD/AMD system (cpu - AMD A10-5750M, gpu - AMD 8970M). My systems power led changes color when the dGPU is active which allows me to easily monitor it's status.

Since upgrading to 3.13, my system either freezes and reboots just seconds after starting X, or the automated switching doesn't appear to function at all. Both results are 100% reproducable and they depend on the actions of the user. 10-12 seconds or so after booting to the CLI the dGPU will be switched off automatically.

If I run startx BEFORE this happens, I am left with a usable system without any automated switching (note that manual vga_switcheroo switching doesn't work in 3.13 - this is a serious problem for notebook users who run on battery).

If I run startx AFTER the dGPU is switched off, X temporarily pauses with errors regarding resume from sleep and boots into the WM. 1-2 seconds later my mouse begins to hitch, the screen freezes, and the speakers pop and the system reboots. I attempted to quickly grab a dmesg output but this doesn't seem possible even when I type it in time.

To grab something of use I experimented with removing the xrandr offload provider switch in the openbox autostart file. I booted into X before the dGPU was switched off, and did not move the mouse. The dGPU switched off, and I moved the mouse and it switched back on. The screen froze for a few seconds but all went back to normal. I have attached the dmesg output.

I tried the exact same thing but this time did not move the mouse after the dGPU switched off for 20 seconds. This time when the dGPU switched back on, the screen stayed permanently frozen and eventually turned black. The mouse cursor was movable but the system didn't respond to alt-ctl-del or the power button.
   dmesg.txt (72.3 KiB)
This task depends upon

Closed by  Gerardo Exequiel Pozzi (djgera)
Monday, 08 December 2014, 16:32 GMT
Reason for closing:  No response
Comment by nikku (nikku) - Friday, 21 February 2014, 17:23 GMT
I added a script to the openbox autostart file to save the dmesg output to a file when it starts. This yielded nothing useful, just the same two errors in the above attachment. I'm 90% sure this is probably not something Arch devs can fix but here's hoping.
Comment by Javier Viñal (fjvinal) - Saturday, 22 February 2014, 10:04 GMT
After update to 3.13, I have a kernel panic when starting KDM with a radeon card in a Sony Vaio Laptop.
If I disable kdm.service,the system start fine in console mode.
Comment by Javier Viñal (fjvinal) - Saturday, 22 February 2014, 14:11 GMT
Solved (in my case) with "radeon.runpm=0" in the kernel parameter line.
This disable power management for radeon cards.
It is a kernel bug: https://bugzilla.kernel.org/show_bug.cgi?id=65761
Comment by nikku (nikku) - Saturday, 22 February 2014, 15:01 GMT
Thanks for the link! Looks like it might be fixed in 3.14 (although there are conflicting reports). I've done some more testing and automatic switching does seem to be functional, but it's behavior is just way off. X randomly wakes it up, and sometimes prevents it from ever sleeping again. I can get the dGPU to power off if nothing on the screen is changing but the time until it's woken seems to vary. And I also experience that "hanging" for a few seconds every time the card is woken up.

X really does not like being started while it's asleep (speaker pop and reboot). I don't know if others experience this but it's 100% reproducable on my laptop. I can't find any mention of it elsewhere which is strange

edit: runpm=0 works great, and while vga_switcheroo functionality is returned it doesn't quite work right. It disables the dGPU like normal, but the screen then immediately freezes and doesn't become responsive again.
Comment by Doug Newgard (Scimmia) - Saturday, 12 April 2014, 15:40 GMT
@nikku, now that 3.14 is in Core, are you still having this problem?
Comment by SpacemanSpiff (SpacemanSpiff) - Sunday, 29 June 2014, 16:46 GMT
Update 3.15 gives me similar problem as this bug. 3.14.6 was working perfectly.
HP dv6z laptop with 6755g2 dual gpu ( Note that i dont have intel )

After Login to desktop using kdm, the desktop freezes regularly for 5-10 seconds every 5-10 seconds. My laptop temperatures are under control so there is no restart but they are elevated.
It looks like that automatic dgpu switching is broken. The dgpu keeps on turning on and off (along with the freeze which maybe caused by it)
i can confirm this by a) looking at the dgpu temperature in a kde widget 2) Using watch cat /sys/kernel/debug/vgaswitcheroo/switch - the discrete gpu status keeps on switching between dynPwr and dynOff

i can use console by tty switch using Cntrl-Alt-F2. There is no hang in console. Top says that 2 kworker processes get active regularly

Killing X using cntrl-backspace just hangs and i start getting error - CPU soft lockup CPU#1 stuck for 23s! [X:352]. AFter this reboot also halts due to the hanged kworker processes and i have to manually shutdown.

radeon.runpm=0 will probably work, but for now i have gone back to 3.14.6. dmesg attached. sorry if its too long, i think the on-off produced a lot of output




Comment by Tobias Powalowski (tpowa) - Wednesday, 13 August 2014, 07:04 GMT
Status on 3.16?

Loading...