Arch Linux

Please read this before reporting a bug:
https://wiki.archlinux.org/index.php/Reporting_Bug_Guidelines

Do NOT report bugs when a package is just outdated, or it is in Unsupported. Use the 'flag out of date' link on the package page, or the Mailing List.

REPEAT: Do NOT report bugs for outdated packages!
Tasklist

FS#50397 - [linux] radeon: ring 0 stalled for more than 10250msec

Attached to Project: Arch Linux
Opened by Wiktor (typh00nz) - Sunday, 14 August 2016, 21:44 GMT
Last edited by Doug Newgard (Scimmia) - Monday, 15 August 2016, 05:19 GMT
Task Type Bug Report
Category Kernel
Status Assigned
Assigned To Tobias Powalowski (tpowa)
Architecture x86_64
Severity Critical
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 0%
Votes 10
Private No

Details

Description:

xf86-video-ati 1:7.7.0-1 breaks my OS, causing black screen and freezing whole machine.


-- Reboot --
kernel: radeon 0000:01:00.0: ring 0 stalled for more than 10250msec
kernel: radeon 0000:01:00.0: failed to get a new IB (-35)
kernel: [drm:radeon_cs_ioctl [radeon]] *ERROR* Failed to get ib !
kernel: radeon 0000:01:00.0: failed to get a new IB (-35)
kernel: [drm:radeon_cs_ioctl [radeon]] *ERROR* Failed to get ib !
kernel: BUG: unable to handle kernel paging request at ffffc90400f70ffc
kernel: IP: [<ffffffffa06e7ff5>] radeon_ring_backup+0xd5/0x170 [radeon]
kernel: RIP [<ffffffffa06e7ff5>] radeon_ring_backup+0xd5/0x170 [radeon]
-- Reboot --



Additional info:

4.7.0-1-ARCH,
xf86-video-ati 1:7.7.0-1,
radeon hd 6850m,



Steps to reproduce:

Noticed while playing dota2:
This task depends upon

Comment by cirrus (cirrus) - Thursday, 01 June 2017, 22:52 GMT
i experience this using
Card: Advanced Micro Devices [AMD/ATI] RV770 [Radeon HD 4870]
Display Server: X.Org 1.19.3 driver: N/A Resolution: 1920x1080@60.00hz, 1920x1080@60.00hz
GLX Renderer: Gallium 0.4 on AMD RV770 (DRM 2.49.0 / 4.11.3-2-ck-nehalem, LLVM 4.0.0)
GLX Version: 3.0 Mesa 17.1.0
It happens on stock arch kernel too, i use the xf86-video-ati driver, this has been occuring since i began using this GPU about 1 year ago
some kernels seem to work better than others, dmesg often shows output akin to this ..
perf: interrupt took too long (2711 > 2500), lowering kernel.perf_event_max_sample_rate to 73000
perf: interrupt took too long (3512 > 3388), lowering kernel.perf_event_max_sample_rate to 56000
perf: interrupt took too long (4459 > 4390), lowering kernel.perf_event_max_sample_rate to 44000
perf: interrupt took too long (5613 > 5573), lowering kernel.perf_event_max_sample_rate to 35000
hawker64 kernel: [drm:radeon_cs_ioctl [radeon]] *ERROR* Failed to schedule IB !
hawker64 kernel: [drm:radeon_cs_ioctl [radeon]] *ERROR* Failed to schedule IB !
hawker64 kernel: radeon 0000:02:00.0: scheduling IB failed (-2).
hawker64 kernel: [drm:radeon_cs_ioctl [radeon]] *ERROR* Failed to schedule IB
--------------------------------------------------------------
Mar 26 19:22:12 hawker64 kernel: [drm:radeon_uvd_cs_parse [radeon]] *ERROR* Illegal UVD message type (-1)!
Mar 26 19:22:12 hawker64 kernel: [drm:radeon_cs_ioctl [radeon]] *ERROR* Invalid command stream !
Mar 26 19:22:23 hawker64 kernel: radeon 0000:02:00.0: ring 0 stalled for more than 10360msec
Mar 26 19:22:23 hawker64 kernel: radeon 0000:02:00.0: GPU lockup (current fence id 0x00000000004ead16 last fence id 0x00000000004eaddd on ring 0)
Mar 26 19:22:23 hawker64 kernel: radeon 0000:02:00.0: failed to get a new IB (-35)
Mar 26 19:22:23 hawker64 kernel: [drm:radeon_cs_ioctl [radeon]] *ERROR* Failed to get ib !
Mar 26 19:22:23 hawker64 kernel: radeon 0000:02:00.0: couldn't schedule ib'
Mar 26 19:22:23 hawker64 kernel: [drm:radeon_uvd_suspend [radeon]] *ERROR* Error destroying UVD (-22)!
sorry i have no further logs available, but the original error posted by OP , here is what i often see -http://archlinux.uk/misc/gpulockup.html.-
thankfully not experienced this for some time now, currently on kernel 4.13.9-1-ck-hehalem
(radeon ring/fence errors)
Comment by mattia (nTia89) - Tuesday, 03 October 2017, 20:01 GMT
is this issue still valid?
Comment by cirrus (cirrus) - Tuesday, 30 January 2018, 13:05 GMT
This issue seems to appear only on certain kernels for me at least.
Sometimes i wont experience the hangs for a couple of month, srolling back or switchng betweeen linux-ck & linux does often suffice.
Comment by cirrus (cirrus) - Tuesday, 20 February 2018, 20:19 GMT
still ..
Feb 20 18:09:12 blade kernel: radeon 0000:02:00.0: GPU lockup (current fence id 0x0000000000001f1c last fence id 0x0000000000002015 on ring 0)
Feb 20 18:09:12 blade kernel: radeon 0000:02:00.0: failed to get a new IB (-35)
Feb 20 18:09:12 blade kernel: [drm:radeon_cs_ioctl [radeon]] *ERROR* Failed to get ib !
Feb 20 18:09:12 blade kernel: radeon 0000:02:00.0: Saved 7961 dwords of commands on ring 0.
Feb 20 18:09:12 blade kernel: radeon 0000:02:00.0: GPU softreset: 0x00000019

-------------------------------------------------------------------------------------------------
Linux blade 4.15.4-1-ck-nehalem #1 SMP PREEMPT Sun Feb 18 09:18:16 EST 2018 x86_64 GNU/Linux

02:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] RV770 [Radeon HD 4870]
Subsystem: PC Partner Limited / Sapphire Technology RV770 [Radeon HD 4870]
Kernel driver in use: radeon
Kernel modules: radeon

ofc it could well be faulty hardware on my end but my google fu seems to validate this as an ongoing driver issue, if symptoms persist i gonna swap out GPU, or i might well still be moaning here in a decade.
regards.
Comment by sgar (garnica) - Monday, 25 June 2018, 18:25 GMT
I have the same issue running linux-zen 4.17.2-1

01:05.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] RS780L [Radeon 3000]

[drm:radeon_cs_ioctl [radeon]] *ERROR* Invalid command stream !
[drm:radeon_cs_parser_relocs [radeon]] *ERROR* gem object lookup failed >
[drm:radeon_cs_ioctl [radeon]] *ERROR* Failed to parse relocation -2!
radeon 0000:01:05.0: ring 0 stalled for more than 10276msec
Comment by loqs (loqs) - Monday, 25 June 2018, 19:02 GMT
@garnica as you will have noticed no action has been taken on this bug report in the past 21 months or is likely to be in the future.
I suggest you try 4.18-rc2 / amd-staging-drm-next and if the issue is still present there report it upstream and work with upstream on a resolution.
Comment by Archie The Penguin (opus10) - Tuesday, 26 June 2018, 05:35 GMT
>ofc it could well be faulty hardware on my end
it's probably not, I've seen similar or same radeon stall error after resuming from suspend to ram for the past months

bug related
https://bugs.archlinux.org/task/55611
Comment by sgar (garnica) - Tuesday, 26 June 2018, 08:43 GMT
Error is gone when downgrading to:

xorg-server 1.19
xf86-video-ati 1:7.10

Related forum threads:
https://bbs.archlinux.org/viewtopic.php?id=237659
https://bbs.archlinux.org/viewtopic.php?pid=1787035
Comment by Maxim (Zeben) - Sunday, 08 July 2018, 14:40 GMT
I've got the same issue on HP Pavilion G6 laptop. Solved by removing xf86-video-ati driver; now OS uses mesa-releated modesetting driver I guess, but all acceleration-releated things now works...
CPU: AMD Phenom II X4 P960
iGPU: Radeon Mobility 4570M
dGPU: Radeon 6470M
Linux 4.17.3
Xorg: 1.20

Loading...