FS#61117 - linux-firmware-20181216.211de16 prevents booting with AMD video card

Attached to Project: Arch Linux
Opened by Jean-Patrick Simard (jpsimard) - Tuesday, 18 December 2018, 02:49 GMT
Last edited by Laurent Carlier (lordheavy) - Thursday, 20 December 2018, 06:36 GMT
Task Type Bug Report
Category Packages: Core
Status Closed
Assigned To Laurent Carlier (lordheavy)
Architecture All
Severity Critical
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 10
Private No

Details

Description:

Upon installing linux-firmware-20181216.211de16, the system will blank to a black screen and be unresponsive. Reverting to linux-firmware-20181026.1cb4e51 fixes the issue and the system boots normally with kernel 4.19.9-arch1-1-ARCH and amd-ucode 20181216.211de16. Found other reports of the problem on the web.

Additional info:
Don't know which logs to provide. Couldn't find any indication myself with desmg.


Steps to reproduce:
install linux-firmware-20181216.211de16
reboot
This task depends upon

Closed by  Laurent Carlier (lordheavy)
Thursday, 20 December 2018, 06:36 GMT
Reason for closing:  Fixed
Additional comments about closing:  linux-firmware 20181218.0f22c85-1
Comment by Andrew (crossbowffs) - Tuesday, 18 December 2018, 04:13 GMT
Same issue here with a RX Vega 64. I've attached the logs of a failed boot. Interestingly I tried booting a second time and didn't get any errors in the log.
   logs.txt (6.2 KiB)
Comment by Jiawei Zhou (4679kun) - Tuesday, 18 December 2018, 04:31 GMT Comment by Kyle Devir (QuartzDragon) - Tuesday, 18 December 2018, 04:58 GMT
Cannot reproduce with RX 580.
Comment by Kyle Devir (QuartzDragon) - Tuesday, 18 December 2018, 05:02 GMT
Jean-Patrick, what's your GPU model?
Comment by J. Andrew Lanz-O'Brien (jlanzobr) - Tuesday, 18 December 2018, 05:07 GMT
Happening to me as well, Vega 64. Full specs: https://gist.github.com/jlanzobr/8c4f906ab00ff15b37b37da8d0ab9364
Comment by Maciek Borzecki (bboozzoo) - Tuesday, 18 December 2018, 07:43 GMT
Can confirm the same thing with Vega 56 GPU:
25:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Vega 10 XL/XT [Radeon RX Vega 56/64] (rev c3)
Comment by Jakub Okoński (farnoy) - Tuesday, 18 December 2018, 10:53 GMT
How did you downgrade linux-firmware if it wasn't booting? I did it from livecd, after arch-chroot but it was not enough. The downgrade was failing on post install hooks in systemd tempfiles. I downgraded linux{,-docs,-headers} to 4.19.8 and that allowed me to boot.
Comment by Neil Moore (Dar13) - Tuesday, 18 December 2018, 11:29 GMT
Happened to me, RX Vega 56 with Ryzen 1800X. Downgrading linux-firmware fixed the issue (managed to get a SSH session up). dmesg of the bad boot is attached.
Comment by Jean-Patrick Simard (jpsimard) - Tuesday, 18 December 2018, 11:54 GMT
My GPU is a Vega64. And I downgraded via USB boot disk using arch-root.
Comment by loqs (loqs) - Tuesday, 18 December 2018, 12:26 GMT
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/log/ if you revert the 18.50 update commits does that resolve the issue?
If so please contact the committer.
Comment by Ville Aakko (Wild_Penguin) - Tuesday, 18 December 2018, 15:46 GMT
Broke my system, too:

03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Vega 10 XL/XT [Radeon RX Vega 56/64] (rev c1)

@ #comment175478 :
This is not tech support, but I will mention (since it could be perhaps helpful for others affected by this bug): downgrading linux-firmware does not regenerate initramfs (if you are using one). Re-installing (you don't need to downgrade) kernel will do it, as will manually regenerating with mkinitcpio.

Also, remember to mount boot partition (if you have one) in your chroot environment.
Comment by Jakub Okoński (farnoy) - Tuesday, 18 December 2018, 15:56 GMT
@ #comment175485:
I have done that, I even booted it up using 4.19.8 and old firmware and tried reinstalling again the older linux-firmware with 4.19.9 and it still failed to boot for me. It seems I am the only one who needs both a downgrade for linux-firmware and the kernel itself.
Comment by Ville Aakko (Wild_Penguin) - Tuesday, 18 December 2018, 16:12 GMT
Hi Jakub,

Maybe there are separate issues?

Better open a discussion at Arch forums, as this is not the right place.

FWIW:
$ uname -r
4.19.9-arch1-1-ARCH
$ pacman -Qs linux-firmware
local/linux-firmware 20181026.1cb4e51-1 (base)
Firmware files for Linux

Works fine here.
Comment by velemas (velemas) - Tuesday, 18 December 2018, 16:41 GMT
I have the same situation as Jakub. Downgraded to old firmware, still blank screen with new kernel. On the contrary, new firmware works with the old kernel.

My system is Asus ROG GL702ZC with RX 580:

description: VGA compatible controller
product: Ellesmere [Radeon RX 470/480/570/570X/580/580X]
vendor: Advanced Micro Devices, Inc. [AMD/ATI]
physical id: 0
bus info: pci@0000:0c:00.0
version: c1
width: 64 bits
clock: 33MHz
capabilities: vga_controller bus_master cap_list rom
configuration: driver=amdgpu latency=0


Comment by loqs (loqs) - Tuesday, 18 December 2018, 17:30 GMT
Please try building linux-firmware from the attached PKGBUILD is the issue still present?
If it is try uncommenting the three commented git reverts in prepare and see if the issue is still present.
   PKGBUILD (3.1 KiB)
Comment by Noah Stegmaier (ef004) - Tuesday, 18 December 2018, 17:45 GMT
as a workaround, appending amdgpu.dc=0 to the kernel parameters seems to be a workaround for the newest kernel+firmware.
Comment by velemas (velemas) - Tuesday, 18 December 2018, 18:16 GMT
@ loqs PKGBUILD with reverts and without does not help

@ Noah thank you. It helps, I was thinking to try it too.

Suspicious commits are:

https://github.com/torvalds/linux/commit/3374518d4d1ae021dc60885f4ea8dca0a3ca290f
https://github.com/torvalds/linux/commit/318f6e599dcd31a8de5756e4a6e01012aa7669cb
Comment by Daenney (daenney) - Tuesday, 18 December 2018, 18:42 GMT
Setting amdgpu.dc=0 in my case just causes the boot to hang. Granted, I do have video now, but no working system. I had to resort to a arch USB stick and downgrade the linux-firmware package and then run mkinitcpio to rebuild the initramfs before I could get it to boot.
Comment by loqs (loqs) - Tuesday, 18 December 2018, 18:46 GMT
@velemas that is somewhat understandable as your issue and farnoy's appears to triggered by the upgrade from 4.19.8 to 4.19.9
Are you going to try reverting the commits you highlighted or do a bisection between 4.19.8 and 4.19.9?
Comment by Jean-Patrick Simard (jpsimard) - Tuesday, 18 December 2018, 20:21 GMT
@loqs : I used your suggested PKGBUILD and my system now boots normally.
Comment by loqs (loqs) - Tuesday, 18 December 2018, 20:47 GMT
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=0f22c8527439eaaf5c3fcf87b31c89445b6fa84d
reverts the amdgpu: update vega10 fw for 18.50 release for causing hangs
That should resolve the firmware issue with the next linux-firmware release.
Comment by Phillip Schichtel (pschichtel) - Tuesday, 18 December 2018, 21:12 GMT
I'd say the 18.50 firmware is intended for the newer AMDGPU version, given that it works just fine for me after switching to linux-mainline.
Comment by velemas (velemas) - Tuesday, 18 December 2018, 21:44 GMT
@loqs I am doing bisecting now.
Comment by velemas (velemas) - Wednesday, 19 December 2018, 01:04 GMT Comment by loqs (loqs) - Wednesday, 19 December 2018, 01:28 GMT
That commit fixed an issue another arch user had https://bugs.freedesktop.org/show_bug.cgi?id=108542
You can either file a bug on https://bugs.freedesktop.org Product DRI Component DRM/AMDgpu or
Email amd-gfx@lists.freedesktop.org CC nicholas.kazlauskas@amd.com alexander.deucher@amd.com harry.wentland@amd.com

As an aside does the issue still exist in 4.20-rc7?
Comment by Arsalan (afzalarsalan) - Wednesday, 19 December 2018, 02:11 GMT
Okay so pulling the latest version of version of linux-firmware.git should fix the issue due to upstream reverting the firmware commit on their own right. Best solution imo is to just package linux-firmware and cherry pick the upstream revert.
Comment by velemas (velemas) - Wednesday, 19 December 2018, 11:17 GMT
@loqs I must say there is no issue in 4.20-rc7. Should I still report upstream or just stick with my custom built 4.19.10 until 4.20 arrives?
Comment by loqs (loqs) - Wednesday, 19 December 2018, 12:07 GMT
Up to you 4.19 will become the next linux-lts. Seems as if something else needs backporting from 4.20 to 4.19.
Comment by Laurent Carlier (lordheavy) - Wednesday, 19 December 2018, 16:45 GMT
linux-firmware-20181218.0f22c85-1 is now in testing
Comment by Plex (plexor) - Thursday, 20 December 2018, 04:16 GMT
@lordheavy Installed linux-firmware-20181218.0f22c85-1 and now I'm able to boot with my AMD Vega 64. Thanks for the fix
Comment by Jiawei Zhou (4679kun) - Thursday, 20 December 2018, 04:23 GMT
Vega 56,works with linux-firmware-20181218.0f22c85-1

Loading...