FS#65005 - linux-firmware 20191220 makes Gigabyte X570 Master Motherboard unable to power on

Attached to Project: Arch Linux
Opened by Bob (Bobrolak) - Wednesday, 01 January 2020, 20:32 GMT
Last edited by freswa (frederik) - Thursday, 20 February 2020, 22:11 GMT
Task Type Bug Report
Category Packages: Core
Status Closed
Assigned To No-one
Architecture x86_64
Severity Critical
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

Description:
After installing linux-firmware 20191220.6871bff-1 and booting arch kernel, and then powering off PC I was no able to power on PC back again.

I thought motherboard was "bricked", I was not able to power it on also with on-build power button. No diode was lighten, not even on mentioned on-build power button. I checked PSU on my other PC and it was fine. So it had to be the mobo. I was just planning to return my motherboard to the RMA, but then I decided to play with it a little bit [1].
After "fixing", this "mobo death" happened again and again, until I was finally able to find the reason - it was either newest version of linux kernel 5.4.6.arch3-1 I used or linux-firmware 20191220.6871bff-1 package. Downgrading both packages to previous versions (so 5.4.6.arch1-1 and 20191215.eefb5f7-1) fixed the problem. Mobo is working fine now for a couple of days.

Now, I see that - except versioning - there were no differences between linux 5.4.6.arch1-1 and 5.4.6.arch3-1.
So this "mobo death" has to be caused by linux-firmware 20191220.6871bff-1

[1]
1. At first I disconnected PSU from power source for couple of second and connected back - nope, still dead.
2. So I repeated 1. + disconnect 24-pin ATX power connector and connected back - diods lightened!
3. Unfortunatelly hitting power button did not start it, instead diods immidiatelly turned out...
4. So I repeated 2 and tried to clear cmos with on-build button (which was lightened then) - didn't help
5. Finally I repeated 2 and tried to remove battery - located under my graphics card (btw not the best place) so I had to remove it at first - and voila - I was able to start the mobo


Additional info:
* package version(s)
linux-firmware 20191220.6871bff-1

Hardware:
Mobo: Gigabyte X570 Master, F11 Bios
CPU: Ryzen 3900x
RAM: 2x16GB Corsair Vengeance 3466MHz CL16

Steps to reproduce:
1. Update
2. Boot
3. Power off
4. Try to power on. No way, its dead.
This task depends upon

Closed by  freswa (frederik)
Thursday, 20 February 2020, 22:11 GMT
Reason for closing:  Not a bug
Comment by Bob (Bobrolak) - Wednesday, 01 January 2020, 20:43 GMT
Minor mistake, should be:
Category Packages: Core
Comment by Bob (Bobrolak) - Wednesday, 01 January 2020, 21:46 GMT
also before downgrading linux and linux-firmware packages I downgraded only amd-ucode package - but this doesn't help
Comment by Doug Newgard (Scimmia) - Thursday, 02 January 2020, 00:10 GMT
There are plenty of differences in arch1 and arch3, you're making unwarranted assumptions.

If it really is the firmware, that package is simply repackaged blobs from upstream, so it's not a packaging issue.
Comment by loqs (loqs) - Thursday, 02 January 2020, 01:43 GMT
linux-firmware 20191215.eefb5f7-1 to 20191220.6871bff-1 contains from upstream [1] :
* an amd-ucode update split out to amd-ucode 20191220.6871bff-1
* updates to ath10k firmware
not from upstream [2] :
* an update to iwlwifi firmware

[1] https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/log/
[2] https://git.archlinux.org/svntogit/packages.git/commit/trunk?h=packages/linux-firmware&id=4896a1d8d473cf4a6557f6c205b2bd67305796e2
Comment by Bob (Bobrolak) - Saturday, 04 January 2020, 11:14 GMT
@Scimmia: you are right, my assumption about no changes on linux between kernel 5.4.6-arch1 and arch3 was wrong. My look on linux package git log was too quick. So I made a patch to compare these releases, it was rather small and I didn't notice anything suspicious.
So I decided to update kernel to 5.4.6.arch3-1 once again, manually, I updated no other packages in that time. I performed a couple of poweroffs and everything works fine.

It leaves nothing more than linux-firmware, the problem must be inside of it.

If this is only repackaging then can you maybe tell me (or assist? ;)) where (in the upstream) should I knock?
Comment by loqs (loqs) - Saturday, 04 January 2020, 13:16 GMT
Does the system use either the ath10k or iwlwifi drivers?
Comment by Nick (DerHomp) - Sunday, 05 January 2020, 10:02 GMT
I have the same Motherboard (Bios F11) with the 3900x and don't see this issue.
One difference is that I use the ZEN kernel. And 4x8GB 3200 cl14 RAM.
But otherwise I use the same linux-firmware package.

Could it be related to some memory overclocking?
Or do you use PCIe4 devices? I don't have any.
I will check with the standard kernel later if I can reproduce it.
Comment by Nick (DerHomp) - Sunday, 05 January 2020, 12:05 GMT
I tried several times to reproduce the problem with linux-5.4.6.arch3-1 and linux-5.4.7.arch1-1.
I don't see any issues here.

Do you have any special hardware installed?
Did you try with default BIOS settings?
Comment by Bob (Bobrolak) - Sunday, 05 January 2020, 16:06 GMT
@loqs: system uses iwlwifi.

@DerHomp: Thanks for joining conversation and testing. The fact that you can't reproduce is good and bad at the same time ;).
No OC here and my bios is very close to default (I am always only enabling virtualization SVM and CPPC). When first "mobo death" happened I had ERP enabled, later I red it may cause very nasty problems. But disabling it did not resolve "mobo deaths" issues.
No PCIE 4 devices. Single GeForce and Asus Xonar DX in PCI-E slots. Kraken x61 to cool CPU. Nothing exotic.

lsmod and lspci in attachment.
Comment by Bob (Bobrolak) - Sunday, 05 January 2020, 16:08 GMT
@DerHomp: ah and also, first "mobo death" happened after disabling system driven by standard arch kernel, but I had same problems with zen kernel (as I am usually using zen as well)
Comment by loqs (loqs) - Sunday, 05 January 2020, 16:16 GMT
Can you reproduce the issue with the iwlwifi module blacklisted?
Comment by Nick (DerHomp) - Sunday, 05 January 2020, 17:24 GMT
I also have a 1080 TI and a Xonar Essence STX. Quite close to your configuration.
SVM is enabled. CPPC and ERP settings I never changed.
I disabled the onboard sound in BIOS if that makes any difference but otherwise I just changed memory related settings.
Comment by Bob (Bobrolak) - Monday, 06 January 2020, 18:48 GMT
I tried to blacklist iwlwifi. Updated linux-firmware once again, but I am unable to reproduce the issue now, also without blacklisted iwlwifi... Even tho it was causing "mobo death" so many times in a row before, at same time powering off from Windows never caused such bad broken behavior.
Maybe kernel upgrade fixed it, or maybe it was a hardware issue, don't know ¯\_(ツ)_/¯

Maybe at least my fixing procedure will save someones mobo from RMA, however I am not sure if archlinux bugs are positioned that high on google.
Feel free to close the task.
Thank you very much for your support, and good luck to all of you.

Loading...