FS#57067 - [intel-ucode] Problems on Haswell, Broadwell and KabyLake
Attached to Project:
Arch Linux
Opened by Peter Weber (hoschi) - Friday, 12 January 2018, 10:56 GMT
Last edited by Christian Hesse (eworm) - Wednesday, 14 March 2018, 21:41 GMT
Opened by Peter Weber (hoschi) - Friday, 12 January 2018, 10:56 GMT
Last edited by Christian Hesse (eworm) - Wednesday, 14 March 2018, 21:41 GMT
|
Details
Description:
Hello! The new microcode for several CPUs seem to cause undesired reboots, page-faults and system-hangs: https://newsroom.intel.com/news/intel-security-issue-update-addressing-reboot-issues/ https://support.lenovo.com/de/de/solutions/len-18282 Additional info: * package version(s): 20180108-1 We cannot fix that and not providing the updates is also a problem. Maybe we should print a warning message? If not, at least this bug can tell users about the problems and the can decide on their own to apply or not apply the updates microcodes. Thanks |
This task depends upon
Closed by Christian Hesse (eworm)
Wednesday, 14 March 2018, 21:41 GMT
Reason for closing: Fixed
Additional comments about closing: intel-ucode 20180312-1
Wednesday, 14 March 2018, 21:41 GMT
Reason for closing: Fixed
Additional comments about closing: intel-ucode 20180312-1
https://access.redhat.com/solutions/3315431?sc_cid=701f2000000tsLNAAY&
No congratulations to Intel for this achivement.
In my opinion, this version should be put in [testing], and provide in [extra], like everyone else, the 20171117 release.
Now as the arch kernel is not using any features exposed by SPEC_CTRL I would not have expected your system to experience such issues.
Which is the same basis the kernel developers seem to be working on not that those affected microcodes should never be used.
Are you dual booting with another OS? The only other possibility I can think of is something in userspace was using those features.
Arch could release a new version using the epoch feature to force a downgrade.
I am using the linux-zen kernel, but if I remember correctly, I had the same issues with vanilla (4.14.12 and 4.14.13).
Edit: and, FYI, what I did after multiple reboots on a live-cd to recover my filesystem, corrupting itself during my different retries to understand what was going on, was just to the install the old ucode. Everything is going ok now.
I can try again with the last microcode and the vanilla kernel to be sure it's not related to Zen. (after forcing a backup :p)
Edit 2: Ok, tested on vanilla 4.14.15 with the last ucode, I got a lot of kernel errors. Some apps were freezing, sometimes they could not use the network (followed by errors in a kernel thread related to the network stack), etc.
If the issue persists upstream would seem to be mistaken about their proposed fix being effective.
I don't know if this patch specifically fixes the issue for me, but something between 4.14 and this commit did.
Thanks loqs!
Arch should downgrade to 20171117, most distros already did.
Arch is supposed to follow upstream not roll its own hacks. There is linux-lts package which will stay on 4.14 for a long time. Intel will release new microcode when it's ready.
build tested set could probably be reduced but for a simple test I just kept all patches from the PTI merge up until the patch required to disable IBRS on the affected microcodes.
As a minimal demonstration I stopped the backports without those additional patches.
in grub.cfg (initrd /intel-ucode.img /initramfs-linux.img).
With the following initrd line
initrd /initramfs-linux.img the 4.15.2 works perfectly.
Reverting to the previous intel-ucode package version does NOT solve the issue so I do not know if it is an issue related to the intel-ucode or how it is managed by the new kernel 4.15 onwards.