FS#47004 - [glibc] weird Intel Skylake microcode causes apps to core dump when "lock-elision" is used

Attached to Project: Arch Linux
Opened by Jörg Stettner (jost5367) - Sunday, 08 November 2015, 19:13 GMT
Last edited by Allan McRae (Allan) - Sunday, 08 November 2015, 23:51 GMT
Task Type Bug Report
Category Packages: Core
Status Closed
Assigned To No-one
Architecture x86_64
Severity High
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

After a hardware upgrade (new mainboard + CPU) on an existing Linux installation, many application are crashing and core dumping with a "general protection" trap. Example:
"traps: manjaro-setting[2091] general protection ip:7f10d10397e0 sp:7ffda2e13708 error:0 in libpthread-2.22.so[7f10d1027000+18000]"

Many other applications are affected, both system services and user programs. The number and frequency of core dumps effectively renders the system unusable!
By researching, it resembled an error reported earlier with Intel Haswell CPUs related to a bug in the microcode architecture responsible to execute the HLE prefixes ("hardware lock elision").
My initrd boot config includes the use of intel-ucode, but obviously there is no microcode for Skylake available yet.
My hardware is composed of a MSI H170A GAMING PRO mainboard and Intel Skylake i5-6500 CPU.
I tried various system configurations, including kernel 4.2.5, 4.3.0, and using different BIOS version that loaded microcode versions 0.x33 or 0x49. The errors persisted.
As a last resort, I re-compiled the current glibc and lib32-glibc sources (from "testing") specifying "--enable-lock-elision=no" option, installed these libs/apps, rebooted, and all such errors are gone!
MSI support hotline says they are using the latest microcode (0x49) that was provided by Intel, and can't do anything else about it for the time being.
Wouldn't it be reasonable to suspend the use of HLE until this issue can be solved with newer microcode?

Package version: 2.22-3

Extract from dmesg output, describing system config:
[ 0.000000] Linux version 4.2.5-1 (gcc version 5.2.0 (GCC) ) #1 SMP PREEMPT Tue Oct 27 22:56:28 UTC 2015
[ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-4.2-x86_64 root=UUID=425dd282-019a-4b91-8cac-ba0193faf1cb rw resume=UUID=xxxx
[ 0.000000] DMI: MSI MS-7978/H170A GAMING PRO (MS-7978), BIOS 2.30 09/07/2015
[ 0.076915] smpboot: CPU0: Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz (fam: 06, model: 5e, stepping: 03)
[ 0.076929] Performance Events: no PEBS fmt3+, generic architected perfmon, full-width counters, Intel PMU driver.
[ 0.076933] ... version: 4
[ 0.076934] ... bit width: 48
[ 0.076934] ... generic registers: 8
[ 0.076935] ... value mask: 0000ffffffffffff
[ 0.076935] ... max period: 0000ffffffffffff
[ 0.076936] ... fixed-purpose events: 3
[ 0.076936] ... event mask: 00000007000000ff
[ 0.090360] x86: Booting SMP configuration:
[ 0.090361] .... node #0, CPUs: #1
[ 0.094532] NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.
[ 0.097918] #2 #3
[ 0.109710] x86: Booted up 1 node, 4 CPUs
[ 0.109713] smpboot: Total of 4 processors activated (25538.93 BogoMIPS)
[ 0.256544] Unpacking initramfs...
[...]
[ 0.297896] microcode: CPU0 sig=0x506e3, pf=0x2, revision=0x33
[ 0.297903] microcode: CPU1 sig=0x506e3, pf=0x2, revision=0x33
[ 0.297914] microcode: CPU2 sig=0x506e3, pf=0x2, revision=0x33
[ 0.297924] microcode: CPU3 sig=0x506e3, pf=0x2, revision=0x33
[ 0.297970] microcode: Microcode Update Driver: v2.00 <tigran@aivazian.fsnet.co.uk>, Peter Oruba


This task depends upon

Closed by  Allan McRae (Allan)
Sunday, 08 November 2015, 23:51 GMT
Reason for closing:  Duplicate
Additional comments about closing:   FS#46064 
Comment by Doug Newgard (Scimmia) - Sunday, 08 November 2015, 19:19 GMT
nvidia drivers?
Comment by Jörg Stettner (jost5367) - Sunday, 08 November 2015, 19:59 GMT
Yes, current version 352.55, but the core dumps persist just the same with nouveau driver, and (with kernel 4.3) even without the Nvidia card at all (i.e. using the ob-board Intel GPU)
Comment by Jörg Stettner (jost5367) - Sunday, 08 November 2015, 21:18 GMT

Loading...