FS#43009 - Enabling lock elision in glibc causes illegal instruction crashes on non-Haswell Intel CPUs

Attached to Project: Arch Linux
Opened by David Anderson (danderson) - Thursday, 04 December 2014, 21:29 GMT
Last edited by Doug Newgard (Scimmia) - Friday, 05 December 2014, 00:32 GMT
Task Type Bug Report
Category Packages: Core
Status Closed
Assigned To No-one
Architecture x86_64
Severity High
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

Description:

When compiled with --enable-lock-elision, glibc 2.20 unconditionally issues the 'xend' instruction in pthread_mutex_unlock. This causes programs to crash with SIGILL on non-Haswell Intel CPUs, because they don't implement the TSX instruction set extension that defines 'xend'.

Obviously, the fix for glibc itself should be done upstream (I don't see any relevant bugs in their tracker, so I'm going to go file one after this). In the meantime, Arch could remove --enable-lock-elision from the glibc PKGBUILD to work around the issue, at the cost of degraded performance on Haswell CPUs.

Fedora is also tracking this bug in their tracker, though they don't seem to be working on an upstream fix - they just disabled lock elision. See https://bugzilla.redhat.com/show_bug.cgi?id=1146967 and https://bugzilla.redhat.com/show_bug.cgi?id=1144794

Steps to reproduce:

The annoying reproduction I have involves building Ceph using my PKGBUILD here: https://github.com/danderson/packages-archlinux/tree/master/aur/ceph , then running `ceph -s`. I'm working on a short&sweet C reproduction, I'll post it when I have it.
This task depends upon

Closed by  Doug Newgard (Scimmia)
Friday, 05 December 2014, 00:32 GMT
Reason for closing:  Duplicate
Additional comments about closing:   FS#43010 

Loading...