FS#53112 - [linux] MCE displayed during boot since 4.10.1-1-ARCH

Attached to Project: Arch Linux
Opened by Johannes Maibaum (jmx) - Tuesday, 28 February 2017, 14:22 GMT
Last edited by Doug Newgard (Scimmia) - Saturday, 08 April 2017, 21:13 GMT
Task Type Bug Report
Category Packages: Testing
Status Closed
Assigned To No-one
Architecture All
Severity Low
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 4
Private No

Details

On my Dell XPS 13 9360, since kernel 4.10.1-1-ARCH hit testing, I see the following error messages during early boot:

kernel: mce: CPU supports 8 MCE banks
kernel: mce: [Hardware Error]: Machine check events logged
[... skipped a few messages here ...]
kernel: smpboot: CPU0: Intel(R) Core(TM) i7-7500U CPU @ 2.70GHz (family: 0x6, model: 0x8e, stepping: 0x9)
kernel: mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 6: ee0000000040110a
kernel: mce: [Hardware Error]: TSC 0 ADDR fef1ff40 MISC 47880018086
kernel: mce: [Hardware Error]: PROCESSOR 0:806e9 TIME 1488290322 SOCKET 0 APIC 0 microcode 42
kernel: mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 7: ee0000000040110a
kernel: mce: [Hardware Error]: TSC 0 ADDR fef1ce40 MISC 7880018086
kernel: mce: [Hardware Error]: PROCESSOR 0:806e9 TIME 1488290322 SOCKET 0 APIC 0 microcode 42

When downgrading to 4.9.11-1-ARCH from current core, only the first two posted lines appear, not the more verbose error information.

I did a little Googling and found the LKML thread in [1] from where I learnt that there are obviously more XPS 9360 users affected by similar error messages when booting different Linux kernels. The OP there had issues both with 4.9 and 4.8 kernels. As already stated, I get the verbose error messages only when booting 4.10.1-1 from testing, but not on 4.9.11-1 from core. As I do see the first two posted lines on both kernels, has there perhaps been some change in log verbosity from 4.9 to 4.10?

Just for reference: the OP from [1] finally stated in [2] that installing Dell's XPS firmware version 1.3.2 made the error messages disappear for him. I am also on firmware version 1.3.2, and still see the mce errors.

The LKML thread indicates that this might neither be a genuine hardware error, nor a Linux kernel issue, but rather a firmware problem on Dell's side, I did want to report it here, since I do see this significant change in the logs between kernels 4.9.11 and 4.10.1.

Apart from the logged errors, everything still seems to work fine with 4.10.1. I did not face any severe issues.


[1] https://lkml.org/lkml/2017/1/4/559
[2] https://lkml.org/lkml/2017/1/27/477
This task depends upon

Closed by  Doug Newgard (Scimmia)
Saturday, 08 April 2017, 21:13 GMT
Reason for closing:  None
Additional comments about closing:  Either a hardware or firmware problem, nothing related to Arch
Comment by Mike Cloaked (mcloaked) - Saturday, 11 March 2017, 18:53 GMT
I am getting the same on my laptop (Lenovo Y510p):

Mar 11 18:41:35 localhost kernel: smpboot: CPU0: Intel(R) Core(TM) i7-4700MQ CPU @ 2.40GHz (family: 0x6, model: 0x3c, stepping: 0x3)
Mar 11 18:41:35 localhost kernel: mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 5: ee0000000040110a
Mar 11 18:41:35 localhost kernel: mce: [Hardware Error]: TSC 0 ADDR ffa874c0 MISC b8a0000086
Mar 11 18:41:35 localhost kernel: mce: [Hardware Error]: PROCESSOR 0:306c3 TIME 1489257693 SOCKET 0 APIC 0 microcode 20

Looks also like https://lkml.org/lkml/2017/2/17/74
Comment by liara (liara) - Sunday, 12 March 2017, 09:33 GMT
Similar here with 4.10.1-1-ARCH on a Lenovo ideapad Z710:

kernel: smpboot: CPU0: Intel(R) Core(TM) i5-4200M CPU @ 2.50GHz (family: 0x6, model: 0x3c, stepping: 0x3)
kernel: mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 5: ae0000000040110a
kernel: mce: [Hardware Error]: TSC 0 ADDR ffb072c0 MISC 38a0000086
kernel: mce: [Hardware Error]: PROCESSOR 0:306c3 TIME 1489310083 SOCKET 0 APIC 0 microcode 20
Comment by Matjaž (rationalperseus) - Wednesday, 22 March 2017, 09:50 GMT
Same error applies to 4.10.3-1-ARCH (and still persists on 4.10.6-1) on Dell Inspirion 5567 (no need to paste the log, since it's basically the same).
Comment by Hugh Smalley (hsmalley) - Wednesday, 22 March 2017, 17:11 GMT
I'm having the same thing on Dell Inspiron Gaming 7559

- Intel(R) Core(TM) i7-7700HQ
- 4.10.4-1-ARCH
Comment by Borche Petrovski (bohap) - Monday, 03 April 2017, 18:59 GMT
Same error applies on Dell Inspirion 5567

Loading...