FS#53112 - [linux] MCE displayed during boot since 4.10.1-1-ARCH
Attached to Project:
Arch Linux
Opened by Johannes Maibaum (jmx) - Tuesday, 28 February 2017, 14:22 GMT
Last edited by Doug Newgard (Scimmia) - Saturday, 08 April 2017, 21:13 GMT
Opened by Johannes Maibaum (jmx) - Tuesday, 28 February 2017, 14:22 GMT
Last edited by Doug Newgard (Scimmia) - Saturday, 08 April 2017, 21:13 GMT
|
Details
On my Dell XPS 13 9360, since kernel 4.10.1-1-ARCH hit
testing, I see the following error messages during early
boot:
kernel: mce: CPU supports 8 MCE banks kernel: mce: [Hardware Error]: Machine check events logged [... skipped a few messages here ...] kernel: smpboot: CPU0: Intel(R) Core(TM) i7-7500U CPU @ 2.70GHz (family: 0x6, model: 0x8e, stepping: 0x9) kernel: mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 6: ee0000000040110a kernel: mce: [Hardware Error]: TSC 0 ADDR fef1ff40 MISC 47880018086 kernel: mce: [Hardware Error]: PROCESSOR 0:806e9 TIME 1488290322 SOCKET 0 APIC 0 microcode 42 kernel: mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 7: ee0000000040110a kernel: mce: [Hardware Error]: TSC 0 ADDR fef1ce40 MISC 7880018086 kernel: mce: [Hardware Error]: PROCESSOR 0:806e9 TIME 1488290322 SOCKET 0 APIC 0 microcode 42 When downgrading to 4.9.11-1-ARCH from current core, only the first two posted lines appear, not the more verbose error information. I did a little Googling and found the LKML thread in [1] from where I learnt that there are obviously more XPS 9360 users affected by similar error messages when booting different Linux kernels. The OP there had issues both with 4.9 and 4.8 kernels. As already stated, I get the verbose error messages only when booting 4.10.1-1 from testing, but not on 4.9.11-1 from core. As I do see the first two posted lines on both kernels, has there perhaps been some change in log verbosity from 4.9 to 4.10? Just for reference: the OP from [1] finally stated in [2] that installing Dell's XPS firmware version 1.3.2 made the error messages disappear for him. I am also on firmware version 1.3.2, and still see the mce errors. The LKML thread indicates that this might neither be a genuine hardware error, nor a Linux kernel issue, but rather a firmware problem on Dell's side, I did want to report it here, since I do see this significant change in the logs between kernels 4.9.11 and 4.10.1. Apart from the logged errors, everything still seems to work fine with 4.10.1. I did not face any severe issues. [1] https://lkml.org/lkml/2017/1/4/559 [2] https://lkml.org/lkml/2017/1/27/477 |
This task depends upon
Closed by Doug Newgard (Scimmia)
Saturday, 08 April 2017, 21:13 GMT
Reason for closing: None
Additional comments about closing: Either a hardware or firmware problem, nothing related to Arch
Saturday, 08 April 2017, 21:13 GMT
Reason for closing: None
Additional comments about closing: Either a hardware or firmware problem, nothing related to Arch
Mar 11 18:41:35 localhost kernel: smpboot: CPU0: Intel(R) Core(TM) i7-4700MQ CPU @ 2.40GHz (family: 0x6, model: 0x3c, stepping: 0x3)
Mar 11 18:41:35 localhost kernel: mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 5: ee0000000040110a
Mar 11 18:41:35 localhost kernel: mce: [Hardware Error]: TSC 0 ADDR ffa874c0 MISC b8a0000086
Mar 11 18:41:35 localhost kernel: mce: [Hardware Error]: PROCESSOR 0:306c3 TIME 1489257693 SOCKET 0 APIC 0 microcode 20
Looks also like https://lkml.org/lkml/2017/2/17/74
kernel: smpboot: CPU0: Intel(R) Core(TM) i5-4200M CPU @ 2.50GHz (family: 0x6, model: 0x3c, stepping: 0x3)
kernel: mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 5: ae0000000040110a
kernel: mce: [Hardware Error]: TSC 0 ADDR ffb072c0 MISC 38a0000086
kernel: mce: [Hardware Error]: PROCESSOR 0:306c3 TIME 1489310083 SOCKET 0 APIC 0 microcode 20
- Intel(R) Core(TM) i7-7700HQ
- 4.10.4-1-ARCH