FS#36288 - [linux] 3.10.x flood of mei_me errors after resume from suspend. Intel GPU

Attached to Project: Arch Linux
Opened by Dominik Brzeziński (kelloco2) - Friday, 26 July 2013, 18:52 GMT
Last edited by Tobias Powalowski (tpowa) - Tuesday, 17 September 2013, 12:22 GMT
Task Type Bug Report
Category Packages: Core
Status Closed
Assigned To Tobias Powalowski (tpowa)
Thomas Bächler (brain0)
Architecture x86_64
Severity Medium
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 7
Private No

Details

Description:
After resume from suspend I get flood of following errors:
[log]
lip 25 17:26:32 laptop_kasus kernel: mei_me 0000:00:16.0: reset: init clients timeout hbm_state = 1.
lip 25 17:26:32 laptop_kasus kernel: mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
lip 25 17:27:02 laptop_kasus kernel: mei_me 0000:00:16.0: reset: init clients timeout hbm_state = 1.
lip 25 17:27:02 laptop_kasus kernel: mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
lip 25 17:27:32 laptop_kasus kernel: mei_me 0000:00:16.0: reset: init clients timeout hbm_state = 1.
lip 25 17:27:32 laptop_kasus kernel: mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
lip 25 17:28:02 laptop_kasus kernel: mei_me 0000:00:16.0: reset: init clients timeout hbm_state = 1.
lip 25 17:28:02 laptop_kasus kernel: mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
[/log]

in dmesg/journal. Xorg goes crazy. Errors sometimes jump out faster and more - than immediately hang the system.

Additional info: I tested only:
3.10.3-1 -affected
3.10.2-1 -affected
3.9.* and previous - everything worked okay. (this version haven't "skipping the temporary switch to a text console, systems with the graphics driver for Intel GPUs are now able wake from standby faster" feature).


additionaly, for a long time I use settings for GPU:
/etc/modprobe.d/inte-gpu.conf :
options snd_hda_intel power_save=1
options i915 i915_enable_rc6=7 i915_enable_fbc=1 lvds_downclock=1

It looks like this:
http://www.youtube.com/watch?v=UfasjdPWcIc

Steps to reproduce:
suspend laptop (systemctl suspend) and wake. Probably the Intel GPU is needed. I read about it on the mailing lists. From what I understand in the stable 3.10 this bug was supposed to be fixed(?).
Regards K.
This task depends upon

Closed by  Tobias Powalowski (tpowa)
Tuesday, 17 September 2013, 12:22 GMT
Reason for closing:  Fixed
Additional comments about closing:  3.11.1
Comment by wluce0 (wluce0) - Saturday, 03 August 2013, 16:37 GMT
same problem, xorg goes crazy when resuming from suspend, a gnome-shell reboot doesn't fix the issue. Sometimes the issue fix by itself after some time, but mostly it eventually get flooded with mei_me messages :

[log]
août 03 18:18:43 archlinux kernel: mei_me 0000:00:16.0: reset: unexpected enumeration response hbm.
août 03 18:18:43 archlinux kernel: mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
août 03 18:18:43 archlinux kernel: mei_me 0000:00:16.0: reset: wrong host start response
août 03 18:18:43 archlinux kernel: mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
août 03 18:18:43 archlinux kernel: mei_me 0000:00:16.0: reset: unexpected enumeration response hbm.
août 03 18:18:43 archlinux kernel: mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
août 03 18:18:43 archlinux kernel: mei_me 0000:00:16.0: reset: wrong host start response
août 03 18:18:43 archlinux kernel: mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
août 03 18:18:43 archlinux kernel: mei_me 0000:00:16.0: reset: unexpected enumeration response hbm.
août 03 18:18:43 archlinux kernel: mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
août 03 18:18:43 archlinux kernel: mei_me 0000:00:16.0: reset: wrong host start response
août 03 18:18:43 archlinux kernel: mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
août 03 18:18:43 archlinux kernel: mei_me 0000:00:16.0: reset: unexpected enumeration response hbm.
août 03 18:18:43 archlinux kernel: mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
août 03 18:18:43 archlinux kernel: mei_me 0000:00:16.0: reset: wrong host start response
août 03 18:18:43 archlinux kernel: mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
août 03 18:18:43 archlinux kernel: mei_me 0000:00:16.0: reset: unexpected enumeration response hbm.
août 03 18:18:43 archlinux kernel: mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
août 03 18:18:43 archlinux kernel: mei_me 0000:00:16.0: reset: wrong host start response
août 03 18:18:43 archlinux kernel: mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
août 03 18:18:43 archlinux kernel: mei_me 0000:00:16.0: reset: unexpected enumeration response hbm.
août 03 18:18:43 archlinux kernel: mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
août 03 18:18:43 archlinux kernel: mei_me 0000:00:16.0: reset: wrong host start response
août 03 18:18:43 archlinux kernel: mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
août 03 18:18:43 archlinux kernel: mei_me 0000:00:16.0: we can't read the message slots =00000001.
août 03 18:18:43 archlinux kernel: mei_me 0000:00:16.0: reset: unexpected enumeration response hbm.
août 03 18:18:43 archlinux kernel: mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
[/log]

i have a lenovo thinkpad x230 :

# lspci
00:00.0 Host bridge: Intel Corporation 3rd Gen Core processor DRAM Controller (rev 09)
00:02.0 VGA compatible controller: Intel Corporation 3rd Gen Core processor Graphics Controller (rev 09)
00:14.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB xHCI Host Controller (rev 04)
00:16.0 Communication controller: Intel Corporation 7 Series/C210 Series Chipset Family MEI Controller #1 (rev 04)
00:19.0 Ethernet controller: Intel Corporation 82579LM Gigabit Network Connection (rev 04)
00:1a.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB Enhanced Host Controller #2 (rev 04)
00:1b.0 Audio device: Intel Corporation 7 Series/C210 Series Chipset Family High Definition Audio Controller (rev 04)
00:1c.0 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 1 (rev c4)
00:1c.1 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 2 (rev c4)
00:1c.2 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 3 (rev c4)
00:1d.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB Enhanced Host Controller #1 (rev 04)
00:1f.0 ISA bridge: Intel Corporation QM77 Express Chipset LPC Controller (rev 04)
00:1f.2 SATA controller: Intel Corporation 7 Series Chipset Family 6-port SATA Controller [AHCI mode] (rev 04)
00:1f.3 SMBus: Intel Corporation 7 Series/C210 Series Chipset Family SMBus Controller (rev 04)
02:00.0 System peripheral: Ricoh Co Ltd PCIe SDXC/MMC Host Controller (rev 07)
03:00.0 Network controller: Intel Corporation Centrino Wireless-N 2200 (rev c4)

here is my modprobe.conf

# cat /etc/modprobe.d/modprobe.conf
options i915 i915_enable_rc6=1 i915_enable_fbc=1 lvds_downclock=1
options iwlwifi 11n_disable=1

never had any problem before recent upgrading (about one week).
Regards.


Regards,
Comment by Cédric Archambeau (excced) - Wednesday, 07 August 2013, 14:22 GMT
Hi,

Solved for me after upgrading to 3.10.5.
Thanks to kernel devs for this fast solution :-)

FIY, the bug notified upstream was here : https://bugzilla.kernel.org/show_bug.cgi?id=60530
Comment by Dominik Brzeziński (kelloco2) - Tuesday, 13 August 2013, 07:21 GMT
  • Field changed: Percent Complete (100% → 0%)
I checked (Ivy Bridge laptop - Asus K75 - 3.10.5-1) - BUG still exists. ;/
Comment by wluce0 (wluce0) - Tuesday, 13 August 2013, 08:29 GMT
It has been solved for me too after upgrading to 3.10.5

thanks a lot !
Comment by Otto Allmendinger (OttoA) - Wednesday, 14 August 2013, 13:38 GMT
Same issues with ASrock z77 and nvidia graphics, even with Kernel 3.10.6. Dmesg mei_me messages after resume, total freeze a few minutes after that.

Blacklisting mei and mei_me modules seems to work.
Comment by Dominik Brzeziński (kelloco2) - Wednesday, 14 August 2013, 20:28 GMT
3.10.6-2. I got:
mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
mei_me 0000:00:16.0: reset: unexpected enumeration response hbm.
mei_me 0000:00:16.0: unexpected reset: dev_state = RESETTING
mei_me 0000:00:16.0: reset: unexpected enumeration response hbm.
etc....
Should I blacklist mei_me?
Comment by Dominik Brzeziński (kelloco2) - Monday, 19 August 2013, 21:05 GMT
What is the module mei_me used for? Only provide the interface /sys/* for Intel GPU for the card? I blacklisted this module and the problem doesn't occur.
[code]find /sys/bus/pci/drivers/mei_me/ -type l|sed 's#^.*/##'[/code]
[code]echo 0000:00:16.0 > /sys/bus/pci/drivers/mei_me/unbind[/code] - replace 0000:00:16.0 with your number
also seems to help. Is this correct solution to the problem? Does anyone know what mei_me driver is doing and is it required?
Comment by Otto Allmendinger (OttoA) - Monday, 19 August 2013, 21:07 GMT
According do my research, it's about the Intel Management Engine: http://en.wikipedia.org/wiki/Intel_Active_Management_Technology

Some fancy-pants remote administration stuff as far as I can see
Comment by Tobias Powalowski (tpowa) - Tuesday, 17 September 2013, 08:01 GMT
Status on 3.11.1?
Comment by Dominik Brzeziński (kelloco2) - Tuesday, 17 September 2013, 11:28 GMT
Thanks. It works well. It does not require blacklist. When you wake up from sleep occurs:
"[ 126.885036] mei_me 0000:00:16.0: wait hw ready failed. status = -110
[ 126.885040] mei_me 0000:00:16.0: hw_start failed disabling the device
"
But only this. Everything is working well and there is no more errors. In practice, for me the issue is resolved.
Regards :)
linux 3.11.1-1

Loading...