FS#63159 - kernel 5.2: intel i915 modeset (kms) hang at boot

Attached to Project: Arch Linux
Opened by Riku Salminen (rikusalminen) - Thursday, 11 July 2019, 08:24 GMT
Last edited by Jan de Groot (JGC) - Monday, 16 September 2019, 07:57 GMT
Task Type Bug Report
Category Kernel
Status Closed
Assigned To No-one
Architecture x86_64
Severity Critical
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 10
Private No

Details

Description:
After upgrading to linux-5.2.arch2-1, my Intel graphics (i914) laptop hangs at boot. This occurs as video modes are changing, visible as a change in contrast on the screen.

Adding kernel command line `nomodeset` or `i915.modeset=0` will make the system boot normally (without graphics of course).

Downgrading to linux-5.1.16-arch1-1 does not have the issue.

Additional info:
* I do not know if it's only the graphics that is hanging (and the system boots normally otherwise), I haven't tried making contact with the box via network
* However, no keyboard input (alt-ctrl-del, etc) makes any difference, so it's probably a hard hang

Steps to reproduce:
* take Lenovo Laptop with Intel Skylake
* Upgrade to linux-5.2-arch2-1
* Reboot system
* Observe hang right after rootfs mounted (or at boot if i915 module is added to initrd in mkinitcpio.conf)

$ lspci
00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor Host Bridge/DRAM Registers (rev 08)
00:02.0 VGA compatible controller: Intel Corporation Skylake GT2 [HD Graphics 520] (rev 07)
00:08.0 System peripheral: Intel Corporation Xeon E3-1200 v5/v6 / E3-1500 v5 / 6th/7th Gen Core Processor Gaussian Mixture Model
00:14.0 USB controller: Intel Corporation Sunrise Point-LP USB 3.0 xHCI Controller (rev 21)
00:14.2 Signal processing controller: Intel Corporation Sunrise Point-LP Thermal subsystem (rev 21)
00:16.0 Communication controller: Intel Corporation Sunrise Point-LP CSME HECI #1 (rev 21)
00:16.3 Serial controller: Intel Corporation Sunrise Point-LP Active Management Technology - SOL (rev 21)
00:1c.0 PCI bridge: Intel Corporation Sunrise Point-LP PCI Express Root Port #1 (rev f1)
00:1c.2 PCI bridge: Intel Corporation Sunrise Point-LP PCI Express Root Port #3 (rev f1)
00:1c.4 PCI bridge: Intel Corporation Sunrise Point-LP PCI Express Root Port #5 (rev f1)
00:1f.0 ISA bridge: Intel Corporation Sunrise Point-LP LPC Controller (rev 21)
00:1f.2 Memory controller: Intel Corporation Sunrise Point-LP PMC (rev 21)
00:1f.3 Audio device: Intel Corporation Sunrise Point-LP HD Audio (rev 21)
00:1f.4 SMBus: Intel Corporation Sunrise Point-LP SMBus (rev 21)
00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection I219-LM (rev 21)
02:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd. RTS522A PCI Express Card Reader (rev 01)
04:00.0 Network controller: Intel Corporation Wireless 8260 (rev 3a)
05:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM951/PM951 (rev 01)
This task depends upon

Closed by  Jan de Groot (JGC)
Monday, 16 September 2019, 07:57 GMT
Reason for closing:  Fixed
Comment by Emantor (Emantor) - Thursday, 11 July 2019, 09:17 GMT Comment by Riku Salminen (rikusalminen) - Thursday, 11 July 2019, 09:18 GMT
Workaround: add "i915.enable_psr = 0" to kernel command line.
Comment by Wiktor Kwapisiewicz (wiktor) - Friday, 12 July 2019, 20:17 GMT
I think I'm experiencing the same issue on XPS 9350 (Intel Skylake, HD Graphics 520).

The built-in display freezes after some time, sometimes early at boot, sometimes later (it's not deterministic).

Externally connected displays work just fine it's just the built-in display that hangs. That'd explain why enable_psr=0 works.
Comment by Gunnar Bretthauer (Taijian) - Saturday, 13 July 2019, 21:52 GMT
Same here - KabyLake System, Laptop with internal display only. Display just goes black as soon as i915 is loaded. Reverting to 5.1.x or using the -lts kernel works fine.
Comment by Jay Somedon (jsomedon) - Sunday, 14 July 2019, 08:36 GMT
same here on my skylake 6700hq; downgrading back to 5.1 and my screen is back.
Comment by loqs (loqs) - Thursday, 18 July 2019, 01:55 GMT Comment by Daniel Bershatsky (daskol) - Wednesday, 24 July 2019, 14:07 GMT
The same for me. VGA controller is Intel Corporation Skylake GT2 [HD Graphics 520] (rev 07).
Comment by loqs (loqs) - Wednesday, 24 July 2019, 16:05 GMT Comment by carlos (osly) - Wednesday, 24 July 2019, 22:19 GMT
Thanks for the update. So, we'll wait until the next merge window...
Comment by Caleb Maclennan (alerque) - Saturday, 27 July 2019, 12:03 GMT
I'm not sure if all the comments on here are actually about the same issue, the original report and comments seem to span a couple different sets of symptoms, devices, and solutions.

I have a Dell XPS 13 system affected by some iteration of this.

$ lspci | head -2
00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor Host Bridge/DRAM Registers (rev 09)
00:02.0 VGA compatible controller: Intel Corporation Iris Graphics 540 (rev 0a)

In my case the system boots and works fine, but at some (non predictable) point 1-10 minutes in the display would freeze. The system would not be locked, SSHing in remotely showed no signs of any problems, and even graphical apps could be started / killed from the command line. No Xorg log messages, dmesg, or any other source showed any sign of an error, but the display would be frozen with whatever was drawn at the moment of freeze.

It first manifested on a system update that included the 5.2 kernel and libinput drivers, and for a long time I thought it was related to input devices, then started playing around with video drivers. Interestingly even using the vesa driver and removing xf86-video-intel did not fix this. Downgrading to DRI2 and the fallback uxa acceleration mode seemed to almost fix it but not quite. The freeze just moved to an hour or more out instead of happening in minutes, and in the mean time the performance was abominable.

With the above comments in mind I went back to the intel driver and added i915.enable_psr=0 to my EFI boot options, and wend back to DRI3 + the default acceleration. It's been 30+ minutes with no freeze and normal performance. That hasn't happened in the last several weeks.
Comment by François Guerraz (kubrick) - Tuesday, 06 August 2019, 13:02 GMT
@caleb, yes same issue

The patch is currently in limbo (and user are suffering) because intel devs didn't format their PR properly so it's unlikely to land upstream before weeks.
See https://patchwork.freedesktop.org/patch/319173/?series=63774&rev=4

So arch maintainers should path the kernel themselves until this is fixed upstream. I have been using it for weeks and it works great (I am the "tested-by" tag on the patch).
Comment by loqs (loqs) - Tuesday, 06 August 2019, 14:54 GMT
@kubrick you could contact Jan Alexander Steffens (heftig) on IRC or email to see if he will add the commit before it is added to linux stable.
The patch is not queued for 5.2.7, I would suggest contacting Sasha Levin or Greg Kroah-Hartman to see if the backport is intended to be queued for 5.2.8
possibly by replying to following thread using the instructions included in the thread https://lore.kernel.org/stable/20190731192331.GA17697@sasha-vm/
Comment by Emantor (Emantor) - Wednesday, 07 August 2019, 03:58 GMT
@kubrick this is incorrect. Intel engineers had some vacation coordinator maintainer mishap, but send out a pull on 02.08.2019: https://lists.freedesktop.org/archives/dim-tools/2019-August/001312.html which resulted in a PR for Linus from Daniel on the same day. The fix therefore is in Linus tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/gpu/drm/i915?id=6d61f716a01ec0e134de38ae97e71d6fec5a6ff6 which should make inclusion in stable possible.
Comment by François Guerraz (kubrick) - Wednesday, 07 August 2019, 07:06 GMT
I had missed that PR as the conversation on freedesktop ended quite abruptly. Let's hope it makes it to the next stable.
Comment by loqs (loqs) - Wednesday, 07 August 2019, 19:46 GMT Comment by loqs (loqs) - Saturday, 10 August 2019, 14:37 GMT
Can you confirm the issue is resolved in linux 5.2.8.arch1-1? (currently in testing)
Comment by Kochi (pelopor) - Sunday, 11 August 2019, 16:38 GMT
At least, without "i915.enable_psr = 0" my XPS (9350) with skylake works for more than 10 hours without any error.

Loading...