FS#65392 - [linux] i915 driver issue. System freezes after updating to kernel 5.5.1/5.5.2

Attached to Project: Arch Linux
Opened by cnotis (cnotis) - Wednesday, 05 February 2020, 19:13 GMT
Last edited by freswa (frederik) - Thursday, 10 September 2020, 20:56 GMT
Task Type Bug Report
Category Packages: Core
Status Closed
Assigned To Tobias Powalowski (tpowa)
Jan Alexander Steffens (heftig)
Architecture x86_64
Severity Critical
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 18
Private No

Details

Description:

After updating the system, I've noticed consistent system hangs/freezes after at most 1 hour. Using the linux kernel 5.4.15, my system working fine.
As i can understand from journalctl output, it is i915 driver issue.

Please find attached the journalctl kernel output

This task depends upon

Closed by  freswa (frederik)
Thursday, 10 September 2020, 20:56 GMT
Reason for closing:  Fixed
Additional comments about closing:  linux 5.5.4-arch1-1
Comment by loqs (loqs) - Wednesday, 05 February 2020, 20:16 GMT Comment by loqs (loqs) - Thursday, 06 February 2020, 01:03 GMT
Thank you got filing the upstream report https://gitlab.freedesktop.org/drm/intel/issues/1146

AUR contains a PKGBUILD for drm-tip [1]

[1] https://aur.archlinux.org/packages/linux-drm-tip-git/
Comment by figue (figue) - Saturday, 08 February 2020, 16:39 GMT
I have this freeze bug too. Tested several kernels in v5.5 and I can't work with them. @loqs did you have time to try drm-tip kernel? Patch posted doesn't apply in v5.5.

In the meantime I go back to old release in my "production" laptop.
Comment by loqs (loqs) - Saturday, 08 February 2020, 17:00 GMT
@figure no I do not have the hardware to test on luman encountering a different i915 issue could not get drm-tip to boot so far https://bugs.archlinux.org/task/64725#comment186152
Comment by cnotis (cnotis) - Sunday, 09 February 2020, 15:52 GMT
Built from AUR the linux-drm-tip version 5.5.0-1-drm-tip-git-gc53ff44eb14e.
It's been running for 4 hours with no issues so far.
Comment by Ioan Loosley (ioangogo) - Friday, 14 February 2020, 04:06 GMT
According to phoronix some patches where missed in the latest point release that would have mitigated these hangs

https://www.phoronix.com/scan.php?page=news_item&px=Linux-5.5-Intel-Missed-Graphics
Comment by loqs (loqs) - Saturday, 15 February 2020, 16:12 GMT
@ionagogo they are included in 5.5.4.arch1-1 [1] currently in testing can you reproduce the issue using that version?

[1] https://git.archlinux.org/linux.git/log/?h=v5.5.4-arch1
Comment by figue (figue) - Saturday, 15 February 2020, 23:23 GMT
Running all afternoon 5.5.4-arch1-1 in my lenovo laptop. Seems good. No freezes at all for now. Thank you!
Comment by cnotis (cnotis) - Sunday, 16 February 2020, 18:10 GMT
The latest update 5.5.4-arch1-1 is working fine for me.
Comment by George Shearer (docdrow) - Monday, 17 February 2020, 20:28 GMT
I'm using 5.5.4-arch1-1 on a Razer Blade Stealth 2019 and still having lockups. :(

00:02.0 VGA compatible controller: Intel Corporation UHD Graphics 620 (Whiskey Lake) (rev 02)


Comment by loqs (loqs) - Monday, 17 February 2020, 20:41 GMT
@docdrow what is the dmesg from the lockup?
Comment by George Shearer (docdrow) - Sunday, 23 February 2020, 00:27 GMT
Doesn't seem to be much of interest in my dmesg :( Video freezes solid, but I can still ssh in.
   dmesg.txt (114.3 KiB)
Comment by George Shearer (docdrow) - Sunday, 23 February 2020, 00:29 GMT
Note that it has happened again with: 5.5.5-arch1-1

Comment by loqs (loqs) - Sunday, 23 February 2020, 00:47 GMT
I suspect this is a separate issue from cnotis as the i915 driver is not detecting any issues with the GPU.
You can increase drm logging with drm.debug=0x1e log_buf_len=1M [1]

[1] https://gitlab.freedesktop.org/drm/intel/-/wikis/How-to-file-i915-bugs
Comment by Paul Kerry (paulkerry) - Sunday, 23 February 2020, 20:12 GMT
@docdrow: your dmesg.txt also shows "bbswitch" and "Optimus" lines so you may have a problem there?
I agree with @loqs that your issue looks different from the OP as you would be getting error lines in dmesg.
Comment by Georgi Mitsov (Elemag) - Monday, 24 February 2020, 20:13 GMT
I still have the same problem. It has creeped all the way back to the 4.19 LTS series and it is definitely something with the Intel 915 driver.
I have reliably reproduced the problem by opening a project in Intellij which appears to be causing too many graphical updates with Scala syntax colouring.

Using LTS kernel does not solve it.
Kernel downgrade to previous LTS (4.19) does not solve it.
Firmware downgrade of linux-firmware to 20200122 does not solve it.

There is something fixed in the latest 5.5 kernel such as sometimes the system does not completely freeze but restarts all graphics (as in reentering runlevel 5)

As a last resort I have disabled compositing in Plasma and so far have no issues.
Comment by loqs (loqs) - Monday, 24 February 2020, 21:03 GMT Comment by Georgi Mitsov (Elemag) - Tuesday, 25 February 2020, 13:30 GMT
@loqs Yes, I did -> https://gitlab.freedesktop.org/drm/intel/issues/1313
My problem seems quite similar to @docdrow's however I mostly got complete system freezes. Disabling compositing stops the complete freezes and I am left with a graphics restart only.
Comment by George Shearer (docdrow) - Thursday, 27 February 2020, 15:36 GMT
Update: It just happened again with the new 5.5.6-arch1-1

Usually happens when I'm using Firefox. The video just freezes. I can still SSH into my laptop from another device, no errors in Xorg log, no errors in dmesg.. just frozen video.

When the laptop reboots, there's a "ghost" of the frozen screen on the display -- and even fresh booting doesn't completely get rid of the ghost it only goes away after 15 minutes of regular use.

Weird
Comment by Paul Kerry (paulkerry) - Saturday, 29 February 2020, 13:01 GMT
@docdrow - maybe you are suffering from the "there are over 400 patches queued for 5.7" issue as mentioned on...
https://gitlab.freedesktop.org/drm/intel/issues/1201

Upstream kernel 5.5.7 has some i915 updates: maybe that will help when it's pushed out?
Have you tried using the 4.19.* kernel branch?

Cheers
Paul.
Comment by Georgi Mitsov (Elemag) - Monday, 02 March 2020, 15:18 GMT
I have tried 5.5.7 and while it does seem more stable, it still freezes, compositing or no compositing. I get freezes even with 4.19 now, don't know if it is due to a firmware update or something.
I have switched to the discrete graphics as the situation with the integrated is unbearable.
Of course, the logs are clean.
Comment by George Shearer (docdrow) - Tuesday, 03 March 2020, 15:42 GMT
Still happens wit 5.5.7 for me. :(
Comment by George Shearer (docdrow) - Tuesday, 10 March 2020, 14:09 GMT
Has not happened yet with 5.5.8 :) keeping fingers crossed.
Comment by Marcel Korpel (Marcel-) - Tuesday, 10 March 2020, 14:18 GMT
At the upstream bug report someone still has the issue with 5.6.0-rc5: https://gitlab.freedesktop.org/drm/intel/issues/1201#note_431854
Comment by Mikhail Foenko (mfoenko) - Friday, 13 March 2020, 07:18 GMT
@docdrow i was having the exact same issues as you with the Razer Blade Stealth with i7 8565U.

I ended up resolving the issue by clean installing arch, then installing linux-lts419 from aur and booting into that kernel instead. Keeping an eye out to see if the newest kernel resolves the issues, but in the meantime, i'm extremely happy to get this laptop working.
Comment by George Shearer (docdrow) - Friday, 13 March 2020, 12:38 GMT
My fellow archlinuxians.. I'm happy to report zero crashes in 5.5.8 after 24+ hours of uptime and use. once in awhile I see a screen "glitch" when flipping through tabs in Firefox. The glitch appears as a band of garbage pixels, but it's very brief and barely noticeable. Prior to 5.5.8 when this happened, video would usually freeze. So, I'm hesitant to say it's 100% fixed but I'll settle for no freezes.
Comment by George Shearer (docdrow) - Friday, 13 March 2020, 12:38 GMT
PS -- I use Xorg and Awesome window manager with no extras.
Comment by Pim Otte (aureianimus) - Wednesday, 18 March 2020, 09:48 GMT
I'm currently on 5.5.9 and its still an issue when using intellij for me. When without intellij I do sometimes get graphical glitches, but no crashes so far.
Comment by Kacper Kopczyński (capsel) - Thursday, 19 March 2020, 15:21 GMT
Have you tried disabling DRI3 (use DRI2 instead)?
Did have some problems, perhaps minute in comparison with yours, and DRI2, TearFree helped in my case.
Take a look here: https://wiki.archlinux.org/index.php/intel_graphics
Comment by George Shearer (docdrow) - Wednesday, 25 March 2020, 16:30 GMT
I did have a lockup yesterday on 5.5.10. And I still have those weird occasional screen glitches that show up (garbage, random colors) for a split second when doing anything heavy. But its far far less likely to happen now which has significantly the likelihood of this laptop being tossed out a window.
Comment by Nicola Fontana (ntd) - Tuesday, 14 April 2020, 19:20 GMT
Not sure it is the same issue: I had several locks in the past month due to i915, a couple of times per day, in the way described (X server locked but SSH still working) and that issue is still present today (5.6.3-arch1). My log is slightly different from the OP, but clearly related to i915. Let me know if I need to open a new bug as this one seems to group different issues.

I opened a bug upstream: https://gitlab.freedesktop.org/drm/intel/-/issues/1700
In short I've been told my issue has been resolved in drm-tip by commit 614654abe847a42fc75d7eb5096e46f796a438b6 but that commit is still not included in archlinux stock kernel. I just rebased it and I'm compiling a custom kernel to see if this solves the issue.
Comment by loqs (loqs) - Saturday, 25 April 2020, 03:01 GMT
@ntd did you test if drm-tip resolves the X server freezes you noted in [1]?

[1] https://gitlab.freedesktop.org/drm/intel/-/issues/1700#note_464734
Comment by Nicola Fontana (ntd) - Saturday, 25 April 2020, 09:01 GMT
@loqs No, as stated I just downgraded to 5.4.15-arch1 to have an usable system (I need to have that PC working). Furthermore, I really don't know what are the differences between drm-tip and the stock arch kernel, so I don't know if such test could be relevant.
Comment by loqs (loqs) - Saturday, 25 April 2020, 23:41 GMT
If the issue can not be reproduced under drm-tip then it would imply some change applied to drm-tip resolved the issue and it would confirm [1]
You could as an intermediate step try 5.7rc2 which does not contain 614654abe847a42fc75d7eb5096e46f796a438b6
drm-tip is currently 5.7rc2 plus drm changes for 5.8 plus additional fixes. As an integration branch drm-tip is not merged to mainline.

[1] https://gitlab.freedesktop.org/drm/intel/-/issues/1700#note_463875
Edit:
If you want to test that 614654abe847a42fc75d7eb5096e46f796a438b6 fixed the issue. Test 614654abe847a42fc75d7eb5096e46f796a438b6 then its parent.
Comment by loqs (loqs) - Monday, 15 June 2020, 02:40 GMT
@ntd 5.7.2 contains af23facc38c26ac188b6adac2cae2dafaca0c82d [1] which is the fix mentioned in your upstream bug report.
Is the issue resolved?

[1] https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=af23facc38c26ac188b6adac2cae2dafaca0c82d
Comment by itsme (itsme) - Monday, 15 June 2020, 02:57 GMT
Still have the problem on ThinkPad X1 Gen 7 with 5.7.2.

00:02.0 VGA compatible controller: Intel Corporation UHD Graphics 620 (Whiskey Lake) (rev 02)
Comment by loqs (loqs) - Monday, 15 June 2020, 16:35 GMT
@itsme if your issue has the same call trace as in [1] can you please update that bug report noting af23facc38c26ac188b6adac2cae2dafaca0c82d did not fix the issue.

[1] https://gitlab.freedesktop.org/drm/intel/-/issues/1700
Comment by Nicola Fontana (ntd) - Friday, 19 June 2020, 17:57 GMT
@loqs I'm using 5.7.2-arch1-1 since yesterday without issues. I'll wait a couple of days more before updating upstream, just to be on the safe side.
Comment by itsme (itsme) - Sunday, 21 June 2020, 12:08 GMT
@loqs I have another problem, the interface freezes for a short time, usually for 3-7 seconds, but sometimes it happens for a minute or more.

I will try to catch the problem and write to the tracker.
Comment by loqs (loqs) - Sunday, 21 June 2020, 13:04 GMT
See [1] for details to include on the bug report.

[1] https://gitlab.freedesktop.org/drm/intel/-/wikis/How-to-file-i915-bugs
Comment by Nicola Fontana (ntd) - Monday, 22 June 2020, 08:27 GMT
I just closed the upstream bug [1] because (at least in my case) the problem seems to be resolved.

[1] https://gitlab.freedesktop.org/drm/intel/-/issues/1700
Comment by Piotr Bochenski (pit) - Wednesday, 24 June 2020, 13:50 GMT
i915 problems are not related to kernel only, there must be something else broken. Even with kernel downgraded to 5.4.15-arch1-1 I'm still getting occasional hangs.
Relevant logs attached. HW is i5-6300U with HD Graphics 520. Everything up to date besides kernel.
Comment by loqs (loqs) - Wednesday, 24 June 2020, 15:32 GMT
@pit is the issue present under 5.7.5-arch1-1, 5.8-rc2 and drm-tip? The journal extract you posted does not contain all the kernel messages from that boot.
Comment by Piotr Bochenski (pit) - Wednesday, 24 June 2020, 15:38 GMT
@loqs The issue is present in current Arch state with kernel rolled back to 5.4.15-arch1-1
Instead of trying some testing/alpha/debug/etc. packages, I'd prefer to revert some packages to have a stable OS...
Comment by loqs (loqs) - Wednesday, 24 June 2020, 16:15 GMT
The current arch packages are linux-lts is 5.4.48-1 and linux 5.7.15. Have you tested with both those versions?
Which kernel was the journal you posted produced by? The extract did not contain that information.
Why do you believe it is a packaging issue?
Comment by Piotr Bochenski (pit) - Wednesday, 24 June 2020, 16:23 GMT
> Have you tested with both those versions?
No, I've reverted directly to 5.4.15-arch1-1 because I had used it previously with no issues at all.

> Which kernel was the journal you posted produced by?
5.4.15-arch1-1

> Why do you believe it is a packaging issue?
Because it did not happen on that kernel before upgrade.
I had a stable OS with packages form ~January 2020, upgraded it this Monday. I've experienced a GPU hang and came across this bug. As mentioned by OP, kernel 5.4.15-arch1-1 was supposed to not be affected by this issue, so I've reverted to it right away and that step did not solve the hangs. This leads me to the conclusion the Intel issues might also be not related to the kernel itself.
Comment by loqs (loqs) - Wednesday, 24 June 2020, 16:51 GMT
See [1] packing issues are issues caused by how the package is produced / integrated / distributed not issues caused by upstream bugs.

The original report as you mention reported no issues under 5.4.15 while your extract shows issues. Is it not a separate issue?
You upgraded on Monday from what kernel version to what kernel version? You did not include the pacman.log for the update.
You could revert all the packages you updated and see if you can reproduce the issue.

If you want support identifying the cause of your issue please use support channels such as the forums, mailing list, IRC.

[1] https://wiki.archlinux.org/index.php/Bug_reporting_guidelines#Upstream_or_Arch?
Comment by George Shearer (docdrow) - Monday, 31 August 2020, 21:04 GMT
Sadly, I have to report that this bug is back and just as bad if not worse. Started happening fairly consistently in 5.8. :(

Comment by loqs (loqs) - Monday, 31 August 2020, 21:39 GMT
@docdrow have you reported the issue upstream so it can be resolved?

Loading...