FS#63909 - [linux] 5.3 Random freezes
Attached to Project:
Arch Linux
Opened by Daniel Holz (holzi) - Tuesday, 24 September 2019, 18:45 GMT
Last edited by freswa (frederik) - Friday, 21 February 2020, 22:06 GMT
Opened by Daniel Holz (holzi) - Tuesday, 24 September 2019, 18:45 GMT
Last edited by freswa (frederik) - Friday, 21 February 2020, 22:06 GMT
|
Details
Description:
Random freezes since 5.3. They vary from one to around 15 seconds. The hole system becomes unresponsive and the screen freezes. Reinstalling kernel 5.2.14 fixes the issues. They appear on battery and ac. I tried disabling tlp but that did not help. Additional info: I added the output of journalctl for the timespan of the freezes. Steps to reproduce: I have just to use my computer. Sometimes they come earlier, sometimes only after days. When they appear the are triggered by file system operations or videos in chromium. |
This task depends upon
Closed by freswa (frederik)
Friday, 21 February 2020, 22:06 GMT
Reason for closing: None
Additional comments about closing: This seems pretty stalled to me. If it's still an issue, please fill a re-open request. Thank you :)a
Friday, 21 February 2020, 22:06 GMT
Reason for closing: None
Additional comments about closing: This seems pretty stalled to me. If it's still an issue, please fill a re-open request. Thank you :)a
For me it is immediately triggered when under heavy i/o load, i.e. moving 15GB of data from one SATA SSD to another SATA SSD, or extracting large archives.
This is on a desktop machine (i7-4790k) using the integrated graphics. Will try to reproduce on my other machines.
I'm using root on zfs btw, curious if other people with these freeze issues are using zfs too.
Can someone try the zen kernel? I am using it for 5 days now and so far there were no freezes for me.
Another user on the forums noted the same symptoms I was experiencing: https://bbs.archlinux.org/viewtopic.php?pid=1865957#p1865957
I got it "fixed" by disabling any swap-file/partition. Changing swappiness settings didn't help.
Maybe this helps anyone, now running linux-lts because I need my swap space back...
Tried disabling tlp, which didn't help and then disabling swap fixed it. There was nothing in journalctl while stutters were happening.
commit 1e04eb03877c3e0a38c1be1845be97074a1198b6
Author: Damien Le Moal <damien.lemoal@wdc.com>
Date: Wed Aug 28 13:40:20 2019 +0900
block: mq-deadline: Fix queue restart handling
commit cb8acabbe33b110157955a7425ee876fb81e6bbc upstream.
Commit 7211aef86f79 ("block: mq-deadline: Fix write completion
handling") added a call to blk_mq_sched_mark_restart_hctx() in
dd_dispatch_request() to make sure that write request dispatching does
not stall when all target zones are locked. This fix left a subtle race
when a write completion happens during a dispatch execution on another
CPU:
CPU 0: Dispatch CPU1: write completion
dd_dispatch_request()
lock(&dd->lock);
...
lock(&dd->zone_lock); dd_finish_request()
rq = find request lock(&dd->zone_lock);
unlock(&dd->zone_lock);
zone write unlock
unlock(&dd->zone_lock);
...
__blk_mq_free_request
check restart flag (not set)
-> queue not run
...
if (!rq && have writes)
blk_mq_sched_mark_restart_hctx()
unlock(&dd->lock)
Since the dispatch context finishes after the write request completion
handling, marking the queue as needing a restart is not seen from
__blk_mq_free_request() and blk_mq_sched_restart() not executed leading
to the dispatch stall under 100% write workloads.
Fix this by moving the call to blk_mq_sched_mark_restart_hctx() from
dd_dispatch_request() into dd_finish_request() under the zone lock to
ensure full mutual exclusion between write request dispatch selection
and zone unlock on write request completion.
Fixes: 7211aef86f79 ("block: mq-deadline: Fix write completion handling")
Cc: stable@vger.kernel.org
Reported-by: Hans Holmberg <Hans.Holmberg@wdc.com>
Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
I'm not so sure. I'm on 5.3.4 right now, and I had two 3-4 seconds freezes for a few hours. This looks better than before, but will see how it gonna work out later.
> I think I have disabled swap at some point while running 5.3, but that didn't help
Btw, yeah, disabling swap seems to fix it. It's reproducible with swap enabled.
I tried disabling swap at runtime with 'swapoff /dev/sda3' which turned swap off but didn't fix the freezing. I downgraded to linux-5.2.9.arch1-1 (which was the latest 5.2 kernel I had in my package cache) and the problem has gone away, so it seems to be related to the 5.3 kernel. I run an nvidia card with the nouveau driver (and don't have the intel driver installed), so that isn't the cause of my problem.
[1] https://www.kernel.org/doc/html/latest/admin-guide/reporting-bugs.html