FS#75144 - [linux] f2fs ckpt hang on

Attached to Project: Arch Linux
Opened by txyyh (txyyh) - Wednesday, 22 June 2022, 18:11 GMT
Last edited by Toolybird (Toolybird) - Friday, 07 October 2022, 05:42 GMT
Task Type Bug Report
Category Kernel
Status Closed
Assigned To No-one
Architecture x86_64
Severity High
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 1
Private No

Details

Description:


Additional info:
core/linux 5.18.5.arch1-1

[27156.182377] INFO: task f2fs_ckpt-8:2:186 blocked for more than 122 seconds.
[27156.182381] Not tainted 5.18.5-arch1-1 #1
[27156.182382] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[27156.182382] task:f2fs_ckpt-8:2 state:D stack: 0 pid: 186 ppid: 2 flags:0x00004000
[27156.182385] Call Trace:
[27156.182386] <TASK>
[27156.182387] __schedule+0x37c/0x11f0
[27156.182393] schedule+0x4f/0xb0
[27156.182395] rwsem_down_write_slowpath+0x1eb/0x5c0
[27156.182397] ? __schedule+0x384/0x11f0
[27156.182400] ? __checkpoint_and_complete_reqs+0x1b0/0x1b0 [f2fs c5c4fcb9d941679040657b635a74e01a990d7b7f]
[27156.182417] __checkpoint_and_complete_reqs+0x7a/0x1b0 [f2fs c5c4fcb9d941679040657b635a74e01a990d7b7f]
[27156.182431] ? __checkpoint_and_complete_reqs+0x1b0/0x1b0 [f2fs c5c4fcb9d941679040657b635a74e01a990d7b7f]
[27156.182444] issue_checkpoint_thread+0x4c/0x110 [f2fs c5c4fcb9d941679040657b635a74e01a990d7b7f]
[27156.182457] ? cpuacct_percpu_seq_show+0x20/0x20
[27156.182460] kthread+0xde/0x110
[27156.182463] ? kthread_complete_and_exit+0x20/0x20
[27156.182465] ret_from_fork+0x22/0x30
[27156.182468] </TASK>
[27156.182521] INFO: task kworker/u8:0:4011 blocked for more than 122 seconds.
[27156.182522] Not tainted 5.18.5-arch1-1 #1
[27156.182523] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[27156.182523] task:kworker/u8:0 state:D stack: 0 pid: 4011 ppid: 2 flags:0x00004000
[27156.182526] Workqueue: writeback wb_workfn (flush-8:0)
[27156.182529] Call Trace:
[27156.182530] <TASK>
[27156.182531] __schedule+0x37c/0x11f0
[27156.182533] schedule+0x4f/0xb0
[27156.182535] schedule_timeout+0x119/0x150
[27156.182537] __wait_for_common+0x94/0x1d0
[27156.182539] ? usleep_range_state+0x90/0x90
[27156.182541] f2fs_issue_checkpoint+0x11f/0x200 [f2fs c5c4fcb9d941679040657b635a74e01a990d7b7f]
[27156.182555] f2fs_balance_fs_bg+0x115/0x360 [f2fs c5c4fcb9d941679040657b635a74e01a990d7b7f]
[27156.182570] ? dd_prepare_request+0x20/0x20
[27156.182572] ? blk_mq_do_dispatch_sched+0xa4/0x3b0
[27156.182576] f2fs_write_node_pages+0x78/0x230 [f2fs c5c4fcb9d941679040657b635a74e01a990d7b7f]
[27156.182590] do_writepages+0xc4/0x1d0
[27156.182593] ? cpumask_next+0x22/0x30
[27156.182595] ? __percpu_counter_sum+0x5b/0x70
[27156.182598] __writeback_single_inode+0x3d/0x350
[27156.182599] writeback_sb_inodes+0x1ea/0x4a0
[27156.182602] __writeback_inodes_wb+0x4c/0xe0
[27156.182603] wb_writeback+0x204/0x2f0
[27156.182605] wb_workfn+0x36f/0x520
[27156.182607] process_one_work+0x1c7/0x380
[27156.182609] worker_thread+0x51/0x380
[27156.182611] ? rescuer_thread+0x3a0/0x3a0
[27156.182612] kthread+0xde/0x110
[27156.182614] ? kthread_complete_and_exit+0x20/0x20
[27156.182616] ret_from_fork+0x22/0x30
[27156.182619] </TASK>
[27279.062370] INFO: task f2fs_ckpt-8:2:186 blocked for more than 245 seconds.
[27279.062374] Not tainted 5.18.5-arch1-1 #1
[27279.062375] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[27279.062376] task:f2fs_ckpt-8:2 state:D stack: 0 pid: 186 ppid: 2 flags:0x00004000
[27279.062379] Call Trace:
[27279.062380] <TASK>
[27279.062382] __schedule+0x37c/0x11f0
[27279.062388] schedule+0x4f/0xb0
[27279.062391] rwsem_down_write_slowpath+0x1eb/0x5c0
[27279.062393] ? __schedule+0x384/0x11f0
[27279.062396] ? __checkpoint_and_complete_reqs+0x1b0/0x1b0 [f2fs c5c4fcb9d941679040657b635a74e01a990d7b7f]
[27279.062417] __checkpoint_and_complete_reqs+0x7a/0x1b0 [f2fs c5c4fcb9d941679040657b635a74e01a990d7b7f]
[27279.062430] ? __checkpoint_and_complete_reqs+0x1b0/0x1b0 [f2fs c5c4fcb9d941679040657b635a74e01a990d7b7f]
[27279.062443] issue_checkpoint_thread+0x4c/0x110 [f2fs c5c4fcb9d941679040657b635a74e01a990d7b7f]
[27279.062456] ? cpuacct_percpu_seq_show+0x20/0x20
[27279.062459] kthread+0xde/0x110
[27279.062462] ? kthread_complete_and_exit+0x20/0x20
[27279.062464] ret_from_fork+0x22/0x30
[27279.062467] </TASK>
[27279.062544] INFO: task kworker/u8:0:4011 blocked for more than 245 seconds.
[27279.062545] Not tainted 5.18.5-arch1-1 #1
[27279.062546] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[27279.062547] task:kworker/u8:0 state:D stack: 0 pid: 4011 ppid: 2 flags:0x00004000
[27279.062550] Workqueue: writeback wb_workfn (flush-8:0)
[27279.062552] Call Trace:
[27279.062553] <TASK>
[27279.062554] __schedule+0x37c/0x11f0
[27279.062557] schedule+0x4f/0xb0
[27279.062559] schedule_timeout+0x119/0x150
[27279.062560] __wait_for_common+0x94/0x1d0
[27279.062563] ? usleep_range_state+0x90/0x90
[27279.062564] f2fs_issue_checkpoint+0x11f/0x200 [f2fs c5c4fcb9d941679040657b635a74e01a990d7b7f]
[27279.062579] f2fs_balance_fs_bg+0x115/0x360 [f2fs c5c4fcb9d941679040657b635a74e01a990d7b7f]
[27279.062594] ? dd_prepare_request+0x20/0x20
[27279.062596] ? blk_mq_do_dispatch_sched+0xa4/0x3b0
[27279.062599] f2fs_write_node_pages+0x78/0x230 [f2fs c5c4fcb9d941679040657b635a74e01a990d7b7f]
[27279.062613] do_writepages+0xc4/0x1d0
[27279.062616] ? cpumask_next+0x22/0x30
[27279.062619] ? __percpu_counter_sum+0x5b/0x70
[27279.062621] __writeback_single_inode+0x3d/0x350
[27279.062623] writeback_sb_inodes+0x1ea/0x4a0
[27279.062626] __writeback_inodes_wb+0x4c/0xe0
[27279.062627] wb_writeback+0x204/0x2f0
[27279.062629] wb_workfn+0x36f/0x520
[27279.062631] process_one_work+0x1c7/0x380
[27279.062633] worker_thread+0x51/0x380
[27279.062635] ? rescuer_thread+0x3a0/0x3a0
[27279.062636] kthread+0xde/0x110
[27279.062638] ? kthread_complete_and_exit+0x20/0x20
[27279.062640] ret_from_fork+0x22/0x30
[27279.062643] </TASK>
[27401.942353] INFO: task f2fs_ckpt-8:2:186 blocked for more than 368 seconds.
[27401.942357] Not tainted 5.18.5-arch1-1 #1
[27401.942358] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[27401.942358] task:f2fs_ckpt-8:2 state:D stack: 0 pid: 186 ppid: 2 flags:0x00004000
[27401.942361] Call Trace:
[27401.942362] <TASK>
[27401.942363] __schedule+0x37c/0x11f0
[27401.942369] schedule+0x4f/0xb0
[27401.942371] rwsem_down_write_slowpath+0x1eb/0x5c0
[27401.942373] ? __schedule+0x384/0x11f0
[27401.942376] ? __checkpoint_and_complete_reqs+0x1b0/0x1b0 [f2fs c5c4fcb9d941679040657b635a74e01a990d7b7f]
[27401.942393] __checkpoint_and_complete_reqs+0x7a/0x1b0 [f2fs c5c4fcb9d941679040657b635a74e01a990d7b7f]
[27401.942407] ? __checkpoint_and_complete_reqs+0x1b0/0x1b0 [f2fs c5c4fcb9d941679040657b635a74e01a990d7b7f]
[27401.942420] issue_checkpoint_thread+0x4c/0x110 [f2fs c5c4fcb9d941679040657b635a74e01a990d7b7f]
[27401.942433] ? cpuacct_percpu_seq_show+0x20/0x20
[27401.942436] kthread+0xde/0x110
[27401.942439] ? kthread_complete_and_exit+0x20/0x20
[27401.942441] ret_from_fork+0x22/0x30
[27401.942444] </TASK>
[27401.942498] INFO: task kworker/u8:0:4011 blocked for more than 368 seconds.
[27401.942500] Not tainted 5.18.5-arch1-1 #1
[27401.942500] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[27401.942501] task:kworker/u8:0 state:D stack: 0 pid: 4011 ppid: 2 flags:0x00004000
[27401.942504] Workqueue: writeback wb_workfn (flush-8:0)
[27401.942507] Call Trace:
[27401.942507] <TASK>
[27401.942508] __schedule+0x37c/0x11f0
[27401.942511] schedule+0x4f/0xb0
[27401.942513] schedule_timeout+0x119/0x150
[27401.942515] __wait_for_common+0x94/0x1d0
[27401.942517] ? usleep_range_state+0x90/0x90
[27401.942519] f2fs_issue_checkpoint+0x11f/0x200 [f2fs c5c4fcb9d941679040657b635a74e01a990d7b7f]
[27401.942533] f2fs_balance_fs_bg+0x115/0x360 [f2fs c5c4fcb9d941679040657b635a74e01a990d7b7f]
[27401.942548] ? dd_prepare_request+0x20/0x20
[27401.942550] ? blk_mq_do_dispatch_sched+0xa4/0x3b0
[27401.942553] f2fs_write_node_pages+0x78/0x230 [f2fs c5c4fcb9d941679040657b635a74e01a990d7b7f]
[27401.942567] do_writepages+0xc4/0x1d0
[27401.942570] ? cpumask_next+0x22/0x30
[27401.942573] ? __percpu_counter_sum+0x5b/0x70
[27401.942575] __writeback_single_inode+0x3d/0x350
[27401.942577] writeback_sb_inodes+0x1ea/0x4a0
[27401.942579] __writeback_inodes_wb+0x4c/0xe0
[27401.942581] wb_writeback+0x204/0x2f0
[27401.942583] wb_workfn+0x36f/0x520
[27401.942585] process_one_work+0x1c7/0x380
[27401.942587] worker_thread+0x51/0x380
[27401.942588] ? rescuer_thread+0x3a0/0x3a0
[27401.942590] kthread+0xde/0x110
[27401.942592] ? kthread_complete_and_exit+0x20/0x20
[27401.942594] ret_from_fork+0x22/0x30
[27401.942597] </TASK>
[27524.822343] INFO: task f2fs_ckpt-8:2:186 blocked for more than 491 seconds.
[27524.822348] Not tainted 5.18.5-arch1-1 #1
[27524.822349] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[27524.822351] task:f2fs_ckpt-8:2 state:D stack: 0 pid: 186 ppid: 2 flags:0x00004000
[27524.822355] Call Trace:
[27524.822356] <TASK>
[27524.822358] __schedule+0x37c/0x11f0
[27524.822365] schedule+0x4f/0xb0
[27524.822368] rwsem_down_write_slowpath+0x1eb/0x5c0
[27524.822371] ? __schedule+0x384/0x11f0
[27524.822374] ? __checkpoint_and_complete_reqs+0x1b0/0x1b0 [f2fs c5c4fcb9d941679040657b635a74e01a990d7b7f]
[27524.822399] __checkpoint_and_complete_reqs+0x7a/0x1b0 [f2fs c5c4fcb9d941679040657b635a74e01a990d7b7f]
[27524.822421] ? __checkpoint_and_complete_reqs+0x1b0/0x1b0 [f2fs c5c4fcb9d941679040657b635a74e01a990d7b7f]
[27524.822441] issue_checkpoint_thread+0x4c/0x110 [f2fs c5c4fcb9d941679040657b635a74e01a990d7b7f]
[27524.822462] ? cpuacct_percpu_seq_show+0x20/0x20
[27524.822466] kthread+0xde/0x110
[27524.822470] ? kthread_complete_and_exit+0x20/0x20
[27524.822473] ret_from_fork+0x22/0x30
[27524.822478] </TASK>
[27524.822532] INFO: task kworker/u8:0:4011 blocked for more than 491 seconds.
[27524.822534] Not tainted 5.18.5-arch1-1 #1
[27524.822536] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[27524.822536] task:kworker/u8:0 state:D stack: 0 pid: 4011 ppid: 2 flags:0x00004000
[27524.822540] Workqueue: writeback wb_workfn (flush-8:0)
[27524.822544] Call Trace:
[27524.822545] <TASK>
[27524.822546] __schedule+0x37c/0x11f0
[27524.822550] schedule+0x4f/0xb0
[27524.822553] schedule_timeout+0x119/0x150
[27524.822555] __wait_for_common+0x94/0x1d0
[27524.822558] ? usleep_range_state+0x90/0x90
[27524.822561] f2fs_issue_checkpoint+0x11f/0x200 [f2fs c5c4fcb9d941679040657b635a74e01a990d7b7f]
[27524.822583] f2fs_balance_fs_bg+0x115/0x360 [f2fs c5c4fcb9d941679040657b635a74e01a990d7b7f]
[27524.822606] ? dd_prepare_request+0x20/0x20
[27524.822609] ? blk_mq_do_dispatch_sched+0xa4/0x3b0
[27524.822612] f2fs_write_node_pages+0x78/0x230 [f2fs c5c4fcb9d941679040657b635a74e01a990d7b7f]
[27524.822635] do_writepages+0xc4/0x1d0
[27524.822638] ? cpumask_next+0x22/0x30
[27524.822641] ? __percpu_counter_sum+0x5b/0x70
[27524.822645] __writeback_single_inode+0x3d/0x350
[27524.822648] writeback_sb_inodes+0x1ea/0x4a0
[27524.822652] __writeback_inodes_wb+0x4c/0xe0
[27524.822655] wb_writeback+0x204/0x2f0
[27524.822658] wb_workfn+0x36f/0x520
[27524.822661] process_one_work+0x1c7/0x380
[27524.822664] worker_thread+0x51/0x380
[27524.822666] ? rescuer_thread+0x3a0/0x3a0
[27524.822668] kthread+0xde/0x110
[27524.822671] ? kthread_complete_and_exit+0x20/0x20
[27524.822674] ret_from_fork+0x22/0x30
[27524.822678] </TASK>
[27647.702327] INFO: task f2fs_ckpt-8:2:186 blocked for more than 614 seconds.
[27647.702332] Not tainted 5.18.5-arch1-1 #1
[27647.702333] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[27647.702334] task:f2fs_ckpt-8:2 state:D stack: 0 pid: 186 ppid: 2 flags:0x00004000
[27647.702337] Call Trace:
[27647.702339] <TASK>
[27647.702340] __schedule+0x37c/0x11f0
[27647.702347] schedule+0x4f/0xb0
[27647.702350] rwsem_down_write_slowpath+0x1eb/0x5c0
[27647.702353] ? __schedule+0x384/0x11f0
[27647.702356] ? __checkpoint_and_complete_reqs+0x1b0/0x1b0 [f2fs c5c4fcb9d941679040657b635a74e01a990d7b7f]
[27647.702374] __checkpoint_and_complete_reqs+0x7a/0x1b0 [f2fs c5c4fcb9d941679040657b635a74e01a990d7b7f]
[27647.702388] ? __checkpoint_and_complete_reqs+0x1b0/0x1b0 [f2fs c5c4fcb9d941679040657b635a74e01a990d7b7f]
[27647.702401] issue_checkpoint_thread+0x4c/0x110 [f2fs c5c4fcb9d941679040657b635a74e01a990d7b7f]
[27647.702414] ? cpuacct_percpu_seq_show+0x20/0x20
[27647.702417] kthread+0xde/0x110
[27647.702420] ? kthread_complete_and_exit+0x20/0x20
[27647.702422] ret_from_fork+0x22/0x30
[27647.702425] </TASK>
[27647.702474] INFO: task kworker/u8:0:4011 blocked for more than 614 seconds.
[27647.702475] Not tainted 5.18.5-arch1-1 #1
[27647.702476] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[27647.702476] task:kworker/u8:0 state:D stack: 0 pid: 4011 ppid: 2 flags:0x00004000
[27647.702479] Workqueue: writeback wb_workfn (flush-8:0)
[27647.702482] Call Trace:
[27647.702483] <TASK>
[27647.702483] __schedule+0x37c/0x11f0
[27647.702486] schedule+0x4f/0xb0
[27647.702488] schedule_timeout+0x119/0x150
[27647.702490] __wait_for_common+0x94/0x1d0
[27647.702492] ? usleep_range_state+0x90/0x90
[27647.702494] f2fs_issue_checkpoint+0x11f/0x200 [f2fs c5c4fcb9d941679040657b635a74e01a990d7b7f]
[27647.702508] f2fs_balance_fs_bg+0x115/0x360 [f2fs c5c4fcb9d941679040657b635a74e01a990d7b7f]
[27647.702523] ? dd_prepare_request+0x20/0x20
[27647.702525] ? blk_mq_do_dispatch_sched+0xa4/0x3b0
[27647.702528] f2fs_write_node_pages+0x78/0x230 [f2fs c5c4fcb9d941679040657b635a74e01a990d7b7f]
[27647.702542] do_writepages+0xc4/0x1d0
[27647.702545] ? cpumask_next+0x22/0x30
[27647.702547] ? __percpu_counter_sum+0x5b/0x70
[27647.702550] __writeback_single_inode+0x3d/0x350
[27647.702552] writeback_sb_inodes+0x1ea/0x4a0
[27647.702554] __writeback_inodes_wb+0x4c/0xe0
[27647.702556] wb_writeback+0x204/0x2f0
[27647.702557] wb_workfn+0x36f/0x520
[27647.702559] process_one_work+0x1c7/0x380
[27647.702562] worker_thread+0x51/0x380
[27647.702563] ? rescuer_thread+0x3a0/0x3a0
[27647.702564] kthread+0xde/0x110
[27647.702566] ? kthread_complete_and_exit+0x20/0x20
[27647.702569] ret_from_fork+0x22/0x30
[27647.702571] </TASK>


Steps to reproduce:
This task depends upon

Closed by  Toolybird (Toolybird)
Friday, 07 October 2022, 05:42 GMT
Reason for closing:  Duplicate
Additional comments about closing:   FS#74906 
Comment by Tarsin Norbin (tarsin) - Thursday, 23 June 2022, 01:29 GMT
This f2fs_ckpt kernel workqueue hang on will cause all "sync" syscall hang on, thus systemd-shutdown will be blocked and we can't safely shutdown or reboot etc.
Comment by loqs (loqs) - Thursday, 23 June 2022, 04:56 GMT
Have you reported the issue upstream on either https://bugzilla.kernel.org/ or linux-f2fs-devel mailing list?
Comment by Maxim Sheviakov (mradermaxlol) - Saturday, 25 June 2022, 07:42 GMT
Experienced the same issue just yesterday with linux-5.18.2. I don't know whether any data loss has occurred as a result, but a systemd service symlink made by "systemctl enable" magically disappeared after a reboot (which happened with a hang, of course).
Comment by Tarsin Norbin (tarsin) - Saturday, 25 June 2022, 13:04 GMT
I guess all write operations won't be executed since f2fs_ckpt hang on, they would just stuck in kernel buffer and no writeback happen, actually many inodes are broken after I reboot.
Comment by Toolybird (Toolybird) - Monday, 01 August 2022, 07:46 GMT
Is this still happening? Unless someone reports it upstream then there is not much Arch can do here.
Comment by Emil Sauer Lynge (emillynge) - Monday, 01 August 2022, 20:50 GMT
I have the same issue.

temporarily switching to the linux-lts kernel (currently 5.15.55-2-lts) seems to have helped.
(remember to install linux-lts-header if you use dkms + update grub config)
Comment by Maxim Sheviakov (mradermaxlol) - Tuesday, 02 August 2022, 05:26 GMT
I have just moved the last machine I have access to back to good ol' ext4 after experiencing this clusterbuck f2fs has gotten into during the 5.18 cycle for a while. Guess I'm not of much help here anymore.

The only thing I can say (in retrospective) is that this issue seems to be present from kernel 5.18.2 onwards and it's pretty random, at least in my experience.
Comment by Sebastian Goth (sgoth) - Tuesday, 06 September 2022, 10:16 GMT
Just started to see this too on 5.19.6 always resulting in the issue  FS#74906  with f2fs_gc hogging a core.
Comment by Sebastian Goth (sgoth) - Tuesday, 06 September 2022, 18:33 GMT
5.17.9-arch1-1 (last 5.17 arch release) seems to work fine so far
Comment by Toolybird (Toolybird) - Friday, 07 October 2022, 05:42 GMT
This seems to be a dupe of  FS#74906 . Upstream is tracking the issue(s) in [1]

[1] https://bugzilla.kernel.org/show_bug.cgi?id=216050

Loading...