FS#77596 - [linux] 6.2 fails to boot on F2FS root

Attached to Project: Arch Linux
Opened by Gereon Schomber (IncredibleLaser) - Tuesday, 21 February 2023, 09:06 GMT
Last edited by Buggy McBugFace (bugbot) - Saturday, 25 November 2023, 20:22 GMT
Task Type Bug Report
Category Packages: Testing
Status Closed
Assigned To Jan Alexander Steffens (heftig)
Architecture x86_64
Severity High
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 2
Private No

Details

Description:
After installing Linux 6.2 from [testing], my system wouldn't fully boot anymore. My F2FS root was mounted RO (marked RW in /etc/fstab) so no logs were saved. I got a TTY at one point but since the system was RO, not much could be done there. I rebooted to a live USB system and installed 6.1.12 from [core] which continues to work.

Additional info:
* package version(s): Linux 6.2
* /etc/fstab root entry:

# /dev/nvme0n1p2 LABEL=root
UUID=15fddbaa-f25b-4a84-b048-607af48664ae / f2fs rw,relatime,lazytime,background_gc=on,no_heap,inline_xattr,inline_data,inline_dentry,flush_merge,extent_cache,mode=adaptive,active_logs=6,alloc_mode=default,fsync_mode=posix 0 0


Steps to reproduce:
* Install Linux 6.2 on F2FS root
* Reboot
This task depends upon

Closed by  Buggy McBugFace (bugbot)
Saturday, 25 November 2023, 20:22 GMT
Reason for closing:  Moved
Additional comments about closing:  https://gitlab.archlinux.org/archlinux/p ackaging/packages/linux/issues/9
Comment by Jérôme Mahuet (Rydgel) - Tuesday, 21 February 2023, 10:42 GMT
I can confirm having the same issue.
Also the bug happen with a self compiled kernel, so the issue is upstream.
Comment by loqs (loqs) - Wednesday, 22 February 2023, 20:38 GMT
Is there an upstream bug report? Have you bisected [1] the kernel to find the causal commit?

[1] https://wiki.archlinux.org/title/Kernel#Debugging_regressions
Comment by Gereon Schomber (IncredibleLaser) - Thursday, 23 February 2023, 07:55 GMT
I can't report against upstream since that would break the rules ("Please use your distribution's bug tracking tools — This bugzilla is for reporting bugs against upstream Linux kernels.").

I have not bisected the kernel yet.
Comment by Gereon Schomber (IncredibleLaser) - Thursday, 23 February 2023, 13:25 GMT
I just tried and I can't bisect the kernel because building fails because of the bug discussed here0fbad67e-c359-47c3-8c10-faa003e6519f@app.fastmail.com/T/"> https://lore.kernel.org/bpf/0fbad67e-c359-47c3-8c10-faa003e6519f@app.fastmail.com/T/
Comment by loqs (loqs) - Thursday, 23 February 2023, 14:21 GMT
I have not encountered that issue with building the kernel. 6.2-1.4 is https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 6.2 tag with no additional commits using Arch's config.
https://drive.google.com/file/d/1KutGCid-3xO_kBNDNB-txw_5dMEGviWC/view?usp=share_link linux-6.2-1.4-x86_64.pkg.tar.zst
https://drive.google.com/file/d/1m_SGqSnJbis9_3jtoHlCzLNWeVRohvQy/view?usp=share_link linux-headers-6.2-1.4-x86_64.pkg.tar.zst

This is again upstream no additions from the commit before the f2fs pull was merged
https://drive.google.com/file/d/1aEGho2uJBalKr2PpHJ7umYsB1uz_hKYc/view?usp=share_link linux-6.1.r10907.geb67d239f3aa-1-x86_64.pkg.tar.zst
https://drive.google.com/file/d/1CxweMIFLB66ehOK8RfapxAPvAp-OcR39/view?usp=share_link linux-headers-6.1.r10907.geb67d239f3aa-1-x86_64.pkg.tar.zst
Comment by Gereon Schomber (IncredibleLaser) - Thursday, 23 February 2023, 14:45 GMT
Thanks for your efforts. 6.2-1.4 doesn't work while the one from before the f2fs pull merge does.

I have attached dmesg output from the non-working boot but I don't see any real hints in there, it complains that the FS is RO but not why it was mounted RO.

Edit: The reason is because my kernel command line did not contain a rw option. Previously, this would lead to the kernel mounting root as RW anyways. It seems this got changed. Also, `/usr/share/systemd/bootctl/arch.conf` does not list the option by default, and this is the template I used when I installed the system.
   dmesg (99.8 KiB)
Comment by loqs (loqs) - Thursday, 23 February 2023, 15:21 GMT
[ 0.059302] Kernel command line: initrd=\amd-ucode.img initrd=\initramfs-linux.img root=PARTLABEL=root rootfstype=f2fs add_efi_memmap libata.allow_tpm=1 resume="PARTUUID=8ba3ef2a-6ab4-4702-98a5-cdb530063828"

If you set it rw does the boot succeed or the error change?
Comment by Gereon Schomber (IncredibleLaser) - Thursday, 23 February 2023, 15:23 GMT
Yeah, I added the option and it boots fine. I added that to my previous post some minutes ago; previous versions of the kernel didn't care, and it's not included by default in the provided template.
Comment by loqs (loqs) - Thursday, 23 February 2023, 15:31 GMT
6.2 included [1] if you set rootflags=noflush_merge can you then boot with ro?

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=967eaad1fed5f6335ea97a47d45214744dc57925
Comment by Gereon Schomber (IncredibleLaser) - Thursday, 23 February 2023, 15:38 GMT
No, removing rw and adding that option leads to the same failed boot as without it.
Comment by loqs (loqs) - Thursday, 23 February 2023, 15:53 GMT
Is the error in dmesg still "F2FS-fs (nvme0n1p2): FLUSH_MERGE not compatible with readonly mode"?
Comment by Raman Mohan (mohan43u) - Friday, 03 March 2023, 13:56 GMT
I cofirm this issue still exist. adding "rw" kernel parameter fixes this issue
Comment by Quentin Bouget (ypsah) - Tuesday, 07 March 2023, 02:57 GMT
Hit this myself, fixed it by removing the flush_merge option from /etc/fstab.

Documentation of bootparam(7) says the following about ro & rw [0]:

> The 'ro' option tells the kernel to mount the root
> filesystem as 'read-only' so that filesystem consistency
> check programs (fsck) can do their work on a quiescent
> filesystem. No processes can write to files on the
> filesystem in question until it is 'remounted' as
> read/write capable, for example, by 'mount -w -n -o
> remount /'. (See also mount(8).)

> The 'rw' option tells the kernel to mount the root
> filesystem read/write. This is the default.

Do we know why the default appears to be ro now?
Assuming there's a good reason for that change, I wouldn't consider manually setting rw to be a satisfying fix.
Removing the flush_merge option like I did isn't better either, documentation makes it sound rather useful [1]:

> Merge concurrent cache_flush commands as much as possible
> to eliminate redundant command issues. If the underlying
> device handles the cache_flush command relatively slowly,
> recommend to enable this option.

[0] https://github.com/mkerrisk/man-pages/blob/master/man7/bootparam.7#L173
[1] https://www.kernel.org/doc/Documentation/filesystems/f2fs.txt
Comment by Jan Alexander Steffens (heftig) - Tuesday, 07 March 2023, 03:23 GMT Comment by Quentin Bouget (ypsah) - Tuesday, 07 March 2023, 03:56 GMT
Looks like it. Thanks for the references.
Comment by Thomas Weidner (thomas001le) - Monday, 13 March 2023, 07:39 GMT Comment by loqs (loqs) - Monday, 13 March 2023, 21:58 GMT
@thomas001le this is going round in a circle to https://bugs.archlinux.org/task/77596#comment215496
If f2fs can not support FLUSH_MERGE in read-only mode then enforcing such a requirement does not seem to be a bug, the alternative would be to drop the option.
If ro plus rootflags=noflush_merge assuming rootflags=noflush_merge is valid still produces the issue that would appear to be a bug.
This is assuming a workflow of mount read-only with noflush then fsck and the fstab entry can change it to read-write with flush_merge if desired.
Comment by Thomas Weidner (thomas001le) - Wednesday, 15 March 2023, 08:25 GMT
@loqs, sorry I had missed that earlier comment.

So if flush_merge + ro does not work, it seems to default back to noflush_merge if the option is not explicitly set and just fail if flush_merge is set. That sounds sensible. And indeed, rootflags=flush_merge won't work since it defaults to ro, I am with you here.
IMO it would still be more sensible to just disable flush_merge instead of failing to mount the root filesystem...

But for me remounting the root filesystem in rw didn't work since i had flush_merge in fstab. So mount -o rw,remount / would produce the same error message, so it looks like the kernel code checks the current ro/rw state and not the future one.

$ cat /proc/cmdline
initrd=\intel-ucode.img initrd=\initramfs-linux.img root=UUID=df0fb79c-d647-41a5-9284-7c211cd9512c rootfstype=f2fs add_efi_memmap

$ systemctl status systemd-remount-fs.service
[...]
Mar 15 09:18:17 kiste2 systemd-remount-fs[206]: /usr/bin/mount for / exited with exit status 32.
[...]

$ dmesg | grep f2fs
[ 2.703924] F2FS-fs (nvme0n1p2): FLUSH_MERGE not compatible with readonly mode

$ mount -o rw,remount /
mount: /: mount point not mounted or bad option.
dmesg(1) may have more information after failed mount system call.

$ mount -o rw,remount,noflush_merge /
[ no output, it worked! ]

$ mount -o remount,flush_merge /
[ it worked again! ]

$ mount | grep f2fs
/dev/nvme0n1p2 on / type f2fs (rw,relatime,lazytime,background_gc=on,discard,no_heap,user_xattr,inline_xattr,acl,inline_data,inline_dentry,flush_merge,barrier,extent_cache,mode=adaptive,active_logs=6,alloc_mode=default,checkpoint_merge,fsync_mode=posix,discard_unit=block,memory=normal)

That confirms it, you can't remount with rw and flush_merge at the same time right now, you have to do it in 2 steps. This breaks systemd remounting the root filesystem rw. Maybe this is a different bug, but the root cause seems to be the same kernel change.
Comment by Jan Alexander Steffens (heftig) - Thursday, 27 July 2023, 23:06 GMT
Has this been resolved?
Comment by Quentin Bouget (ypsah) - Thursday, 03 August 2023, 09:51 GMT
As of 6.4.7.arch1-2, no.

Loading...