Arch Linux

Please read this before reporting a bug:
https://wiki.archlinux.org/title/Bug_reporting_guidelines

Do NOT report bugs when a package is just outdated, or it is in the AUR. Use the 'flag out of date' link on the package page, or the Mailing List.

REPEAT: Do NOT report bugs for outdated packages!
Tasklist

FS#44385 - [linux] boot problem with an unclean btrfs root filesystem

Attached to Project: Arch Linux
Opened by Csaba Miklos (micsuka) - Saturday, 28 March 2015, 19:13 GMT
Last edited by Tobias Powalowski (tpowa) - Wednesday, 08 April 2015, 06:53 GMT
Task Type Bug Report
Category Kernel
Status Closed
Assigned To Tobias Powalowski (tpowa)
Thomas Bächler (brain0)
Architecture x86_64
Severity Medium
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 2
Private No

Details

Description:

I have an Archlinux system with ext4 on /boot and btrfs on the root filesystem. The system doesn't boot up when the btrfs was uncleanly umounted.
It stucks on the following message:
:: running hook [udev]
:: Triggering uevents
:: performing fsck on '/dev/sda3'
/sbin/fsck.btrfs: BTRFS filesystem
:: mounting '/dev/sda3' on real root

When passing the debug parameter to the kernel, I can see this message on the output:

no db file to read /run/udev/data/+bdi:btrfs-1: No such file or directory

It can be solved by booting from the install dvd and simply mounting/umounting the partition.


Additional info:
kernel version: 3.19.2-1 or 3.14.36-1
systemd: 218-2

Steps to reproduce:
Simply reset the machine without cleanly restarting it.

This task depends upon

Closed by  Tobias Powalowski (tpowa)
Wednesday, 08 April 2015, 06:53 GMT
Reason for closing:  Fixed
Additional comments about closing:  3.19.3-2
Comment by Doug Newgard (Scimmia) - Saturday, 28 March 2015, 19:51 GMT
First off, the fsck hook is useless on a BTRFS root. I doubt that will change anything, though.
Comment by Csaba Miklos (micsuka) - Saturday, 28 March 2015, 20:06 GMT
Yes, the fsck hook is useless on a BTRFS root, but there are other non-btrfs partitions, those must be fsck-ed I guess.

You are right, removing the fsck hook doesn't change anything. I tried with fsck.mode=skip kernel parameter too.
Comment by Doug Newgard (Scimmia) - Sunday, 29 March 2015, 03:07 GMT
The other partitions don't matter, since they'll be fsck'd by the init system before being mounted. Only partitions mounted in the initramfs need the fsck hook.

Doesn't really matter, though, doesn't seem to have any bearing on this problem.
Comment by Csaba Miklos (micsuka) - Sunday, 29 March 2015, 13:39 GMT

I put some extra debugging messages in the initrd boot script and I found that it blocks in the mount syscall itself (default_mount_handler()/init_functions).
And when I mounted this dirty partition from the boot dvd (with the same parameters) I caught some btrfs related assertions in the dmesg (see attached). It mounted the fs without blocking and everything seemed to be ok, though.

So I'm pretty sure that it is a kernel problem.
   log.log (37.1 KiB)
Comment by AK (Andreaskem) - Sunday, 29 March 2015, 14:34 GMT
Your log is on kernel 3.17.6? So it happens on 3.14.36, 3.17.6, 3.19.2?

http://www.spinics.net/lists/linux-btrfs/msg40858.html

Sounds similar but I would think that fix would have found its way into a stable kernel by now?

edit: Indeed, that fix went in with 3.19rc4 (and some stable kernels).
http://lwn.net/Articles/629131/
Comment by Csaba Miklos (micsuka) - Sunday, 29 March 2015, 15:06 GMT

So... when the btrfs is dirty, the system boot blocks in the mount syscall, tried with 3.19.2 and 3.14.37 (if I wait a little bit more, I can even see the kernel hangcheck timer complaints, the process is stuck in some btrfs transaction).

When I booted from the boot dvd (I had the archlinux-2015.01.01-dual.iso with kernel 3.17.6) and tried to mount the dirty btrfs partition, I caught that assertion in the dmesg.
After your question, I downloaded the latest boot iso (archlinux-2015.03.01-dual.iso, kernel 3.18.6) and tried to mount the dirty btrfs with this kernel and there are no assertions in the dmesg anymore (I tried 3 times).

So I couldn't reproduce the blocking itself when mounting the partition with these kernels from the boot isos, although I called the mount with the same arguments like from the initrd script.

I have no idea what could be the difference between running the mount from the initrd script or from the shell with the same parameters.

Loading...