FS#61317 - [systemd] 240.0-3 can't reach login

Attached to Project: Arch Linux
Opened by Andreas (poison) - Wednesday, 09 January 2019, 09:31 GMT
Last edited by Andreas Radke (AndyRTR) - Tuesday, 10 December 2019, 13:07 GMT
Task Type Bug Report
Category Packages: Core
Status Closed
Assigned To Dave Reisner (falconindy)
Christian Hesse (eworm)
Architecture All
Severity High
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 3
Private No

Details

Description:
After upgrade to systemd-240.0-3 I can't reach login/DM.
Can't seem to switch to a tty either to analyse any further.
Downgrading to systemd-239.370-1 fixes it.


Additional info:
* package version(s)
systemd-240.0-3

* config and/or log files etc.
The only thing suspicious I can spot in the log is:
Jan 09 10:02:34 draco systemd-udevd[402]: Process 'bcache-register /dev/sda1' failed with exit code 1.
Jan 09 10:02:34 draco systemd-udevd[402]: Failed to wait spawned command 'bcache-register /dev/sda1': Input/output error
Jan 09 10:02:34 draco systemd-udevd[378]: Process 'bcache-register /dev/sda2' failed with exit code 1.
Jan 09 10:02:34 draco systemd-udevd[378]: Failed to wait spawned command 'bcache-register /dev/sda2': Input/output error
Jan 09 10:02:34 draco systemd-udevd[377]: Process 'bcache-register /dev/sdc1' failed with exit code 1.
Jan 09 10:02:34 draco systemd-udevd[377]: Failed to wait spawned command 'bcache-register /dev/sdc1': Input/output error

Steps to reproduce:
Upgrade, reboot.


This task depends upon

Closed by  Andreas Radke (AndyRTR)
Tuesday, 10 December 2019, 13:07 GMT
Reason for closing:  Duplicate
Additional comments about closing:   FS#61328 
Comment by Andreas (poison) - Wednesday, 09 January 2019, 10:19 GMT
Can't ping/ssh into machine either, doesn't seem to reach network target.
Comment by Andreas (poison) - Wednesday, 09 January 2019, 11:02 GMT
removing bcache mountpoint from fstab lets me boot with systemd-240.0-3:

/dev/bcache/by-uuid/192cf411-2af7-4c39-8066-cd1145920544 /usr/local/games ext4 defaults,noatime 0 0
Comment by Andreas (poison) - Wednesday, 09 January 2019, 11:14 GMT
can boot with nofail on bcache mount:
/dev/bcache/by-uuid/192cf411-2af7-4c39-8066-cd1145920544 /usr/local/games ext4 defaults,noatime,nofail 0 0

But can not mount any bcache FS at all with systemd-240.0-3:

# mount /usr/local/games/
# mount|grep games
#

nothing in dmesg, FS not mounted.
Comment by Dave Reisner (falconindy) - Wednesday, 09 January 2019, 12:01 GMT
Is your device entry really "/dev/bcache/by-uuid/192cf411-2af7-4c39-8066-cd1145920544" and not "/dev/disk/by-uuid/192cf411-2af7-4c39-8066-cd1145920544" ?

I don't know anything about bcache, but it seems to me that there should be an error logged from bcache-register exiting 1:

https://evilpiepirate.org/git/bcache-tools.git/tree/bcache-register.c#n31
Comment by Andreas (poison) - Wednesday, 09 January 2019, 12:06 GMT
@falconindy: yes, the device entry is "/dev/bcache/by-uuid/192cf411-2af7-4c39-8066-cd1145920544" and from my understanding that is correct:

$ ls -lah /dev/bcache/by-uuid/
total 0
drwxr-xr-x 2 root root 80 Jan 9 12:19 .
drwxr-xr-x 3 root root 60 Jan 9 12:19 ..
lrwxrwxrwx 1 root root 13 Jan 9 12:19 192cf411-2af7-4c39-8066-cd1145920544 -> ../../bcache1
lrwxrwxrwx 1 root root 13 Jan 9 12:19 8395e39c-f9bc-4321-bddc-1e0e8ce01823 -> ../../bcache0


Sorry, I don't understand your comment. There's no error being logged for bcache-register on systemd-239

Comment by Andreas (poison) - Wednesday, 09 January 2019, 12:25 GMT Comment by Dave Reisner (falconindy) - Wednesday, 09 January 2019, 13:55 GMT
> Sorry, I don't understand your comment. There's no error being logged for bcache-register on systemd-239

Then what happens when you run bcache-register manually?
Comment by Andreas (poison) - Wednesday, 09 January 2019, 16:17 GMT
> Then what happens when you run bcache-register manually?
it works and auto-mounts the bcache mountpoint as soon as the devices are registered:

# /usr/lib/udev/bcache-register /dev/sda1
# /usr/lib/udev/bcache-register /dev/sda2
# /usr/lib/udev/bcache-register /dev/sdd2

dmesg:
[ 173.479803] bcache: bch_journal_replay() journal replay done, 3 keys in 3 entries, seq 516292
[ 173.490555] bcache: bch_cached_dev_attach() Caching sdc1 as bcache0 on set c2cb921e-e0d9-4458-bd43-c0eca75df21c
[ 173.490571] bcache: register_cache() registered cache device sda1
[ 174.290616] EXT4-fs (bcache0): mounted filesystem with ordered data mode. Opts: (null)
[ 189.382414] bcache: bch_journal_replay() journal replay done, 1 keys in 2 entries, seq 141743
[ 189.382519] bcache: register_cache() registered cache device sda2
[ 194.781055] bcache: register_bdev() registered backing device sdd2
[ 194.790337] bcache: bch_cached_dev_attach() Caching sdd2 as bcache1 on set cd81aabd-282a-4b96-b654-deb9345fcc15


---
for reference, the messages at boot:
$ journalctl -b|grep bcache
Jan 09 17:08:45 draco systemd-udevd[406]: Process 'bcache-register /dev/sda1' failed with exit code 1.
Jan 09 17:08:45 draco systemd-udevd[406]: Failed to wait spawned command 'bcache-register /dev/sda1': Input/output error
Jan 09 17:08:45 draco systemd-udevd[447]: Process 'bcache-register /dev/sda2' failed with exit code 1.
Jan 09 17:08:45 draco systemd-udevd[447]: Failed to wait spawned command 'bcache-register /dev/sda2': Input/output error
Jan 09 17:08:45 draco systemd-udevd[374]: Process 'bcache-register /dev/sdd2' failed with exit code 1.
Jan 09 17:08:45 draco systemd-udevd[374]: Failed to wait spawned command 'bcache-register /dev/sdd2': Input/output error
Jan 09 17:08:45 draco kernel: bcache: register_bdev() registered backing device sdc1
Jan 09 17:10:15 draco systemd[1]: dev-bcache-by\x2duuid-192cf411\x2d2af7\x2d4c39\x2d8066\x2dcd1145920544.device: Job dev-bcache-by\x2duuid-192cf411\x2d2af7\x2d4c39\x2d8066\x2dcd1145920544.device/start timed out.
Jan 09 17:10:15 draco systemd[1]: Timed out waiting for device /dev/bcache/by-uuid/192cf411-2af7-4c39-8066-cd1145920544.
Jan 09 17:10:15 draco systemd[1]: dev-bcache-by\x2duuid-192cf411\x2d2af7\x2d4c39\x2d8066\x2dcd1145920544.device: Job dev-bcache-by\x2duuid-192cf411\x2d2af7\x2d4c39\x2d8066\x2dcd1145920544.device/start failed with result 'timeout'.
Comment by Andreas (poison) - Wednesday, 09 January 2019, 16:27 GMT
To add further confusion:
After more experiments, every 3rd to 5th or so reboot it manages to actually register+mount one of the bcache partitions:
$ journalctl -b|grep bcache
Jan 09 17:20:48 draco systemd-udevd[412]: Process 'bcache-register /dev/sda1' failed with exit code 1.
Jan 09 17:20:48 draco systemd-udevd[412]: Failed to wait spawned command 'bcache-register /dev/sda1': Input/output error
Jan 09 17:20:48 draco kernel: bcache: register_bdev() registered backing device sdc1
Jan 09 17:20:48 draco kernel: bcache: bch_journal_replay() journal replay done, 1 keys in 3 entries, seq 141745
Jan 09 17:20:48 draco kernel: bcache: register_cache() registered cache device sda2
Jan 09 17:20:48 draco kernel: bcache: register_bdev() registered backing device sdd2
Jan 09 17:20:48 draco kernel: bcache: bch_cached_dev_attach() Caching sdd2 as bcache1 on set cd81aabd-282a-4b96-b654-deb9345fcc15
Jan 09 17:22:18 draco systemd[1]: dev-bcache-by\x2duuid-192cf411\x2d2af7\x2d4c39\x2d8066\x2dcd1145920544.device: Job dev-bcache-by\x2duuid-192cf411\x2d2af7\x2d4c39\x2d8066\x2dcd1145920544.device/start timed out.
Jan 09 17:22:18 draco systemd[1]: Timed out waiting for device /dev/bcache/by-uuid/192cf411-2af7-4c39-8066-cd1145920544.
Jan 09 17:22:18 draco systemd[1]: dev-bcache-by\x2duuid-192cf411\x2d2af7\x2d4c39\x2d8066\x2dcd1145920544.device: Job dev-bcache-by\x2duuid-192cf411\x2d2af7\x2d4c39\x2d8066\x2dcd1145920544.device/start failed with result 'timeout'.
Comment by Dave Reisner (falconindy) - Wednesday, 09 January 2019, 17:18 GMT
So then it's a race condition. What happens if you forcibly load the bcache module in the initramfs (prior to all of this)?
Comment by Andreas (poison) - Wednesday, 09 January 2019, 17:38 GMT
MODULES="ext4 bcache"

same story

Besides, the udev rule already loads the module:
https://evilpiepirate.org/git/bcache-tools.git/tree/69-bcache.rules
Comment by Dave Reisner (falconindy) - Wednesday, 09 January 2019, 17:44 GMT
I understand that the udev rule does this, but not all modules are well written and do their initialization synchronously. I assume that you rebuilt your initramfs with the modified config before rebooting?
Comment by Andreas (poison) - Wednesday, 09 January 2019, 17:45 GMT
> I assume that you rebuilt your initramfs with the modified config before rebooting?
yes
Comment by loqs (loqs) - Wednesday, 09 January 2019, 18:13 GMT Comment by Dave Reisner (falconindy) - Wednesday, 09 January 2019, 18:56 GMT
> Possibly try the fix from https://github.com/systemd/systemd/issues/11314

Wouldn't that be negated by my idea to pre-load the module? I'm also unclear on which fix from 11314 you're referring to -- most of the referenced backports are already included with 240.0-3.
Comment by loqs (loqs) - Wednesday, 09 January 2019, 19:17 GMT
https://github.com/keszybz/systemd/commit/dbbb5e65ff03c8a27bedb8ca69e121eaec8728c8 was the commit I meant to reference.
Yes pre-loading the module makes my suggestion redundant. Would you suggest a git bisection?
Comment by Dan Martins (dmartins) - Wednesday, 09 January 2019, 22:33 GMT
I wanted to share some findings - it looks like this is related to https://github.com/systemd/systemd/issues/11314 and not specifically bcache. There have been similar reports here: https://bbs.archlinux.org/viewtopic.php?id=243253
A common complaint is that the keyboard doesn't work in the emergency shell.

I've been unable to reproduce the issue myself, but in one case, the workaround mentioned in the systemd bug report was successful, suggesting it is a race condition as mentioned in the upstream bug report.

Modify /usr/lib/initcpio/hooks/udev

From:
udevadm trigger --type=subsystems --action=add
udevadm trigger --type=devices --action=add

To:
udevadm trigger --type=subsystems --action=add -v
udevadm trigger --type=devices --action=add -v

Then rebuild the initramfs: mkinitcpio -p linux

Given that this spams the boot messages, it doesn't seem suitable to package, but is hopefully a step in the right direction.
Comment by Dave Reisner (falconindy) - Thursday, 10 January 2019, 15:32 GMT Comment by Dan Martins (dmartins) - Thursday, 10 January 2019, 20:46 GMT
Hi Dave,

Two users so far have reported on the forum that the 240.156 packages don't resolve the problem.
Comment by Dan Martins (dmartins) - Saturday, 12 January 2019, 03:08 GMT
I see systemd has merged a commit and it has been packaged in 240.34 (in testing). I can confirm the test VM I had created now boots OK with systemd-240.34-2.
Thanks Dave!
Comment by Andreas (poison) - Saturday, 12 January 2019, 14:34 GMT
Sorry for the late response.

240.34 no longer results in 'Failed to wait spawned command' but instead: 'failed with exit code 1'

Still no luck though.

$ journalctl -b|grep bcache
Jan 12 15:28:26 draco systemd-udevd[485]: Process 'bcache-register /dev/sda1' failed with exit code 1.
Jan 12 15:28:26 draco systemd-udevd[427]: Process 'bcache-register /dev/sdc1' failed with exit code 1.
Jan 12 15:28:26 draco kernel: bcache: register_bdev() registered backing device sdd2
Jan 12 15:28:26 draco kernel: bcache: bch_journal_replay() journal replay done, 1 keys in 2 entries, seq 143080
Jan 12 15:28:26 draco kernel: bcache: bch_cached_dev_attach() Caching sdd2 as bcache0 on set cd81aabd-282a-4b96-b654-deb9345fcc15
Jan 12 15:28:26 draco kernel: bcache: register_cache() registered cache device sda2
Jan 12 15:29:56 draco systemd[1]: dev-bcache-by\x2duuid-192cf411\x2d2af7\x2d4c39\x2d8066\x2dcd1145920544.device: Job dev-bcache-by\x2duuid-192cf411\x2d2af7\x2d4c39\x2d8066\x2dcd1145920544.device/start timed out.
Jan 12 15:29:56 draco systemd[1]: Timed out waiting for device /dev/bcache/by-uuid/192cf411-2af7-4c39-8066-cd1145920544.
Jan 12 15:29:56 draco systemd[1]: dev-bcache-by\x2duuid-192cf411\x2d2af7\x2d4c39\x2d8066\x2dcd1145920544.device: Job dev-bcache-by\x2duuid-192cf411\x2d2af7\x2d4c39\x2d8066\x2dcd1145920544.device/start failed with result 'timeout'
Comment by Andreas (poison) - Saturday, 12 January 2019, 14:47 GMT
With 240.34 plus bcache in initramfs it works.

Hmmm, let me check if I actually screwed up creating the initramfs with 240.0, that'd be embarrassing :/
Comment by Andreas (poison) - Saturday, 12 January 2019, 14:52 GMT
Darn, sorry, I did screw up :(

240.0 plus bcache in MODULES works.
Comment by Dave Reisner (falconindy) - Saturday, 12 January 2019, 15:11 GMT
So what's the state here? 240.34-2 (not -1) backports the fix for udev loading modules when *not* using systemd in the initramfs.

* Are you using systemd-240.34-2?
* Are you using systemd in the initramfs?
* Do you normally expect to have to explicitly load bcache in MODULES= ?

I really suspect this is a dupe of  FS#61328  after all...

Loading...