FS#62450 - [systemd] 242.0: system is unbootable when using mkinitcpio 'systemd' hook

Attached to Project: Arch Linux
Opened by Sergiu (physicalit) - Tuesday, 23 April 2019, 18:25 GMT
Last edited by Christian Hesse (eworm) - Thursday, 25 April 2019, 13:03 GMT
Task Type Bug Report
Category Packages: Core
Status Closed
Assigned To Dave Reisner (falconindy)
Christian Hesse (eworm)
Architecture All
Severity Critical
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 16
Private No

Details

Description:
I have installed latest updates and then rebooted the laptop. To my surprise, I wasn't able to boot anymore.

Unfortunately my failed boot attempts did not registered, so I don't have any log to show you:
physicalit@~() $ journalctl --list-boots
......
-3 2f0a728b69d24986b707e8bef7bbf894 Wed 2019-04-17 22:09:19 EEST—Thu 2019-04-18 00:21:52 EEST
-2 f4bb2f16f8ec4180bdc8445813eaf506 Thu 2019-04-18 20:37:55 EEST—Sat 2019-04-20 11:54:44 EEST
-1 9c0dfeb7758c4e039814062446a2d3c8 Sat 2019-04-20 11:54:54 EEST—Tue 2019-04-23 20:18:09 EEST
0 afb02092055d43d7af71a5c232cda092 Tue 2019-04-23 21:04:56 EEST—Tue 2019-04-23 21:11:16 EEST

I will attach dmesg information, if it helps.

The packages that broke the boot are:
systemd-242.0-1-x86_64.pkg.tar.xz
systemd-libs-242.0-1-x86_64.pkg.tar.xz
systemd-sysvcompat-242.0-1-x86_64.pkg.tar.xz

After reinstalling the old version by booting from live cd and arch-chroot-ing, I was able to boot normally:
systemd 241.67-1
systemd-libs 241.67-1
systemd-sysvcompat 241.67-1

Additional info:
* package version(s)- Attached

System: Lenovo V330-14IKB

This task depends upon

Closed by  Christian Hesse (eworm)
Thursday, 25 April 2019, 13:03 GMT
Reason for closing:  Fixed
Additional comments about closing:  systemd 242.0-2
Comment by Sergiu (physicalit) - Tuesday, 23 April 2019, 18:31 GMT
Forgot to mention, there is no error message at the boot time, it just hangs after successful message: "Stop udev kernel device Manager"
Comment by loqs (loqs) - Tuesday, 23 April 2019, 18:50 GMT Comment by Britt Yazel (brittyazel) - Tuesday, 23 April 2019, 22:31 GMT
I too have this. It appears that it has to do with the systemd hook in the mkinitcpio.conf file. Users are reporting that switching the systemd hook to udev solves the problem.

https://bbs.archlinux.org/viewtopic.php?id=245661
Comment by loqs (loqs) - Tuesday, 23 April 2019, 22:38 GMT
@brittyazel I would class that as a work around rather than a fix.
As the users running testing who encountered this issue chose not to report the issue on this bug tracker or upstream the issue still remains
with the exact cause unknown and the package now in core.
Comment by Britt Yazel (brittyazel) - Tuesday, 23 April 2019, 23:09 GMT
"solved" was a poor choice of words, I should have said "works around the problem". Of course solving this issue involves figuring out why the systemd hook is broken.

I don't run testing on my production machines, so I wasn't testing 242. Apologies for not reporting it sooner.
Comment by loqs (loqs) - Tuesday, 23 April 2019, 23:33 GMT
Apart from a bisection between 241 and 242 it is difficult to debug due to nothing being recorded.
The root filesystem will be mounted but journald does not flush from memory to disk until after switchroot which is after the bug stops boot progress.
Can you boot to rd.rescue or rd.emergency? From rd.rescue you may be able to dump the initial journal contents to a file on /sysroot.
You could also report what is known so far upstream for more help diagnosing the issue if you do not want to perform the bisection.
As you were not running testing it is not reasonable to expect you to report it.
Comment by Benjamin Hodgetts (Enverex) - Wednesday, 24 April 2019, 00:01 GMT
I've just run into this too and have no other access to the system so I'm a little screwed. For anyone else, this is what it looks like (originally I saw literally nothing because I had "quiet" set in the kernel and no errors are generated, so you have a system that doesn't boot and just shows a black screen, which is even more of a problem) - https://i.imgur.com/I3ysE37.png
Comment by Sergiu (physicalit) - Wednesday, 24 April 2019, 12:19 GMT
I did not found any issue related to this on github systemd issue tracking page.
I was thinking of creating one and I started looking over the mkinitcpio, unfortunately I did not understood exactly what mkinitcpio does with systemd parameter in order to better understand if is a systemd or a mkinitcpio related issue.

Does anyone knows how to tag falconindy aka Dave Reisner in this bug report?
Comment by loqs (loqs) - Wednesday, 24 April 2019, 12:48 GMT
@physicalit I do not believe flyspray supports such tags and task assignment is restricted to Developers and Bug Wranglers

mkinicpio will call build() of /usr/lib/initcpio/install/systemd if the systemd hook is used.

Edit:
You could also try bisecting the issue https://bbs.archlinux.org/viewtopic.php?pid=1841589#p1841589
Comment by Mateusz Marzantowicz (mmarzantowicz) - Wednesday, 24 April 2019, 14:50 GMT
I am also affected by this issue. After upgrading systemd to 242.0, I am no longer able to boot my machine.
Comment by Sergiu (physicalit) - Wednesday, 24 April 2019, 19:24 GMT
Hmm, @Ioqs, I looked over the file, it might be that one of the default targets or services might fail to start, but why would this not have timeout.
Also I found this page https://fossies.org/diffs/systemd/241_vs_242/ , where we can find all the modification done between the two releases of systemd.
By eyeballing the page, did not found anything useful :(

Comment by loqs (loqs) - Wednesday, 24 April 2019, 19:42 GMT
Agreed why after 90 seconds does systemd not fail a starting service or SIGKILL a stopping service or print output that it failed attempting to do so?
That why I suggested the bisection to find which commit is the cause or if it is some change in the build environment e.t.c.
Comment by Christian Hesse (eworm) - Wednesday, 24 April 2019, 20:57 GMT
I do boot all my systems with systemd-enabled initramfs... Have not seen this at all.
So looks like anybody of you has to figure...

And just for reference: I do monitor the bug tracker, not the forums. Please report critical issues to the bug tracker as soon as possible.
Comment by freswa (frederik) - Wednesday, 24 April 2019, 21:07 GMT
Of 4 systems, only one is affected which is a system with two disks - similar to this report: https://bbs.archlinux.org/viewtopic.php?pid=1841604#p1841604
Anyone here affected with only one disk attached?
Comment by Chuan Ji (jichu4n) - Wednesday, 24 April 2019, 21:10 GMT
Encountered a similar issue after upgrading to systemd 242.0 (https://github.com/random-archer/mkinitcpio-systemd-tool/issues/25).

I found this commit in systemd which may be relevant: https://github.com/systemd/systemd/commit/142b8142d7bb84f07ac33fc00527a4d48ac8ef9f#diff-39479f052e5c764d107d871bb1d83a8a
Comment by Sergiu (physicalit) - Wednesday, 24 April 2019, 22:35 GMT
@freswa, it actually happens that I have two disks (1 nvme on m.2 ssd and one sata ssd) in my laptop so I can confirm that also.
@jichu4n, I do not know C, but is kind of straight forward what happens there(it might be related), I will try in the next few days to compile the latest systemd code, with those modification reversed and install it, in order to see if it happens again, or if it fixes the issue.

If anyone else can do it sooner, it would be very helpful.

If this would fix the issue, then we could go and submit a bug report directly on their issue tracker, in order to get fix in the next release.
Comment by loqs (loqs) - Wednesday, 24 April 2019, 23:19 GMT
test.patch reverts 142b8142d7bb84f07ac33fc00527a4d48ac8ef9f and d0fe45cb151774827a3aca4ea5a19856dec9f600
source bundle for easy building I suggest installing devtools then using extra-x86_64-build to build packages in a clean chroot then install with pacman to test.
Comment by Josh (JoshH100) - Thursday, 25 April 2019, 02:28 GMT
@freswa I have about 25 identical machines + a few others models that all have a single NVMe drive and are all affected by this issue. I'm pretty shocked this version wasn't removed from core as soon as this report was made. I guess using a systemd based initramfs is more niche than I thought.
Comment by Sergiu (physicalit) - Thursday, 25 April 2019, 11:41 GMT
So it appears we are a little bit behind, the issue might have been fixed :)
https://bugzilla.redhat.com/show_bug.cgi?id=1702358
https://github.com/systemd/systemd/pull/12346

Is going to be while till the next RC, i think, last time, was approximately to months after stable release until next RC was released.
Comment by Christian Hesse (eworm) - Thursday, 25 April 2019, 11:42 GMT
The systemd enabled initramfs is *not* the only factor. I do run this setup without issues.

Looks like the number of drives can be ruled out as well.

Any other special configuration effected? Boot parameters? Entries in /etc/fstab? Whatever?
Comment by Christian Hesse (eworm) - Thursday, 25 April 2019, 12:15 GMT
Anybody wants to try systemd 242.0-2?
Comment by loqs (loqs) - Thursday, 25 April 2019, 12:17 GMT
@eworm could you please look at  FS#62347  as well thanks.
Comment by freswa (frederik) - Thursday, 25 April 2019, 12:30 GMT
242.0-2 works for me. Thanks!

Loading...