FS#51818 - [linux-lts][linux-zen][linux-grsec][linux] move to mkinitcpio hook lvm2 breakage

Attached to Project: Arch Linux
Opened by Alexandre Figura (arugifa) - Monday, 14 November 2016, 12:54 GMT
Last edited by Doug Newgard (Scimmia) - Friday, 09 December 2016, 23:41 GMT
Task Type Bug Report
Category Packages: Core
Status Closed
Assigned To Tobias Powalowski (tpowa)
Eric Belanger (Snowman)
Andreas Radke (AndyRTR)
Jan Alexander Steffens (heftig)
Christian Hesse (eworm)
Architecture x86_64
Severity Critical
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 27
Private No

Details

Description:

I just upgraded my system with ``pacman -Syu``. But when I rebooted my machine, my system was broken: lvm was unable to find my root file system.

It appears that lvm2 requires readline 6, but readline 7 was installed during the upgrade. I observed this dependency while analysing pacman logs (see attachment), when pacman generated a new kernel image. This can be reproduced with ``mkinitcpio -p linux``.

Hopefully, I had still readline 6 in my pacman cache. So I booted on a live key, chrooted in my system, and installed readline 6 (``pacman -U /var/cache/pacman/pkg/readline-6.3.008-4-x86_64.pkg.tar.xz``). But new versions of bash and gawk which have been installed during the upgrade depend on readline 7. So I downgraded bash and gawk. Unfortunately, when I rebooted, lvm was able to find my root file system, but my newly updated gnome-shell (3.22.2-1) refused to start as it depends on readline 7.

Finally, I copy/pasted the shared library file of readline 6 and installed again readline 7:

lrwxrwxrwx 1 root root 16 Nov 6 18:42 libreadline.so -> libreadline.so.7
lrwxrwxrwx 1 root root 27 Nov 14 13:02 libreadline.so.6 -> /usr/lib/libreadline.so.6.3
-r-xr-xr-x 1 root root 344904 Nov 14 13:03 libreadline.so.6.3
lrwxrwxrwx 1 root root 18 Nov 6 18:42 libreadline.so.7 -> libreadline.so.7.0
-r-xr-xr-x 1 root root 363064 Nov 6 18:42 libreadline.so.7.0

lvm is now able to find again my root file system when booting my machine. And I can continue to use the cutting edge packages which depend on readline 7.

readline dependency for lvm2 should be maybe pinned or have an upper bound.

Additional info:
* package version(s): lvm2-2.02.167-2, readline-7.0-1

N.B.: a teammate had the same problem. Moreover, as this completely blocks systems using lvm2, I set the severity to critical :)
This task depends upon

Closed by  Doug Newgard (Scimmia)
Friday, 09 December 2016, 23:41 GMT
Reason for closing:  Fixed
Comment by Dave Reisner (falconindy) - Monday, 14 November 2016, 13:22 GMT
Seems like an ordering problem. Downgrading was not the correct solution here -- you just needed to rerun `mkinitcpio -P`.

Had you posted the entire transaction from pacman.log, we'd be able to confirm this.
Comment by Roman Z. (romz) - Monday, 14 November 2016, 13:56 GMT
I faced the same issue today and I'm process of recovering my system now using chroot from an USB device. I'll try re-running `mkinitcpio -P`.
For reference, here is my pacman.log.
Comment by Jan de Groot (JGC) - Monday, 14 November 2016, 14:07 GMT
mkinitcpio is run when kernel is installed, so it puts new readline with old lvm in the initramfs image.

Can't we solve this with hooks in the future? IMHO the initramfs image should be generated at the end of the transaction, not as soon as kernel is installed.
Comment by Christian Hesse (eworm) - Monday, 14 November 2016, 14:12 GMT
Yes, we can. :D
But anybody has to do the work. This has to work with a combination of different kernels, initramfs images, initcpio hooks and their dependencies, ...
Comment by nasosnik (nasosnik) - Monday, 14 November 2016, 14:56 GMT
I also confirm the aforementioned bug. I would also like to highlight that given the severity of the issue, that will render a system that uses LVM - a very common setup nowadays - unbootable (fallback initramfs will not save the situation), it would be wiser to test thoroughly such "stable" packages prior to their release.
Comment by Alexandre Figura (arugifa) - Monday, 14 November 2016, 15:04 GMT
Here is my complete Pacman transaction :)
Comment by Dave Reisner (falconindy) - Monday, 14 November 2016, 16:03 GMT
So yes, the problem is ordering and re-running mkinitcpio after pacman completed would avoid this.

edit: specifically, see the timeline:

[2016-11-14 10:44] [ALPM] upgraded readline (6.3.008-4 -> 7.0-1) # this breaks the lvm2 dependency
[2016-11-14 10:44] [ALPM] upgraded linux (4.8.6-1 -> 4.8.7-1) # initramfs gets rebuilt with bad lvm2 binaries
[2016-11-14 10:45] [ALPM] upgraded lvm2 (2.02.166-1 -> 2.02.167-2)
Comment by Lars Hupel (larsrh) - Tuesday, 15 November 2016, 12:57 GMT
I had the same issue and can confirm that rerunning mkinitcpio in a live system fixed it. What's peculiar about this is that while there is an error message in the installation log (ERROR: binary dependency `libreadline.so.6' not found for `/usr/bin/lvm'), at the end, it says "Image generation successful". I basically ignored that error message because of the success message at the end.
Comment by Dave Reisner (falconindy) - Tuesday, 15 November 2016, 13:04 GMT
Good catch. That's a separate bug in mkinitcpio.
Comment by BoBeR182 (BoBeR182) - Tuesday, 15 November 2016, 13:42 GMT
Confirming booting from liveUSB, mounting / and /boot/ arch-chroot and mkinitcpio -p linux fixed issue.

No need to downgrade packages.
Comment by Eric Mountain (eric.mountain) - Wednesday, 16 November 2016, 08:26 GMT
Might this deserve a mention in the latest news on archlinux.org? A colleague and myself each have a machine to fix today... This looks like it will affect quite a few people.
Comment by TJ Griffiths (teejer) - Wednesday, 16 November 2016, 18:10 GMT
I think I'm having this same issue. My lts kernel(4.4.31-1) doesn't work, but non-lts kernel (4.8.8-1) does work. When I run mkinitcpio -P or mkinitcpio -p linux-lts it does not fix my lts kernel though.

Comment by hheinz (hheinz) - Wednesday, 16 November 2016, 19:35 GMT
Please put this on archlinux.org, I also broke my machine yesterday (and I looked there before to see if there are any urgent issues..)
"mkinitcpio -p linux" from a chroot fixes it but I still can't boot the grsec kernel.

Edit: Please ignore my ignorance, I should have run "mkinitcpio -P" as suggested before.
Comment by Dave Reisner (falconindy) - Wednesday, 16 November 2016, 19:48 GMT
> I think I'm having this same issue.
No, it doesn't sound like you are if the prescribed fix doesn't help you.

> "mkinitcpio -p linux" from a chroot fixes it but I still can't boot the grsec kernel.
Of course it doesn't. 'mkinitcpio -p linux' only fixes the kernel provided by the "linux" package. Is there value in posting this on the front page? Maybe... I doubt anyone will read it, just as people don't seem to read my first post on this bug which recommends 'mkinitcpio -P', rebuilding *all* images for *all* kernels.
Comment by Pig Monkey (pigmonkey) - Wednesday, 16 November 2016, 20:01 GMT
I was just hit with this. I use LUKS/LVM, so it took me a while to get to the libreadline error. Once I saw that I checked here and saw the suggested "mkinitcpio -P". I was able to boot into an install disk, decrypt my drive, chroot, mkinitcpio, and everything is fine now (for both linux and linux-grsec).

I think there is definitely value in posting this on the front page. I would have read it (just like I read the first post on this bug which recommends 'mkinitcpio -P', rebuilding *all* images for *all* kernels) and saved some stress and about an hour of my day.
Comment by Christian Hesse (eworm) - Wednesday, 16 November 2016, 22:13 GMT
Dave changed the assignees, probably because this is not a lvm2 bug but an issue with regenerating the initramfs. My proposed fix still is a simple pacman hook, see the attachment.

This does...
* remove the initramfs regeneration from install script.
* add a pacman hook that runs mkinitcpio when /boot/vmlinuz-linux or an initcpio hook or install script changes.

That makes it rebuild the initramfs on linux update (obviously), and on package update for packages that ship initcpio files. This includes systemd, lvm2, mkinitcpio, ... Get a list of packages with:

pacman -Qoq /boot/vmlinuz-linux $(find /usr/lib/initcpio/ -type f) | sort | uniq

Finally this fixes consistency issue as the hook is run after the transaction. Things just get better, so we should go for it.

Two things to note:

* Every linux package (linux, linux-zen, linux-lts, linux-grsec, ... AUR packages like linux-git, ...) has to ship its own hook.
* This does not rebuild the initramfs on minor dependency updates. Is that required? A bugfix release for readline does not matter - soname bumps are caught by the package that depends on it, lvm2 in this case. IMHO we are fine with this.

We can work on this later, but let's make things work the easy way for now.

Tobias, your thoughts?
Comment by Doug Newgard (Scimmia) - Wednesday, 16 November 2016, 23:36 GMT
"Dave changed the assignees, probably because this is not a lvm2 bug but an issue with regenerating the initramfs."

I took the shotgun approach to assignees, assigned anyone who was affiliated with any of the packages involved. This is a big enough problem I wanted to pull in as many people as I could.
Comment by Christian Hesse (eworm) - Thursday, 17 November 2016, 08:04 GMT
Ups, I mis-read the mail... Though that Tobias was added to assignees - in fact Dave just removed himself. :-p

Doug, your approach is just fine. ;) I did not blame anybody for whatever.
Comment by Alexandre Figura (arugifa) - Thursday, 17 November 2016, 10:49 GMT
I also agree that writing a note about this issue on the front-page and twitter really worth it. There are maybe people who do not read the news, but there are also a lot of people who do. And for those people, we will save them some hours of debugging :)
Comment by Tobias Powalowski (tpowa) - Thursday, 17 November 2016, 13:39 GMT
4.8.8-2 will include eworms hook to solve this.
Comment by Doug Newgard (Scimmia) - Thursday, 17 November 2016, 17:26 GMT
Added thestinger to the notification list, for linux-grsec.
Comment by John (graysky) - Saturday, 19 November 2016, 12:23 GMT
> Every linux package (linux, linux-zen, linux-lts, linux-grsec, ... AUR packages like linux-git, ...) has to ship its own hook.

Are unique filenames for the hooks important? For example, what happens if an AUR package ships xx-linux.hook and another package also ships xx-linux.hook?
Comment by Jan Alexander Steffens (heftig) - Saturday, 19 November 2016, 12:25 GMT
That would just be a file conflict.
Comment by John (graysky) - Saturday, 19 November 2016, 12:31 GMT
@heftig - I should have mentioned that the PKGBUILD will rename '99-linux.hook' to '99-${pkgbase}.hook' on the filesystem so no package can provide the same hook if there is a unique pkgbase defined (see lines 126-128 of the official PKGBUILD). I guess I answered my own question <<red face>>
Comment by Piotr Górski (sir_lucjan) - Saturday, 19 November 2016, 14:17 GMT
@graysky - You have absolutely right. I've tested it with linux-ck 4.8.9 and linux 4.8.8-2 from [core].
Comment by Piotr Górski (sir_lucjan) - Saturday, 19 November 2016, 15:25 GMT
For example:

80-linux.hook
99-linux-bfq-haswell.hook
Comment by Freddy (AvelVras) - Thursday, 24 November 2016, 13:33 GMT
Hi everybody

I had a same problem with linux-lts(4.4.31-1 -> 4.4.34-1), lvm2(2.02.166-1 -> 2.02.167-2) and readline(6.3.008-4 -> 7.0-1) ... and I confirm for resolve problem with `mkinitcpio -p linux-lts`

Sorry for my bad english language.

See you
Comment by Doug Newgard (Scimmia) - Thursday, 08 December 2016, 16:49 GMT
Ping AndyRTR...

linux and linux-zen implemented this 3 weeks ago, and linux-grsec a week ago. Just waiting on linux-lts.
Comment by Andreas Radke (AndyRTR) - Thursday, 08 December 2016, 19:27 GMT
Fixed for lts kernel in svn trunk, will be released to our package tomorrow with new upstream release 4.4.37.
Comment by Pablo Lezaeta (Jristz) - Friday, 09 December 2016, 23:14 GMT
Is there a missing kernel without the hook now?

Loading...