FS#22955 - lvm2 init script stopped working for lvm pvs inside luks partitions

Attached to Project: Arch Linux
Opened by A Web (aweb) - Saturday, 19 February 2011, 03:21 GMT
Last edited by Eric Belanger (Snowman) - Tuesday, 17 May 2011, 20:43 GMT
Task Type Bug Report
Category Packages: Core
Status Closed
Assigned To Eric Belanger (Snowman)
Thomas Bächler (brain0)
Architecture All
Severity High
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 1
Private No

Details

Description:

I have a machine which has an lvm pv on top of a luks partition (which itself is an lvm partition). My /etc/mkinitcpio.conf file contains this line:

HOOKS="base udev autodetect pata scsi sata lvm2 encrypt lvm2 resume filesystems"

My kernel command line is:

kernel /vmlinuz26 cryptdevice=/dev/mapper/host-encrypted:decrypted root=/dev/mapper/host-root ro resume=/dev/mapper/decrypted-swap

This configuration has worked for a long time. However, upon upgrading from lvm 2.02.82-2 -> 2.02.84-1, my machine suddenly stopped being able to boot, because the new script no longer recognizes the decrypted lvm partition.

One possible fix is to apply the following patch to lvm2_hook (i.e., the file that gets installed as /lib/initcpio/hooks/lvm2).

--- lvm2_hook.orig 2011-02-09 23:49:47.000000000 -0800
+++ lvm2_hook 2011-02-18 19:02:30.923337474 -0800
@@ -18,6 +18,7 @@
[ "${quiet}" = "y" ] && LVMQUIET=">/dev/null"

msg "Activating logical volumes..."
+ eval /sbin/lvm vgscan --ignorelockingfailure $LVMQUIET
eval /sbin/lvm vgchange --sysinit -a y $LVMQUIET
fi
}


Steps to reproduce:

Attempt to boot a machine in which the swap/resume partition is an lvm partition inside a luks encrypted partition.
This task depends upon

Closed by  Eric Belanger (Snowman)
Tuesday, 17 May 2011, 20:43 GMT
Reason for closing:  Fixed
Comment by Dave Reisner (falconindy) - Saturday, 19 February 2011, 03:48 GMT
Could you try the initscripts package in testing?
Comment by A Web (aweb) - Saturday, 19 February 2011, 07:30 GMT
Okay, just downloaded and installed initscripts-2011.02.1-1-x86_64.pkg.tar.xz from testing, and it did not fix the problem.

Just to clarify, the problem is happening within the intrd scripts. Things start breaking as soon as the resume hook is run (when it can't find my swap device), which is before switch_root is even called to switch to the real root file system. Since the initscripts package mostly seems to affect what happens after the root file system is mounted, it is unlikely to be the culprit. Most likely the problem is in lvm2, mkinitcpio, or cryptsetup (all of which contribute to the contents of /lib/initcpio). An alternative and reasonable fix would be to have /lib/initcpio/hooks/encrypt run vgscan and vgchange -ay after decrypting a volume.

While testing initscript, I once again made my machine unbootable. In case anyone has the same problem and runs across this bug report, here is how to fix it:

- In grub, add " break=y" to the end of the kernel command line

- When you get a shell, run "lvm vgscan --ignorelockingfailure"

- Then run "lvm vgchange --sysinit -ay"

- Then press control-D.

When your system boots, add the line "eval /sbin/lvm vgscan --ignorelockingfailure $LVMQUIET" to /lib/initcpio/hooks/lvm2 right before the vgchange line. Finally run "mkinitcpio -p kernel26" to re-generate your initrd files. Then your system should at least be bootable until the problem is resolved.


Comment by A Web (aweb) - Saturday, 19 February 2011, 07:43 GMT
By the way, this is the patch that broke things:

http://projects.archlinux.org/svntogit/packages.git/commit/lvm2/trunk?id=d35fc1200636ec40d77f36ad146ff3a5b09ec6f3

The logic behind the patch seems to be that vgscan is no longer initially needed. However, it is needed when used in conjunction with the encrypt hook. This patch is likely to make people's laptops unbootable if they followed the instructions on the following arch wiki page, which says, "The easiest and best method is to set up LVM on top of the encrypted partition instead of the other way around."

https://wiki.archlinux.org/index.php/System_Encryption_with_LUKS_for_dm-crypt#Encrypting_a_LVM_setup

If you still want to avoid the extra vgscan in the common (non-encrypted case), just add vgscan in the encrypt hook, which is part of the cryptsetup package.

Comment by Dave Reisner (falconindy) - Saturday, 19 February 2011, 17:54 GMT
Sorry, I misread the initial report and somehow thought this was initscripts related.

Can you try the attached hook in place of the current lvm2 hook? I don't have an actual block dev I can try this on, so I've only checked the script through busybox ash for syntax errors.
   lvm2 (0.8 KiB)
Comment by A Web (aweb) - Saturday, 19 February 2011, 18:38 GMT
Yes, I can confirm that the revised lvm you supplied fixes my specific problem, and is probably worth committing soon.

However, even this revised script does not seem to be as robust as it could be. For instance, it would not work if for some reason dm-crypt were compiled into the kernel, or if there is some other module that makes lvm pvs available (e.g., what happens if you have some call to mdadm which also requires lvm vgscan?).

What about having some flag file like /tmp/.need_vgscan. Have both the encrypt and mdadm scripts touch /tmp/.need_vgscan. Have lvm2 run vgscan if the file exists, and moreover have it create /tmp/.need_vgscan. This would guarantee that if the lvm2 script is run twice, it will always run vgscan the second time, which would provide a catch-all fix to this problem by just editing mkinitcpio.conf should anyone run into the problem in an unanticipated contexts.

Another failsafe option might be to modify poll_device so that on failure it tries vgscan/vgchange. Or at the very least, if poll_device fails, print some message suggesting booting with break=y and manually running vgscan/vgchange.

I think it's worth being a little paranoid about this, just because making machines non-bootable is the kind of thing that can get people upset about arch linux if they don't know how to fix it. (If I hadn't known about the break=y trick, I would be pretty unhappy.)
Comment by Dave Reisner (falconindy) - Saturday, 19 February 2011, 18:50 GMT
Good call on the non-module case. It would be smarter to call:

grep -q 'crypt_iterate_devices' /proc/kallsyms

...which finds dm_crypt functionality regardless of whether its a module or in-tree.

lvm is a bit of a foreign beast to me, so I'm not sure of what use cases necessitate calling 'vgscan --ignorelockingfailure' versus what use cases make initcpio explode when called. Creating temp files seems too hackish and seems to be prone to creating additional failure points. we should certainly get Thomas's input on this.

I've attached the revised hookscript that greps /proc/kallsyms instead.
   lvm2 (0.8 KiB)
Comment by A Web (aweb) - Saturday, 19 February 2011, 19:22 GMT
The new script does NOT work. The problem is that the version of busybox used by mkinitcpio does not have a grep command.

I guess stepping back, can I ask what the point of omitting the call to vgscan is? At least on my machine, vgscan seems to take only 0.15 seconds to run. How about just adding a kernel command-line option "skipvgscan" for people who really want to shave .2 seconds off their boot time, and having the default (in the absence of this option) be to run vgscan every time the lvm2 hook is run.

Alternatively, you could say there is a contract between hooks where any hook that potentially makes PVs available somehow has to signify it. The temporary file thing is a hack, but since all the hooks are run in the same shell script, how about just setting a variable, need_vgscan=1 and testing for that? A nice benefit of that is that if you run into problems, you can also add need_vgscan=1 to the kernel command-line. Then the encrypt hook would definitely set need_vgscan, and possibly the mdadm script, and lvm2 should set it as well, since the only reason you would run the lvm2 hook twice is that new PVs have become available. (Not sure if mdadm is a problem, but since it can create devices that contain PVs, it may suffer the same problem.)
Comment by Dave Reisner (falconindy) - Saturday, 19 February 2011, 19:37 GMT
Hrmmm, something's wrong here. busybox on the initcpio definitely has grep, but the --install function doesn't actually install it, so we should just replace the call with 'busybox grep -q ....'

I was under the assumption that adding the vgscan call in when it wasn't needed would break boot, as you mention in an earlier comment. If that's not the case, then it seems the simpler solution is just to leave it in. I'd opt for avoiding adding the need for more kernel cmdline options whenever possible.
Comment by A Web (aweb) - Saturday, 19 February 2011, 20:51 GMT
Agreed something is weird with busybox. If I run /lib/initcpio/busybox grep, it works. But if I just run /lib/initcpio/busybox, it doesn't list grep as a defined function, and it definitely doesn't get installed in the initrd environment.

Anyway, more importantly, I don't know of any situation in which vgscan would break anything. It might exit non-zero, but shouldn't break anything. Anyway, the lvm2 script is running vgchange, and there's definitely should be no situation in which vgscan would break but vgchange wouldn't. If anything, vgchange fails if you don't run vgscan.

I think what must have happened is that older kernels always required vgscan, but newer kernels automatically run the equivalent of vgscan on boot-up. This made vgscan redundant but harmless in the common case. (There are no reports of the older lvm2 package breaking, right?) So it probably seemed simpler to eliminate the redundant vgscan call... except that in certain situations (cryptsetup and possibly mdadm) new block devices come into existence that will not have been automatically scanned by the kernel.

The vgscan man page says:

In LVM2, vgscans take place automatically; but you might still need to
run one explicitly after changing hardware.

cryptsetup, since it creates a decrypted version of a device, seems to count as changing hardware.

Anyway, I'm all in favor of just always doing the vgscan as suggested in my original patch.
Comment by Dave Reisner (falconindy) - Saturday, 19 February 2011, 20:55 GMT
Agreed. If the call is harmless but necessary in some cases, then we should just add it.

Yet another attachment for dev review...
   lvm2 (0.8 KiB)
Comment by A Web (aweb) - Saturday, 19 February 2011, 21:16 GMT
Confirmed that the new version works. For consistency you might want both calls to lvm to be either lvm or /sbin/lvm, but either works and the script as-is works just fine. Thanks.
Comment by Thomas Bächler (brain0) - Sunday, 20 February 2011, 14:03 GMT
I am confused, since when does busybox ash support [[ constructs? We don't use them anywhere else, we use the POSIX [ constructs instead.
Comment by Dave Reisner (falconindy) - Sunday, 20 February 2011, 14:30 GMT
We should follow what our shell of choice (busybox ash, not just ash) supports, no? busybox has brought in a lot of convenience features from ksh/bash, such as parameter expansion and [[ (though i don't believe there's any difference between the single and double, unlike in bash). I've recently been working on cleaning up /lib/initcpio/init to cut back on forks and use this. Things like"

cmd="$(echo "${cmd}" | cut -d= -f1 | sed 's|\.|_|g')"
cmd="$(echo "${cmd}" | sed 's|-|_|g')=${rhs}"

Can be turned into:

cmd="${cmd%%=*}"
cmd="${cmd//[.-]/_}=$rhs"

Comment by A Web (aweb) - Sunday, 20 February 2011, 18:54 GMT
I hope this higher-level discussion about shell syntax doesn't delay some sort of fix being committed.

My possibly irrelevant opinion is that if busybox ash treats [[ and [ identically to [ in bash, then it's best to use only [, since you don't gain extra functionality from [[ and it might just confuse people who expect the bash semantics.

Moreover, for what it's worth, #, ##, %, and %% for variable modification are POSIX standard, so ought to be safe. But // is a bash extension, which you might want to avoid. You can see the actual specification here:

http://pubs.opengroup.org/onlinepubs/009695399/utilities/xcu_chap02.html#tag_02_06_02
Comment by Eric Belanger (Snowman) - Saturday, 07 May 2011, 02:49 GMT
Thomas: I'm thinking about simply readding the:
eval /sbin/lvm vgscan --ignorelockingfailure $LVMQUIET
which you removed in:
http://projects.archlinux.org/svntogit/packages.git/commit/lvm2/trunk/lvm2_hook?id=d35fc1200636ec40d77f36ad146ff3a5b09ec6f3
because it appeared to not be necessary. IMO, the hook syntax/improvement should be in its own bug report.

Do you have any objections about that?
Comment by A Web (aweb) - Saturday, 07 May 2011, 05:28 GMT
Yes, please do that! There's no reason for lvm2/luks to remain broken while the shell syntax question is being hashed out.
Comment by Thomas Bächler (brain0) - Saturday, 07 May 2011, 09:11 GMT
Heh, this is a weird case. The first 'lvm2' hook calls 'vgchange', which implicitly calls 'vgscan'. This creates a configuration in /etc/lvm/. The second lvm2 hook calls 'vgchange' again. Now that a configuration exists, it doesn't scan anymore. It seems we actually need the scan - for a minor speedup, we could run this:

[ -d /etc/lvm ] && /sbin/lvm vgscan --sysinit
Comment by Eric Belanger (Snowman) - Monday, 09 May 2011, 00:16 GMT
With the new version of lvm2 (2.02.85), I'm getting a:
/etc/lvm/cache/.cache unlink failed: Read-only filesystem
message on bootup. It's caused by the activate_vgs function in /etc/rc.d/functions. It doesn't seem to be harmful though. We could also just realease a lvm2 2.02.84-2 to fix this bug...
Comment by Thomas Bächler (brain0) - Monday, 09 May 2011, 08:52 GMT
Seems like they screwed up --sysinit again (it should not try to write to /etc, maybe they should start using /run). In my opinion, --sysinit should also scan unconditionally, as it cannot write to /etc/lvm/ anyway.
Comment by Eric Belanger (Snowman) - Friday, 13 May 2011, 00:59 GMT
Should be fixed in lvm2 2.02.85-1 in testing. Let me know how it goes.

FTR, I reported the bootup message on the lvm2 ML, and they came up with a fix. I've applied it to the package.

Loading...