FS#24288 - [udev] doesn't create LVM devices

Attached to Project: Arch Linux
Opened by Alessandro Massignan (ff0000.it) - Sunday, 15 May 2011, 15:03 GMT
Last edited by Tom Gundersen (tomegun) - Tuesday, 17 May 2011, 18:39 GMT
Task Type Bug Report
Category Packages: Core
Status Closed
Assigned To Tom Gundersen (tomegun)
Architecture i686
Severity High
Priority High
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

Description:
After last update during init scripts some weird messages began to appear, like:

"bootlogd: cannot deduce real console device"

and randomly:

"fsck.ext3: No such file or directory while try to open /dev/sda3
Possibly non-existent device?"

"/dev/sda3" is mounted as '/'.
In addition, the init script that processes LVM section doesn't
recognize any logical volumes (no relative devices are created
under "/dev").

I use a custom kernel with no initrd image (since 4 years ago),
in "/etc/rc.conf" USELVM is set to "yes" and i don't use any
testing packages (besides a few on AUR, but they're not part of
core packages).


Additional info:
* udev 168-1
* initscripts 2011.05.2-1

[/etc/fstab]
/dev/sda2 /boot ext2 noauto,defaults 0 1
/dev/sda3 / ext3 defaults 0 1
/dev/sda5 swap swap defaults 0 0
/dev/sda6 /opt ext3 defaults 0 1
/dev/sda7 /tmp ext3 defaults 0 1
/dev/sda8 /usr ext3 defaults 0 1
/dev/sda9 /var ext3 defaults 0 1
/dev/vgff0000/home /home ext3 defaults 0 1
/dev/vgff0000/data /data ext3 defaults 0 1


Steps to reproduce:
"bootlogd" error i can't reproduce it, because i don't know what
it is :-)
"/dev/sda3" error fired for about 3 times and then init works fine.
LVM devices error (non detection)


Temporary solution:
I downgrade udev 168-1 to 167-2 and "/dev/sda3" and LVM devices
creation seem solved.
"bootlogd" isn't solved at all.


Thoughts:
I don't know if it's a bug or if last updates take archlinux to
act not so good with my daily habit, you tell me ;-D
This task depends upon

Closed by  Tom Gundersen (tomegun)
Tuesday, 17 May 2011, 18:39 GMT
Reason for closing:  Fixed
Additional comments about closing:  Fixed in udev-168-2
Comment by Vladimir (snigga) - Sunday, 15 May 2011, 15:40 GMT
Same thing here. Temporarily solved by downgrading udev to 167-2.
Comment by Tom Gundersen (tomegun) - Sunday, 15 May 2011, 16:05 GMT
@snigga: do you have the exact same setup? no packages from testing, no initramfs?

bootlogd:
This is a new feature in this version of initscripts, the message means that for some reason it does not work on your system. Just ignore it. For details see "man bootlogd" (the BUGS section).

lvm:
This is a real bug, thanks for reporting!

Please try two things:

1)
After you booted, run
vgchange -ay
manually. Do the devices appear in /dev now? Any error messages?

2)
Could you edit your rc.sysinit to replace
/sbin/udevadm settle --quiet --timeout=${UDEV_TIMEOUT:-30}
with
/sbin/udevadm settle
Does this output anything during boot (if there is a problem at this point your boot might stall for a long time, please wait)?
Comment by Vladimir (snigga) - Sunday, 15 May 2011, 16:19 GMT
Yep, no initramfs, no testing progs.
I've got custom kernel, compiled from source, but it's been working for quite a lot of time without any problems.

1) With udev-168-1 after this message ("bootlogd: cannot deduce real console device") it sais that there's no /dev/sda3, then the root password prompt appears. I can boot into even 5 initlevel, but first i have to manually mount /boot,/home and remount / read-write. But seems that it can't open some serial devices like /dev/tty* - I'm having no terminal name in my bash prompt. All the block devices (/dev/sda*) are present in /dev.

2) right away
Comment by Vladimir (snigga) - Sunday, 15 May 2011, 16:29 GMT
2) Well that's queer. I've upgraded udev again and modified rc.sysinit and I'm getting no errors now. The system's loaded alright.
Comment by Tom Gundersen (tomegun) - Sunday, 15 May 2011, 16:31 GMT
@snigga: it seems to me there might be a problem with udev not settling correctly, or that it has to do with your kernel config. To test the settle, could you try adding "sleep 20" before the section containing the call to udevadm settle (i.e., not inside an if)?

@Everyone who sees this problem with a custom kernel:
Could you please let me know what kernel version and attach your .config files? Also, please verify that your kernel satisfies the udev requirements: <http://git.kernel.org/?p=linux/hotplug/udev.git;a=blob;f=README;h=f85637068547d681c6b79686e9add05abf1d44ca;hb=HEAD> (including the required but not strictly needed ones).
Comment by Tom Gundersen (tomegun) - Sunday, 15 May 2011, 16:33 GMT
@snigga: sorry, didn't see that second comment you made. What modification did you make to rc.sysinit? Could you attach it?

@the rest of you: could you try with snigga's rc.sysinit?
Comment by Vladimir (snigga) - Sunday, 15 May 2011, 16:34 GMT
I'm gonna try.
Here's my kernel config.
   kernelcfg (67.5 KiB)
Comment by Vladimir (snigga) - Sunday, 15 May 2011, 16:34 GMT
Right as you said earlier:

"Could you edit your rc.sysinit to replace
/sbin/udevadm settle --quiet --timeout=${UDEV_TIMEOUT:-30}
with
/sbin/udevadm settle"

And now it's working.
Comment by Tom Gundersen (tomegun) - Sunday, 15 May 2011, 16:36 GMT
@snigga: good it is working. may I also have your rc.conf (it seems that somehow UDEV_TIMEOUT is set to 0)?
Comment by Vladimir (snigga) - Sunday, 15 May 2011, 16:45 GMT
Sure.
   rc.conf (3.5 KiB)
Comment by Tom Gundersen (tomegun) - Sunday, 15 May 2011, 16:54 GMT
Thanks.

If you don't mind I have a few more things I'd like to ask, just so we know exactly what the problem is (and wether or not it is upstream in udev or in initscripts).

So UDEV_TIMEOUT is not set in your rc.conf. This should not make a difference, but could you try to set UDEV_TIMEOUT=30?

If that does not make a difference I wondered if you could test which one of these that will work and which will not (then I think we will have it narrowed down):
/sbin/udevadm settle --timeout=30
/sbin/udevadm settle --quiet

Thanks so much for taking the time to hunt this down :-)
Comment by Vladimir (snigga) - Sunday, 15 May 2011, 17:10 GMT
That's fun. Seems that the problem has totally disappeared. Neither restoring the rc.sysinit, nor adding UDEV_TIMEOUT made that happen again. No idea what's changed... I'll be in touch =)
Comment by Tom Gundersen (tomegun) - Sunday, 15 May 2011, 17:14 GMT
Ok. Let me know if it reappears.

Anyone else?
Comment by Alessandro Massignan (ff0000.it) - Sunday, 15 May 2011, 17:58 GMT
Hi all,

i install udev 168-1 but the system is acting in the same way, i try with the following lines:

* /sbin/udevadm settle --timeout=30
* /sbin/udevadm settle --quiet

in /etc/rc.sysinit but it acts as "i udev (randomly) don't want to create the devices you'd
like"... :-|
I attach rc.conf and my .config... Kernel config seems fine but CONFIG_UNIX that i've as
module (@snigga: how is your?).
Comment by Vladimir (snigga) - Sunday, 15 May 2011, 18:02 GMT
@ff0000.it: What do you mean? My config is attached some messages above, you could see by yoursel: CONFIG_UNIX=y.
But I doubt that it's the key...
Comment by Alessandro Massignan (ff0000.it) - Sunday, 15 May 2011, 18:03 GMT
@snigga: sorry i didn't see it... my fault :-)
Comment by Tom Gundersen (tomegun) - Sunday, 15 May 2011, 18:07 GMT
I'm pretty sure the problem is lack of devtmpfs. I already thought this after seeing snigga's config, and ff0000.it's seems to confirm it.

Could you try with the standard arch kernel to see if it can still be reproduced?
Comment by Vladimir (snigga) - Sunday, 15 May 2011, 18:56 GMT
THe problem has appeared again spontaneously.
First, i noticed that bootlogd refused to load ("bootlogd: cannot deduce real console device") while the system loaded alright. After spending some time reading man i tried to adppend some options to the kernel
console=tty0, then console=tty1. Neither of it seemed to work, but again i got the message "failed to open the device /dev/sda3" instead. I managed to trigger the behaviour by "playing" with these kernel options and the rc.sysinit file (with the "/sbin/udevadm settle" part) until the system totally refused to boot, giving me the root password maintenance prompt. Then i added a line "sleep=10"

after this part:

/sbin/udevadm settle
fi
---here---

and the system started to load fine. There were no additional info messages though.
Comment by Vladimir (snigga) - Sunday, 15 May 2011, 19:01 GMT
I'm gonna try to add devtmpfs into the kernel and see if this works.
Comment by Alessandro Massignan (ff0000.it) - Sunday, 15 May 2011, 19:06 GMT
@tomegun: i recompile the kernel with CONFIG_DEVTMPFS=y (and switching CONFIG_UNIX from 'm' to 'y'),
revert /etc/rc.sysinit back to original form (with --quiet and --timeout options set) and then i
reboot five times and the logical volumes have been found every time printing out the same notice:

/dev/mapper/vgff0000-home not set up by udev: Falling back to direct node creation.
/dev/mapper/vgff0000-data not set up by udev: Falling back to direct node creation.
The link /dev/vgff0000/home should had been created by udev but it was not found. Falling back to direct link creation.
The link /dev/vgff0000/data should had been created by udev but it was not found. Falling back to direct link creation.
2 logical volume(s) in volume group "vgff0000" now active

so i think the situation is not totally clear, but at least working.
i don't know what to do with bootlogd issue, may kernel option be involved? I have
CONFIG_UNIX98_PTYS=y and CONFIG_LEGACY_PTYS=n.


Thanks :-)
Comment by Tom Gundersen (tomegun) - Sunday, 15 May 2011, 19:06 GMT
I found other reports in other distros about this and there is now a thread on the udev mailinglist discussing the problem (I linked to this bug there).

This has not been confirmed, but it looks like "udevadm settle" is somehow broken. If you add the sleep it will allow udev to settle, so that is why the problem is fixed. If you use devtmpfs then udev settles much faster anyway, so that is why only you guys with self-compiled kernels are seeing this problem.

If you are up to the task, I'm sure it would be much appreciated if you were to try compiling udev 168 from git, and see if the problem is fixed by reverting one or both of


commit ead7c62ab7641e150c6d668f939c102a6771ce60
Author: Kay Sievers <kay.sievers@vrfy.org>
Date: Wed Apr 20 02:18:22 2011 +0200

udevadm: settle - kill alarm()



commit a3eca08b19711bf6322e639ca2b2c81d91896a67
Author: Kay Sievers <kay.sievers@vrfy.org>
Date: Wed Apr 13 18:44:28 2011 +0200

udevadm: settle - watch queue file
Comment by Vladimir (snigga) - Sunday, 15 May 2011, 19:08 GMT
Ok, that's clear. I'll see what I can do.

And the bootlogd still refuses to work. Any ideas?
Comment by Tom Gundersen (tomegun) - Sunday, 15 May 2011, 19:10 GMT
@ff0000.it: great! thanks for reporting back.

About bootlogd: try commenting out your /dev/pts line from fstab. If that does not work, please open a separate bug report against initscripts (as it is unrelated and much less critical ;-) ).
Comment by Alessandro Massignan (ff0000.it) - Sunday, 15 May 2011, 19:18 GMT
@tomegun: you're welcome, thanks to you all for serving us one of the best distro out there :-)
i've some work to finish, then (if sleep doesn't fall on me) i give a try to udev-git and pts.
Comment by Alessandro Massignan (ff0000.it) - Sunday, 15 May 2011, 19:47 GMT
@tomegun: here i am :-), i cloned git repo for udev and setup a tarball to compile with ABS'
scripts (with a little readapt), commented out devpts in /etc/fstab and rebbot: bootlogd
issues are gone (there's another i don't mention but it belongs to initscripts) and LVM2
vgs are detected and created as devices (still remain the notice "... not set up by udev:
Falling back to direct node creation... blah, blah, blah, ...").
So it seems almost resolved ;-D

Thanks mate!
Comment by Vladimir (snigga) - Sunday, 15 May 2011, 19:50 GMT
As for me - I recompiled kernel with devtmpfs and all issues are resolved, bootlogd loads fine.

Thanks a lot!
Comment by Alessandro Massignan (ff0000.it) - Sunday, 15 May 2011, 20:03 GMT
@snigga: have you install udev as package or from git? Do you set CONFIG_DEVTMPFS_MOUNT in kernel's config?
@tomegun: the issue during init is that mkdir find 2 pre-existent directories ("/tmp/.ICE-unix" and
"/tmp/.X11-unix") so i suppose it lacks of testing their existance or it's missing their deletion.
Comment by Tom Gundersen (tomegun) - Sunday, 15 May 2011, 20:07 GMT
@ff0000.it: see  FS#24279 .
Comment by Vladimir (snigga) - Sunday, 15 May 2011, 20:08 GMT
@ff0000.it: no, I've just set CONFIG_DEVTMPFS_MOUNT in the kernel config and everything's gone fine.
Comment by Alessandro Massignan (ff0000.it) - Monday, 16 May 2011, 06:58 GMT
@tomegun: thanks for the redirection, i'm one the many with a traditional filesystem mounted on /tmp, so i temporary edited the init script to force the clean-up of that directory.
However, for this issue do we select CONFIG_DEVTMPFS, or CONFIG_DEVTMPFS_MOUNT, or both? Could we haven't to pay attention to messages like these (output of
"vchange" command in activate_vgs() of /etc/rc.d/functions):

/dev/mapper/... not set up by udev: Falling back to direct node creation.
/dev/mapper/... set up by udev: Falling back to direct node creation.
The link /dev/... should had been created by udev but it was not found.
Falling back to direct link creation.
The link /dev/... should had been created by udev but it was not found.
Falling back to direct link creation.
2 logical volume(s) in volume group "..." now active

? Last question: is this a real bug?!? I mean, to solve it we change only
our kernel configuration (going off the rails using a custom kernel instead
of the packaged one), so for what it concerns to me it was my "fault" ;-).

Thanks again for support! :-)
Comment by Gerardo Exequiel Pozzi (djgera) - Monday, 16 May 2011, 07:08 GMT
@Ale: Only CONFIG_DEVTMPFS, because is mounted manually by init script at initramfs and/or by rc.sysinit (if not mounted already, example on systems without initramfs).
Comment by Vladimir (snigga) - Monday, 16 May 2011, 08:13 GMT
@ff0000.it: "However, for this issue do we select CONFIG_DEVTMPFS, or CONFIG_DEVTMPFS_MOUNT, or both?"
In fact, seems that you can't set CONFIG_DEVTMPFS_MOUNT without CONFIG_DEVTMPFS. I've set both and it works for me, don't know if CONFIG_DEVTMPFS alone is sufficient, haven't tested it.
Comment by Tom Gundersen (tomegun) - Monday, 16 May 2011, 08:27 GMT
CONFIG_DEVTMPFS is sufficient, but there is no reason to not also use CONFIG_DEVTMPFS_MOUNT as it will probably speed up boot a bit (it will let your kernel mount devtmpfs for you, otherwise this is done later by our initscripts).
Comment by Tom Gundersen (tomegun) - Monday, 16 May 2011, 08:29 GMT
About the original bug, if someone can still reproduce it, there was a report that reverting <http://git.kernel.org/?p=linux/hotplug/udev.git;a=commit;h=ff2c503df091e6e4e9ab48cdb6df6ec8b7b525d0> will fix it. Can someone confirm?
Comment by Gerardo Exequiel Pozzi (djgera) - Monday, 16 May 2011, 14:33 GMT
@Tom: CONFIG_DEVTMPFS_MOUNT (or devtmpfs.mount=1 at cmdline) is only functional when root-FS is managed directly by kernel routines and not via initramfs mechanism ;)
Comment by Tom Gundersen (tomegun) - Monday, 16 May 2011, 16:46 GMT
@djgera: yes, that's correct I should have said that. I just assumed that since these guys had self-compiled kernels they did not use an initramfs, but that might of course not be the case :-)
Comment by Igor Saric (karabaja4) - Tuesday, 17 May 2011, 02:01 GMT
@Tom Gundersen (tomegun)

these are the answers you wanted from the bug  FS#24272 . Sorry it took so long, I was unavailable.

1. Both
-- /proc/sys/kernel/hotplug
-- /sys/kernel/uevent_helper
are completely empty.

2. I use stock ARCH kernel

3. Replacing
-- /sbin/udevadm settle --quiet --timeout=${UDEV_TIMEOUT:-30}
with
-- /sbin/udevadm settle
made absolutely no difference in boot process or boot messages.

4. ** If you keep everything up to date, except downgrade udev, does that solve the problem? **
Yes, if everything is up to date with downgraded udev (and mkinitcpio, because it requires newest udev) everything works fine.

5. Again, I used and currently am using stock ARCH kernel.

Something I noticed that may or may not be useful: with udev 167 the system waited noticably longer on "waiting for udev events to be processed", like 2 seconds longer. I wonder if that wait period is the cause of USB drive not being recognized, or is it an udev bug like this one...
Comment by Tom Gundersen (tomegun) - Tuesday, 17 May 2011, 10:02 GMT
@karabaja4: thanks for the feedback. Your last comment is useful. It means that you almost certainly are hit by the udev bug. Once we fix that your system will probably work again (the problem with USB is still there, but you might not notice it if udev takes long enough to settle other events).

If the USB problems persist and you want to keep using USB at boot, then I can only suggest trying systemd (from community) as it does not rely on events to settle. There is however nothing we can do in initscripts due to the limitations of its design (it assumes all hardware is there at boot and never changes).
Comment by Tom Gundersen (tomegun) - Tuesday, 17 May 2011, 11:33 GMT
Hi all,

Could anyone still experiencing this (I guess it is only karabaja4 left?) try to see if this package solves the problem: <http://www.pps.jussieu.fr/~teg/udev-168-2-x86_64.pkg.tar.xz>? (I will upload the i686 version soon).

I notice that with this package my udev settle takes noticeably longer, so it is promising.
Comment by Hector Mtz-Seara Monne (hseara) - Tuesday, 17 May 2011, 12:39 GMT
Hi all,
I have the problem with usbdisk at boot. I'm using i686 system. Waiting for a solution if possible :). Otherwise moving to systemd
Comment by Igor Saric (karabaja4) - Tuesday, 17 May 2011, 18:31 GMT
@Tom: both your udev 168-2 package from the link you provided, and the package udev 168-2 that was uploaded to the repo today, fix the problem for me. It takes a bit longer to udev events to be processed but the drive is recognized.
Comment by Tom Gundersen (tomegun) - Tuesday, 17 May 2011, 18:39 GMT
@karabaja4: thanks for letting me know. I'll close this now then.

The reason it takes longer is that before udev would not wait for the events to finish processing (that was the bug).

Loading...