FS#3369 - ldap group management breaks udev on start
Attached to Project:
Arch Linux
Opened by Alex Matviychuk (alexmat) - Friday, 21 October 2005, 20:33 GMT
Last edited by arjan timmerman (blaasvis) - Wednesday, 02 November 2005, 10:13 GMT
Opened by Alex Matviychuk (alexmat) - Friday, 21 October 2005, 20:33 GMT
Last edited by arjan timmerman (blaasvis) - Wednesday, 02 November 2005, 10:13 GMT
|
Details
I've been using udev and ldap user and group authentication
for a while now, but the latest udev update makes my machine
stuck on [busy] when loading udev. If I set nsswitch.conf
=> "group files" instead of "group files ldap",
everything works fine. "group files ldap" used to work fine.
I checked and ldap groups are working fine if I enable it
after bootup. The thing is the ldap server is not booted up
before udev.. but this was never necessary before.
|
This task depends upon
Ok that's all history now because I got sick of finding workarounds for workarounds. I dug into udev and found what causes it to halt. It is indeed trying to resolve with an ldap server thats active, but on the network, and since the network services don't start without udev it becomes a cyclical dependancy.
However, udev worked just fine a few updates ago so what happened? udev.rules assignes devices to groups using numbers... that is until recently, now half the rules are numbers and half are names. I switched all group names to their nuemerical mappings and Viola! everything is smooth again.
I don't know how to resolve this in a clean manner for UDEV and LDAP. Putting in numbers instead of group names is a chore and not all systems may use the same mappings (although I would think most people stick with the default group mappings). However I can't imagine how NSS_LDAP can work with the current UDEV, because the system insists on timing out waiting for a LDAP server it's never going to reach.
I did a man on nsswitch.conf and there were some interesting bits in there about switches like TRYAGAIN and UNAVAIL, however, I could not get any of them to make UDEV skip the LDAP entry in the nsswitch.conf on boot.
There must be a nice way to do this that I am overlooking. Help me Obi Judd Kenobi! You're my only hope ;P
A workaround solved the matter...
--- start_udev 2005-12-18 17:21:41.000000000 +0100
+++ start_udev.new 2005-12-18 17:21:09.000000000 +0100
@@ -92,8 +92,8 @@
# You can use the shell scripts above by calling run_udev or execute udevstart
# which does the same thing, but much faster by not using shell.
# only comment out one of the following lines.
-#run_udev
-/sbin/udevstart
+run_udev
+#/sbin/udevstart
echo "making extra nodes"
make_extra_nodes
Thanks for sharing :)
EDIT:
tell us which version of udev did work?
http://cvs.archlinux.org/cgi-bin/viewcvs.cgi/base/udev/udev.rules.diff?r1=1.7&r2=1.8&cvsroot=Current&only_with_tag=MAIN
Version 1.7 of udev.rules was the last one to work because the things in 1.8 contains group="$VALUE". When the system boots it tries resolve $VALUE to a number. I have my nsswitch.conf set to: group files ldap, so it tries to resolve to an ldap server and times out (hangs indefinitly). I changed all the group names to group numbers and udev will now start up fine. But it is broken with every udev update. My options are to disable ldap (not really an option for me at this point) or mess with udev all the time (although I still haven't tried the suggestion from dimorph, maybe that will help things).
Just to be clear, version 1.7 of udev.rules works fine with ldap, 1.8 does not.
You'll notice there are no instances of group="$VALUE".
all big distros use group="name" so i think that's the standard way.
i don't know how the others deal with ldap, but their start_udev doesn't differ from ours.
https://www.redhat.com/archives/fedora-devel-list/2005-September/msg00406.html
Unfortunaley, that thread doesn't resolve anything.
I know I'm not the only one with the problem and it may not be an Arch Linux specific thing, but it is a problem as far as I can tell. Are any of the devs using a nss_ldap setup with a recent udev?
Did the workaround from Dimorph work at all? If so, we can use that until a fix comes from upstream.
I set "bind_policy soft" in /etc/nss_ldap.conf and now all is well. I get messages that it's failing a few ldap lookups early in the boot (before the modules are loaded and the network started up) but it just continues on its merry way after that, and the ldap kicks in fine once the network comes up.
Thanks for the suggestions, all.