FS#13859 - [initscripts] Udev with LDAP blocked at "Starting Udev daemon"
Attached to Project:
Arch Linux
Opened by Paul Ezvan (paulez) - Wednesday, 18 March 2009, 21:28 GMT
Last edited by Tom Gundersen (tomegun) - Sunday, 27 March 2011, 16:24 GMT
Opened by Paul Ezvan (paulez) - Wednesday, 18 March 2009, 21:28 GMT
Last edited by Tom Gundersen (tomegun) - Sunday, 27 March 2011, 16:24 GMT
|
Details
Description:
Since last udev update, the system doesn't start and stays blocked at "Starting Udev daemon". The system uses LDAP auth with nss_ldap and PAM. The problem occures on two different systems, I haven't rebooted the others :p It was previously working fine without the workaround described in the wiki. I am not sure that LDAP is the problem, but I don't know how to discover what is blocking udev daemon startup. Additional info: * package version(s) : udev-139-1 Steps to reproduce: 1-Configure nss to use LDAP 2-reboot 3-system is blocked during startup at step "Starting Udev daemon". |
This task depends upon
Closed by Tom Gundersen (tomegun)
Sunday, 27 March 2011, 16:24 GMT
Reason for closing: No response
Additional comments about closing: See my last comment.
Sunday, 27 March 2011, 16:24 GMT
Reason for closing: No response
Additional comments about closing: See my last comment.
hosts: files dns
My best guess is that the new udev added several rules which are relying on users/groups that are not present locally in /etc/passwd and /etc/group, so it refers to the next item in nsswitch.conf, and if your nsswitch.conf looks like
passwd: files ldap
group: files ldap
shadow: files ldap
Then nss_ldap will try to fetch the necessary entries from ldap. In the process it does a hostname resolution and hangs in an infinite loop since it can't contact the ldap server for hostname lookup. It's just a guess, but I suppose the fix would be to make sure all users/groups used by udev are present in /etc/passwd and /etc/group.
group: files ldap[unavail=return]
The configuration "hosts: dns ldap" is the default given by nsswitch.conf.ldap. Maybe we should change this to avoid this kind of problem ?
I have the same Problem.... Arch needs more than 180 Sek to load UDev.... The Workaround doenst work for me... What should i do?
Greetings from Germany!
please paste your exact nsswitch.conf (without a workaround)
passwd: files ldap
group: files ldap
shadow: files ldap
publickey: files
hosts: files dns ldap
networks: files
protocols: files
services: files
ethers: files
rpc: files
netgroup: files
# End /etc/nsswitch.conf
I resolved the problem by setting "bind_policy soft" in nss_ldap.conf instead of "bind_policy hard".
bind_policy soft seems to be a bad workaround because ldap resolution may fail on high network load.
This workaround seems to be better : adding the following to /etc/nss_ldap.conf :
nss_reconnect_tries 1
nss_reconnect_sleeptime 1
nss_reconnect_maxsleeptime 8
nss_reconnect_maxconntries 2
I have just tested it on an archbox, it boots fine with bind_policy hard, so for the moment I prefer this solution. It think it should be added to the default /etc/nss_ldap.conf file.
I guess that is a good idea, not only for this but also for other purposes: udevd can have a separate startup script, then can use a conf.d/udev or something like this.
Are people still experiencing this problem?
@Kaiting: could you find out what users/groups are missing? I just looked through my /etc/group and /lib/udev/rules.d/, and everything looks to be the way it should.
In my opinion, this problem should be solved by filing bug reports against packages that add udev rules, but do not add the required users/groups.
There is a similar RedHat bug report (https://bugzilla.redhat.com/show_bug.cgi?id=234541), in which the udev maintainer comments:
"Kay Sievers 2007-04-02 05:41:03 EDT
Please just fix your /etc/nsswitch.conf to lookup /etc/group first. And make
sure, _all_ required system users exist in the /etc/group file.
This isn't a bug in udev nor glibc. System users which are not stored on the
local system, but used in udev-rules, are a "configuration bug", which can't be
worked around."
I suggest closing this bug.