Arch Linux

Please read this before reporting a bug:
https://wiki.archlinux.org/title/Bug_reporting_guidelines

Do NOT report bugs when a package is just outdated, or it is in the AUR. Use the 'flag out of date' link on the package page, or the Mailing List.

REPEAT: Do NOT report bugs for outdated packages!
Tasklist

FS#25900 - thunderbird (6.x, 5.x) doesn't work for users with NFS home directories

Attached to Project: Arch Linux
Opened by Marek Kozlowski (guayasil) - Monday, 05 September 2011, 12:55 GMT
Last edited by Jan de Groot (JGC) - Monday, 28 November 2011, 10:34 GMT
Task Type Bug Report
Category Packages: Extra
Status Closed
Assigned To Ionut Biru (wonder)
Architecture x86_64
Severity Critical
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

Description:
Arch, XFCE 4.8.
For unknown reason thunderbird works OK for users with local home directories. For those with NFS homes a `segmentation fault' occurs. Tracing gives no reasonable error.

I tried configuring my NFS client according to https://wiki.archlinux.org/index.php/NFS , with or without the 'nolock' fstab option -- no difference.

AFAIK my server serves NFS3 but no other application (libreoffice, firefox, numerous graphical and development apps etc) complains, no errors in logs.

This task depends upon

Closed by  Jan de Groot (JGC)
Monday, 28 November 2011, 10:34 GMT
Reason for closing:  Won't fix
Additional comments about closing:  This is a design flaw in nss_ldap. Either use nscd to move the ldap lookup code out-of-process or switch to nss_ldapdb which fixes this design flaw.
Comment by Ionut Biru (wonder) - Monday, 05 September 2011, 13:55 GMT
you need to recompile firefox/thunderbird with debug symbols my adding flags to mozconfig and removing others (especially those with stripping and do ./configure --help | grep debug to see what to add). Along with that you also need add in PKGBUILD options=!strip. After that you can run
gdb /usr/lib/thunderbird-6.0.1/thunderbird-bin

it must say that it has debug symbols.

Then you report this issue on http://bugzilla.mozilla.org
Comment by Marek Kozlowski (guayasil) - Monday, 05 September 2011, 14:37 GMT
Sorry for such imprecise description. My impatience and my fault. I'm currently debugging and analyzing it... and the problem gets more and more mysterious. Give me 3-4 hours to think it over and I provide you with all the details.
Comment by Marek Kozlowski (guayasil) - Monday, 05 September 2011, 16:45 GMT
I have local a user (`uszatekm') and some remote users (for example: `kozlowskim'). Remote users authenticate via pam_ldap and nss_ldap. Their homes are located on a remote filesystem mounted via NFS; the `/etc/fstab' entries are:

194.29.178.12:/home/samba /home/samba nfs rw,nolock 0 0
194.29.178.12:/home/samba /home2/samba nfs rw,nolock 0 0

Everything works perfect for both local and remote users except thunderbird which segfaults if run by a remote user. I straced thunderbird for both users. The line (#7141 in the attached log for the local `uszatekm'):

1367 symlink("194.29.178.135:+1367", "/home/uszatekm/.thunderbird/7i6ywbr6.default/lock") = 0

is never reached for the remote user (`kozlowskim'); instead a segfault occurs.

Mozillas don't like NFS, there is a lock so it seemed obvious that there is some problem with locks on NFS. I traced it and tested numerous configurations and observed no difference. But finally I did an experiment: I removed the remote mounts and created a local home directory for the remote (LDAP) user `kozlowskim'. Well.... exacly the same segfault occurs.

--> it seems to be unrelated to NFS (the same w/o it),
--> the only difference (except NFS) is that remote users authenticate via pam_ldap and nss_ldap. But is seems to have nothing to do with the segfault while creating a symlink/lock.

Seems like squaring the circle and I'm completely confused...

Can anyone check/test how thunderbird works with NFS and LDAP users -- maybe we''' be able to identify the problem..?
Comment by Jan de Groot (JGC) - Monday, 05 September 2011, 21:42 GMT
You might want to try recompiling the nss_ldap package. In the ideal world NSS modules should get built against the running glibc. We've had several glibc updates in the meanwhile and we also changed CFLAGS/LDFLAGS in the process, so this could be caused by a glibc/nss_ldap mismatch.

Edit:
Try starting the nscd daemon, it should offload all the lookup code to a caching daemon, also speeds up your passwd/group lookups a lot.
Comment by Marek Kozlowski (guayasil) - Tuesday, 06 September 2011, 07:32 GMT
"Try starting the nscd daemon, it should offload all the lookup code to a caching daemon, also speeds up your passwd/group lookups a lot."

Bingo!!
Comment by Jan de Groot (JGC) - Tuesday, 06 September 2011, 08:04 GMT
And next bugreport will be "nscd has random crashes", which could be a duplicate of this bug. Did you try to recompile nss_ldap to rule this out?
Comment by Marek Kozlowski (guayasil) - Tuesday, 06 September 2011, 08:47 GMT
Just switched from Gentoo after "ten years of compiling" ;-)
I have some problems with nss_ldap compilation (`make'):

[...]
CVSVERSIONDIR=. vers_string -v
/bin/sh: vers_string: command not found
make[1]: *** [vers.c] Error 127
make[1]: Leaving directory `/home/aki/Downloads/nss_ldap/nss_ldap-265'
make: *** [all] Error 2

I did `chmod +x vers_string' and edited the Makefile according to http://134.75.123.21/twiki/bin/view/Main/LdapInstallation
Now I have:

make all-am
make[1]: Entering directory `/home/aki/Downloads/nss_ldap/nss_ldap-265'
CVSVERSIONDIR=
if gcc -DHAVE_CONFIG_H -I. -I. -I. -DLDAP_REFERRALS -DLDAP_DEPRECATED -D_REENTRANT -g -O2 -Wall -fPIC -MT vers.o -MD -MP -MF ".deps/vers.Tpo" -c -o vers.o vers.c; \
then mv -f ".deps/vers.Tpo" ".deps/vers.Po"; else rm -f ".deps/vers.Tpo"; exit 1; fi
gcc: error: vers.c: No such file or directory
gcc: fatal error: no input files
compilation terminated.
make[1]: *** [vers.o] Error 1
make[1]: Leaving directory `/home/aki/Downloads/nss_ldap/nss_ldap-265'
make: *** [all] Error 2

May I ask for some help?
Comment by Jan de Groot (JGC) - Tuesday, 06 September 2011, 09:10 GMT
Use abs for this:
# pacman -S abs
# abs extra/nss_ldap
$ cp -r /var/abs/extra/nss_ldap ~
$ cd ~/nss_ldap
$ makepkg

The commands prefixed with # should be run as root, the ones with $ as regular user. After doing this, you'll have a recompiled nss_ldap package in ~/nss_ldap.
Comment by Marek Kozlowski (guayasil) - Tuesday, 06 September 2011, 10:20 GMT
Thank you very much for the howto (looks like good old ebuilds ;-) ).
I prepared a new package with ABS and installed it (`pacman -U ...pkg.tar.gz') than I stopped nscd. Unfortunately, the is no difference between "new" and "old" package. It seems that nss_ldap is OK (not broken by minor glibc changes). I'm used to "new fascinating Mozilla's bugs" and it looks like one of them :-(
Comment by Jan de Groot (JGC) - Tuesday, 06 September 2011, 23:01 GMT
Ok, this problem is known upstream for 7 years (SEVEN!). This is caused by Thunderbird linking to the Mozilla LDAP C-sdk, which is different from the OpenLDAP libraries libnss-ldap link to. These libraries have some symbols with the same name, so when you use nss_ldap, it will use a mix of Mozilla and OpenLDAP libraries, causing crashes.

There's two ways to fix it:
- build Thunderbird against system libldap, there's patches available for that
- Switch to nss_ldapd
Comment by Stefan J. Betz (encbladexp) - Saturday, 26 November 2011, 07:46 GMT
I think we should rebuild Thunderbird (Firefox too???) with system libldap.
Comment by Marek Kozlowski (guayasil) - Saturday, 26 November 2011, 08:09 GMT
Never had that problem with firefox. But "sicher ist sicher" anyway...

Loading...