FS#5894 - ls segfaults

Attached to Project: Arch Linux
Opened by Johannes Jordan (FoPref) - Saturday, 25 November 2006, 15:13 GMT
Task Type Bug Report
Category System
Status Closed
Assigned To No-one
Architecture not specified
Severity Critical
Priority Normal
Reported Version 0.7.2 Gimmick
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

Since the last update, ls segfaults reading /root/ on one of my systems!

The packages in questions updated are listed in the log:
[11/24/06 23:05] upgraded glibc (2.4-4 -> 2.5-2)
[11/24/06 23:05] upgraded coreutils (6.4-2 -> 6.5-1)

As root, following happens:
[root@fsi ~]# ls
Segmentation fault
[root@fsi ~]# ls --color=none
awstats-6.5 awstats-6.5.tar.gz crontab dokuwiki-2006-03-09 exclude mailman-backup-2006-09-18.tar.bz2 root-snapshots server-backup
[root@fsi ~]# ls --color=none -l
Segmentation fault
[root@fsi ~]# valgrind ls --color=none -l
==9776== Memcheck, a memory error detector.
==9776== Copyright (C) 2002-2006, and GNU GPL'd, by Julian Seward et al.
==9776== Using LibVEX rev 1606, a library for dynamic binary translation.
==9776== Copyright (C) 2004-2006, and GNU GPL'd, by OpenWorks LLP.
==9776== Using valgrind-3.2.0, a dynamic binary instrumentation framework.
==9776== Copyright (C) 2000-2006, and GNU GPL'd, by Julian Seward et al.
==9776== For more details, rerun with: -v
==9776==
==9776== Conditional jump or move depends on uninitialised value(s)
==9776== at 0x400AD59: _dl_relocate_object (in /lib/ld-2.5.so)
==9776== by 0x4003EC7: dl_main (in /lib/ld-2.5.so)
==9776== by 0x40135C5: _dl_sysdep_start (in /lib/ld-2.5.so)
==9776== by 0x40011E1: _dl_start (in /lib/ld-2.5.so)
==9776== by 0x4000846: (within /lib/ld-2.5.so)
==9776==
==9776== Conditional jump or move depends on uninitialised value(s)
==9776== at 0x400A9D4: _dl_relocate_object (in /lib/ld-2.5.so)
==9776== by 0x4003EC7: dl_main (in /lib/ld-2.5.so)
==9776== by 0x40135C5: _dl_sysdep_start (in /lib/ld-2.5.so)
==9776== by 0x40011E1: _dl_start (in /lib/ld-2.5.so)
==9776== by 0x4000846: (within /lib/ld-2.5.so)
==9776==
==9776== Conditional jump or move depends on uninitialised value(s)
==9776== at 0x400A889: _dl_relocate_object (in /lib/ld-2.5.so)
==9776== by 0x4003EC7: dl_main (in /lib/ld-2.5.so)
==9776== by 0x40135C5: _dl_sysdep_start (in /lib/ld-2.5.so)
==9776== by 0x40011E1: _dl_start (in /lib/ld-2.5.so)
==9776== by 0x4000846: (within /lib/ld-2.5.so)
==9776==
==9776== Conditional jump or move depends on uninitialised value(s)
==9776== at 0x400A891: _dl_relocate_object (in /lib/ld-2.5.so)
==9776== by 0x4003EC7: dl_main (in /lib/ld-2.5.so)
==9776== by 0x40135C5: _dl_sysdep_start (in /lib/ld-2.5.so)
==9776== by 0x40011E1: _dl_start (in /lib/ld-2.5.so)
==9776== by 0x4000846: (within /lib/ld-2.5.so)
==9776==
==9776== Conditional jump or move depends on uninitialised value(s)
==9776== at 0x400B161: _dl_relocate_object (in /lib/ld-2.5.so)
==9776== by 0x4003EC7: dl_main (in /lib/ld-2.5.so)
==9776== by 0x40135C5: _dl_sysdep_start (in /lib/ld-2.5.so)
==9776== by 0x40011E1: _dl_start (in /lib/ld-2.5.so)
==9776== by 0x4000846: (within /lib/ld-2.5.so)
==9776==
==9776== Conditional jump or move depends on uninitialised value(s)
==9776== at 0x400A889: _dl_relocate_object (in /lib/ld-2.5.so)
==9776== by 0x4003D64: dl_main (in /lib/ld-2.5.so)
==9776== by 0x40135C5: _dl_sysdep_start (in /lib/ld-2.5.so)
==9776== by 0x40011E1: _dl_start (in /lib/ld-2.5.so)
==9776== by 0x4000846: (within /lib/ld-2.5.so)
==9776==
==9776== Conditional jump or move depends on uninitialised value(s)
==9776== at 0x400A891: _dl_relocate_object (in /lib/ld-2.5.so)
==9776== by 0x4003D64: dl_main (in /lib/ld-2.5.so)
==9776== by 0x40135C5: _dl_sysdep_start (in /lib/ld-2.5.so)
==9776== by 0x40011E1: _dl_start (in /lib/ld-2.5.so)
==9776== by 0x4000846: (within /lib/ld-2.5.so)
==9776==
==9776== Conditional jump or move depends on uninitialised value(s)
==9776== at 0x400A9D4: _dl_relocate_object (in /lib/ld-2.5.so)
==9776== by 0x4003D64: dl_main (in /lib/ld-2.5.so)
==9776== by 0x40135C5: _dl_sysdep_start (in /lib/ld-2.5.so)
==9776== by 0x40011E1: _dl_start (in /lib/ld-2.5.so)
==9776== by 0x4000846: (within /lib/ld-2.5.so)
==9776==
==9776== Conditional jump or move depends on uninitialised value(s)
==9776== at 0x400AD59: _dl_relocate_object (in /lib/ld-2.5.so)
==9776== by 0x401165F: dl_open_worker (in /lib/ld-2.5.so)
==9776== by 0x400D6F1: _dl_catch_error (in /lib/ld-2.5.so)
==9776== by 0x4011068: _dl_open (in /lib/ld-2.5.so)
==9776== by 0x4126DA0: do_dlopen (in /lib/libc-2.5.so)
==9776== by 0x400D6F1: _dl_catch_error (in /lib/ld-2.5.so)
==9776== by 0x4126E90: dlerror_run (in /lib/libc-2.5.so)
==9776== by 0x4126FB5: __libc_dlopen_mode (in /lib/libc-2.5.so)
==9776== by 0x410322D: __nss_lookup_function (in /lib/libc-2.5.so)
==9776== by 0x41032BF: __nss_lookup (in /lib/libc-2.5.so)
==9776== by 0x4104D45: __nss_passwd_lookup (in /lib/libc-2.5.so)
==9776== by 0x40BFA23: getpwuid_r@@GLIBC_2.1.2 (in /lib/libc-2.5.so)
==9776==
==9776== Conditional jump or move depends on uninitialised value(s)
==9776== at 0x400A9D4: _dl_relocate_object (in /lib/ld-2.5.so)
==9776== by 0x401165F: dl_open_worker (in /lib/ld-2.5.so)
==9776== by 0x400D6F1: _dl_catch_error (in /lib/ld-2.5.so)
==9776== by 0x4011068: _dl_open (in /lib/ld-2.5.so)
==9776== by 0x4126DA0: do_dlopen (in /lib/libc-2.5.so)
==9776== by 0x400D6F1: _dl_catch_error (in /lib/ld-2.5.so)
==9776== by 0x4126E90: dlerror_run (in /lib/libc-2.5.so)
==9776== by 0x4126FB5: __libc_dlopen_mode (in /lib/libc-2.5.so)
==9776== by 0x410322D: __nss_lookup_function (in /lib/libc-2.5.so)
==9776== by 0x41032BF: __nss_lookup (in /lib/libc-2.5.so)
==9776== by 0x4104D45: __nss_passwd_lookup (in /lib/libc-2.5.so)
==9776== by 0x40BFA23: getpwuid_r@@GLIBC_2.1.2 (in /lib/libc-2.5.so)
==9776==
==9776== Invalid read of size 1
==9776== at 0x4022108: strlen (in /usr/lib/valgrind/x86-linux/vgpreload_memcheck.so)
==9776== by 0x8052F06: (within /bin/ls)
==9776== by 0x804B480: (within /bin/ls)
==9776== by 0x804D8F6: (within /bin/ls)
==9776== by 0x804F4C3: (within /bin/ls)
==9776== by 0x404A7C7: (below main) (in /lib/libc-2.5.so)
==9776== Address 0x0 is not stack'd, malloc'd or (recently) free'd
==9776==
==9776== Process terminating with default action of signal 11 (SIGSEGV)
==9776== Access not within mapped region at address 0x0
==9776== at 0x4022108: strlen (in /usr/lib/valgrind/x86-linux/vgpreload_memcheck.so)
==9776== by 0x8052F06: (within /bin/ls)
==9776== by 0x804B480: (within /bin/ls)
==9776== by 0x804D8F6: (within /bin/ls)
==9776== by 0x804F4C3: (within /bin/ls)
==9776== by 0x404A7C7: (below main) (in /lib/libc-2.5.so)
==9776==
==9776== ERROR SUMMARY: 20 errors from 11 contexts (suppressed: 0 from 0)
==9776== malloc/free: in use at exit: 16,231 bytes in 7 blocks.
==9776== malloc/free: 79 allocs, 72 frees, 23,333 bytes allocated.
==9776== For counts of detected errors, rerun with: -v
==9776== searching for pointers to 7 not-freed blocks.
==9776== checked 76,468 bytes.
==9776==
==9776== LEAK SUMMARY:
==9776== definitely lost: 0 bytes in 0 blocks.
==9776== possibly lost: 0 bytes in 0 blocks.
==9776== still reachable: 16,231 bytes in 7 blocks.
==9776== suppressed: 0 bytes in 0 blocks.
==9776== Reachable blocks (those to which a pointer was found) are not shown.
==9776== To see them, rerun with: --show-reachable=yes
Segmentation fault

This bug is 100% reproducable. Using find -ls in the same directory had no problems.

Workaround: Re-installing older packages made ls work as expected:
[root@fsi pkg]# pacman -U glibc-2.4-4.pkg.tar.gz coreutils-6.4-2.pkg.tar.gz binutils-2.17-1.pkg.tar.gz
[root@fsi ~]# ls
awstats-6.5 awstats-6.5.tar.gz crontab dokuwiki-2006-03-09 exclude mailman-backup-2006-09-18.tar.bz2 root-snapshots server-backup
[root@fsi ~]# valgrind ls -l
==9991== Memcheck, a memory error detector.
==9991== Copyright (C) 2002-2006, and GNU GPL'd, by Julian Seward et al.
==9991== Using LibVEX rev 1606, a library for dynamic binary translation.
==9991== Copyright (C) 2004-2006, and GNU GPL'd, by OpenWorks LLP.
==9991== Using valgrind-3.2.0, a dynamic binary instrumentation framework.
==9991== Copyright (C) 2000-2006, and GNU GPL'd, by Julian Seward et al.
==9991== For more details, rerun with: -v
==9991==
total 7248
drwx------ 5 sifrblen 513 4096 Dec 24 2005 awstats-6.5
-rw-r----- 1 root root 1051780 Dec 24 2005 awstats-6.5.tar.gz
-rw-r--r-- 1 root root 114 Nov 17 18:26 crontab
drwxr-xr-x 7 sijojord 1000 4096 Mar 9 2006 dokuwiki-2006-03-09
-rw-r----- 1 root root 109 Oct 25 16:56 exclude
-rw-r--r-- 1 root root 6326864 Sep 18 16:42 mailman-backup-2006-09-18.tar.bz2
drwxr-xr-x 2 root root 4096 Nov 25 15:37 root-snapshots
drwxr-xr-x 3 root root 4096 Nov 25 14:45 server-backup
==9991==
==9991== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 19 from 1)
==9991== malloc/free: in use at exit: 12,302 bytes in 21 blocks.
==9991== malloc/free: 106 allocs, 85 frees, 28,751 bytes allocated.
==9991== For counts of detected errors, rerun with: -v
==9991== searching for pointers to 21 not-freed blocks.
==9991== checked 76,476 bytes.
==9991==
==9991== LEAK SUMMARY:
==9991== definitely lost: 0 bytes in 0 blocks.
==9991== possibly lost: 0 bytes in 0 blocks.
==9991== still reachable: 12,302 bytes in 21 blocks.
==9991== suppressed: 0 bytes in 0 blocks.
==9991== Reachable blocks (those to which a pointer was found) are not shown.
==9991== To see them, rerun with: --show-reachable=yes


As you see, all the error messages, not only the invalid read at the end, aren't seen with the old versions. Perhaps this helps determining where it went wrong.


Btw., glibc is on NoUpgrade. Why was it upgraded by pacman?
This task depends upon

Closed by  Jan de Groot (JGC)
Monday, 27 November 2006, 22:39 GMT
Reason for closing:  Works for me
Additional comments about closing:  6.6 is in x86_64 also, so this bug is no longer valid.
Comment by Johannes Jordan (FoPref) - Saturday, 25 November 2006, 15:14 GMT
Further investigation show that using new glibc/binutils packages work, while using new coreutils package leads to the errornous behaviour.
Comment by Jan de Groot (JGC) - Saturday, 25 November 2006, 16:02 GMT
coreutils 6.5-2 and 6.6 have been released last week, which don't have this problem anymore. The crash happens because ls tries to lookup a userid from its cache, which isn't possible when the userid doesn't exist on your system anymore.
Comment by Tobias Powalowski (tpowa) - Sunday, 26 November 2006, 09:44 GMT
jan this can be closed or not?
Comment by Jan de Groot (JGC) - Sunday, 26 November 2006, 10:37 GMT
For i686 this problem doesn't exist anymore, don't know the state of x86_84 though.

Loading...