FS#11096 - Segmentation fault when using pacman 3.2.0-1 on x64

Attached to Project: Pacman
Opened by Mark (voidzero) - Saturday, 02 August 2008, 21:31 GMT
Last edited by Dan McGee (toofishes) - Saturday, 09 August 2008, 15:09 GMT
Task Type Bug Report
Category Backend/Core
Status Closed
Assigned To Xavier (shining)
Dan McGee (toofishes)
Architecture x86_64
Severity Critical
Priority Normal
Reported Version 3.2.0
Due in Version 3.2.1
Due Date Undecided
Percent Complete 100%
Votes 1
Private No

Details

Description:
When having installed pacman 3.2.0-1 on x64, i am getting a segmentation fault at the end of the operation. It seems that this occurs after the actual package installation, because a downgrade solves this problem for me (ie. the package gets downgraded and this is reflected in the package database). However, /var/lib/pacman/db.lck is not removed.

I am not using any tweakages, have a normal pacman.conf and am using the stock archlinux kernel.

#################

# pacman -S pacman
(...)
checking package integrity...
(1/1) checking for file conflicts [#####################] 100%
(1/1) upgrading pacman [#####################] 100%
error: segmentation fault
error: Internal pacman error: Segmentation fault.

#################

Debug (unnecessary parts omitted)
debug: adding entry 'pacman' in 'local' cache
debug: executing post_upgrade script...
debug: . /tmp/alpm_6rAuWl/.INSTALL; post_upgrade 3.2.0-1 3.2.0-1
debug: chrooting in /
debug: executing ". /tmp/alpm_6rAuWl/.INSTALL; post_upgrade 3.2.0-1 3.2.0-1"
debug: call to waitpid succeeded
error: segmentation fault
error: Internal pacman error: Segmentation fault.
This task depends upon

Closed by  Dan McGee (toofishes)
Saturday, 09 August 2008, 15:09 GMT
Reason for closing:  Fixed
Additional comments about closing:  Fixed in commit d8f8a126658a25fdad
Comment by Mark (voidzero) - Saturday, 02 August 2008, 21:32 GMT
With strace output it shows as follows:

stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2917, ...}) = 0
socket(PF_FILE, SOCK_DGRAM, 0) = 14
fcntl(14, F_SETFD, FD_CLOEXEC) = 0
connect(14, {sa_family=AF_FILE, path="/dev/log"}, 110) = 0
sendto(14, "<12>Aug 2 23:31:53 pacman: upgr"..., 65, MSG_NOSIGNAL, NULL, 0) = 65
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2917, ...}) = 0
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
Comment by Xavier (shining) - Sunday, 03 August 2008, 22:17 GMT
It would be great if you could build pacman with debug symbols, and get a gdb backtrace.
Is this issue reproducible on any x86_64 box? And when installing any packages, or just pacman?
It is really a problem that none of pacman developers / contributors seem to have a x86_64 box...
Comment by Allan McRae (Allan) - Monday, 04 August 2008, 07:36 GMT
I have tried and can't replicate this on my x86_64 laptop.
Comment by Mark (voidzero) - Monday, 04 August 2008, 14:16 GMT
I can reproduce this on my server as well. So right now two systems seem to have the same problem. I'll try my third x86_64 system today: a laptop with the same configuration.

How do I get debug symbols into pacman? I couldn't find this in the options.

As an aside and in response to your comment Xavier, I am open to helping along with x86_64 packages and wouldn't mind to join the team - I have 12 years of linux experience.
Comment by Xavier (shining) - Tuesday, 05 August 2008, 07:05 GMT
Just add the --enable-debug configure flag when running ./configure.
If you build it from a PKGBUILD, you also need to disable stripping with :
options=(!strip)

http://wiki.archlinux.org/index.php/Debug_-_Getting_Traces
Comment by Mark (voidzero) - Tuesday, 05 August 2008, 13:54 GMT
Okay, I built it with !strip and --enable-debug and have noticed something strange. I tested it by installing the package 'unison':

###############
# pacman -S unison
warning: unison-2.27.57-1 is up to date -- reinstalling
resolving dependencies...
looking for inter-conflicts...

Targets (1): unison-2.27.57-1 [1.87 MB]

Total Download Size: 0.00 MB
Total Installed Size: 5.66 MB

Proceed with installation? [Y/n]
checking package integrity...
(1/1) checking for file conflicts [#####] 100%
(1/1) upgrading unison [#####] 100%
NOTE:
For gtk1 frontend please add 'gtk' package.
For gtk2 frontend please add 'gtk2' package.
Default X11 frontend is set to gtk2.

If you want to default to gtk1 unison:
'rm /usr/bin/unison-x11'
'ln -s /usr/bin/unison-gtk /usr/bin/unison-x11'

error: segmentation fault
error: Internal pacman error: Segmentation fault.
Please submit a full bug report with --debug if appropriate.
#############

Now, when ran with --debug, it seems to actually work just fine. :|

#############
(...)
NOTE:
For gtk1 frontend please add 'gtk' package.
For gtk2 frontend please add 'gtk2' package.
Default X11 frontend is set to gtk2.

If you want to default to gtk1 unison:
'rm /usr/bin/unison-x11'
'ln -s /usr/bin/unison-gtk /usr/bin/unison-x11'

[15:52:18] debug: call to waitpid succeeded
[15:52:18] debug: running "ldconfig -r /"
[15:52:18] debug: closing database 'local'
[15:52:18] debug: unregistering database 'local'
[15:52:18] debug: freeing package cache for repository 'local'
[15:52:18] debug: closing database 'compiz-fusion'
[15:52:18] debug: unregistering database 'compiz-fusion'
[15:52:18] debug: freeing package cache for repository 'compiz-fusion'
[15:52:18] debug: closing database 'kdemod'
[15:52:18] debug: unregistering database 'kdemod'
[15:52:18] debug: freeing package cache for repository 'kdemod'
[15:52:18] debug: closing database 'archlinuxfr'
[15:52:18] debug: unregistering database 'archlinuxfr'
[15:52:18] debug: freeing package cache for repository 'archlinuxfr'
[15:52:18] debug: closing database 'testing'
[15:52:18] debug: unregistering database 'testing'
[15:52:18] debug: freeing package cache for repository 'testing'
[15:52:18] debug: closing database 'core'
[15:52:18] debug: unregistering database 'core'
[15:52:18] debug: freeing package cache for repository 'core'
[15:52:18] debug: closing database 'extra'
[15:52:18] debug: unregistering database 'extra'
[15:52:18] debug: freeing package cache for repository 'extra'
[15:52:18] debug: closing database 'community'
[15:52:18] debug: unregistering database 'community'
[15:52:18] debug: closing database 'unstable'
[15:52:18] debug: unregistering database 'unstable'
##############
Comment by Mark (voidzero) - Tuesday, 05 August 2008, 14:50 GMT
Okay. i'm using metalog and have the "UseSysLog" option. It used to work.. but now I'm not so sure. When I take it out, the segfaults seem to vanish.

This was never a concern before 3.2.0.
Comment by Xavier (shining) - Tuesday, 05 August 2008, 17:46 GMT
I never tried that option before, but I just did, just in case. It worked fine, with syslog-ng already running.
I also installed metalog and added a pacman section to its config file, it worked great too.
But well, that is all on i686.
Anyway, this syslog stuff has not changed at all. The only thing pacman does is calling vsyslog (see man vsyslog).
I don't see how it could have stopped working with pacman 3.2.0.
Comment by Dan McGee (toofishes) - Wednesday, 06 August 2008, 00:52 GMT
I'd be less interested in the --debug output and more interested in a gdb backtrace- can you still reproduce that with the symbols-compiled version?
Comment by Mark (voidzero) - Thursday, 07 August 2008, 10:45 GMT
Please help me with the gdb backtrace. What commands do I need to use?
Comment by Dan McGee (toofishes) - Thursday, 07 August 2008, 10:55 GMT
gdb --args pacman -S unison

or similar. When it loads up, type the following:
b main (to break at the start of the main routine)
r (to run the program)
c (to continue past the breakpoint we set at main)

Once you get it to segfault, it will drop you into the GDB prompt. Type 'bt' to get a backtrace. 'c' will then continue and 'q' will quit.
Comment by Xavier (shining) - Friday, 08 August 2008, 14:07 GMT
Are you sure you never changed the LogFile option either?
It would be more relevant after examining the strace output above.

But it would be even better if you could provide us that backtrace.
Comment by Mark (voidzero) - Friday, 08 August 2008, 14:12 GMT
I will do the gdb trace tonight and will post my pacman.conf as well. Sorry for the slight delays.
Comment by Mark (voidzero) - Friday, 08 August 2008, 22:19 GMT
Okay, what I got was this:
(1/1) upgrading unison [#####################] 100%
NOTE:
For gtk1 frontend please add 'gtk' package.
For gtk2 frontend please add 'gtk2' package.
Default X11 frontend is set to gtk2.

If you want to default to gtk1 unison:
'rm /usr/bin/unison-x11'
'ln -s /usr/bin/unison-gtk /usr/bin/unison-x11'


Program received signal SIGSEGV, Segmentation fault.
0x00007f99c6605970 in strlen () from /lib/libc.so.6
(gdb)



Pacman.conf, options section:
[options]
UseSyslog
ShowSize
TotalDownload
UseDelta
LogFile = /var/log/pacman.log
HoldPkg = pacman glibc kernel26

When commenting UseSyslog the problem does not occur.
Comment by Mark (voidzero) - Friday, 08 August 2008, 22:23 GMT
Sorry, rather misread the gdb instructions. Below is the backtrace ;)

Program received signal SIGSEGV, Segmentation fault.
0x00007f0458539970 in strlen () from /lib/libc.so.6
(gdb) bt
#0 0x00007f0458539970 in strlen () from /lib/libc.so.6
#1 0x00007f0458506396 in vfprintf () from /lib/libc.so.6
#2 0x00007f045969d582 in _alpm_logaction () from /usr/lib/libalpm.so.3
#3 0x00007f0459693d89 in alpm_logaction () from /usr/lib/libalpm.so.3
#4 0x000000000040abe3 in cb_trans_evt ()
#5 0x00007f0459683c6f in commit_single_pkg () from /usr/lib/libalpm.so.3
#6 0x00007f045968528b in _alpm_add_commit () from /usr/lib/libalpm.so.3
#7 0x00007f045969be5f in _alpm_trans_commit () from /usr/lib/libalpm.so.3
#8 0x00007f0459699a63 in _alpm_sync_commit () from /usr/lib/libalpm.so.3
#9 0x00007f045969be88 in _alpm_trans_commit () from /usr/lib/libalpm.so.3
#10 0x000000000040969d in sync_trans ()
#11 0x0000000000409bd4 in pacman_sync ()
#12 0x0000000000407594 in main ()
Comment by Dan McGee (toofishes) - Saturday, 09 August 2008, 03:20 GMT
  • Field changed: Status (Researching → Requires Testing)
  • Field changed: Category (General → Backend/Core)
  • Field changed: Reported Version (None → 3.2.0)
  • Field changed: Due in Version (Undecided → 3.2.1)
  • Field changed: Architecture (All → x86_64)
Woo, got this one!
http://lists.opensuse.org/opensuse-programming/2008-02/msg00005.html
http://lists.opensuse.org/opensuse-programming/2008-02/msg00008.html

Patch attached. Is there any way you could test this? I wish I had an x86_64 box to test on, but I don't.
Comment by Allan McRae (Allan) - Saturday, 09 August 2008, 12:04 GMT
I can confirm that enabling UseSyslog causes the crash and the above patch fixes it.

Loading...