FS#9024 - pacman segfault during xorg + nvidia installation
Attached to Project:
Pacman
Opened by André Prata (nDray) - Thursday, 27 December 2007, 19:12 GMT
Last edited by Dan McGee (toofishes) - Saturday, 05 January 2008, 23:46 GMT
Opened by André Prata (nDray) - Thursday, 27 December 2007, 19:12 GMT
Last edited by Dan McGee (toofishes) - Saturday, 05 January 2008, 23:46 GMT
|
Details
I was reinstalling arch in my laptop, I was doing some new
things... Testing stuff, messing with pacman cache and
such... I came to a point where I installed X, and I usually
do it with
pacman -Sy xorg-{server,xinit} xf86-input-{mouse,keyboard} xorg-fonts-{100,75}dpi nvidia synaptics Pacman just "segfaulted"... I thgought I had done something wrong before, screwing up pacman's cache, or so, so I just formatted again, I had just begun... For my surprise, doing no "new things", produced the same error... I begun trying to install less packages at a time, and realised that pacman -S nvidia by itself, without X installed, made pacman segfault... I did a --debug and pacman gracefully tells me that "nvidia-utils provides its own conflict"... I installed all of the above packages relating to X and then, separately, nvidia... It tells me that there's a conflict with libgl.... This shouldn't happen... At all.... It's just stupid.... I need to have have both xorg-server and nvidia... What I can't understand is why yesterday I had these same packages living along in my system... I provide here the --debug output... Hope you can do something about it... I'm available for testing, I'm not installing the rest of the system, that's only a spare laptop.... |
This task depends upon
Closed by Dan McGee (toofishes)
Saturday, 05 January 2008, 23:46 GMT
Reason for closing: Fixed
Additional comments about closing: fixed in GIT
Saturday, 05 January 2008, 23:46 GMT
Reason for closing: Fixed
Additional comments about closing: fixed in GIT
After some talk at the irc, MrElending lead me to conclusion that pacman, recursively, grabs nvidia, then xorg-server, then libgl, but it didn't acknowledge that nvidia already provided libgl...
I guess this should be improved in pacman, like look for providings, then dependencies, but I'll wait to hear from devs....
I have a few problems :
1) the debug log isn't full, is it? Where is the segfault? It might help to also have the non debug output of pacman to see more easily what happens.
2) I can't reproduce it in any situations : with pacman 3.0 or with 3.1, with or without the libgl package installed.
This hasn't ever happened before, maybe because earlier pacman fetched nvidia-utils first, I don't know... This was the first time, and adding nvidia-utils to the line solves the problem, so it's good for me anyways...
About the debug, I did a "pacman --debug -S nvidia > debuginfo".... stderr printed something like "pacman internal error: segfaulted..."... you must know what message i'm talking about...
If you need it that much, I think I could repeat the procedure and store both stdout and stderr...
The main problem is that I'm unable to reproduce this bug after several tries. But maybe someone else can.
I may then provide the pacman log... even syslog-ng... I don't usually activate the daemon, but I could do it.... If you need such, I may try to help...
pkgx.log (66.4 KiB)
xorg-server xorg-xinit xf86-input-mouse xf86-input-keyboard xorg-fonts-100dpi xorg-fonts-75dpi ttf-dejavu nvidia synaptics
But still, it bothers me that I'm unable to reproduce this with 3.0.
nDray, you are using 3.0.6, right?
What I don't get is the following lines in the log:
debug: CONFLICTS:: nvidia-utils conflicts with libgl
debug: CONFLICTS:: nvidia-utils conflicts with libgl
Looking at the code, it shouldn't be possible to get duplicate in the list, there is a check for preventing these.
See the 3.0.6 code, libalpm/conflict.c :
162 if(miss && !_alpm_depmiss_isin(miss, baddeps)) {
163 baddeps = alpm_list_add(baddeps, miss);
And when I run pacman 3.0.6 on my system, this check seems to work fine, because when I look at my debug log, I get only one line:
debug: CONFLICTS:: nvidia-utils conflicts with libgl
i can't help you at all about that double check....
The command was exactely:
# PKGX="xorg-server xorg-xinit xf86-input-mouse xf86-input-keyboard xorg-fonts-100dpi xorg-fonts-75dpi ttf-dejavu nvidia synaptics
# pacman -Sy $PKGX > pkgx.log 2>&1
I could repeat the error over and over....
I did a base install, but not all packages were installed, actually...
I omitted licenses, lilo, nano, syslog-ng, mailx, logrotate, reiserfsprogs, xfsprogs, jfsutils, pcmciautils, gettext, lvm2, and i believe that's it... After that it's the pacman.log....
I had 3.1 pacman package installed there, so I was using a locally built 3.0.
I tried on another box where I had the official 3.0 package, and I could reproduce the bug.
I couldn't reproduce the bug here because I made a debug build with : ./configure --enable-debug && make
I rebuilt it with : ./configure && make , and now I can reproduce the bug. thanks.
http://www.archlinux.org/pipermail/pacman-dev/2007-October/009687.html
That does prevent duplicate conflicts in the baddeps list in conflict.c (even with a non-debug pacman build), and so would prevent the segfault in sync.c .
But the code in sync.c is not totally safe, because it implicitly assumes that the conflict list contains no duplicate.
In 3.1, that depmiss_isin doesn't exist anymore, but duplicate conflicts are now avoided with the _alpm_conflict_isin function, which works correctly.
And the code in sync.c (around line 600) still segfaults in case of duplicate conflicts. That shouldn't happen though, so it's probably not a big problem.
And anyway, it's quite hard to figure out what that code does, for making it more bullet proof.
Well, lines 556-570 are _really_ odd.
The good old choose from providers question.
1. The funny thing, that "pacman -S nvidia xorg-server" leads to a completely different result.
2. This example shows, that removing from target list may be needed (I refer to your sync branch, Xavier)
And which pacman version did you try exactly? git?
2. Yes, that's what I figured, and that's why I made a change in my sync branch, as I said in
FS#8899