FS#66522 - SEGFAULT in libalpm.so.12

Attached to Project: Pacman
Opened by Jonas Große Sundrup (cherti) - Sunday, 03 May 2020, 12:31 GMT
Last edited by Andrew Gregory (andrewgregory) - Tuesday, 19 May 2020, 19:57 GMT
Task Type Bug Report
Category General
Status Closed
Assigned To No-one
Architecture All
Severity Low
Priority Normal
Reported Version 5.2.1
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

# Summary and Info:
pacman-version: 5.2.1-4
pyalpm-version: 0.9.1-2

I am regularly, but not really reproducably unfortunately, running into a segfault within libalpm.so.12 when used via pyalpm.

I have been running into this issue while writing blinky [0] and at least once this happened when this function [1] was called with 'python-mastodon' as an argument, see the trimmed down example below in the Steps to Reproduce.

[0] https://github.com/cherti/blinky
[1] https://github.com/cherti/blinky/blob/master/blinky/pacman.py#L25


Steps to Reproduce:
This is a little difficult, in *theory*, the following code should trigger the issue (trimmed down blinky-logic to where it triggered in blinky), but doesn't:

#!/usr/bin/env python3

import pyalpm, pycman

handle = pycman.config.init_with_config('/etc/pacman.conf')
ldb = handle.get_localdb()
#sdbs.handle.get_syncdb()

s = pyalpm.find_satisfier(ldb.pkgcache, 'python-mastodon')

I haven't even been able to reliably reproduce the issue with blinky, in subsequent repetitions of the same command the segfault sometimes triggers and sometimes does not.

Attached is everything I extracted from coredumpctl so far, in the hope that it might be useful.

If there is anything you need or I could try to investigate the issue further, feel free to get back to me. Any directions of how to pin down the issue further are also greatly appreciated!
This task depends upon

Closed by  Andrew Gregory (andrewgregory)
Tuesday, 19 May 2020, 19:57 GMT
Reason for closing:  Not a bug
Additional comments about closing:  Reopen this if you can replicate in a single-threaded environment.
Comment by Allan McRae (Allan) - Sunday, 03 May 2020, 12:55 GMT
Any chance you want to compile pacman with debug symbols and get a more complete coredump? I am having a hard time seeing how to get to where the crash occurs from the function above it in the stack trace.
Comment by Jonas Große Sundrup (cherti) - Sunday, 03 May 2020, 17:00 GMT
Give me what you want to have changed in the PKGBUILD to get the coredump you need and I'll run with that pacman until I hit that issue again (which might take a while, might not, bit unpredictable).
Comment by Jonas Große Sundrup (cherti) - Sunday, 03 May 2020, 17:23 GMT
I have added --enable-debug to the configure-call in pacman's PKGBUILD and will report back once I get another coredump. If there are more debugflags you might find useful, I'm happy to add them as well!
Comment by Jonas Große Sundrup (cherti) - Monday, 04 May 2020, 18:33 GMT
So, I got a coredump with (hopefully all relevant) debug symbols enabled in the pacman-package.
The backtrace is again attached, furthermore, I uploaded the complete coredump here:
https://share.cherti.org/398dfecbc3c180d08f5e/blinky.coredump

As I wasn't so far able to reproduce the issue in a slimmed down code, it's unfortunately the entirity of blinky still around it.
It again crashed in the same line in the blinky code as denoted in the report above, with a different package however, whereas the one noted in the Report above did not cause any problems.

If there is anything else I can do to help getting to the core of this, feel free to reach out!
Comment by Andrew Gregory (andrewgregory) - Monday, 04 May 2020, 19:53 GMT
You appear to be loading the db in multiple threads at the same time, that is not thread-safe
Comment by Jonas Große Sundrup (cherti) - Tuesday, 05 May 2020, 09:11 GMT
Oh, that is very possible, I wasn't aware it was not thread-safe (and wrongly assumed different inner workings of pyalpm). Then this issue might entirely be my fault. I have wrapped the call to find_satisfier on the local DB into some locking and so far I was not able to reproduce the issue.

I haven't observed the issue so far with the sync-DBs so far, only with the local one. Pure luck or is find_satisfier called on the sync-DBs in pyalpm (potentially accidentially) thread-safe because the sync-DBs are structured differently on disk?

Thanks a lot for taking a look into it!

Loading...