FS#48973 - [llvm-libs] Any executable can not use multiple versions of libLLVM simultaneously

Attached to Project: Arch Linux
Opened by Armin Kazmi (apriori) - Saturday, 16 April 2016, 09:09 GMT
Last edited by Evangelos Foutras (foutrelis) - Monday, 19 March 2018, 22:48 GMT
Task Type Bug Report
Category Packages: Extra
Status Closed
Assigned To Evangelos Foutras (foutrelis)
Architecture x86_64
Severity Low
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

Description:

The utmost simple c++ program:
"int main() { return 0; }"

crashes when linked to two different libLLVM.so's (3.7, 3.5) simultaneously.
This crash is specific to the build configuration used in the Arch packages. Completely
separate vanilla builds of both llvm versions in a local prefix work just fine. So something
is wrong in the PKGBUILDS. I so far failed to indentify the exact issue.

Also failing: Arch llvm35 and a manual external 3.7/3.8 build.
Also Arch llvm3.7 and a manual external 3.5/38 build.

Contextual information:
While this requirement might sound bogus at first, here is an example:
An application uses an older libLLVM to do some codegen for GPGPU stuff
and then fires up an OpenGL renderer. The OpenGL driver (radeon/mesa) pulls in the
newer libLLVM into the process.


Additional info:
* package version(s)
extra/llvm35 and extra/llvm
(gcc 5.3.0)


Steps to reproduce:

1. Write the minimal C/C++ program "int main() { return 0; }"
2. Build using g++ and link against llvm35 and llvm37: g++ test.cpp -o test /usr/lib/libLLVM.so.3.7 /usr/lib/libLLVM-3.5.2.so
3. Run the program and observe the segfault

Stack trace of thread 17528:
#0 0x00007f62885a6709 _ZN4llvm2cl6Option11addArgumentEv (libLLVM.so.3.7)
#1 0x00007f628657ce51 n/a (libLLVM-3.5.so)
#2 0x00007f628a7db3ba call_init.part.0 (ld-linux-x86-64.so.2)
#3 0x00007f628a7db4cb _dl_init (ld-linux-x86-64.so.2)
#4 0x00007f628a7ccdca _dl_start_user (ld-linux-x86-64.so.2)

Other variants of this bug:

When mixing the llvm3.7 with a manual 3.8, you will get the "duplicate options" asserting being thrown on program execution.

Comparison of CXXFLAGS :

From makepkg.conf (unmodified, assuming also these flags were used for extra/llvm(35)):
CFLAGS="-march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong --param=ssp-buffer-size=4"

Manual build:
CFLAGS="-fPIC -m64 -march=native"
This task depends upon

Closed by  Evangelos Foutras (foutrelis)
Monday, 19 March 2018, 22:48 GMT
Reason for closing:  Fixed
Additional comments about closing:  LLVM 6 should behave better in this regard. Please file a new bug if you discover any related problems.
Comment by Jan de Groot (JGC) - Monday, 18 April 2016, 10:49 GMT
This is caused by mixing symbols from two different versions of LLVM. From your trace you can see LLVM 3.5 calling into a 3.7 function, which will segfault or at least give an assertion if it won't segfault.

LLVM added versioned symbols in 3.7, but it appears this only works when building LLVM with autotools. As we build with CMake for various reasons, our builds don't have versioned symbols and you can't mix versions.

Even if we would have versioned symbols, I don't think you would be able to mix libraries with pre-3.7 versions.

Comment by Armin Kazmi (apriori) - Sunday, 24 April 2016, 15:42 GMT
Interesting. I'd just like to add that externally there is something happening about this "bug":

CERN noticed it as well: https://sft.its.cern.ch/jira/browse/ROOT-7744

And mesa has a bug report: https://bugs.freedesktop.org/show_bug.cgi?id=93103

Seems like rebuilding the mesa DRI with static llvm might be a viable workaround.
Comment by JP Cimalando (jpcima) - Tuesday, 27 June 2017, 05:09 GMT
I have observed myself and "fixed" what I believe to be the same bug, but with a different symptom.

When the application and the driver, in my case mesa/radeon, use libLLVM.so, the program aborts with a message signaling a double initialization of the llvm library. (for instance the emulator rpcs3, which uses llvm as its recompiler, is in this category)

The same error can also be triggered with OpenCL in several ways. One way is to start a program which uses the OpenCL icd, such as luxmark, when you have both mesa-opencl and pocl installed, two backends linked to the shared LLVM library. The bug can also be reproduced by attempting to load the MesaOpenCL library twice in succession (see test program load-cl.c).

In all cases you will obtain an error like this:

mesa: for the -simplifycfg-sink-common option: may only occur zero or one times!
mesa: CommandLine Error: Option 'enable-value-profiling' registered more than once!
LLVM ERROR: inconsistency in registered CommandLine options

Now for the fix, line apriori has said, statically linking is a possibility, and for having recompiled mesa with it, I confirm all my problems were solved by this solution.
In mesa, this is supposedly done by passing --disable-llvm-shared-libs, but this functionality is broken in the build system (mesa 17.1.3). I have written a tiny patch for mesa to make the static build possible (static-link-2.patch).
Comment by Evangelos Foutras (foutrelis) - Monday, 19 March 2018, 22:41 GMT
@jpcima: Your load-cl.c example will work correctly with LLVM 6 (currently in [testing]) thanks to https://reviews.llvm.org/D40459

As for the original bug, LLVM 6 introduced versioned symbols but I'm not sure if this allows loading two different libLLVM versions.

Loading...