FS#58933 - [mesa] Segmentation Fault for glxspheres64

Attached to Project: Arch Linux
Opened by Patrick Young (kmahyyg) - Friday, 08 June 2018, 15:59 GMT
Last edited by Doug Newgard (Scimmia) - Monday, 20 August 2018, 15:09 GMT
Task Type Bug Report
Category Packages: Extra
Status Closed
Assigned To Jan de Groot (JGC)
Andreas Radke (AndyRTR)
Laurent Carlier (lordheavy)
Architecture All
Severity Medium
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 18
Private No



Cannot run CSGO and glxspheres64.

Also if I could run CSGO with nvidia-utils-396.24-2&nvidia-396.24-7, it will sometimes freeze the picture lasting about 5 secs.

Additional info:

virtualgl 2.5.2-3
primus 20151110-7
bumblebee 3.2.1-17

Steps to reproduce:

$ sudo primusrun glxspheres64

Polygons in scene: 62464 (61 spheres * 1024 polys/spheres)
Visual ID of window: 0x145
Context is Direct
OpenGL Renderer: GeForce 940MX/PCIe/SSE2
[1] 3182 segmentation fault sudo primusrun glxspheres64
This task depends upon

Closed by  Doug Newgard (Scimmia)
Monday, 20 August 2018, 15:09 GMT
Reason for closing:  Fixed
Additional comments about closing:  mesa 18.1.6-1
Comment by Patrick Young (kmahyyg) - Friday, 08 June 2018, 16:04 GMT
*Update: with nvidia (396.24-7) nvidia-utils-396.24-2 , still report segmentation fault for glxspheres64
Comment by Patrick Young (kmahyyg) - Friday, 08 June 2018, 16:14 GMT
*Update: with both version of nvidia, csgo and glxspheres64 all CAN NOT RUN.
*Update: run ```optirun glxspheres64``` , all works fine. This issue may caused by primus.
Comment by Patrick Young (kmahyyg) - Friday, 08 June 2018, 16:18 GMT
---- edit: use pastebin instead of directly pasting here



4 hours ago, I still run CSGO.
But now I can not run it. Here's my pacman.log.
Comment by Patrick Young (kmahyyg) - Saturday, 09 June 2018, 01:29 GMT
[ 847.682847] [DEBUG] LD_LIBRARY_PATH: /usr/lib/nvidia:/usr/lib32/nvidia:/usr/lib:/usr/lib32
[ 847.682858] [DEBUG] Socket path: /var/run/bumblebee.socket
[ 847.682878] [DEBUG] Accel/display bridge: primus
[ 847.682910] [DEBUG] VGL Compression: proxy
[ 847.682933] [DEBUG] VGLrun extra options:
[ 847.682952] [DEBUG] Primus LD Path: /usr/lib/primus:/usr/lib32/primus
[ 848.006646] [INFO]Response: Yes. X is active.

[ 848.006664] [INFO]Running application using primus.
[ 848.006750] [DEBUG]Process glxspheres64 started, PID 7306.
Polygons in scene: 62464 (61 spheres * 1024 polys/spheres)
Visual ID of window: 0x145
Context is Direct
OpenGL Renderer: GeForce 940MX/PCIe/SSE2
[ 848.485955] [DEBUG]SIGCHILD received, but wait failed with No child processes
[ 848.486084] [DEBUG]Socket closed.
[ 848.486145] [DEBUG]Killing all remaining processes.
Comment by Sven Mauch (SvenMauch) - Saturday, 09 June 2018, 11:10 GMT
Likely related to mesa (18.0.4-1 -> 18.1.1-1). Note that lib32-mesa is still on 18.0.4. Also related: https://bbs.archlinux.org/viewtopic.php?pid=1790732
Comment by Jakub Janek (CrafterSvK) - Sunday, 10 June 2018, 06:39 GMT
Not related to mesa as I have multilib-testing and got the same version of lib32-mesa and mesa (18.1.1-1).
Comment by Patrick Young (kmahyyg) - Sunday, 10 June 2018, 06:45 GMT
Same. Cannot locate where's the problem. But the problem still exists. Tried all the method in Arch Wiki, no one works.
Comment by Sven Mauch (SvenMauch) - Sunday, 10 June 2018, 10:43 GMT
I can confirm this still happens when upgrading lib32-mesa from multilib-testing. However it starts working again when downgrading both to 18.0.4. What's left is the question whether this is an issue with mesa or if primus/bumblebee needs patching.
Comment by Patrick Young (kmahyyg) - Sunday, 10 June 2018, 11:34 GMT
@SvenMauch 18.0.4-0 or 18.0.4-1?
I'll have a try.
Comment by Sven Mauch (SvenMauch) - Sunday, 10 June 2018, 11:36 GMT
@Patrick Young
Doesn't matter as far as I can tell, but I'm currenty on mesa 18.0.4-1 and lib32-mesa 18.0.4-2 if you'd like to give it a shot.
Comment by Patrick Young (kmahyyg) - Sunday, 10 June 2018, 14:46 GMT
Confirmed. Cause by mesa, after downgrading mesa from 18.1.1-1 to 18.0.4-1, problem not arise.
Comment by Adriano Fantini (OdinEidolon) - Tuesday, 12 June 2018, 10:12 GMT
Confirm, cannot run primusrun or optirun with the latest mesa 18.1.1-1, had to revert to 18.0.4-1.
Comment by loqs (loqs) - Thursday, 14 June 2018, 15:12 GMT
https://bugs.archlinux.org/task/58933#comment170241 mentioned https://bbs.archlinux.org/viewtopic.php?pid=1790732 which was bisected to 8d0d89715984e321315631dd6667e05813d26e03 in xserver
Please try building xorg-xserver with the above patch which is a revert of the commit adjusted to apply cleanly.
Comment by Gavin Troy (wofall) - Tuesday, 26 June 2018, 23:21 GMT Comment by Jeb Rosen (jebrosen) - Wednesday, 04 July 2018, 01:41 GMT
Building without HAVE_DRI3_MODIFIERS, as mentioned in the Debian bug report -- by running in prepare() `sed -i "/pre_args += '-DHAVE_DRI3_MODIFIERS'/d" meson.build` -- does make primusrun work again for me. I have no idea what functionality/optimization might be lost, but this might serve as a suitable workaround for some use cases.
Comment by Giancarlo Razzolini (grazzolini) - Wednesday, 04 July 2018, 02:58 GMT
Even though the patch applies cleanly, the build process fails. Downgrading both mesa and lib32-mesa to 18.0.4, does the trick.
Comment by Carlos (cyberconan) - Wednesday, 04 July 2018, 15:51 GMT
Another option is downgrade xorg-server from 1.20 to 1.19 and keep mesa in the last version.
Comment by loqs (loqs) - Wednesday, 04 July 2018, 17:33 GMT
@grazzolini apologies for the broken patch this one builds at least but I lack the hardware to test it.
Comment by SilverMight (SilverMight) - Sunday, 08 July 2018, 01:55 GMT
@loqs didn't seem to work for me, still got a segfault
Comment by chriscjsus (chriscjsus) - Tuesday, 10 July 2018, 17:26 GMT
PRIMUS_UPLOAD=1 primusrun glxspheres64

No segmentation fault after setting environment variable PRIMUS_UPLOAD=1

No need to downgrade or patch xorg/mesa.
Comment by Giancarlo Razzolini (grazzolini) - Tuesday, 10 July 2018, 17:41 GMT

Setting PRIMUS_UPLOAD=1 impacts the performance. I got worse results with it than with optirun. Right now, downgrading mesa is the way to go. Also, this is the upstream primus bug: https://github.com/amonakov/primus/issues/201
Comment by chriscjsus (chriscjsus) - Tuesday, 10 July 2018, 18:18 GMT
On my system, skylake/nvidia 970m, primusrun is still better than optirun. I do not have 32-bit mesa installed so not sure about 32-bit apps. I tested with glxspheres64 and Unigine Valley.
Comment by Jeb Rosen (jebrosen) - Wednesday, 11 July 2018, 03:35 GMT
PRIMUS_UPLOAD=0, the default, corresponds to autodetecting the faster method between PRIMUS_UPLOAD=1 and PRIMUS_UPLOAD=2. It's actually the autodetection itself is what's causing the segfault on my machine; specifying either =1 or =2 are both fine and faster than optirun for me.

(Additional speculation at https://github.com/amonakov/primus/issues/201#issuecomment-404027454)
Comment by McFyrn (McFyrn) - Tuesday, 17 July 2018, 06:53 GMT
I found an interesting/weird work-around for mesa 18.1.4, if I enable the tearfree option on my *intel* iGPU, then primusrun works like before without the PRIMUS_UPLOAD option:

$ cat /etc/X11/xorg.conf.d/20-intel-tearfree.conf
Section "Device"
Identifier "Intel Graphics"
Driver "intel"
Option "TearFree" "true"

Also I ran some benchmarks between optirun, primusrun and nvidia-xrun and glxgears/glxspheres64 can't be used for benchmarks:
- glxgears: optirun ~160 FPS / primusrun ~160 FPS / nvidia-xrun ~13500 FPS (!)
- glxspheres64: optirun ~267 Mpix/s / primusrun ~250 Mpix/s / nvidia-xrun ~4550 Mpix/s (!!)

With unigine-heaven it gives more coherent scores:
- optirun 643 / primusrun 947 / nvidia-xrun 897
The scores don't change much between mesa 18.0.4 + primusrun (with or without vblank_mode, with or without the intel tearfree work-around), and mesa 18.1.4 + primusrun + vblank_mode + intel work-around.
Comment by Carlos (cyberconan) - Wednesday, 18 July 2018, 17:07 GMT
Thanks McFyrn! I'm going to test your solution.

Very important! I never had xf86-video-intel installed because is not required and not recommended (https://wiki.archlinux.org/index.php/intel_graphics#Installation) but if you set intel driver in Xorg.conf you need install xf86-video-intel or X11 don't start.
Comment by McFyrn (McFyrn) - Tuesday, 24 July 2018, 21:09 GMT
Carlos: thanks for the reminder, I tried to remove it but it seems impossible to have a tear-free desktop using Awesome WM and Compton, I had to reinstall xf86-video-intel.