FS#53284 - mesa-17.0.1-2 introduces extreme unresponsiveness on nvidia system with wayland
Attached to Project:
Arch Linux
Opened by Matt Sturgeon (mattsturgeon) - Monday, 13 March 2017, 01:13 GMT
Last edited by Laurent Carlier (lordheavy) - Friday, 13 September 2019, 15:06 GMT
Opened by Matt Sturgeon (mattsturgeon) - Monday, 13 March 2017, 01:13 GMT
Last edited by Laurent Carlier (lordheavy) - Friday, 13 September 2019, 15:06 GMT
|
Details
Description:
I have just tried updating mesa (and lib32-mesa) from 17.0.1-1 to -2 and experienced some strange behaviour including extremely unresponsive mouse and keyboard input, high CPU usage, multiple monitor layout changes, conky not rendering only while using gdm or gnome wayland (gnome xorg was unaffected) with the latest mesa version. Downgrading to mesa 17.0.1-1 fixed all issues. At first I thought this was related to nvidia-libgl being merged into nvidia-utils, but that is not the case as I can reproduce this with both versions of nvidia. I have not tested if nouveau is affected. Hardware: * Chipset z170 * CPU i5-6600k 4.5GHz * RAM 16GB DDR4 2400MHz * GPU MSI GTX 970 Software: * mesa 17.0.1-2 * gdm 3.22.3-1 * gnome-shell 3.22.3-1 * nvidia-dkms 378.13-3 * nvidia-utils 378.13-5 * libglvnd 0.2.999+g4ba53457-1 Steps to reproduce: * Install gdm and mesa 17.0.1.2 on a nvidia machine with the proprietary drivers; observe poor performance and other strange behaviour. * Downgrade to mesa 17.0.1-1; observe normal performance and behaviour * xorg and tty sessions are not affected. AFAICT only wayland sessions (including gdm) are affected. |
This task depends upon
Closed by Laurent Carlier (lordheavy)
Friday, 13 September 2019, 15:06 GMT
Reason for closing: Upstream
Additional comments about closing: GNOME on Wayland works relatively well now with the NVIDIA driver, with the exception of XWayland apps. This is unrelated to the original bug report and has been documented in the wiki.
Friday, 13 September 2019, 15:06 GMT
Reason for closing: Upstream
Additional comments about closing: GNOME on Wayland works relatively well now with the NVIDIA driver, with the exception of XWayland apps. This is unrelated to the original bug report and has been documented in the wiki.
In addition gdm sometimes starts to act normally after a couple of seconds.
I am using:
mesa 17.0.1-2
lib32-mesa 17.0.1-2
nvidia 378.13-3
nvidia-utils 378.13-5
libglvnd 0.2.999
lib32-libglvnd 0.2.999
sddm 0.14.0-2
sddm-kcm 5.9.3-1
I attach my journalctl and Xorg.0.log
journalctl (237.4 KiB)
I have experienced blank display managers and/or display managers not starting before, but those issues were usually caused by updates to the nvidia package.
slashME: I've not actually checked if Wayland is being used or not, but I only experienced the bug in gdm and the normal gnome session (the "GNOME on Xorg" session wasn't affected). I probably should have checked if forcing gdm not to use Wayland in gdm.conf has any impact.
It looks like it doesn't detect the NVIDIA driver at all. When I run glxinfo (from the Wayland session), there is only information about the mesa driver, not NVIDIA.
The following is shown in the journal when GNOME Shell is starting:
kernel: [drm:nvidia_drm_gem_import_nvkms_memory [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to import NVKMS memory to GEM object
kernel: [drm:nvidia_drm_gem_import_nvkms_memory [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to import NVKMS memory to GEM object
kernel: [drm:nvidia_drm_gem_import_nvkms_memory [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to import NVKMS memory to GEM object
gnome-shell[770]: Failed to apply DRM plane transform 0: Invalid argument
gnome-shell[770]: Failed to apply DRM plane transform 0: Invalid argument
org.gnome.Shell.desktop[770]: Disabling glamor and dri3, EGL setup failed
org.gnome.Shell.desktop[770]: Failed to initialize glamor, falling back to sw
Edit: to clarify: I'm using the proprietary driver with nvidia-drm.modeset=1
You can verify that it was *not* fixed in 17.1, if you grab the mesa PKGBUILD from ABS and bump the version to 17.1.0-rc1, the issue is still there.
You can make these changes to the PKGBBUILD: https://gist.github.com/slokhorst/5f2f08cceb4924469684535474ec9229 . The two EGL patches can be skipped as they landed upstream (just before the 17.1 branch point https://cgit.freedesktop.org/mesa/mesa/commit/?id=ce562f9e3fab769d64b0e5453ec2b4f8710a31ce )
Edit: I'm stupid it was the same commit as on the 17.1 branch.
Apr 20 17:35:58 vashnix gnome-shell[627]: Failed to apply DRM plane transform 0: Invalid argument
Apr 20 17:35:58 vashnix org.gnome.Shell.desktop[627]: Disabling glamor and dri3, EGL setup failed
Apr 20 17:35:58 vashnix org.gnome.Shell.desktop[627]: Failed to initialize glamor, falling back to sw
I was unable to run 'glxinfo' at all and the system painting was noticeably slower than normal (dragging windows around), and apps that depended on GLX were failing to open. Has someone found any other workarounds to ensure Nvidia's acceleration is working?
Set WaylandEnable=false in /etc/gdm/custom.conf and login to the "GNOME on Xorg" session instead of the "GNOME" session.
NVIDIA driver doesn't support Xwayland, yet. Hence running GLXGears will run on LLVMPipe.
Not using Wayland also works around the issue (WaylandEnable=false in /etc/gdm/custom.conf and login to GNOME Xorg session).
If this is possible to fix with the current Nvidia drivers, I believe the fix would have to be done with changes to the Nvidia or glvnd package, not the Mesa one. I'd be curious to test the same Nvidia driver set on Fedora to see if Wayland is working properly with their configuration.
I'm curious if this is related to the 'Failed to import NVKMS memory into GEM object' error, which Google leads me to believe is related to CONFIG_HARDENED_USERCOPY, however this doesn't happen with weston-eglstream so I'm really confused as to where the issue is.
Maybe for the scope of this bug, until XWayland on Nvidia has Nvidia glx support there should be a file included with that package that defaults GDM to use X11?
FS#54099- though I'm probably misusing the issue tracker by doing so, so feel free to merge these two issues as required.* afaik mesa 17.0.1-2 was about mesa making use of GLVND, which would explain why that version bump revealed GLVND issues.
Is there any update on the GLVND issue? Has this been reported upstream somewhere?
[1] https://bugzilla.gnome.org/show_bug.cgi?id=773629#c70
If I add “nvidia-drm.modeset=1” to /usr/lib/modprobe.d/nvidia.conf, I will get "Disabling glamor and dri3, EGL setup failed, Failed to initialize glamor, falling back to sw". Wayland will start but in a software rendering mode. ( I am guessing.)
Edit:
I got my answer for fedora.
Fedora hasn't enabled --enable-egl-device in mutter including rawhide and f26
http://pkgs.fedoraproject.org/cgit/rpms/mutter.git/tree/mutter.spec#n117
Hope this gets fixed soon...
[drm:nvidia_drm_gem_import_nvkms_memory [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to import NVKMS memory to GEM object
https://github.com/negativo17/nvidia-driver/issues/27
Devices are created by the udev rules:
0 crw-rw-rw- 1 root root 195, 0 16 aug 12.35 /dev/nvidia0
0 crw-rw-rw- 1 root root 195, 255 16 aug 12.35 /dev/nvidiactl
0 crw-rw-rw- 1 root root 195, 254 16 aug 12.35 /dev/nvidia-modeset
But gnome shell fails to launch, see log:
https://hastebin.com/vecerojude.pl
FS#54980http://copr-dist-git.fedorainfracloud.org/cgit/mvicomoya/mutter-eglstream/mutter.git/tree/?id=cc8df0d436c549f2faf00a6ca50a8760b5d70e6e
https://copr.fedorainfracloud.org/coprs/mvicomoya/mutter-eglstream/
@leigh123linux, do you mean that the patches should solve the sluggishness? From what I understand the bug report is about window resizing?
After my last comment, my system was unable to start using wayland for some time (as Martin Wallin said), but today it was able to start again. But with the same sluggishness as before though. So this issue is still there.
This info might help find the problem.
okt 11 00:23:19 - gnome-shell[5393]: Failed to apply DRM plane transform 0: Invalid argument
okt 11 00:23:19 - org.gnome.Shell.desktop[5393]: Disabling glamor and dri3, EGL setup failed
okt 11 00:23:19 - org.gnome.Shell.desktop[5393]: Failed to initialize glamor, falling back to sw
I don't know if any of this helps but here is my experiences with Wayland session finally working with Nvida DRM.
I have observed that even though all nvidia related modules are loaded, the nvidia driver doesn't think there is a display and glxinfo seems to indicate that Mesa is running the show.
[ ~]$ nvidia-settings
ERROR: Unable to find display on any available system
[ ~]$ glxinfo | grep glx
server glx vendor string: SGI
server glx version string: 1.4
server glx extensions:
client glx vendor string: Mesa Project and SGI
client glx version string: 1.4
client glx extensions:
If I log out of a Wayland session and try to log back in to a Wayland session, gnome crashes back to the GDM login screen, however I can still log into a Gnome Xorg session.
even if I log into a Gnome Xorg session first, log out and log into a Wayland session, the Wayland session loads fine (with the exception of the above) but any attempt to log out
of a Wayland session and try to log back into a Wayland session a second or subsequent times, causes gnome to crash back to the GDM login screen. Re-booting again allows an initial
login to a Wayland session.
i just added `export EGL_PLATFORM=wayland` and performance still bad.
~ glxinfo | egrep "(glx|OpenGL)"
server glx vendor string: SGI
server glx version string: 1.4
server glx extensions:
client glx vendor string: Mesa Project and SGI
client glx version string: 1.4
client glx extensions:
OpenGL vendor string: VMware, Inc.
OpenGL renderer string: llvmpipe (LLVM 5.0, 256 bits)
OpenGL core profile version string: 3.3 (Core Profile) Mesa 17.2.2
OpenGL core profile shading language version string: 3.30
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile
OpenGL core profile extensions:
OpenGL version string: 3.0 Mesa 17.2.2
OpenGL shading language version string: 1.30
OpenGL context flags: (none)
OpenGL extensions:
OpenGL ES profile version string: OpenGL ES 3.0 Mesa 17.2.2
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.00
OpenGL ES profile extensions:
~ echo $EGL_PLATFORM
wayland
I have the same results as @partizan, still falling back to llvmpipe.
$ glxinfo | egrep "(glx|OpenGL)"
server glx vendor string: SGI
server glx version string: 1.4
server glx extensions:
client glx vendor string: Mesa Project and SGI
client glx version string: 1.4
client glx extensions:
OpenGL vendor string: VMware, Inc.
OpenGL renderer string: llvmpipe (LLVM 5.0, 256 bits)
OpenGL core profile version string: 3.3 (Core Profile) Mesa 17.2.2
OpenGL core profile shading language version string: 3.30
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile
OpenGL core profile extensions:
OpenGL version string: 3.0 Mesa 17.2.2
OpenGL shading language version string: 1.30
OpenGL context flags: (none)
OpenGL extensions:
OpenGL ES profile version string: OpenGL ES 3.0 Mesa 17.2.2
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.00
OpenGL ES profile extensions:
$ eglinfo
EGL API version: 1.4
EGL vendor string: NVIDIA
$ env | grep wayland
WAYLAND_DISPLAY=wayland-0
EGL_PLATFORM=wayland
XDG_SESSION_TYPE=wayland
from journal:
okt 19 10:06:45 gnome-shell[5405]: Failed to apply DRM plane transform 0: Invalid argument
okt 19 10:06:45 org.gnome.Shell.desktop[5405]: Disabling glamor and dri3, EGL setup failed
okt 19 10:06:45 org.gnome.Shell.desktop[5405]: Failed to initialize glamor, falling back to sw
https://devtalk.nvidia.com/default/topic/925605/linux/nvidia-364-12-release-vulkan-glvnd-drm-kms-and-eglstreams/post/5188874/#5188874
Which means that anything that uses GLX will be in a funky state, _apart_ from the need of custom EGL_stream codepath to have Wayland working with the blob.
AFAICT gnome-shell has the latter, while others (kwin, xfce?...) do not.
Although from a quick skim this report is going through multiple unrelated issues.
So i decided to check nvidia drivers again. And it is usable, does not eat cpu and feels smooth enough.
> glxinfo | egrep "(glx|OpenGL)"
still shows "OpenGL renderer string: llvmpipe (LLVM 5.0, 256 bits)"
But i don't care until it works good.
Closing application results in complete wayland freeze.
"
* Added new application profile settings, "EGLVisibleDGPUDevices" and "EGLVisibleTegraDevices", to control which discrete and Tegra GPU devices, respectively, may be enumerated by EGL. See the "Application Profiles" appendix of the driver README for more details.
* Corrected the SONAME of the copy of the libnvidia-egl-wayland library included in the .run installer package to libnvidia-egl-wayland.so.1. The SONAME had previously been versioned incorrectly with the full version number of the library.
"
Rough ideas:
- strace eglinfo or LD_DEBUG=libs eglinfo - see if libEGL_nvidia.so (or similar not 100% on the name) is attempted to be opened
- no: track what's happening in GLVND (aka libEGL.so) - might have to rebuild the package with debug symbols
- yes: forward the information to Nvidia - only they have the code for the driver :-\
Anyone else still having issues or can we close this?
I can't seem to start it in Wayland mode any more at all.. GDM starts it in X11 mode and running `XDG_SESSION_TYPE=wayland exec dbus-run-session gnome-session` from a TTY gives a bunch of errors.
Edit: https://bugs.archlinux.org/task/57957
* The mouse pointer lags some of the time
* Sometimes Keyboard input just doesn't ever reach the application that I'm writing in.
* Animations etc. are lagging
+# -D xwayland_eglstream=true \ # requires weston-eglstream from AUR
Not sure why it should require weston-eglstream.
So I wouldn't expect any improvements anyway. I might rebuild xorg myself to test it out.
xwayland still uses software renderer
> journalctl -x -o cat | grep glamor
glamor: 'wl_drm' not supported
Missing Wayland requirements for glamor GBM backend
glamor: Using nvidia's EGLStream interface, direct rendering impossible.
glamor: Performance may be affected. Ask your vendor to support GBM!
but things that support wayland is using nvidia driver, like
> glmark2-wayland
OpenGL Information
GL_VENDOR: NVIDIA Corporation
GL_RENDERER: GeForce GTX 1080/PCIe/SSE2
GL_VERSION: 4.6.0 NVIDIA 418.56
gnome-shell performance is quite good on software render (on AMD Ryzen 7 2700 at least)
But it's not stable enough, today i had two crashes.
https://blogs.gnome.org/uraeus/2019/04/03/preparing-for-fedora-workstation-30/
Fedora folks working with nvidia, and by the fall we may have this issue fixed.
I have updated [GNOME#Wayland sessions] and [NVIDIA#DRM kernel mode setting] in the wiki to explain the XWayland situation and added the link that partizan posted.
[1] https://wiki.archlinux.org/index.php/GNOME#Wayland_sessions
[2] https://wiki.archlinux.org/index.php/NVIDIA#DRM_kernel_mode_setting