FS#64465 - nvidia-dkms: X11 Warning, couldn't open module glxserver_nvidia

Attached to Project: Arch Linux
Opened by EnSER (EnSER) - Monday, 11 November 2019, 21:49 GMT
Last edited by Sven-Hendrik Haase (Svenstaro) - Tuesday, 09 March 2021, 00:12 GMT
Task Type Bug Report
Category Packages: Extra
Status Closed
Assigned To Sven-Hendrik Haase (Svenstaro)
Architecture x86_64
Severity Medium
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

Description:
When I'm using "nvidia-dkms", 2d acceleration isn't working in X11.
For example, Firefox has sluggish performance and Youtube videos stutter.

Xorg.0.log shows:
[ 4.232] (**) NVIDIA(0): Enabling 2D acceleration
[ 4.232] (II) Loading sub module "glxserver_nvidia"
[ 4.232] (II) LoadModule: "glxserver_nvidia"
[ 4.232] (WW) Warning, couldn't open module glxserver_nvidia
[ 4.232] (EE) NVIDIA: Failed to load module "glxserver_nvidia" (module does not exist, 0)
[ 4.232] (EE) NVIDIA(0): Failed to initialize the GLX module; please check in your X
[ 4.232] (EE) NVIDIA(0): log file that the GLX module has been loaded in your X
[ 4.232] (EE) NVIDIA(0): server, and that the module is the NVIDIA GLX module. If
[ 4.232] (EE) NVIDIA(0): you continue to encounter problems, Please try
[ 4.232] (EE) NVIDIA(0): reinstalling the NVIDIA driver.

When I replace "nvidia-dkms" with "nvidia", the issue is gone and 2d acceleration is working.

Let me know if/what additional information is needed.

Additional info:
* package version(s) 435.21-17
GPU is Nvidia GeForce RTX2060 Super

Steps to reproduce:
System with Nvidia GPU
Install package "nvidia-dkms"
Reboot
Check Xorg.0.log for log entry above
Wait some time until GPU clocks down
Open Firefox and play Youtube video
This task depends upon

Closed by  Sven-Hendrik Haase (Svenstaro)
Tuesday, 09 March 2021, 00:12 GMT
Reason for closing:  No response
Comment by Doug Newgard (Scimmia) - Monday, 11 November 2019, 23:03 GMT
That's part of nvidia-utils. Did you change versions of that package or something when switching?
Comment by EnSER (EnSER) - Tuesday, 12 November 2019, 07:14 GMT
No, I did not touch nvidia-utils.

The only thing I have to do, to reproduce this is:
pacman -S nvidia-dkms -> After the next reboot, the issue is there
pacman -S nvidia -> After the next reboot, the issue is gone

I can see that "/usr/lib/nvidia/xorg/libglxserver_nvidia.so.435.21" and the link "/usr/lib/nvidia/xorg/libglxserver_nvidia.so" is there in both cases. My guess is that this issue is more of an indirect nature, but my knowlege on Xorg and Nvidia is close to zero in how they interact.

Comment by Sven-Hendrik Haase (Svenstaro) - Tuesday, 12 November 2019, 23:47 GMT
Update and check whether issue remains.
Comment by EnSER (EnSER) - Wednesday, 13 November 2019, 17:14 GMT
The issue still remains with nvidia-dkms 440.31-1.
Behavior is identical to 435.21-17
Comment by Sven-Hendrik Haase (Svenstaro) - Sunday, 24 November 2019, 06:28 GMT
There's a new update you can test. Apart from that, I got absolutely no idea. I'm going to see if anyone else from the team knows what's up.
Comment by EnSER (EnSER) - Monday, 25 November 2019, 21:05 GMT
The issue is still there, I did a comparison and I can see a difference between nvidia and nvidia-dkms in Xorg.0.log.
The following messages are missing when the issue happens:
[ 4.429] (II) xfree86: Adding drm device (/dev/dri/card0)
[ 4.431] (**) OutputClass "nvidia" ModulePath extended to "/usr/lib/nvidia/xorg,/usr/lib/xorg/modules,/usr/lib/xorg/modules"
[ 4.431] (**) OutputClass "nvidia" setting /dev/dri/card0 as PrimaryGPU

[ 4.480] (II) Applying OutputClass "nvidia" options to /dev/dri/card0
[ 4.480] (**) NVIDIA(0): Option "AllowEmptyInitialConfiguration"

Not sure what this means, maybe a timing issue?
I'm booting from efistub, but I doubt this could be the problem.

I have attached logs for the ok case with "nvidia-440.36-2" and the error one with "nvidia-dkms-440.36-2".
Comment by Sven-Hendrik Haase (Svenstaro) - Monday, 02 December 2019, 10:46 GMT
Frankly I got no idea how to tackle this further.
Comment by EnSER (EnSER) - Tuesday, 14 January 2020, 10:27 GMT
I'm no longer able to reproduce this issue.
I have changed my mainboard+cpu from AMD to Intel, so not sure if this was fixed indirectly or hardware related.
Comment by Benjamin Robin (benjarobin) - Wednesday, 17 February 2021, 23:40 GMT
  • Field changed: Percent Complete (100% → 0%)
The bug is still there, I analyzed it, and I can explain it. But the comment section is closed.
Comment by Emil (xexaxo) - Thursday, 18 February 2021, 13:02 GMT
The issue is basically the same as  FS#69673 .

In particular - the user has provided a "Device" section, which results in the "OutputClass" (aka default config part of the package /usr/share/X11/xorg.conf.d/10-nvidia-drm-outputclass.conf) to be ignored.

Since the config is ignored, the respective module cannot be found.

Going forward I see two routes for the user to take:
- reduce/remove the custom config, or
- chase nvidia/xorg-server to change the Device/OutputClass semantics


Edit: it's also possible that you're hitting some race condition, where the device node is missing as OutputClass is parsed, but present while "Device" is handled.
Comment by Sven-Hendrik Haase (Svenstaro) - Wednesday, 24 February 2021, 09:24 GMT
@benjarobin is there anything you can add here to help us fix this case from a packaging perspective?

Loading...