Arch Linux

Please read this before reporting a bug:
https://wiki.archlinux.org/title/Bug_reporting_guidelines

Do NOT report bugs when a package is just outdated, or it is in the AUR. Use the 'flag out of date' link on the package page, or the Mailing List.

REPEAT: Do NOT report bugs for outdated packages!
Tasklist

FS#36114 - testing/nvidia_319.32-3 oops with bumblebee

Attached to Project: Arch Linux
Opened by Tod Jackson (shirokuro) - Friday, 12 July 2013, 04:18 GMT
Last edited by Tobias Powalowski (tpowa) - Tuesday, 06 August 2013, 09:31 GMT
Task Type Bug Report
Category Packages: Extra
Status Closed
Assigned To Tobias Powalowski (tpowa)
Ionut Biru (wonder)
Architecture All
Severity Medium
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 5
Private No

Details

Description:
After a few runs of optirun glxinfo and optirun glxspheres (which worked after a long pause) I checked dmesg to find it caused a kernel oops. The relevant dmesg is attached. After a few optiruns the nVidia card crashed in a way that broke the framebuffer console too, so I had to ctrl+alt+delete after exiting X.

Two patches I tried a few weeks ago on kernel 3.10 with nVidia wouldn't even allow the card to display gfx, despite building.

Additional info:

* package version(s)
testing nvidia_319.32-3

* config and/or log files etc.
-dmesg attached
-default bumblebee settings
optirun glxinfo | grep -i render showed the nVidia card was being used

Steps to reproduce:
Attempt to use the nVidia card with optirun or primusrun.
This task depends upon

Closed by  Tobias Powalowski (tpowa)
Tuesday, 06 August 2013, 09:31 GMT
Reason for closing:  Fixed
Comment by Tod Jackson (shirokuro) - Friday, 12 July 2013, 23:40 GMT
From https://devtalk.nvidia.com/default/topic/549532/linux-3-10-incompatibility-in-function-lsquo-nv_i2c_del_adapter-rsquo-error-void-value-not-igno/ :
http://pastebin.com/N0a5KMZa

This patch supposedly avoids this scenario: [ 223.662924] proc_dir_entry 'driver/nvidia' already registered from my dmesg by disabling the proc fs but I've not the energy to test it today. At least wanted to mention it though.
Comment by Sergey (bsergik) - Friday, 26 July 2013, 09:38 GMT
I would move this bug into "Packages: extra" category, because broken nvidia driver came into "extra" category.

I just have updated linux (3.9.9-1 -> 3.10.2-1) and nvidia (319.32-2 -> 319.32-4) packages and since then I cannot use nvidia card anymore.

nvidia card is the second video card on my laptop, I use it through bumblebee. When I use optirun, I get error in kern.log:

Jul 26 13:08:11 sherlock kernel: [ 96.369637] bbswitch: enabling discrete graphics
Jul 26 13:08:11 sherlock kernel: [ 96.642638] pci 0000:01:00.0: power state changed by ACPI to D0
Jul 26 13:08:11 sherlock kernel: [ 96.699359] nvidia: module license 'NVIDIA' taints kernel.
Jul 26 13:08:11 sherlock kernel: [ 96.699363] Disabling lock debugging due to kernel taint
Jul 26 13:08:11 sherlock kernel: [ 96.703821] vgaarb: device changed decodes: PCI:0000:01:00.0,olddecodes=io+mem,decodes=none:owns=none
Jul 26 13:08:11 sherlock kernel: [ 96.703964] [drm] Initialized nvidia-drm 0.0.0 20130102 for 0000:01:00.0 on minor 1
Jul 26 13:08:11 sherlock kernel: [ 96.703968] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 319.32 Wed Jun 19 15:51:20 PDT 2013
Jul 26 13:08:16 sherlock kernel: [ 101.336323] NVRM: GPU at 0000:01:00.0 has fallen off the bus.
Jul 26 13:08:16 sherlock kernel: [ 101.354706] NVRM: RmInitAdapter failed! (0x25:0x28:1148)
Jul 26 13:08:16 sherlock kernel: [ 101.354713] NVRM: rm_init_adapter(0) failed

When I downgraded linux, bbswitch and nvidia packages everything works fine.
Comment by ricsch (ricsch) - Saturday, 03 August 2013, 14:31 GMT
Patch http://pastebin.com/JDpkR3kt from linked NVIDIA developer forum thread fixes the issue for me flawlessly. The one currently used in [extra] is broken. /proc does not get cleaned up correctly with this one which causes various problems.
Comment by Tod Jackson (shirokuro) - Monday, 05 August 2013, 02:31 GMT
ricsch, thanks! I can confirm your linked patch works perfectly. Unfortunately the current testing (release 5) has the same issue as in the initial bug post.

13.584302] bbswitch: version 0.7
[ 13.584316] bbswitch: Found integrated VGA device 0000:00:02.0: \_SB_.PCI0.GFX0
[ 13.584326] bbswitch: Found discrete VGA device 0000:01:00.0: \_SB_.PCI0.P0P2.PEGP
[ 13.585052] bbswitch: detected an Optimus _DSM function
[ 13.585070] bbswitch: Succesfully loaded. Discrete card 0000:01:00.0 is on
[ 13.590410] [drm] Module unloaded
[ 13.592413] bbswitch: disabling discrete graphics
[ 13.604575] pci 0000:01:00.0: Refused to change power state, currently in D0
[ 13.605178] pci 0000:01:00.0: power state changed by ACPI to D3cold
[ 78.642989] bbswitch: enabling discrete graphics
[ 79.148256] pci 0000:01:00.0: power state changed by ACPI to D0
[ 79.212440] vgaarb: device changed decodes: PCI:0000:01:00.0,olddecodes=none,decodes=none:owns=none
[ 79.213339] [drm] Initialized nvidia-drm 0.0.0 20130102 for 0000:01:00.0 on minor 1
[ 79.213350] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 319.32 Wed Jun 19 15:51:20 PDT 2013
[ 83.740441] [drm] Module unloaded
[ 83.743370] bbswitch: disabling discrete graphics
[ 83.756867] pci 0000:01:00.0: Refused to change power state, currently in D0
[ 83.757497] pci 0000:01:00.0: power state changed by ACPI to D3cold
[ 221.673107] bbswitch: enabling discrete graphics
[ 222.173581] pci 0000:01:00.0: power state changed by ACPI to D0
[ 222.236098] vgaarb: device changed decodes: PCI:0000:01:00.0,olddecodes=none,decodes=none:owns=none
[ 222.236658] [drm] Initialized nvidia-drm 0.0.0 20130102 for 0000:01:00.0 on minor 1
[ 222.236671] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 319.32 Wed Jun 19 15:51:20 PDT 2013
[ 225.176789] [drm] Module unloaded
[ 225.180274] bbswitch: disabling discrete graphics
[ 225.192950] pci 0000:01:00.0: Refused to change power state, currently in D0
[ 225.193658] pci 0000:01:00.0: power state changed by ACPI to D3cold
[ 229.066062] bbswitch: enabling discrete graphics
[ 229.566500] pci 0000:01:00.0: power state changed by ACPI to D0
[ 229.602918] vgaarb: device changed decodes: PCI:0000:01:00.0,olddecodes=none,decodes=none:owns=none
[ 229.603871] [drm] Initialized nvidia-drm 0.0.0 20130102 for 0000:01:00.0 on minor 1
[ 229.603902] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 319.32 Wed Jun 19 15:51:20 PDT 2013
[ 232.616486] [drm] Module unloaded
[ 232.620213] bbswitch: disabling discrete graphics
[ 232.632752] pci 0000:01:00.0: Refused to change power state, currently in D0
[ 232.633446] pci 0000:01:00.0: power state changed by ACPI to D3cold

Loading...