FS#58558 - [nvidia] rminitadapter failed

Attached to Project: Arch Linux
Opened by Tobias Schwarz (tobyblack) - Saturday, 12 May 2018, 12:19 GMT
Last edited by Sven-Hendrik Haase (Svenstaro) - Monday, 14 May 2018, 09:21 GMT
Task Type Bug Report
Category Upstream Bugs
Status Closed
Assigned To Sven-Hendrik Haase (Svenstaro)
Felix Yan (felixonmars)
Architecture All
Severity High
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 4
Private No

Details

Description:
I updated my nvidia package to 396.24-2 and my linux package to 4.16.8-1. After restart I only get a black screen. After switching to the onboard graphics I read several errors:
* RmInitAdapter failed
* lightdm could not be started
* linux-uvc kernel module could not be loaded

After downgrading to nvidia package 390.48-13 and linux package 4.16.7-1 it is working again. But now my GNOME is freezing after login and several seconds of work.

Additional info:
* package version(s)
nvidia 396.24-2 (+ matching nvidia-utils) and linux 4.16.8-1
This task depends upon

Closed by  Sven-Hendrik Haase (Svenstaro)
Monday, 14 May 2018, 09:21 GMT
Reason for closing:  Upstream
Comment by EQ (vityafx) - Saturday, 12 May 2018, 17:13 GMT
There is a kernel panic so people switch to nvidia-390xx packages. The log with kernel panic with the 396.24 ("nvidia" package in official archlinux repos):

May 12 19:32:32 purplejam kernel: resource sanity check: requesting [mem 0x000e0000-0x000fffff], which spans more than pnp 00:09 [mem 0x000e0000-0x000effff]
May 12 19:32:32 purplejam kernel: caller _nv028815rm+0x57/0x90 [nvidia] mapping multiple BARs
May 12 19:32:32 purplejam kernel: resource sanity check: requesting [mem 0x000c0000-0x000fffff], which spans more than PCI Bus 0000:00 [mem 0x000c0000-0x000dffff window]
May 12 19:32:32 purplejam kernel: caller _nv001112rm+0xe3/0x1d0 [nvidia] mapping multiple BARs
May 12 19:32:32 purplejam kernel: NVRM: GPU at PCI:0000:01:00: GPU-cd5e26d8-892a-7f10-97dc-ee95377c06b4
May 12 19:32:32 purplejam kernel: NVRM: GPU Board Serial Number:
May 12 19:32:32 purplejam kernel: NVRM: Xid (PCI:0000:01:00): 62, 0ac0(2f10) 00000000 00000000
May 12 19:32:33 purplejam systemd-networkd[289]: enp3s0: Configured
May 12 19:32:54 purplejam kernel: NVRM: RmInitAdapter failed! (0x53:0xffff:1957)
May 12 19:32:54 purplejam kernel: NVRM: rm_init_adapter failed for device bearing minor number 0
May 12 19:32:54 purplejam kernel: BUG: unable to handle kernel paging request at 0000000000001fa8
May 12 19:32:54 purplejam kernel: IP: _nv010416rm+0x30/0x1a0 [nvidia]
May 12 19:32:54 purplejam kernel: PGD 0 P4D 0
May 12 19:32:54 purplejam kernel: Oops: 0000 [#1] PREEMPT SMP PTI
May 12 19:32:54 purplejam kernel: Modules linked in: it87 hwmon_vid mousedev intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel nvidia_drm(PO) nvidia_modeset(PO) snd_hda_codec_hdmi kvm snd_hda_codec_realtek irqbypass nvidia(PO) snd_hda_codec_generic crct10dif_pclmul crc32_pclmul i2c_algo_bit ghash_clmulni_intel pcbc iTCO_wdt iTCO_vendor_support gpio_ich drm_kms_helper snd_hda_intel ppdev aesni_intel snd_hda_codec drm aes_x86_64 crypto_simd glue_helper snd_hda_core cryptd intel_cstate snd_hwdep snd_pcm intel_gtt ipmi_devintf agpgart snd_timer ipmi_msghandler r8169 snd intel_uncore mei_me syscopyarea sysfillrect sysimgblt mii fb_sys_fops mei i2c_i801 soundcore lpc_ich input_leds shpchp intel_rapl_perf parport_pc led_class pcspkr rtc_cmos parport evdev mac_hid vboxnetflt(O) vboxnetadp(O) vboxpci(O) vboxdrv(O)
May 12 19:32:54 purplejam kernel: crypto_user ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 fscrypto hid_generic usbhid hid sd_mod ata_generic pata_acpi serio_raw atkbd libps2 ata_piix libata ehci_pci ehci_hcd crc32c_intel scsi_mod usbcore usb_common i8042 serio
May 12 19:32:54 purplejam kernel: CPU: 0 PID: 68 Comm: kworker/0:1 Tainted: P O 4.16.8-1-ARCH #1
May 12 19:32:54 purplejam kernel: Hardware name: Gigabyte Technology Co., Ltd. Z68P-DS3/Z68P-DS3, BIOS F7 10/12/2011
May 12 19:32:54 purplejam kernel: Workqueue: events os_execute_work_item [nvidia]
May 12 19:32:54 purplejam kernel: RIP: 0010:_nv010416rm+0x30/0x1a0 [nvidia]
May 12 19:32:54 purplejam kernel: RSP: 0018:ffffa7ff01b33d88 EFLAGS: 00010246
May 12 19:32:54 purplejam kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000001
May 12 19:32:54 purplejam kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9006a9f6c008
May 12 19:32:54 purplejam kernel: RBP: ffff900699885ff8 R08: 0000000000000001 R09: 0000000000000087
May 12 19:32:54 purplejam kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff900699a43508
May 12 19:32:54 purplejam kernel: R13: 0000000000000000 R14: ffff900699bd71c8 R15: ffff9006ad384a80
May 12 19:32:54 purplejam kernel: FS: 0000000000000000(0000) GS:ffff9006bfa00000(0000) knlGS:0000000000000000
May 12 19:32:54 purplejam kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 12 19:32:54 purplejam kernel: CR2: 0000000000001fa8 CR3: 00000001f200a004 CR4: 00000000000606f0
May 12 19:32:54 purplejam kernel: Call Trace:
May 12 19:32:54 purplejam kernel: ? _nv001065rm+0x84/0xe0 [nvidia]
May 12 19:32:54 purplejam kernel: ? rm_execute_work_item+0x49/0xc0 [nvidia]
May 12 19:32:54 purplejam kernel: ? kmem_cache_alloc+0xa1/0x1b0
May 12 19:32:54 purplejam kernel: ? os_execute_work_item+0x40/0x60 [nvidia]
May 12 19:32:54 purplejam kernel: ? process_one_work+0x1d1/0x3b0
May 12 19:32:54 purplejam kernel: ? worker_thread+0x2b/0x3d0
May 12 19:32:54 purplejam kernel: ? process_one_work+0x3b0/0x3b0
May 12 19:32:54 purplejam kernel: ? kthread+0x112/0x130
May 12 19:32:54 purplejam kernel: ? kthread_create_on_node+0x60/0x60
May 12 19:32:54 purplejam kernel: ? ret_from_fork+0x35/0x40
May 12 19:32:54 purplejam kernel: Code: 89 f4 53 48 83 ed 08 89 fb e8 8d 11 00 00 85 c0 0f 85 0d 01 00 00 48 8b 05 ee 93 bb 00 89 de 45 31 ed 48 89 c7 ff 90 88 01 00 00 <48> 8b b0 a8 1f 00 00 48 89 c7 48 89 c3 e8 ae 4c 2c 00 48 85 c0
May 12 19:32:54 purplejam kernel: RIP: _nv010416rm+0x30/0x1a0 [nvidia] RSP: ffffa7ff01b33d88


The card is GTX 1060 6GB.
Comment by EQ (vityafx) - Saturday, 12 May 2018, 17:28 GMT
I have found some interesting lines about such error, probably related, in the NVIDIA module source's README:


A. On some notebooks with Optimus graphics, the NVIDIA driver may not be able
to retrieve the Video BIOS due to interactions between the System BIOS and
the Linux kernel's ACPI subsystem. On affected notebooks, applications that
require the GPU will fail, and messages like the following may appear in
the system log:


NVRM: failed to copy vbios to system memory.
NVRM: RmInitAdapter failed! (0x30:0xffffffff:858)
NVRM: rm_init_adapter(0) failed


Such problems are typically beyond the control of the NVIDIA driver, which
relies on proper cooperation of ACPI and the System BIOS to retrieve
important information about the GPU, including the Video BIOS.
Comment by EQ (vityafx) - Saturday, 12 May 2018, 17:37 GMT
Toby, could you tell your motherboard please? Mine is Gigabyte GA-Z68P-DS3 (2011 year) with old BIOS. According to the text above it could be related to our old BIOS. I have also read somewhere in internet that updating bios helped some guy who had such trouble in 2014.
Comment by Tobias Schwarz (tobyblack) - Saturday, 12 May 2018, 20:05 GMT
I have the MSI Z170A PC MATE Intel Z170 Motherboard. Also a GTX 1060 6GB.
Comment by Marc Sven Schulte (msschulte) - Sunday, 13 May 2018, 07:54 GMT
I can confirm the issue also, but without kernel panic (Xorg with 100% cpu time).

NVRM: RmInitAdapter failed! (0x26:0xffff:1123)
NVRM: rm_init_adapter failed for device bearing minor number 0

Mainboard: ASUS H87I-PLUS
Board: Gainward GeForce GTX 1060 6GB Phoenix
Operating System: Linux 4.16.8-1-ARCH
Driver Version: 396.24
Comment by Darek (blablo) - Sunday, 13 May 2018, 09:35 GMT
@vityafx log:
> May 12 19:32:32 purplejam kernel: NVRM: Xid (PCI:0000:01:00): 62, 0ac0(2f10) 00000000 00000000
https://docs.nvidia.com/deploy/xid-errors/index.html -> Internal micro-controller halt (newer drivers)

Please report this upstream at [1] following the instructions at [2]. Thanks

[1] https://devtalk.nvidia.com/default/board/98/linux/
[2] https://devtalk.nvidia.com/default/topic/522835/linux/if-you-have-a-problem-please-read-this-first/

Comment by Darek (blablo) - Sunday, 13 May 2018, 10:38 GMT Comment by Sven-Hendrik Haase (Svenstaro) - Monday, 14 May 2018, 08:34 GMT
Any chance there is anything for me to be done here in the package?
Comment by EQ (vityafx) - Monday, 14 May 2018, 08:39 GMT Comment by Sven-Hendrik Haase (Svenstaro) - Monday, 14 May 2018, 09:19 GMT
Alright. Closing this as upstream stuff is confirmed. Just flag the package once an update becomes available.

Loading...