Arch Linux

Please read this before reporting a bug:
https://wiki.archlinux.org/title/Bug_reporting_guidelines

Do NOT report bugs when a package is just outdated, or it is in the AUR. Use the 'flag out of date' link on the package page, or the Mailing List.

REPEAT: Do NOT report bugs for outdated packages!
Tasklist

FS#78019 - [nvidia-dkms] System crash during sleep (Failed to map NvKmsKapiMemory)

Attached to Project: Arch Linux
Opened by Thomas (uhthomas) - Tuesday, 28 March 2023, 17:36 GMT
Last edited by Toolybird (Toolybird) - Friday, 28 April 2023, 20:26 GMT
Task Type Bug Report
Category Packages: Extra
Status Closed
Assigned To No-one
Architecture All
Severity Low
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

Description:

Most of the time my system sleeps, I will come to find it in a state as if it had been powered off and on again. Today, I caught it mid-crash. There was one error on the screen, which I was able to find with dmesg.

[48408.603546] [drm:__nv_drm_gem_map_nvkms_memory_offset [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000900] Failed to map NvKmsKapiMemory 0x0000000032cd8710

This happens every single day, and has been for months.

It may be helpful to know I have an RTX 3080 Ti and I'm using GNOME Wayland.

Additional info:
* package version(s)

```
❯ paru -Q gnome-shell nvidia-dkms
gnome-shell 1:43.3-2
nvidia-dkms 530.41.03-1
```

* config and/or log files etc.

Additional content:

[39546.798557] usb 2-3.3: 1:2 : unsupported format bits 0x100000000
[45007.229979] razermouse: Command timed out. status: 04 transaction_id.id: 1f remaining_packets: 00 protocol_type: 00 data_size: 02, command_class: 07, command_id.id: 80 Params: 00000000000000000000000000000000 .
[47411.439329] razermouse: Command timed out. status: 04 transaction_id.id: 1f remaining_packets: 00 protocol_type: 00 data_size: 02, command_class: 07, command_id.id: 80 Params: 00000000000000000000000000000000 .
[48012.512420] razermouse: Command timed out. status: 04 transaction_id.id: 1f remaining_packets: 00 protocol_type: 00 data_size: 02, command_class: 07, command_id.id: 80 Params: 00000000000000000000000000000000 .
[48397.526804] snd_hda_codec_hdmi hdaudioC1D0: HDMI: invalid ELD data byte 17
[48397.527115] snd_hda_codec_hdmi hdaudioC1D0: HDMI: invalid ELD data byte 0
[48397.529174] rfkill: input handler enabled
[48404.651964] PM: suspend entry (deep)
[48404.691679] Filesystems sync: 0.039 seconds
[48404.696596] Freezing user space processes
[48408.603546] [drm:__nv_drm_gem_map_nvkms_memory_offset [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000900] Failed to map NvKmsKapiMemory 0x0000000032cd8710
[48408.606067] Freezing user space processes completed (elapsed 3.909 seconds)
[48408.606073] OOM killer disabled.
[48408.606074] Freezing remaining freezable tasks
[48408.607921] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
[48408.607945] printk: Suspending console(s) (use no_console_suspend to debug)
[48408.752222] ACPI: PM: Preparing to enter system sleep state S3
[48409.102931] ACPI: PM: Saving platform NVS memory
[48409.103014] Disabling non-boot CPUs ...
[48409.107550] smpboot: CPU 1 is now offline
[48409.112927] smpboot: CPU 2 is now offline
[48409.117746] smpboot: CPU 3 is now offline
[48409.121618] smpboot: CPU 4 is now offline
[48409.126124] smpboot: CPU 5 is now offline
[48409.130420] smpboot: CPU 6 is now offline
[48409.134783] smpboot: CPU 7 is now offline
[48409.138624] smpboot: CPU 8 is now offline
[48409.143890] smpboot: CPU 9 is now offline
[48409.148356] smpboot: CPU 10 is now offline
[48409.152869] smpboot: CPU 11 is now offline
[48409.156461] smpboot: CPU 12 is now offline
[48409.159976] smpboot: CPU 13 is now offline
[48409.163341] smpboot: CPU 14 is now offline
[48409.166263] smpboot: CPU 15 is now offline
[48409.169307] smpboot: CPU 16 is now offline
[48409.169880] Spectre V2 : Update user space SMT mitigation: STIBP off
[48409.172312] smpboot: CPU 17 is now offline
[48409.175595] smpboot: CPU 18 is now offline
[48409.178359] smpboot: CPU 19 is now offline
[48409.180633] smpboot: CPU 20 is now offline
[48409.183238] smpboot: CPU 21 is now offline
[48409.186366] smpboot: CPU 22 is now offline
[48409.188758] smpboot: CPU 23 is now offline
[48409.190789] smpboot: CPU 24 is now offline
[48409.193096] smpboot: CPU 25 is now offline
[48409.195495] smpboot: CPU 26 is now offline
[48409.197679] smpboot: CPU 27 is now offline
[48409.199787] smpboot: CPU 28 is now offline
[48409.202380] smpboot: CPU 29 is now offline
[48409.204396] smpboot: CPU 30 is now offline
[48409.210071] smpboot: CPU 31 is now offline
[48409.210862] ACPI: PM: Low-level resume complete
[48409.210883] ACPI: PM: Restoring platform NVS memory
[48409.210997] LVT offset 0 assigned for vector 0x400
[48409.211456] Enabling non-boot CPUs ...
[48409.211538] x86: Booting SMP configuration:
[48409.211539] smpboot: Booting Node 0 Processor 1 APIC 0x2
[48409.217432] ACPI: \_PR_.C002: Found 2 idle states
[48409.218149] CPU1 is up
[48409.218239] smpboot: Booting Node 0 Processor 2 APIC 0x4
[48409.220710] ACPI: \_PR_.C004: Found 2 idle states
[48409.221663] CPU2 is up
[48409.221705] smpboot: Booting Node 0 Processor 3 APIC 0x6
[48409.224165] ACPI: \_PR_.C006: Found 2 idle states
[48409.225040] CPU3 is up
[48409.225067] smpboot: Booting Node 0 Processor 4 APIC 0x8
[48409.227568] ACPI: \_PR_.C008: Found 2 idle states
[48409.228443] CPU4 is up
[48409.228470] smpboot: Booting Node 0 Processor 5 APIC 0xa
[48409.230997] ACPI: \_PR_.C00A: Found 2 idle states
[48409.232131] CPU5 is up
[48409.232227] smpboot: Booting Node 0 Processor 6 APIC 0xc
[48409.234749] ACPI: \_PR_.C00C: Found 2 idle states
[48409.235971] CPU6 is up
[48409.236002] smpboot: Booting Node 0 Processor 7 APIC 0xe
[48409.238528] ACPI: \_PR_.C00E: Found 2 idle states
[48409.239574] CPU7 is up
[48409.239604] smpboot: Booting Node 0 Processor 8 APIC 0x10
[48409.242107] ACPI: \_PR_.C010: Found 2 idle states
[48409.243388] CPU8 is up
[48409.243418] smpboot: Booting Node 0 Processor 9 APIC 0x12
[48409.245961] ACPI: \_PR_.C012: Found 2 idle states
[48409.247316] CPU9 is up
[48409.247347] smpboot: Booting Node 0 Processor 10 APIC 0x14
[48409.249881] ACPI: \_PR_.C014: Found 2 idle states
[48409.251338] CPU10 is up
[48409.251371] smpboot: Booting Node 0 Processor 11 APIC 0x16
[48409.253911] ACPI: \_PR_.C016: Found 2 idle states
[48409.255410] CPU11 is up
[48409.255439] smpboot: Booting Node 0 Processor 12 APIC 0x18
[48409.257945] ACPI: \_PR_.C018: Found 2 idle states
[48409.259516] CPU12 is up
[48409.259554] smpboot: Booting Node 0 Processor 13 APIC 0x1a
[48409.262096] ACPI: \_PR_.C01A: Found 2 idle states
[48409.263589] CPU13 is up
[48409.263617] smpboot: Booting Node 0 Processor 14 APIC 0x1c
[48409.266177] ACPI: \_PR_.C01C: Found 2 idle states
[48409.267691] CPU14 is up
[48409.267720] smpboot: Booting Node 0 Processor 15 APIC 0x1e
[48409.270297] ACPI: \_PR_.C01E: Found 2 idle states
[48409.271840] CPU15 is up
[48409.271870] smpboot: Booting Node 0 Processor 16 APIC 0x1
[48409.274419] ACPI: \_PR_.C001: Found 2 idle states
[48409.276263] Spectre V2 : Update user space SMT mitigation: STIBP always-on
[48409.276269] CPU16 is up
[48409.276296] smpboot: Booting Node 0 Processor 17 APIC 0x3
[48409.278869] ACPI: \_PR_.C003: Found 2 idle states
[48409.280663] CPU17 is up
[48409.280736] smpboot: Booting Node 0 Processor 18 APIC 0x5
[48409.283269] ACPI: \_PR_.C005: Found 2 idle states
[48409.285102] CPU18 is up
[48409.285147] smpboot: Booting Node 0 Processor 19 APIC 0x7
[48409.287668] ACPI: \_PR_.C007: Found 2 idle states
[48409.289389] CPU19 is up
[48409.289420] smpboot: Booting Node 0 Processor 20 APIC 0x9
[48409.291921] ACPI: \_PR_.C009: Found 2 idle states
[48409.293981] CPU20 is up
[48409.294011] smpboot: Booting Node 0 Processor 21 APIC 0xb
[48409.296585] ACPI: \_PR_.C00B: Found 2 idle states
[48409.298469] CPU21 is up
[48409.298497] smpboot: Booting Node 0 Processor 22 APIC 0xd
[48409.301067] ACPI: \_PR_.C00D: Found 2 idle states
[48409.303298] CPU22 is up
[48409.303333] smpboot: Booting Node 0 Processor 23 APIC 0xf
[48409.305914] ACPI: \_PR_.C00F: Found 2 idle states
[48409.308201] CPU23 is up
[48409.308231] smpboot: Booting Node 0 Processor 24 APIC 0x11
[48409.310740] ACPI: \_PR_.C011: Found 2 idle states
[48409.312894] CPU24 is up
[48409.312930] smpboot: Booting Node 0 Processor 25 APIC 0x13
[48409.315525] ACPI: \_PR_.C013: Found 2 idle states
[48409.318607] CPU25 is up
[48409.318634] smpboot: Booting Node 0 Processor 26 APIC 0x15
[48409.321232] ACPI: \_PR_.C015: Found 2 idle states
[48409.323771] CPU26 is up
[48409.323801] smpboot: Booting Node 0 Processor 27 APIC 0x17
[48409.326412] ACPI: \_PR_.C017: Found 2 idle states
[48409.328893] CPU27 is up
[48409.328922] smpboot: Booting Node 0 Processor 28 APIC 0x19
[48409.331437] ACPI: \_PR_.C019: Found 2 idle states
[48409.334050] CPU28 is up
[48409.334080] smpboot: Booting Node 0 Processor 29 APIC 0x1b
[48409.336701] ACPI: \_PR_.C01B: Found 2 idle states
[48409.339214] CPU29 is up
[48409.339246] smpboot: Booting Node 0 Processor 30 APIC 0x1d
[48409.341859] ACPI: \_PR_.C01D: Found 2 idle states
[48409.344558] CPU30 is up
[48409.344590] smpboot: Booting Node 0 Processor 31 APIC 0x1f
[48409.347213] ACPI: \_PR_.C01F: Found 2 idle states
[48409.350159] CPU31 is up
[48409.354103] ACPI: PM: Waking up from system sleep state S3
[48409.384858] nvme nvme1: Shutdown timeout set to 8 seconds
[48409.410221] nvme nvme0: 32/0/0 default/read/poll queues
[48409.410890] nvme nvme1: 32/0/0 default/read/poll queues
[48409.670168] usb 2-3.4: reset full-speed USB device number 6 using xhci_hcd
[48409.692199] ata6: SATA link down (SStatus 0 SControl 300)
[48409.692238] ata10: SATA link down (SStatus 0 SControl 300)
[48409.695191] ata3: SATA link down (SStatus 0 SControl 300)
[48409.695192] ata5: SATA link down (SStatus 0 SControl 300)
[48409.695233] ata4: SATA link down (SStatus 0 SControl 300)
[48409.695233] ata9: SATA link down (SStatus 0 SControl 300)
[48409.766116] usb 1-6.3: reset full-speed USB device number 4 using xhci_hcd
[48409.926124] usb 1-6.1: reset high-speed USB device number 3 using xhci_hcd
[48410.024890] OOM killer enabled.
[48410.024891] Restarting tasks ...
[48410.026350] Bluetooth: hci0: RTL: examining hci_ver=0a hci_rev=000b lmp_ver=0a lmp_subver=8761
[48410.027391] Bluetooth: hci0: RTL: rom_version status=0 version=1
[48410.027395] Bluetooth: hci0: RTL: loading rtl_bt/rtl8761bu_fw.bin
[48410.027414] Bluetooth: hci0: RTL: loading rtl_bt/rtl8761bu_config.bin
[48410.027427] Bluetooth: hci0: RTL: cfg_sz 6, total sz 27814
[48410.031826] done.
[48410.031838] random: crng reseeded on system resumption
[48410.032120] PM: suspend exit
[48410.167332] Bluetooth: hci0: RTL: fw version 0x09a98a6b
[48410.244610] Bluetooth: MGMT ver 1.22
[48410.247580] Bluetooth: hci0: Bad flag given (0x1) vs supported (0x0)

* link to upstream bug report, if any

Steps to reproduce:

Allow the system to sleep, maybe unplug a USB device, wait.
This task depends upon

Closed by  Toolybird (Toolybird)
Friday, 28 April 2023, 20:26 GMT
Reason for closing:  Upstream
Additional comments about closing:  If still happening, please report upstream.
Comment by Toolybird (Toolybird) - Tuesday, 28 March 2023, 21:29 GMT
A few hits show up when searching for that error online e.g. [1]. You'll have to report this upstream to Nvidia because this is *clearly* an upstream issue. Please let us know what you find out.

[1] https://forums.developer.nvidia.com/t/495-44-failed-to-map-nvkmskapimemory/194081

Loading...