FS#74886 - [nvidia] traps: Missing ENDBR with Linux 5.18.0-arch1
Attached to Project:
Arch Linux
Opened by sven (commonuser) - Saturday, 28 May 2022, 20:29 GMT
Last edited by Sven-Hendrik Haase (Svenstaro) - Monday, 06 June 2022, 15:37 GMT
Opened by sven (commonuser) - Saturday, 28 May 2022, 20:29 GMT
Last edited by Sven-Hendrik Haase (Svenstaro) - Monday, 06 June 2022, 15:37 GMT
|
Details
Description:
Module loading fails with kernel error "Missing ENDBR". Additional info: > May 28 22:17:38 kernel: nvidia: loading out-of-tree module taints kernel. > May 28 22:17:38 kernel: nvidia: module license 'NVIDIA' taints kernel. > May 28 22:17:38 kernel: Disabling lock debugging due to kernel taint > May 28 22:17:38 kernel: nvidia: module verification failed: signature and/or required key missing - tainting kernel > May 28 22:17:38 kernel: nvidia-nvlink: Nvlink Core is being initialized, major device number 507 > May 28 22:17:38 kernel: > May 28 22:17:38 kernel: traps: Missing ENDBR: _nv011430rm+0x0/0x10 [nvidia] > May 28 22:17:38 kernel: ------------[ cut here ]------------ > May 28 22:17:38 kernel: kernel BUG at arch/x86/kernel/traps.c:252! > May 28 22:17:38 kernel: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI > May 28 22:17:38 kernel: CPU: 13 PID: 528 Comm: systemd-modules Tainted: P OE 5.18.0-arch1-1 #1 b71a70fe104889aac2f32556bc52f649da2881d2 > May 28 22:17:38 kernel: Hardware name: Dell Inc. XPS 15 9510/01V4T3, BIOS 1.9.0 03/17/2022 > May 28 22:17:38 kernel: RIP: 0010:exc_control_protection+0xc2/0xd0 > May 28 22:17:38 kernel: Code: 8b 93 80 00 00 00 be f9 00 00 00 48 c7 c7 d3 ab e6 9f e8 d1 01 50 ff e9 72 ff ff ff 48 c7 c7 ba ab e6 9f e8 c7 31 fb ff 0f 0b <0f> 0b 66 66 2e 0f 1f 84 00 00 00 00 00 90 66 0f 1f 00 55 53 48 89 > May 28 22:17:38 kernel: RSP: 0018:ffffb7f280ef7b48 EFLAGS: 00010002 > May 28 22:17:38 kernel: RAX: 0000000000000033 RBX: ffffb7f280ef7b68 RCX: 0000000000000027 > May 28 22:17:38 kernel: RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff93176f7616a0 > May 28 22:17:38 kernel: RBP: 0000000000000003 R08: 0000000000000000 R09: ffffb7f280ef7968 > May 28 22:17:38 kernel: R10: 0000000000000003 R11: ffffffffa06caa08 R12: 0000000000000000 > May 28 22:17:38 kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 > May 28 22:17:38 kernel: FS: 00007faac0cf8380(0000) GS:ffff93176f740000(0000) knlGS:0000000000000000 > May 28 22:17:38 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > May 28 22:17:38 kernel: CR2: 0000564ced0d4000 CR3: 0000000109ef0006 CR4: 0000000000f70ee0 > May 28 22:17:38 kernel: PKRU: 55555554 > May 28 22:17:38 kernel: Call Trace: > May 28 22:17:38 kernel: <TASK> > May 28 22:17:38 kernel: asm_exc_control_protection+0x22/0x30 > May 28 22:17:38 kernel: RIP: 0010:_nv011430rm+0x0/0x10 [nvidia] > May 28 22:17:38 kernel: Code: 66 2e 0f 1f 84 00 00 00 00 00 48 83 ec 08 e8 07 0f 1e 00 48 83 c4 08 48 89 c7 e9 bb ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 90 <48> 89 f7 e9 18 08 00 00 0f 1f 84 00 00 00 00 00 48 89 f7 e9 18 08 > May 28 22:17:38 kernel: RSP: 0018:ffffb7f280ef7c10 EFLAGS: 00010202 > May 28 22:17:38 kernel: RAX: ffffffffc25e90e0 RBX: ffffffffc46e2b10 RCX: 0000000000000000 > May 28 22:17:38 kernel: RDX: 0000000000043187 RSI: 0000000000000010 RDI: ffffffffc46e2b10 > May 28 22:17:38 kernel: RBP: ffff931042b6dfe0 R08: 0000000000000020 R09: ffffffffc46e2b50 > May 28 22:17:38 kernel: R10: 0000000000039688 R11: ffff93178f7fa000 R12: 0000000000000010 > May 28 22:17:38 kernel: R13: ffff931042b6b000 R14: 00007faac158332c R15: ffffb7f280ef7d80 > May 28 22:17:38 kernel: ? _nv034888rm+0x20/0x20 [nvidia 41a8e80d4727066c67f87d1723f6a7740a16e698] > May 28 22:17:38 kernel: _nv011428rm+0x24/0xe0 [nvidia 41a8e80d4727066c67f87d1723f6a7740a16e698] > May 28 22:17:38 kernel: _nv034889rm+0xe/0xa0 [nvidia 41a8e80d4727066c67f87d1723f6a7740a16e698] > May 28 22:17:38 kernel: _nv034892rm+0x1d/0x30 [nvidia 41a8e80d4727066c67f87d1723f6a7740a16e698] > May 28 22:17:38 kernel: _nv034894rm+0x2f/0x40 [nvidia 41a8e80d4727066c67f87d1723f6a7740a16e698] > May 28 22:17:38 kernel: _nv015562rm+0x15/0x70 [nvidia 41a8e80d4727066c67f87d1723f6a7740a16e698] > May 28 22:17:38 kernel: _nv000644rm+0x9/0x20 [nvidia 41a8e80d4727066c67f87d1723f6a7740a16e698] > May 28 22:17:38 kernel: ? cdev_add+0x4d/0x60 > May 28 22:17:38 kernel: rm_init_rm+0x17/0x60 [nvidia 41a8e80d4727066c67f87d1723f6a7740a16e698] > May 28 22:17:38 kernel: nvidia_init_module+0x22e/0x5b0 [nvidia 41a8e80d4727066c67f87d1723f6a7740a16e698] > May 28 22:17:38 kernel: ? nvidia_init_module+0x5b0/0x5b0 [nvidia 41a8e80d4727066c67f87d1723f6a7740a16e698] > May 28 22:17:38 kernel: nvidia_frontend_init_module+0x50/0x91 [nvidia 41a8e80d4727066c67f87d1723f6a7740a16e698] > May 28 22:17:38 kernel: ? nvidia_init_module+0x5b0/0x5b0 [nvidia 41a8e80d4727066c67f87d1723f6a7740a16e698] > May 28 22:17:38 kernel: do_one_initcall+0x5a/0x220 > May 28 22:17:38 kernel: do_init_module+0x4a/0x240 > May 28 22:17:38 kernel: __do_sys_init_module+0x138/0x1b0 > May 28 22:17:38 kernel: do_syscall_64+0x5c/0x90 > May 28 22:17:38 kernel: ? syscall_exit_to_user_mode+0x26/0x50 > May 28 22:17:38 kernel: ? do_syscall_64+0x6b/0x90 > May 28 22:17:38 kernel: ? handle_mm_fault+0xb2/0x280 > May 28 22:17:38 kernel: ? do_user_addr_fault+0x1db/0x680 > May 28 22:17:38 kernel: ? do_syscall_64+0x6b/0x90 > May 28 22:17:38 kernel: ? exc_page_fault+0x74/0x170 > May 28 22:17:38 kernel: entry_SYSCALL_64_after_hwframe+0x44/0xae > May 28 22:17:38 kernel: iwlwifi 0000:00:14.3 wlp0s20f3: renamed from wlan0 > May 28 22:17:38 kernel: RIP: 0033:0x7faac0f12c3e > May 28 22:17:38 kernel: Code: 48 8b 0d 5d b1 0e 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 af 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 2a b1 0e 00 f7 d8 64 89 01 48 > May 28 22:17:38 kernel: RSP: 002b:00007fff1f730d08 EFLAGS: 00000246 ORIG_RAX: 00000000000000af > May 28 22:17:38 kernel: RAX: ffffffffffffffda RBX: 000055cd28f8eb10 RCX: 00007faac0f12c3e > May 28 22:17:38 kernel: RDX: 00007faac158332c RSI: 0000000003bb4e40 RDI: 00007faaba54c010 > May 28 22:17:38 kernel: RBP: 00007faaba54c010 R08: 000055cd28f8ea10 R09: 0000000000000000 > May 28 22:17:38 kernel: R10: 0000000000000005 R11: 0000000000000246 R12: 00007faac158332c > May 28 22:17:38 kernel: R13: 000055cd28f8ecc0 R14: 000055cd28f8e7c0 R15: 000055cd28f938c0 > May 28 22:17:38 kernel: </TASK> > May 28 22:17:38 kernel: Modules linked in: mousedev snd_sof acpi_cpufreq(-) kfifo_buf snd_sof_utils hid_sensor_iio_common snd_soc_hdac_hda industrialio snd_hda_ext_core snd_ctl_led snd_soc_acpi_intel_match snd_soc_acpi snd_hda_codec_realtek(+) hid_sensor_hub soundwire_bus nvidia(POE+) intel_ishtp_hid hid_mul> > May 28 22:17:38 kernel: intel_lpss_pci processor_thermal_device psmouse videobuf2_common intel_lpss btbcm pcspkr snd i2c_i801 spi_intel_pci processor_thermal_rfim btmtk btintel spi_intel i2c_smbus soundcore i915 mei_me idma64 cfg80211 videodev processor_thermal_mbox drm_buddy ucsi_acpi mc bluetooth processo> > May 28 22:17:38 kernel: vivaldi_fmap crc32_pclmul crc32c_intel ghash_clmulni_intel tpm_tis nvme aesni_intel tpm_tis_core crypto_simd tpm xhci_pci cryptd nvme_core rng_core rtsx_pci xhci_pci_renesas i8042 serio > May 28 22:17:38 kernel: ---[ end trace 0000000000000000 ]--- > May 28 22:17:38 kernel: RIP: 0010:exc_control_protection+0xc2/0xd0 > May 28 22:17:38 kernel: Code: 8b 93 80 00 00 00 be f9 00 00 00 48 c7 c7 d3 ab e6 9f e8 d1 01 50 ff e9 72 ff ff ff 48 c7 c7 ba ab e6 9f e8 c7 31 fb ff 0f 0b <0f> 0b 66 66 2e 0f 1f 84 00 00 00 00 00 90 66 0f 1f 00 55 53 48 89 > May 28 22:17:38 kernel: RSP: 0018:ffffb7f280ef7b48 EFLAGS: 00010002 > May 28 22:17:38 kernel: RAX: 0000000000000033 RBX: ffffb7f280ef7b68 RCX: 0000000000000027 > May 28 22:17:38 kernel: RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff93176f7616a0 > May 28 22:17:38 kernel: RBP: 0000000000000003 R08: 0000000000000000 R09: ffffb7f280ef7968 > May 28 22:17:38 kernel: R10: 0000000000000003 R11: ffffffffa06caa08 R12: 0000000000000000 > May 28 22:17:38 kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 > May 28 22:17:38 kernel: FS: 00007faac0cf8380(0000) GS:ffff93176f740000(0000) knlGS:0000000000000000 > May 28 22:17:38 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > May 28 22:17:38 kernel: CR2: 0000564ced0d4000 CR3: 0000000109ef0006 CR4: 0000000000f70ee0 > May 28 22:17:38 kernel: PKRU: 55555554 > May 28 22:17:38 systemd[1]: systemd-modules-load.service: Main process exited, code=killed, status=11/SEGV > May 28 22:17:38 systemd[1]: systemd-modules-load.service: Failed with result 'signal'. > May 28 22:17:38 systemd[1]: Failed to start Load Kernel Modules. Steps to reproduce: Reboot with current Linux and nivida package. |
This task depends upon
Mai 28 23:07:00 arch-precission kernel: nvidia-nvlink: Nvlink Core is being initialized, major device number 509
Mai 28 23:07:00 arch-precission kernel:
Mai 28 23:07:00 arch-precission kernel: traps: Missing ENDBR: _nv011430rm+0x0/0x10 [nvidia]
Mai 28 23:07:00 arch-precission kernel: ------------[ cut here ]------------
Mai 28 23:07:00 arch-precission kernel: kernel BUG at arch/x86/kernel/traps.c:252!
Mai 28 23:07:00 arch-precission kernel: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
Mai 28 23:07:00 arch-precission kernel: CPU: 0 PID: 345 Comm: systemd-modules Tainted: P OE 5.18.0-arch1-1 #1 b71a70fe1048>
Mai 28 23:07:00 arch-precission kernel: usb 5-2.3.3: New USB device found, idVendor=2109, idProduct=2813, bcdDevice=90.01
Mai 28 23:07:00 arch-precission kernel: Hardware name: Dell Inc. Precision 5760/04NVXT, BIOS 1.6.0 12/10/2021
Mai 28 23:07:00 arch-precission kernel: RIP: 0010:exc_control_protection+0xc2/0xd0
Mai 28 23:07:00 arch-precission kernel: Code: 8b 93 80 00 00 00 be f9 00 00 00 48 c7 c7 d3 ab 66 b0 e8 d1 01 50 ff e9 72 ff ff ff 48 c7 >
Mai 28 23:07:00 arch-precission kernel: usb 5-2.3.3: New USB device strings: Mfr=1, Product=2, SerialNumber=0
Mai 28 23:07:00 arch-precission kernel: RSP: 0018:ffffbb0e4110fc28 EFLAGS: 00010002
Mai 28 23:07:00 arch-precission kernel: RAX: 0000000000000033 RBX: ffffbb0e4110fc48 RCX: 0000000000000027
Mai 28 23:07:00 arch-precission kernel: RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff9eb30f4216a0
Mai 28 23:07:00 arch-precission kernel: RBP: 0000000000000003 R08: 0000000000000000 R09: ffffbb0e4110fa48
Mai 28 23:07:00 arch-precission kernel: R10: 0000000000000003 R11: ffffffffb0ecaa08 R12: 0000000000000000
Mai 28 23:07:00 arch-precission kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Mai 28 23:07:00 arch-precission kernel: FS: 00007f8895148380(0000) GS:ffff9eb30f400000(0000) knlGS:0000000000000000
Mai 28 23:07:00 arch-precission kernel: usb 5-2.3.3: Product: USB2.0 Hub
Mai 28 23:07:00 arch-precission kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mai 28 23:07:00 arch-precission kernel: CR2: 000055a27fec80c0 CR3: 0000000105518003 CR4: 0000000000f70ef0
Mai 28 23:07:00 arch-precission kernel: PKRU: 55555554
Mai 28 23:07:00 arch-precission kernel: Call Trace:
Mai 28 23:07:00 arch-precission kernel: <TASK>
Mai 28 23:07:00 arch-precission kernel: usb 5-2.3.3: Manufacturer: VIA Labs, Inc.
Mai 28 23:07:00 arch-precission kernel: asm_exc_control_protection+0x22/0x30
Mai 28 23:07:00 arch-precission kernel: RIP: 0010:_nv011430rm+0x0/0x10 [nvidia]
Edit:
https://github.com/NVIDIA/open-gpu-kernel-modules/issues/256
Edit2:
Based off open-gpu-kernel-modules hopefully it is as simple as
fcf-protection=none being set in two Makefiles and switching it to fcf-protection=branch plus adding in -mharden-sls=all for straight line speculation at least makes objtool happy.
nvidia-open-515.48.07-2
no ibt=off
-> All Fine, really thank you all
[1] https://github.com/NVIDIA/open-gpu-kernel-modules/issues/256#issuecomment-1141350315
[2] https://github.com/NVIDIA/open-gpu-kernel-modules/issues/256#issuecomment-1142294080