FS#72957 - Kernel Oops after upgrading from 5.14.11-arch1-1 to 5.15.6.arch2-1

Attached to Project: Arch Linux
Opened by Jann Foehringer (JannF) - Wednesday, 08 December 2021, 20:10 GMT
Last edited by Sven-Hendrik Haase (Svenstaro) - Sunday, 20 February 2022, 04:02 GMT
Task Type Bug Report
Category Kernel
Status Closed
Assigned To Jan Alexander Steffens (heftig)
Architecture x86_64
Severity High
Priority Normal
Reported Version
Due in Version Undecided
Due Date Undecided
Percent Complete 100%
Votes 0
Private No

Details

Description:

After upgrading from 5.14.11-arch1-1 to 5.15.6.arch2-1 , I get a kernel oops on every boot.
Trying to shut down the machine will result in a system hang and it is impossible to unmount or reboot the machines using sysrq.
Syncing the disk using sysrq is still working, but one is forced to hard turn off the machine, resulting in a dirty journal.


Additional info:

Machine is a CTL Chromebox running Mr.Chromebox's coreboot 4.14 build and booting using EDK2/Tianocore UEFI.
Full dmesg of 5.14 and 5.15 is shareable on request.


Steps to reproduce:

simply boot the system and get the following in dmseg:

[ 12.678702] BUG: kernel NULL pointer dereference, address: 000000000000030c
[ 12.678811] #PF: supervisor read access in kernel mode
[ 12.678891] #PF: error_code(0x0000) - not-present page
[ 12.678970] PGD 0 P4D 0
[ 12.679013] Oops: 0000 [#1] PREEMPT SMP PTI
[ 12.679074] CPU: 3 PID: 527 Comm: systemd-udevd Not tainted 5.15.6-arch2-1 #1 cfba5f24b926d50e4fcc5026b2bafd12217f3134
[ 12.679236] Hardware name: Google Wukong/Wukong, BIOS MrChromebox-4.14 07/25/2021
[ 12.679341] RIP: 0010:cros_ec_check_features+0xc/0xf0
[ 12.679420] Code: 01 e4 eb 92 41 bc f4 ff ff ff eb 8a e8 9d 86 29 00 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 0f 1f 44 00 00 41 54 55 48 89 fd 53 <83> bf 0c 03 00 00 ff 89 f3 74 32 85 db 8d 53 1f 89 d9 b8 01 00 00
[ 12.679690] RSP: 0018:ffffab3d008eba48 EFLAGS: 00010246
[ 12.679770] RAX: ffff9049c3c26c00 RBX: ffff9049c1a36800 RCX: 0000000000000001
[ 12.679871] RDX: 0000000000000000 RSI: 0000000000000029 RDI: 0000000000000000
[ 12.679989] RBP: 0000000000000000 R08: 0000000000000004 R09: 0000000000000000
[ 12.680086] R10: 0000000000000000 R11: 0000000000000000 R12: ffff9049c445ab68
[ 12.680190] R13: ffff9049c1a36890 R14: 0000000000000000 R15: ffffffffc04475c0
[ 12.680292] FS: 00007f3e85627a40(0000) GS:ffff904f26cc0000(0000) knlGS:0000000000000000
[ 12.680378] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 12.680426] CR2: 000000000000030c CR3: 000000014032c003 CR4: 00000000003706e0
[ 12.680483] Call Trace:
[ 12.680504] <TASK>
[ 12.680523] cros_typec_probe+0xe3/0x5b0 [cros_ec_typec 379355f85101ac1eaaa59b885eb9df7616af88ab]
[ 12.680594] ? device_pm_check_callbacks+0x31/0xf0
[ 12.680635] platform_probe+0x3f/0xa0
[ 12.680666] really_probe+0x203/0x400
[ 12.680709] __driver_probe_device+0x112/0x190
[ 12.680745] driver_probe_device+0x1e/0x90
[ 12.680777] __driver_attach+0xc8/0x1e0
[ 12.680807] ? __device_attach_driver+0xf0/0xf0
[ 12.680844] ? __device_attach_driver+0xf0/0xf0
[ 12.680880] bus_for_each_dev+0x8d/0xe0
[ 12.680913] bus_add_driver+0x136/0x1f0
[ 12.680945] driver_register+0x8f/0xf0
[ 12.680983] ? 0xffffffffc044b000
[ 12.681010] do_one_initcall+0x57/0x220
[ 12.681051] do_init_module+0x5c/0x270
[ 12.681131] load_module+0x25de/0x27e0
[ 12.681210] ? __do_sys_init_module+0x12e/0x1b0
[ 12.681295] __do_sys_init_module+0x12e/0x1b0
[ 12.681375] do_syscall_64+0x5c/0x90
[ 12.684165] ? do_user_addr_fault+0x20b/0x6b0
[ 12.687534] ? syscall_exit_to_user_mode+0x23/0x50
[ 12.690808] ? exc_page_fault+0x72/0x180
[ 12.694016] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 12.696266] RIP: 0033:0x7f3e85f5f32e
[ 12.699135] Code: 48 8b 0d 45 0b 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 af 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 12 0b 0c 00 f7 d8 64 89 01 48
[ 12.702535] RSP: 002b:00007fff64cc5198 EFLAGS: 00000246 ORIG_RAX: 00000000000000af
[ 12.704050] RAX: ffffffffffffffda RBX: 000056166291da30 RCX: 00007f3e85f5f32e
[ 12.705574] RDX: 00007f3e860b6a9d RSI: 000000000000d778 RDI: 0000561662a32e30
[ 12.707082] RBP: 0000561662a32e30 R08: 000056166292ba40 R09: 0000000000000000
[ 12.708688] R10: 000056166298aa80 R11: 0000000000000246 R12: 00007f3e860b6a9d
[ 12.710124] R13: 000056166291da30 R14: 000056166291d5e0 R15: 000056166291da30
[ 12.711519] </TASK>
[ 12.711520] Modules linked in: ac97_bus(+) pcc_cpufreq(-) processor_thermal_mbox cros_ec_typec(+) video intel_uncore r8169(+) snd_pcm_dmaengine cros_usbpd_notify btrtl processor_thermal_rapl snd_hwdep btbcm snd_pcm i2c_i801 pcspkr btintel realtek typec i2c_smbus snd_timer ttm mdio_devres tpm_tis_spi intel_rapl_common cfg80211 bluetooth intel_xhci_usb_role_switch acpi_als joydev mousedev intel_soc_dts_iosf snd libphy int3403_thermal roles industrialio_triggered_buffer intel_lpss_pci int340x_thermal_zone cros_ec_lpcs ecdh_generic intel_lpss kfifo_buf intel_gtt intel_pch_thermal tpm_tis_core rfkill idma64 cros_ec soundcore industrialio int3400_thermal acpi_thermal_rel coreboot_table mac_hid ipmi_devintf ipmi_msghandler sg crypto_user fuse bpf_preload ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 hid_semitek usbhid dm_crypt cbc encrypted_keys dm_mod trusted asn1_encoder tee tpm rng_core crct10dif_pclmul crc32_pclmul crc32c_intel sdhci_pci ghash_clmulni_intel cqhci aesni_intel
[ 12.713985] sdhci xhci_pci crypto_simd cryptd mmc_core xhci_pci_renesas
[ 12.726324] CR2: 000000000000030c
[ 12.726325] ---[ end trace 60a2102d16e30b02 ]---
[ 12.726326] RIP: 0010:cros_ec_check_features+0xc/0xf0
[ 12.726332] Code: 01 e4 eb 92 41 bc f4 ff ff ff eb 8a e8 9d 86 29 00 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 0f 1f 44 00 00 41 54 55 48 89 fd 53 <83> bf 0c 03 00 00 ff 89 f3 74 32 85 db 8d 53 1f 89 d9 b8 01 00 00
[ 12.726333] RSP: 0018:ffffab3d008eba48 EFLAGS: 00010246
[ 12.726335] RAX: ffff9049c3c26c00 RBX: ffff9049c1a36800 RCX: 0000000000000001
[ 12.726336] RDX: 0000000000000000 RSI: 0000000000000029 RDI: 0000000000000000
[ 12.726337] RBP: 0000000000000000 R08: 0000000000000004 R09: 0000000000000000
[ 12.726337] R10: 0000000000000000 R11: 0000000000000000 R12: ffff9049c445ab68
[ 12.726338] R13: ffff9049c1a36890 R14: 0000000000000000 R15: ffffffffc04475c0
[ 12.726339] FS: 00007f3e85627a40(0000) GS:ffff904f26cc0000(0000) knlGS:0000000000000000
[ 12.726340] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 12.726341] CR2: 000000000000030c CR3: 000000014032c003 CR4: 00000000003706e0


This task depends upon

Closed by  Sven-Hendrik Haase (Svenstaro)
Sunday, 20 February 2022, 04:02 GMT
Reason for closing:  Upstream
Additional comments about closing:  This appears to be an upstream issue and should be fixed as such unless there's an easy patch. However, Linux releases often and I think this thing should definitely be fixed upstream.
Comment by Jann Foehringer (JannF) - Wednesday, 08 December 2021, 21:06 GMT
dug as far as I could and this seems to have been introduced by this commit: https://github.com/torvalds/linux/commit/a8db7a3f8ac69e558c7bfbd04802201c39a104ad

So it may or may not be an arch issue and I filed a bug report here: https://bugzilla.kernel.org/show_bug.cgi?id=215269

Loading...