FS#80230 - [linux] `sudo chown -R root:tracing /sys/kernel/debug/tracing/` triggers kernel BUG
Attached to Project:
Arch Linux
Opened by Milian Wolff (milianw) - Saturday, 11 November 2023, 10:04 GMT
Last edited by Toolybird (Toolybird) - Wednesday, 15 November 2023, 21:20 GMT
Opened by Milian Wolff (milianw) - Saturday, 11 November 2023, 10:04 GMT
Last edited by Toolybird (Toolybird) - Wednesday, 15 November 2023, 21:20 GMT
|
Details
Description:
I can reliably trigger this kernel crash now with ``` $ uname -a Linux agathemoarbauer 6.6.1-arch1-1 #1 SMP PREEMPT_DYNAMIC Wed, 08 Nov 2023 16:05:38 +0000 x86_64 GNU/Linux ``` All I need to do is run: ``` sudo sysctl -w kernel.yama.ptrace_scope=0 sudo mount -o remount,mode=755 /sys/kernel/debug sudo mount -o remount,mode=755 /sys/kernel/debug/tracing sudo mount -o remount,mode=755 /sys/kernel/tracing sudo chown -R root:tracing /sys/kernel/debug/tracing/ 2828 Killed sudo chown -R root:tracing /sys/kernel/debug/tracing/ ´``` dmesg shows: ``` [ 60.723813] BUG: kernel NULL pointer dereference, address: 0000000000000058 [ 60.723817] #PF: supervisor read access in kernel mode [ 60.723819] #PF: error_code(0x0000) - not-present page [ 60.723820] PGD 0 P4D 0 [ 60.723821] Oops: 0000 [#1] PREEMPT SMP NOPTI [ 60.723823] CPU: 5 PID: 2830 Comm: chown Tainted: P OE 6.6.1-arch1-1 #1 be166a630cd909acf8820643140e9106c6ea80e6 [ 60.723825] Hardware name: LENOVO 20Y30018GE/20Y30018GE, BIOS N40ET42W (1.24 ) 07/26/2023 [ 60.723826] RIP: 0010:eventfs_set_attr+0x28/0xd0 [ 60.723830] Code: 90 90 f3 0f 1e fa 0f 1f 44 00 00 41 55 41 54 49 89 d4 55 48 89 fd 48 c7 c7 a0 b7 b2 b2 53 48 89 f3 e8 8c 16 8c 00 4c 8b 6b 78 <41> f6 45 58 01 0f 85 87 00 00 00 48 89 de 4c 89 e2 48 89 ef e8 4f [ 60.723830] RSP: 0018:ffffc90007fdbd38 EFLAGS: 00010246 [ 60.723832] RAX: 0000000000000000 RBX: ffff88810047e180 RCX: 8000000000000000 [ 60.723832] RDX: ffff888174b6ce00 RSI: ffff88810047e180 RDI: ffffffffb2b2b7a0 [ 60.723833] RBP: ffffffffb2b20620 R08: 0000000000000000 R09: ffffffffb2a4a488 [ 60.723834] R10: 00000000654f505f R11: 0000000018432441 R12: ffffc90007fdbe00 [ 60.723835] R13: 0000000000000000 R14: ffff88810047e180 R15: ffff8881004c02a8 [ 60.723835] FS: 00007f4ac6f4c740(0000) GS:ffff88901f540000(0000) knlGS:0000000000000000 [ 60.723836] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 60.723837] CR2: 0000000000000058 CR3: 000000021ad48001 CR4: 0000000000f70ee0 [ 60.723838] PKRU: 55555554 [ 60.723839] Call Trace: [ 60.723840] <TASK> [ 60.723842] ? __die+0x23/0x70 [ 60.723845] ? page_fault_oops+0x171/0x4e0 [ 60.723847] ? generic_permission+0x39/0x220 [ 60.723850] ? exc_page_fault+0x7f/0x180 [ 60.723853] ? asm_exc_page_fault+0x26/0x30 [ 60.723857] ? eventfs_set_attr+0x28/0xd0 [ 60.723858] ? eventfs_set_attr+0x24/0xd0 [ 60.723859] notify_change+0x1f2/0x4b0 [ 60.723862] ? chown_common+0x222/0x230 [ 60.723863] chown_common+0x222/0x230 [ 60.723865] do_fchownat+0xa3/0x100 [ 60.723866] __x64_sys_fchownat+0x1f/0x30 [ 60.723867] do_syscall_64+0x5d/0x90 [ 60.723869] ? syscall_exit_to_user_mode+0x2b/0x40 [ 60.723871] ? do_syscall_64+0x6c/0x90 [ 60.723872] ? do_syscall_64+0x6c/0x90 [ 60.723874] ? do_syscall_64+0x6c/0x90 [ 60.723875] entry_SYSCALL_64_after_hwframe+0x6e/0xd8 [ 60.723877] RIP: 0033:0x7f4ac704e2ce [ 60.723904] Code: 48 8b 0d 65 8a 0d 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 04 01 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 32 8a 0d 00 f7 d8 64 89 01 48 [ 60.723905] RSP: 002b:00007ffed0b735b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000104 [ 60.723906] RAX: ffffffffffffffda RBX: 00005625366cfd80 RCX: 00007f4ac704e2ce [ 60.723907] RDX: 0000000000000000 RSI: 00005625366cfe10 RDI: 0000000000000004 [ 60.723908] RBP: 0000000000000001 R08: 0000000000000100 R09: 0000000000000007 [ 60.723908] R10: 00000000000003e9 R11: 0000000000000246 R12: 00005625366cb9c0 [ 60.723909] R13: 00005625366cfd10 R14: 00005625366cfe10 R15: 0000000000000000 [ 60.723910] </TASK> [ 60.723910] Modules linked in: snd_seq_dummy snd_hrtimer snd_seq snd_seq_device rfcomm ccm cmac algif_hash algif_skcipher af_alg bnep nvidia_drm(POE) nvidia_modeset(POE) snd_ctl_led snd_soc_skl_hda_dsp snd_soc_intel_hda_dsp_common snd_soc_hdac_hdmi snd_sof_probes nvidia_uvm(POE) snd_hda_codec_realtek snd_hda_codec_generic snd_soc_dmic intel_tcc_cooling snd_sof_pci_intel_tgl x86_pkg_temp_thermal intel_powerclamp snd_sof_intel_hda_common soundwire_intel coretemp snd_sof_intel_hda_mlink nvidia(POE) soundwire_cadence snd_sof_intel_hda kvm_intel snd_sof_pci snd_sof_xtensa_dsp snd_sof kvm snd_sof_utils snd_soc_hdac_hda vfat snd_hda_ext_core fat snd_soc_acpi_intel_match irqbypass snd_soc_acpi iwlmvm crct10dif_pclmul joydev soundwire_generic_allocation mousedev soundwire_bus crc32_pclmul snd_hda_codec_hdmi snd_soc_core polyval_clmulni polyval_generic snd_compress gf128mul mac80211 ac97_bus ghash_clmulni_intel snd_pcm_dmaengine sha512_ssse3 btusb aesni_intel btrtl uvcvideo snd_hda_intel crypto_simd btintel videobuf2_vmalloc [ 60.723940] snd_intel_dspcfg cryptd libarc4 btbcm uvc snd_intel_sdw_acpi btmtk videobuf2_memops iTCO_wdt hid_multitouch snd_hda_codec intel_pmc_bxt videobuf2_v4l2 mei_hdcp mei_wdt mei_pxp rapl ee1004 iTCO_vendor_support intel_rapl_msr snd_hda_core bluetooth processor_thermal_device_pci_legacy videodev processor_thermal_device iwlwifi processor_thermal_rfim spi_nor videobuf2_common snd_hwdep intel_cstate think_lmi processor_thermal_mbox intel_uncore psmouse pcspkr cfg80211 mc ecdh_generic mtd firmware_attributes_class wmi_bmof mei_me ucsi_acpi processor_thermal_rapl snd_pcm i2c_i801 intel_lpss_pci typec_ucsi intel_rapl_common i2c_smbus intel_lpss thunderbolt i2c_hid_acpi mei snd_timer typec intel_soc_dts_iosf idma64 i2c_hid int3403_thermal roles int340x_thermal_zone int3400_thermal acpi_tad acpi_pad acpi_thermal_rel mac_hid i2c_dev crypto_user fuse acpi_call(OE) dm_mod loop ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 i915 thinkpad_acpi rtsx_pci_sdmmc ledtrig_audio i2c_algo_bit mmc_core serio_raw [ 60.723972] platform_profile drm_buddy ttm snd atkbd nvme intel_gtt libps2 vivaldi_fmap soundcore drm_display_helper nvme_core xhci_pci crc32c_intel spi_intel_pci rfkill rtsx_pci spi_intel xhci_pci_renesas cec nvme_common video i8042 serio wmi [ 60.723982] CR2: 0000000000000058 [ 60.723983] ---[ end trace 0000000000000000 ]--- [ 60.723984] RIP: 0010:eventfs_set_attr+0x28/0xd0 [ 60.723985] Code: 90 90 f3 0f 1e fa 0f 1f 44 00 00 41 55 41 54 49 89 d4 55 48 89 fd 48 c7 c7 a0 b7 b2 b2 53 48 89 f3 e8 8c 16 8c 00 4c 8b 6b 78 <41> f6 45 58 01 0f 85 87 00 00 00 48 89 de 4c 89 e2 48 89 ef e8 4f [ 60.723986] RSP: 0018:ffffc90007fdbd38 EFLAGS: 00010246 [ 60.723987] RAX: 0000000000000000 RBX: ffff88810047e180 RCX: 8000000000000000 [ 60.723987] RDX: ffff888174b6ce00 RSI: ffff88810047e180 RDI: ffffffffb2b2b7a0 [ 60.723988] RBP: ffffffffb2b20620 R08: 0000000000000000 R09: ffffffffb2a4a488 [ 60.723989] R10: 00000000654f505f R11: 0000000018432441 R12: ffffc90007fdbe00 [ 60.723989] R13: 0000000000000000 R14: ffff88810047e180 R15: ffff8881004c02a8 [ 60.723990] FS: 00007f4ac6f4c740(0000) GS:ffff88901f540000(0000) knlGS:0000000000000000 [ 60.723991] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 60.723991] CR2: 0000000000000058 CR3: 000000021ad48001 CR4: 0000000000f70ee0 [ 60.723992] PKRU: 55555554 [ 60.723992] note: chown[2830] exited with irqs disabled ``` Additional info: * package version(s) * config and/or log files etc. * link to upstream bug report, if any Steps to reproduce: |
This task depends upon
Closed by Toolybird (Toolybird)
Wednesday, 15 November 2023, 21:20 GMT
Reason for closing: Upstream
Additional comments about closing: Queued for 6.6.2
Wednesday, 15 November 2023, 21:20 GMT
Reason for closing: Upstream
Additional comments about closing: Queued for 6.6.2
inxi -GSC -xx
System:
Host: agathemoarbauer Kernel: 6.6.1-arch1-1 arch: x86_64 bits: 64
compiler: gcc v: 13.2.1 Desktop: KDE Plasma v: 5.27.9 tk: Qt v: 5.15.11
wm: kwin_x11 dm: SDDM Distro: Arch Linux
CPU:
Info: 8-core model: 11th Gen Intel Core i7-11850H bits: 64 type: MT MCP
arch: Tiger Lake rev: 1 cache: L1: 640 KiB L2: 10 MiB L3: 24 MiB
Speed (MHz): avg: 816 high: 1067 min/max: 800/4800 cores: 1: 800 2: 800
3: 800 4: 800 5: 800 6: 1067 7: 800 8: 800 9: 800 10: 800 11: 800 12: 800
13: 800 14: 800 15: 800 16: 800 bogomips: 79888
Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx
Graphics:
Device-1: Intel TigerLake-H GT1 [UHD Graphics] vendor: Lenovo driver: i915
v: kernel arch: Gen-12.1 ports: active: eDP-1 empty: DP-1, DP-2, DP-3,
DP-4, HDMI-A-1 bus-ID: 00:02.0 chip-ID: 8086:9a60
Device-2: NVIDIA GA104M [GeForce RTX 3070 Mobile / Max-Q] vendor: Lenovo
driver: nvidia v: 545.29.02 arch: Ampere pcie: speed: 16 GT/s lanes: 16
bus-ID: 01:00.0 chip-ID: 10de:249d
Device-3: Luxvisions Innotech Integrated RGB Camera driver: uvcvideo
type: USB rev: 2.0 speed: 480 Mb/s lanes: 1 bus-ID: 3-8:2 chip-ID: 30c9:0032
Display: x11 server: X.Org v: 21.1.9 with: Xwayland v: 23.2.2
compositor: kwin_x11 driver: X: loaded: modesetting,nvidia
alternate: fbdev,intel,nouveau,nv,vesa dri: iris gpu: i915 display-ID: :0
screens: 1
Screen-1: 0 s-res: 3840x2400 s-dpi: 192
Monitor-1: eDP-1 model: LG Display 0x06aa res: 3840x2400 dpi: 284
diag: 406mm (16")
API: EGL v: 1.5 platforms: device: 0 drv: nvidia device: 1 drv: iris
device: 3 drv: swrast gbm: drv: kms_swrast surfaceless: drv: nvidia x11:
drv: iris inactive: wayland,device-2
API: OpenGL v: 4.6.0 compat-v: 4.5 vendor: intel mesa v: 23.2.1-arch1.2
glx-v: 1.4 direct-render: yes renderer: Mesa Intel UHD Graphics (TGL GT1)
device-ID: 8086:9a60
API: Vulkan v: 1.3.269 surfaces: xcb,xlib device: 0 type: discrete-gpu
driver: nvidia device-ID: 10de:249d device: 1 type: integrated-gpu
driver: mesa intel device-ID: 8086:9a60
```
[1]: https://archive.archlinux.org/packages/l/linux/linux-6.6.arch1-1-x86_64.pkg.tar.zst
[2]: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=055907ad2c14838c90d63297f7bab8d180a5d844
[3]: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=ea4c30a0a73fb5cb2604539db550f1e620bb949c
[4]: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=9aaee3eebc91dd9ccebf6b6bc8a5f59d04ef718b
[5]: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=fa18a8a0539b02cc621938091691f0b73f0b1288
[6]: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=9034c87d61be8cff989017740a91701ac8195a1d
I have now reproduced it on a second machine with 6.6.1-arch1-1 - the last command is really enough to trigger it on its own:
```
sudo chown -R root:tracing /sys/kernel/debug/tracing/
```
Reverting back to 6.6-arch1-1 fixes the issue for me.
If you point me to a tutorial that makes it simple for me to compile my own arch kernel package with these patches reverted, I would be willing to try it out.
Thanks
If the generated kernel does not have the issue comment line 84 to stop reverting the last commit, rebuild and retest. If the generated kernel does have the issue uncomment lines 85 plus 86, rebuild and retest.
I recommend enabling parallel compilation [3] before building the kernel.
$ pkgctl repo clone --protocol=https linux # obtain the PKGBUILD [1]
$ cd linux/
$ git apply -v PKGBUILD.diff
$ pkgctl build
# pacman -U linux-6.6.1.arch1-1.1-x86_64.pkg.tar.zst linux-headers-6.6.1.arch1-1.1-x86_64.pkg.tar.zst # If you do not need linux-headers you can skip installing it
[1]: https://wiki.archlinux.org/title/Arch_build_system#Using_the_pkgctl_tool
[2]: https://wiki.archlinux.org/title/Patching_packages#Applying_patches
[3]: https://wiki.archlinux.org/title/Makepkg#Parallel_compilation
```
==> Verifying source file signatures with gpg...
linux-6.6.1.tar ... FAILED (unknown public key 38DBBDC86092693E)
linux-v6.6.1-arch1.patch.zst ... FAILED (unknown public key 3B94A80E50A477C7)
==> ERROR: One or more PGP signatures could not be verified!
==> ERROR: Could not download sources.
```
my `archlinux-keyring` is up2date, what do I do now?
$ gpg --import keys/pgp/*
[1]: https://wiki.archlinux.org/title/GnuPG#Import_a_public_key
Thanks for your help with setting this up loqs, much appreciated. Can you take it from here?
[1]: https://lore.kernel.org/stable/20231105160139.660634360%40goodmis.org/
[2]: https://bugzilla.kernel.org/
[3]: https://www.kernel.org/doc/html/latest/admin-guide/reporting-issues.html
Cheers
Thank you. Have you been able to test the proposed fix [1]?
[1]: https://lore.kernel.org/stable/20231112121817.2713c150%40rorschach.local.home/
The following will adjust the PKGBUILD to apply the proposed fix with incremented pkgrel 1.2.
$ cd linux/
$ git reset --hard
$ git apply -v PKGBUILD.diff # The PKGBUILD.diff attached to this comment not the previous one
$ pkgctl build
# pacman -U linux-6.6.1.arch1-1.2-x86_64.pkg.tar.zst linux-headers-6.6.1.arch1-1.2-x86_64.pkg.tar.zst # If you do not need linux-headers you can skip installing it
Cheers
[1]: https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git/commit/queue-6.6/eventfs-check-for-null-ef-in-eventfs_set_attr.patch?id=a9d042fde10315e4844883a4303193dde9dcf93b